sparklyr icon indicating copy to clipboard operation
sparklyr copied to clipboard

Error in dbi_ensure_no_backtick(x) : Can't escape back tick from string

Open mknoszlig opened this issue 3 years ago • 2 comments

After upgrading to sparklyr 1.77 (and dbplyr 2.2.0) I've been getting errors in semi_joins.

Minimal example:

spark_version <- "3.1.2"
sc <- spark_connect(master = "local", version = spark_version)
df = sdf_len(sc, 20)
other = sdf_len(sc, 10)
semi_join(df, other)
# Joining, by = "id"
# -> Error in dbi_ensure_no_backtick(x) : Can't escape back tick from string

I've poked a bit at this and it appears that in the SQL generation sometimes the names RHS and LHS are generated enclosed in backticks (prematurely?) which makes db_ensure_no_backtick fail. I'm however not exactly sure where that happens.

I have a (very ad hoc) workaround for this:

setMethod("dbQuoteIdentifier", c("spark_connection", "character"), function(conn, x, ...) {
  if (length(x) == 0L) {
    x
  } else {
    
    if (regexpr("^`[^`]+`$", x)[[1]] == -1) {
      dbi_ensure_no_backtick(x)
      y <- paste("`", x, "`", sep = "")
    } else {
      y <- x
    }

    SQL(y)
  }
})

this allows strings that contain backticks only in the first and last positions and treats them as already quoted. this works fine for my current use case but ymmv.

mknoszlig avatar Jun 24 '22 14:06 mknoszlig

Hi, I'm not able to recreate the issue. Do you think you can build an reprex like I did here?:

library(sparklyr)
#> 
#> Attaching package: 'sparklyr'
#> The following object is masked from 'package:stats':
#> 
#>     filter
library(dbplyr)
library(dplyr)
#> 
#> Attaching package: 'dplyr'
#> The following objects are masked from 'package:dbplyr':
#> 
#>     ident, sql
#> The following objects are masked from 'package:stats':
#> 
#>     filter, lag
#> The following objects are masked from 'package:base':
#> 
#>     intersect, setdiff, setequal, union
spark_version <- "3.1.2"
sc <- spark_connect(master = "local", version = spark_version)
df = sdf_len(sc, 20)
other = sdf_len(sc, 10)
semi_join(df, other)
#> Joining, by = "id"
#> # Source: spark<?> [?? x 1]
#>       id
#>    <int>
#>  1     6
#>  2     9
#>  3     5
#>  4    10
#>  5     1
#>  6     3
#>  7     7
#>  8     2
#>  9     4
#> 10     8


utils::sessionInfo()
#> R version 4.2.0 (2022-04-22)
#> Platform: aarch64-apple-darwin20 (64-bit)
#> Running under: macOS Monterey 12.4
#> 
#> Matrix products: default
#> BLAS:   /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRblas.0.dylib
#> LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib
#> 
#> locale:
#> [1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> other attached packages:
#> [1] dplyr_1.0.9    dbplyr_2.2.1   sparklyr_1.7.7
#> 
#> loaded via a namespace (and not attached):
#>  [1] pillar_1.7.0      compiler_4.2.0    highr_0.9         r2d3_0.2.6       
#>  [5] base64enc_0.1-3   R.methodsS3_1.8.1 R.utils_2.11.0    tools_4.2.0      
#>  [9] uuid_1.1-0        digest_0.6.29     jsonlite_1.8.0    evaluate_0.15    
#> [13] lifecycle_1.0.1   tibble_3.1.7      R.cache_0.15.0    pkgconfig_2.0.3  
#> [17] rlang_1.0.2       reprex_2.0.1      cli_3.3.0         DBI_1.1.2        
#> [21] rstudioapi_0.13   parallel_4.2.0    yaml_2.3.5        xfun_0.30        
#> [25] fastmap_1.1.0     httr_1.4.2        withr_2.5.0       styler_1.7.0     
#> [29] stringr_1.4.0     knitr_1.39        askpass_1.1       rappdirs_0.3.3   
#> [33] htmlwidgets_1.5.4 generics_0.1.2    fs_1.5.2          vctrs_0.4.1      
#> [37] rprojroot_2.0.3   tidyselect_1.1.2  forge_0.2.0       glue_1.6.2       
#> [41] R6_2.5.1          fansi_1.0.3       rmarkdown_2.14    tidyr_1.2.0      
#> [45] purrr_0.3.4       magrittr_2.0.3    ellipsis_0.3.2    htmltools_0.5.2  
#> [49] assertthat_0.2.1  config_0.3.1      utf8_1.2.2        stringi_1.7.6    
#> [53] openssl_2.0.0     crayon_1.5.1      R.oo_1.24.0

Created on 2022-06-28 by the reprex package (v2.0.1)

edgararuiz avatar Jun 28 '22 20:06 edgararuiz

Hi, here's my output:

> utils::sessionInfo()
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Debian GNU/Linux 10 (buster)

Matrix products: default
BLAS/LAPACK: /usr/lib/x86_64-linux-gnu/libopenblasp-r0.3.5.so

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=en_US.UTF-8     LC_MONETARY=en_US.UTF-8    LC_MESSAGES=C             
 [7] LC_PAPER=en_US.UTF-8       LC_NAME=C                  LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] dbplyr_2.2.0   dplyr_1.0.9    sparklyr_1.7.7

loaded via a namespace (and not attached):
 [1] pillar_1.7.0      compiler_3.6.3    r2d3_0.2.6        base64enc_0.1-3   tools_3.6.3       uuid_1.1-0        digest_0.6.29     jsonlite_1.8.0    lifecycle_1.0.1   tibble_3.1.7     
[11] pkgconfig_2.0.3   rlang_1.0.2       DBI_1.1.3         cli_3.3.0         rstudioapi_0.13   curl_4.3.2        yaml_2.3.5        parallel_3.6.3    fastmap_1.1.0     xml2_1.3.3       
[21] withr_2.5.0       httr_1.4.3        generics_0.1.2    vctrs_0.4.1       htmlwidgets_1.5.4 askpass_1.1       rappdirs_0.3.3    rprojroot_2.0.3   tidyselect_1.1.2  glue_1.6.2       
[31] forge_0.2.0       R6_2.5.1          fansi_1.0.3       purrr_0.3.4       tidyr_1.2.0       magrittr_2.0.3    ellipsis_0.3.2    htmltools_0.5.2   assertthat_0.2.1  config_0.3.1     
[41] utf8_1.2.2        openssl_2.0.2     crayon_1.5.1

In this project I'm using R3.6.3 which might contribute to what I'm seeing. I also noticed you're using dbplyr 2.2.1 in your comment, not 2.2.0 (not sure if that changes anything, though).

mknoszlig avatar Jul 20 '22 13:07 mknoszlig

Hi, sorry, I meant to get code I can run on my environment that gives the same error. Do you think you can provide that?

edgararuiz avatar Dec 06 '22 16:12 edgararuiz

Automatically closed because there has not been a response for 30 days. When you're ready to work on this further, please comment here and the issue will automatically reopen.

github-actions[bot] avatar Jan 06 '23 06:01 github-actions[bot]