fs icon indicating copy to clipboard operation
fs copied to clipboard

Invalid multibyte string when printing certain combination of characters with fs::path()

Open giocomai opened this issue 2 years ago • 2 comments

This is an issue that emerges when a certain, admittedly unusual, combination of valid characters is printed on the console via fs::path(). The shortest combination I found is the following (with the relevant error):

fs::path("śl")

Error in base::nchar(x, type, allowNA, keepNA) : invalid multibyte string, element 1

The issue does not seem to be related to the ś itself, as the following works nicely:

fs::path("ś")

Obvious solutions such as the following all return the same error:

fs::path(stringi::stri_enc_toutf8("śl"))
fs::path(stringi::stri_enc_tonative("śl"))
fs::path(fs::path_sanitize("śl"))

Error in base::nchar(x, type, allowNA, keepNA) : invalid multibyte string, element 1

If you look at the reprex, all seemingly works fine. But if I run the same code on the console, then I get the above error.

I include a reprex for reference, as well as a screenshot, since the issue is not fully visible via reprex.

In the real world, this issue emerges as I print fs::path() as a form to show advancement in a script that processes data related to a bunch of cities, including the Polish city of Przemyśl.

Of course there are ways around it, but since it broke my scripts, I still decided to report this. Tested with both the current development version (see reprex below) and version currently on CRAN (1.5.2)

library("fs")

# see e.g. https://en.wikipedia.org/wiki/Przemy%C5%9Bl
x <- "Przemyśl"

# works
print(x)
#> [1] "Przemyśl"

# throws error
fs::path(x)
#> Przemyśl

# throws error
fs::path(stringi::stri_enc_toutf8(x))
#> Przemyśl

# works
fs::path_sanitize(filename = x) 
#> [1] "Przemyśl"

# throws error
fs::path(fs::path_sanitize(filename = x))
#> Przemyśl

# works
y <- invisible(fs::path(x))

# works
stringr::str_c(y)
#> [1] "Przemyśl"

# works
filename <- fs::path(tempdir(), "Przemyśl.txt")

# throws error
fs::path(filename)
#> /tmp/RtmpzO4Dt3/Przemyśl.txt

# works
writeLines("test", fs::path(filename))

# works
fs::path("ś")
#> ś

# works
fs::path("Przemyś")
#> Przemyś

# throws error
fs::path("śl")
#> śl


devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#>  setting  value
#>  version  R version 4.1.3 (2022-03-10)
#>  os       Fedora Linux 35 (Workstation Edition)
#>  system   x86_64, linux-gnu
#>  ui       X11
#>  language (EN)
#>  collate  en_IE.UTF-8
#>  ctype    en_IE.UTF-8
#>  tz       Europe/Vienna
#>  date     2022-05-31
#>  pandoc   2.14.0.3 @ /usr/libexec/rstudio/bin/pandoc/ (via rmarkdown)
#> 
#> ─ Packages ───────────────────────────────────────────────────────────────────
#>  package     * version    date (UTC) lib source
#>  brio          1.1.3      2021-11-30 [1] CRAN (R 4.1.2)
#>  cachem        1.0.6      2021-08-19 [1] CRAN (R 4.1.1)
#>  callr         3.7.0      2021-04-20 [1] CRAN (R 4.1.2)
#>  cli           3.3.0      2022-04-25 [1] CRAN (R 4.1.3)
#>  crayon        1.5.1      2022-03-26 [1] CRAN (R 4.1.2)
#>  desc          1.4.1      2022-03-06 [1] CRAN (R 4.1.2)
#>  devtools      2.4.3      2021-11-30 [1] CRAN (R 4.1.2)
#>  digest        0.6.29     2021-12-01 [1] CRAN (R 4.1.2)
#>  ellipsis      0.3.2      2021-04-29 [2] CRAN (R 4.1.0)
#>  evaluate      0.15       2022-02-18 [1] CRAN (R 4.1.2)
#>  fansi         1.0.3      2022-03-24 [1] CRAN (R 4.1.2)
#>  fastmap       1.1.0      2021-01-25 [2] CRAN (R 4.1.0)
#>  fs          * 1.5.2.9000 2022-05-31 [1] Github (r-lib/fs@e7d98c4)
#>  glue          1.6.2      2022-02-24 [1] CRAN (R 4.1.2)
#>  highr         0.9        2021-04-16 [3] CRAN (R 4.1.0)
#>  htmltools     0.5.2      2021-08-25 [1] CRAN (R 4.1.1)
#>  knitr         1.39       2022-04-26 [1] CRAN (R 4.1.3)
#>  lifecycle     1.0.1      2021-09-24 [1] CRAN (R 4.1.1)
#>  magrittr      2.0.3      2022-03-30 [1] CRAN (R 4.1.3)
#>  memoise       2.0.1      2021-11-26 [1] CRAN (R 4.1.2)
#>  pillar        1.7.0      2022-02-01 [1] CRAN (R 4.1.2)
#>  pkgbuild      1.3.1      2021-12-20 [1] CRAN (R 4.1.2)
#>  pkgconfig     2.0.3      2019-09-22 [1] CRAN (R 4.1.1)
#>  pkgload       1.2.4      2021-11-30 [1] CRAN (R 4.1.2)
#>  prettyunits   1.1.1      2020-01-24 [1] CRAN (R 4.1.1)
#>  processx      3.5.2      2021-04-30 [2] CRAN (R 4.1.0)
#>  ps            1.6.0      2021-02-28 [2] CRAN (R 4.1.0)
#>  purrr         0.3.4      2020-04-17 [1] CRAN (R 4.1.2)
#>  R.cache       0.15.0     2021-04-30 [3] CRAN (R 4.1.0)
#>  R.methodsS3   1.8.1      2020-08-26 [3] CRAN (R 4.1.0)
#>  R.oo          1.24.0     2020-08-26 [3] CRAN (R 4.1.0)
#>  R.utils       2.11.0     2021-09-26 [1] CRAN (R 4.1.2)
#>  R6            2.5.1      2021-08-19 [1] CRAN (R 4.1.1)
#>  remotes       2.4.2      2021-11-30 [1] CRAN (R 4.1.2)
#>  reprex        2.0.1      2021-08-05 [1] CRAN (R 4.1.1)
#>  rlang         1.0.2      2022-03-04 [1] CRAN (R 4.1.2)
#>  rmarkdown     2.14       2022-04-25 [1] CRAN (R 4.1.3)
#>  rprojroot     2.0.3      2022-04-02 [1] CRAN (R 4.1.3)
#>  rstudioapi    0.13       2020-11-12 [1] CRAN (R 4.1.2)
#>  sessioninfo   1.2.2      2021-12-06 [1] CRAN (R 4.1.2)
#>  stringi       1.7.6      2021-11-29 [1] CRAN (R 4.1.2)
#>  stringr       1.4.0      2019-02-10 [1] CRAN (R 4.1.2)
#>  styler        1.7.0      2022-03-13 [1] CRAN (R 4.1.2)
#>  testthat      3.1.4      2022-04-26 [1] CRAN (R 4.1.3)
#>  tibble        3.1.7      2022-05-03 [1] CRAN (R 4.1.3)
#>  usethis       2.1.6      2022-05-25 [1] CRAN (R 4.1.3)
#>  utf8          1.2.2      2021-07-24 [1] CRAN (R 4.1.1)
#>  vctrs         0.4.1      2022-04-13 [1] CRAN (R 4.1.3)
#>  withr         2.5.0      2022-03-03 [1] CRAN (R 4.1.2)
#>  xfun          0.31       2022-05-10 [1] CRAN (R 4.1.3)
#>  yaml          2.3.5      2022-02-21 [1] CRAN (R 4.1.2)
#> 
#>  [1] /home/g/R/x86_64-redhat-linux-gnu-library/4.1
#>  [2] /usr/lib64/R/library
#>  [3] /usr/share/R/library
#> 
#> ──────────────────────────────────────────────────────────────────────────────

reprex_01 reprex_02

Created on 2022-05-31 by the reprex package (v2.0.1)

giocomai avatar May 31 '22 18:05 giocomai

There is a chance that R 4.2.0 fixes this, if you need a workaround now.

gaborcsardi avatar Jun 01 '22 05:06 gaborcsardi

Thanks! yes, ultimately, even wrapping it in stringr::str_c() fixes it, so there's plenty of workarounds!

giocomai avatar Jun 01 '22 08:06 giocomai