fs
fs copied to clipboard
Invalid multibyte string when printing certain combination of characters with fs::path()
This is an issue that emerges when a certain, admittedly unusual, combination of valid characters is printed on the console via fs::path()
. The shortest combination I found is the following (with the relevant error):
fs::path("śl")
Error in base::nchar(x, type, allowNA, keepNA) : invalid multibyte string, element 1
The issue does not seem to be related to the ś
itself, as the following works nicely:
fs::path("ś")
Obvious solutions such as the following all return the same error:
fs::path(stringi::stri_enc_toutf8("śl"))
fs::path(stringi::stri_enc_tonative("śl"))
fs::path(fs::path_sanitize("śl"))
Error in base::nchar(x, type, allowNA, keepNA) : invalid multibyte string, element 1
If you look at the reprex, all seemingly works fine. But if I run the same code on the console, then I get the above error.
I include a reprex for reference, as well as a screenshot, since the issue is not fully visible via reprex.
In the real world, this issue emerges as I print fs::path()
as a form to show advancement in a script that processes data related to a bunch of cities, including the Polish city of Przemyśl.
Of course there are ways around it, but since it broke my scripts, I still decided to report this. Tested with both the current development version (see reprex below) and version currently on CRAN (1.5.2)
library("fs")
# see e.g. https://en.wikipedia.org/wiki/Przemy%C5%9Bl
x <- "Przemyśl"
# works
print(x)
#> [1] "Przemyśl"
# throws error
fs::path(x)
#> Przemyśl
# throws error
fs::path(stringi::stri_enc_toutf8(x))
#> Przemyśl
# works
fs::path_sanitize(filename = x)
#> [1] "Przemyśl"
# throws error
fs::path(fs::path_sanitize(filename = x))
#> Przemyśl
# works
y <- invisible(fs::path(x))
# works
stringr::str_c(y)
#> [1] "Przemyśl"
# works
filename <- fs::path(tempdir(), "Przemyśl.txt")
# throws error
fs::path(filename)
#> /tmp/RtmpzO4Dt3/Przemyśl.txt
# works
writeLines("test", fs::path(filename))
# works
fs::path("ś")
#> ś
# works
fs::path("Przemyś")
#> Przemyś
# throws error
fs::path("śl")
#> śl
devtools::session_info()
#> ─ Session info ───────────────────────────────────────────────────────────────
#> setting value
#> version R version 4.1.3 (2022-03-10)
#> os Fedora Linux 35 (Workstation Edition)
#> system x86_64, linux-gnu
#> ui X11
#> language (EN)
#> collate en_IE.UTF-8
#> ctype en_IE.UTF-8
#> tz Europe/Vienna
#> date 2022-05-31
#> pandoc 2.14.0.3 @ /usr/libexec/rstudio/bin/pandoc/ (via rmarkdown)
#>
#> ─ Packages ───────────────────────────────────────────────────────────────────
#> package * version date (UTC) lib source
#> brio 1.1.3 2021-11-30 [1] CRAN (R 4.1.2)
#> cachem 1.0.6 2021-08-19 [1] CRAN (R 4.1.1)
#> callr 3.7.0 2021-04-20 [1] CRAN (R 4.1.2)
#> cli 3.3.0 2022-04-25 [1] CRAN (R 4.1.3)
#> crayon 1.5.1 2022-03-26 [1] CRAN (R 4.1.2)
#> desc 1.4.1 2022-03-06 [1] CRAN (R 4.1.2)
#> devtools 2.4.3 2021-11-30 [1] CRAN (R 4.1.2)
#> digest 0.6.29 2021-12-01 [1] CRAN (R 4.1.2)
#> ellipsis 0.3.2 2021-04-29 [2] CRAN (R 4.1.0)
#> evaluate 0.15 2022-02-18 [1] CRAN (R 4.1.2)
#> fansi 1.0.3 2022-03-24 [1] CRAN (R 4.1.2)
#> fastmap 1.1.0 2021-01-25 [2] CRAN (R 4.1.0)
#> fs * 1.5.2.9000 2022-05-31 [1] Github (r-lib/fs@e7d98c4)
#> glue 1.6.2 2022-02-24 [1] CRAN (R 4.1.2)
#> highr 0.9 2021-04-16 [3] CRAN (R 4.1.0)
#> htmltools 0.5.2 2021-08-25 [1] CRAN (R 4.1.1)
#> knitr 1.39 2022-04-26 [1] CRAN (R 4.1.3)
#> lifecycle 1.0.1 2021-09-24 [1] CRAN (R 4.1.1)
#> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.1.3)
#> memoise 2.0.1 2021-11-26 [1] CRAN (R 4.1.2)
#> pillar 1.7.0 2022-02-01 [1] CRAN (R 4.1.2)
#> pkgbuild 1.3.1 2021-12-20 [1] CRAN (R 4.1.2)
#> pkgconfig 2.0.3 2019-09-22 [1] CRAN (R 4.1.1)
#> pkgload 1.2.4 2021-11-30 [1] CRAN (R 4.1.2)
#> prettyunits 1.1.1 2020-01-24 [1] CRAN (R 4.1.1)
#> processx 3.5.2 2021-04-30 [2] CRAN (R 4.1.0)
#> ps 1.6.0 2021-02-28 [2] CRAN (R 4.1.0)
#> purrr 0.3.4 2020-04-17 [1] CRAN (R 4.1.2)
#> R.cache 0.15.0 2021-04-30 [3] CRAN (R 4.1.0)
#> R.methodsS3 1.8.1 2020-08-26 [3] CRAN (R 4.1.0)
#> R.oo 1.24.0 2020-08-26 [3] CRAN (R 4.1.0)
#> R.utils 2.11.0 2021-09-26 [1] CRAN (R 4.1.2)
#> R6 2.5.1 2021-08-19 [1] CRAN (R 4.1.1)
#> remotes 2.4.2 2021-11-30 [1] CRAN (R 4.1.2)
#> reprex 2.0.1 2021-08-05 [1] CRAN (R 4.1.1)
#> rlang 1.0.2 2022-03-04 [1] CRAN (R 4.1.2)
#> rmarkdown 2.14 2022-04-25 [1] CRAN (R 4.1.3)
#> rprojroot 2.0.3 2022-04-02 [1] CRAN (R 4.1.3)
#> rstudioapi 0.13 2020-11-12 [1] CRAN (R 4.1.2)
#> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.1.2)
#> stringi 1.7.6 2021-11-29 [1] CRAN (R 4.1.2)
#> stringr 1.4.0 2019-02-10 [1] CRAN (R 4.1.2)
#> styler 1.7.0 2022-03-13 [1] CRAN (R 4.1.2)
#> testthat 3.1.4 2022-04-26 [1] CRAN (R 4.1.3)
#> tibble 3.1.7 2022-05-03 [1] CRAN (R 4.1.3)
#> usethis 2.1.6 2022-05-25 [1] CRAN (R 4.1.3)
#> utf8 1.2.2 2021-07-24 [1] CRAN (R 4.1.1)
#> vctrs 0.4.1 2022-04-13 [1] CRAN (R 4.1.3)
#> withr 2.5.0 2022-03-03 [1] CRAN (R 4.1.2)
#> xfun 0.31 2022-05-10 [1] CRAN (R 4.1.3)
#> yaml 2.3.5 2022-02-21 [1] CRAN (R 4.1.2)
#>
#> [1] /home/g/R/x86_64-redhat-linux-gnu-library/4.1
#> [2] /usr/lib64/R/library
#> [3] /usr/share/R/library
#>
#> ──────────────────────────────────────────────────────────────────────────────
Created on 2022-05-31 by the reprex package (v2.0.1)
There is a chance that R 4.2.0 fixes this, if you need a workaround now.
Thanks! yes, ultimately, even wrapping it in stringr::str_c()
fixes it, so there's plenty of workarounds!