reactable icon indicating copy to clipboard operation
reactable copied to clipboard

Supporting UTF-8 encoding system in Reactable.downloadDataCSV

Open radovan-miletic opened this issue 2 years ago • 2 comments

Following issue #239
"From version 4.2.0 released in April 2022, R on Windows uses UTF-8 as the native encoding..." (here) This was "a major improvement in encoding support, allowing Windows R users to work with international text and data." (here)

The following runs smoothly with R 4.2.0 and gives the expected results:

x <- data.table::data.table(variable = c('Cláudia', 'João'))
reactable::reactable(x)

Result:
NO-UTF-8

But if your end-users use the "desktop systems older than Windows 10" or "server systems older than Windows Server 2016" (here) and download .csv file, it gives the non-expected results (as desktop systems older than Windows 10 does not support UTF-8 locales):

library(shiny)
library(data.table)

x <- data.table::data.table(variable = c('Cláudia', 'João'))

ui <- fluidPage(
    shiny::downloadButton(outputId = "CSV", label = "Download as CSV", onClick = "Reactable.downloadDataCSV('table', 'table.csv', { sep: ';' })"),
    reactable::reactableOutput("table")
)

server <- function(input, output, session) {
    output$table <- reactable::renderReactable({
        reactable::reactable(x)
    })
}

shinyApp(ui, server)

Result:
UTF-8

Is it possible to overcome this?

radovan-miletic avatar Dec 02 '22 16:12 radovan-miletic

Thanks for these details, now I get what that encoding issue was about. I think the problem here is not as much related to R as it is about the browser. The downloadDataCSV() method always downloads files in UTF-8 encoding in the browser, regardless of what R version or operating system you're on, so you always get a UTF-8 encoded CSV file in the end. And so when you open that in Excel on Windows, Excel assumes it's ISO-8859-1 or Windows-1252 (not sure, but something that's not UTF-8).

From light searching, it's apparently a limitation of the browser APIs that you can only create/download Blob strings as UTF-8, so it seems really hard or impossible to use a different encoding like ISO-8859-1. The '\uFEFF' BOM workaround you used seems to be the most commonly suggested workaround, which gets Excel to auto-detect UTF-8 files as actually UTF-8, but doesn't actually change the file encoding. And some comments report that it doesn't work on macOS, so it's not a perfect workaround either.

This will need more research, but for now, I guess you could continue to use the BOM workaround by implementing your own download function like at https://github.com/glin/reactable/issues/239#issuecomment-1127671824. Reactable.getDataCSV() should make that easier, as you'd only have to implement the downloading part like:

function downloadDataCSV(tableId, filename) {
  const content = Reactable.getDataCSV(tableId)
  // Add BOM for UTF-8 autodetection in Excel on Windows
  const blob = new Blob(["\uFEFF" + content], { type: 'text/csv;charset=UTF-8' })
  const link = document.createElement('a')
  const url = window.URL.createObjectURL(blob)
  link.href = url
  link.download = filename
  link.click()
  window.URL.revokeObjectURL(url)
}

Alternatively, you could have your users import CSV data in Excel or R with a custom-specified UTF-8 encoding. In Excel, there's a Data > From Text button that lets you do this, and frustratingly enough, it actually seems to auto-detect UTF-8 files properly without the BOM workaround. Why Excel doesn't auto-detect when you open a file directly, IDK 😕

Import CSV with UTF-8 encoding in Excel

In R <= 4.1, this would be something like read.csv(file, encoding = "UTF-8").

glin avatar Dec 03 '22 22:12 glin

Well, it seems to be a Windows system problem, neither R's nor {reactable}'s. With your workaround (including custom field separator) the previous example would look like this:

library(shiny)
library(data.table)

x <- data.table::data.table(variable = c('Cláudia', 'João'))

ui <- fluidPage(
    tagList(
        tags$script(
            HTML(
                "
                     function downloadDataCSV(tableId, filename) {
                     const content = Reactable.getDataCSV(tableId, { sep: ';' })
                     // Add BOM for UTF-8 autodetection in Excel on Windows
                     const blob = new Blob(['\uFEFF' + content], { type: 'text/csv; charset = UTF-8' })
                     const link = document.createElement('a')
                     const url = window.URL.createObjectURL(blob)
                     link.href = url
                     link.download = filename
                     link.click()
                     window.URL.revokeObjectURL(url)
                     }
                    "
            )
        ),
        shiny::downloadButton(outputId = "CSV", label = "Download as CSV", onClick = "downloadDataCSV(tableId = 'table', filename = 'table.csv')")
    ),
    reactable::reactableOutput("table")
)

server <- function(input, output, session) {
    output$table <- reactable::renderReactable({
        reactable::reactable(x)
    })
}

shinyApp(ui, server)

Result:
UTF-8

Greg, thank you for your time digging into this!

P.S. Also, it seems that Windows 10 users might need to manually enable UTF-8 encoding (if they have administrative privileges): Control Panel > Clock and Region > Region > Administrative > Change system locale button > Beta:Use Unicode UTF-8 for worldwide language support. WindowsUTF-8

radovan-miletic avatar Dec 04 '22 13:12 radovan-miletic