react-papaparse icon indicating copy to clipboard operation
react-papaparse copied to clipboard

How to detect the encoding of the loaded file before displaying data ?

Open dimitri-hoareau-WEL opened this issue 3 years ago • 6 comments

Hi !

I had a problem for displaying special characters with CVSReader due to csv files which were encoded as ISO-8859-1 instead of utf-8 (ex : Pyr�n�es-Atlantiques )

By adding :

<CSVReader
  config={
    encoding: "ISO-8859-1",
  }
>
 ...
</CSVReader>

in my code it's working, the browser can now read special characters without problems. But the new problem is when loading a file encoded in utf-8, the characters are not displayed properly (ex : Pyrénées-Atlantiques )

My problem is that I am working with clients who do not use the same encoding for their csv files. Some clients use "utf-8", others use "ISO-8859-1". And I can not know in advance what will be the encoding of the file used.

Here is my code :

let changeEncoding = false 
  
   const  handleOnFileLoad = (data) => {
  
    data.map(element => {
      if (element.data.find(element => element.includes("�"))) {
        changeEncoding = true
      } 
    })
    if (changeEncoding) {
      alert("Some characters of your file will not display properly. Please load again yout file.")
      dispatch(setEncodingForExport("ISO-8859-1"))
    } else {
      dispatch(setEncodingForExport("UTF-8"))
    }
    data = data.slice(1)
  
    const enrollFieldArray = {}
  
    enrollFieldArray["data"] = data.map((element => 
      element.data
    ))
  
    dispatch(getDataFromUploadedCsv(enrollFieldArray));
    displayTable();
  
  
  };
  <CSVReader
  onFileLoad={handleOnFileLoad}
  onError={handleOnError}
  ref={buttonRef}
  noClick
  noDrag
  config={encoding}
  >

I use redux and the "encoding" variable is in the state with this default value :

encoding: {encoding: "UTF-8"}

With this solution, the client must load the file a first time to update the state with the correct value of "encoding", and load the file a second time to display data with the correct encoding.

Is there a native CSVReader's method that allows you to detect the encoding of the loaded file before displaying the data ?

Thank you very much for your help

Dimitri

dimitri-hoareau-WEL avatar Apr 19 '21 13:04 dimitri-hoareau-WEL

@dimitri-hoareau-WEL Would you like to check data of the file is UTF-8 or ISO-8859-1 before upload?

Bunlong avatar Apr 20 '21 16:04 Bunlong

Yes ! I wanted to know if it's possible to do that ?

dimitri-hoareau-WEL avatar Apr 23 '21 15:04 dimitri-hoareau-WEL

Hi!

I am having the same issue, is this being worked on?

luuddan avatar Jun 16 '21 07:06 luuddan

I may send a pull request in the coming weeks!

exaucae avatar Sep 07 '21 01:09 exaucae

@exaucae would you mind sharing any update? I'm having this issue as well

andirkh avatar Oct 13 '21 10:10 andirkh

@andirkh ,thanks for the bump. Pull request is #100. Feedbacks welcomed!

exaucae avatar Oct 14 '21 19:10 exaucae