sheetjs
sheetjs copied to clipboard
3A-1 return as NaN but all other letters working fine 3B, 3C etc
Hi!
Can someone let me know what's causing the problem? `function ExportToExcel(type, fn, dl) { var elt = document.getElementById("content"); var oID = document.getElementById("txtOpportunityID").innerHTML; let today = new Date().toISOString().slice(0, 10) var wb = XLSX.utils.table_to_book(elt, { sheet: "OPC_" + oID + "_" + today });
return dl ?
XLSX.write(wb, { bookType: type, bookSST: true, type: 'base64' }):
XLSX.writeFile(wb, fn || ("OPC_" + oID + "_" + today + (type || '.xlsx')));
} `
What is the starting point? Are you trying to export an HTML table?
I am the developer of the add-on Mail Merge for Thunderbird. I have been using the Community Edition of SheetJS for the last couple of years very successfully to parse spreadsheets. Many thanks for providing the library in the first place!
However, after the recent upgrade to version 0.18.9 a user reported a weird problem, which I was able to further narrow down to being this already reported issue.
Cells in read CSVs starting with a combination of a number followed by either an "a" or a "p" are now apparently interpreted as a date and shown as "1/0/00". The last working version is 0.18.7. The newer versions 0.18.8, 0.18.9 and 0.18.10 are affected.
I have created this JSFiddle with version 0.18.10, which prints the contents of the cell "A1" in "Sheet1" read from a string "0a": https://jsfiddle.net/hvjwcr8k/
Version 0.18.7 - correct output in the browser console:
Object { t: "s", v: "0a" }
Version 0.18.8, 0.18.9 and 0.18.10 - wrong output in the browser console:
Object { t: "n", v: NaN, w: "1/0/00" }
I have also created this very simple CSV containing multiple cells with a number followed by a letter. You can clearly see, that only the combination of a number followed by either an "a" or a "p" is broken: test.csv
I suspect this particular combination of characters to be mistakenly interpreted as a date ("a" for AM and "p" for PM). Then further processing apprently fails and results in the value "NaN" and the respective date "1/0/00".
Adding the parameter "raw: true" to the function call XLSX.read() also fixes the issue. All the mentioned versions are then working correctly as well.
If you need any further information from my side to fix the issue, please let me know. Thanks.
Thanks for reporting and sharing a detailed test case!
There was a change to make bare times like 12:34 AM interpreted as dates. Chrome does not recognize those as date values so there's a regexp workaround.
https://github.com/SheetJS/sheetjs/blob/master/bits/20_jsutils.js#L162
var FDRE1 = /^(0?\d|1[0-2])(?:|:([0-5]?\d)(?:|(\.\d+)(?:|:([0-5]?\d))|:([0-5]?\d)(|\.\d+)))([ap])m?/;
The error here is missing required whitespace.
Testing in Excel, 3 a -> 3:00 AM but 3a is unchanged. The regexp currently does not check for that whitespace.
Since the intention is to match the entire text, the regexp probably should be
var FDRE1 = /^(0?\d|1[0-2])(?:|:([0-5]?\d)(?:|(\.\d+)(?:|:([0-5]?\d))|:([0-5]?\d)(|\.\d+)))\s+([ap])m?$/;
(we'd accept a PR with that patch)
Thank you for your quick reply!
Unfortunately I am not yet familiar with either Git or GitHub. I only just created a GitHub account and have not learned the basics. Therefore I suggest you go ahead and fix the issue in the way you seem fit and do not wait for me to first learn the basics and then file a pull request.
The proposed change sounds reasonable: The additional check for the whitespace is needed, because Microsoft Excel seems to interpret strings like "1 a", "1 am" or "1:00 am" as a date and does not interpret strings like "1a". And the additional check for a line ending is needed, because Microsoft Excel does not interpret a string like "1 amx" as a date either.
Please forgive my additional question, because it might turn out to be a little bit off-topic:
For my own add-on Mail Merge for Thunderbird I want to be able to read the raw contents of a user-provided CSV without any fancy number or format detection. I am only interested in the raw values of the cells - represented as a string - read from a CSV. In this case would you suggest to use "raw: true" as a parameter for the function call XLSX.read()?
Has using "raw: true" in this case other side-effects I should be aware of? I use XLSX.read() in my add-on (a) to read CSVs and (b) to read either CSVs or spreadsheets like ODS or XLSX. Is it safe to use "raw: true" in both cases? Or has the parameter "raw: true" additional consequences when an ODS or XLSX is read.
In previous versions of the SheetJS library it didn't seem to make a difference - or I didn't stumble across an edge-case. But with the new advanced and more aggressive number and date detection in SheetJS - introduced in version 0.18.8 (?) - it seems like it is intentional for the string "1 a" in a cell to be interpreted as a date format. (Apparently Microsoft Excel does it this way and imports the cell as "1:00 AM"; while LibreOffice imports the cell as "1 a".) Is that correct?
I couldn't find much about the specifics of the parameter "raw: true" in the documentation. (Again, please forgive me if this is too off-topic.)
No worries, and please drop an email to [email protected] if you prefer.
Docs are a work in progress. https://docs.sheetjs.com/docs/api/parse-options mentions raw. For what you are describing, raw: true is recommended.
Plaintext formats like CSV mix content and presentation. 1/2/3 could be a date or it could be the literal string "1/2/3". Excel has its own parsing behavior that SheetJS tries to mirror. raw disables that behavior and every cell will be passed back as a string with the original value.
This does not apply to formats like XLS or XLSX or ODS since they have typed formats -- each cell has a proper data type and there is a clear separation between content and presentation. raw has no effect.
cellDates determines whether presumed date values are stored in the original date code (cellDates: false) or in a proper JS date object (cellDates: true). There is a much longer discussion on dates in https://docs.sheetjs.com/docs/csf/features/dates
Thank you for the clarification and recommendation regarding "raw: true". I will use this in the future. As a result my add-on is then not directly affected by this issue anymore. Nevertheless, I am looking forward for this issue being fixed in SheetJS.
This was fixed in 0.18.11, as verified by changing the script reference in the fiddle.