rubyXL icon indicating copy to clipboard operation
rubyXL copied to clipboard

Unicode characters in text cells make Excel barf

Open wbrisett opened this issue 10 years ago • 7 comments

I've got the following code in my script for writing out the Excel spreadsheet:

@workbook.write(@mod_file)

@mod_file is my variable that contains the path and filename to write out.

In small spreadsheets, I can get this to work without any issues. However, as the files get a bit larger, I get an error that the file is corrupted when opening in Excel. When I tell it to try to repair and open the file, it sometimes can and sometimes can't repair the file. I don't see any issues related to this. Has anybody else seen this before?

wbrisett avatar May 20 '14 12:05 wbrisett

Looking at some text that contains the same output (I'm reading in content from another source), and looking at things in OpenOffice vs. Excel, I'm wondering if there's some odd encoding issue here (UTF-16 for example). I'm going to try enforcing UTF-8 encoding and see what happens.

wbrisett avatar May 20 '14 13:05 wbrisett

Can you provide a sample of such corrupted file?

weshatheleopard avatar May 20 '14 14:05 weshatheleopard

This is going to sound stupid, but how do I attach a file for you?

wbrisett avatar May 21 '14 06:05 wbrisett

Upload to dropbox and paste the link?

weshatheleopard avatar May 21 '14 06:05 weshatheleopard

https://www.dropbox.com/s/p8x016wdd6ye5bq/test_results.xlsx

wbrisett avatar May 21 '14 20:05 wbrisett

At this point I determined that first error happens because Excel does not like the content in cell D28.

(When I delete rows 29 and onwards, it still complains, but then if I additionally delete cell D28, it works).

Let me try to understand, you are loading some kind of existing document into rubyXL and then re-saving it, right? If so, could you also upload the original document, so I can compare it against this "corrupted" output? Thanks.

weshatheleopard avatar May 21 '14 21:05 weshatheleopard

It looks to me now that Excel seems to barf when Unicode characters are present in the cell value string (<v>...</v>) but accepts them just fine in SharedStrings. Sadly, implementation of the latter is going to happen further down the line.

weshatheleopard avatar May 21 '14 22:05 weshatheleopard