EPPlus icon indicating copy to clipboard operation
EPPlus copied to clipboard

Why do I get "Data at the root level is invalid. Line 1, position 1" for this file?

Open nmg196 opened this issue 1 year ago • 5 comments
trafficstars

EPPlus usage

Commercial use (I have a commercial license)

Environment

Windows console app

Epplus version

7.1.0

Spreadsheet application

Exce;

Description

Why do I get "Data at the root level is invalid. Line 1, position 1" for this file?

Notworking.xlsx

The file opens OK in Excel. If I make any change and click Save then EPPlus can then read it OK. But why can't it read the original file? I'm trying to avoid having to open each file and click save as I have hundreds to process.

Unfortunately i have no control of the process which generates the files and I'm unable to spot what's wrong with the unmodified file.

nmg196 avatar Mar 29 '24 14:03 nmg196

Hi @nmg196,

an xlsx file is just a zipped library of xml-files - to view the internal files of any *.xlsx file you can just rename the file from *.xlsx to *zip and open it.

The reason for the "Data at the root level is invalid" error is that the xml files inside this particular xlsx-file are encoded with UNICODE whereas EPPlus uses UTF-8 when reading them.

Since EPPlus assumes that the xml-files are encoded in UTF-8 the result of the file becomes like this:

��<\0?\0x\0m\0l\0 \0v\0e\0r\0s\0i\0o\0n\0=\0\"\01\0.\00\0\"\0?\0>\0\r\0\n\0 \0<\0T\0y\0p\0e\0s\0 \0x\0m\0l\0n\0s\0=\0\"\0h\0t\0t\0p\0:\0/\0/\0s\0c\0h\0e\0m\0a\0s\0.\0o\0p\0e\0n\0x\0m\0l\0f\0o\0r\0m\0a\0t\0s\0.\0o\0r\0g\0/\0p\0a\0c\0k\0a\0g\0e\0...

...and the "Data at the root level..." exception is thrown when EPPlus tries to load the xml-string into an XmlDocument.

The reason that it works when you have resaved the file with Excel is most likely converts the files to UTF-8.

I don't think that we have had this reported before, but we will have a look at it and see if we can handle this internally in EPPlus and provide a fix in a coming version.

With the current version EPPlus will unfortunately not be able to read these files.

swmal avatar Apr 03 '24 14:04 swmal

@nmg196 - May I ask how your xlsx-files were created? By Excel or another tool? If Excel, do you know which version of it?

swmal avatar Apr 04 '24 06:04 swmal

The files are output by a widely used trading platform called MetaTrader 5 (https://www.metatrader5.com/). If this is an edge case, then perhaps I can extract and repack the files if I detect this error. However that said, Excel is able to open them OK which indicates this is a supported file encoding for XML/xlsx files.

nmg196 avatar Apr 05 '24 12:04 nmg196

I did some testing and we can add support for this in EPPlus by checking the BOM of the files and then process the internal files according to the format. I will add this as a feature to implement in one of the coming versions.

swmal avatar Apr 10 '24 07:04 swmal

That would be great :)

nmg196 avatar Apr 15 '24 09:04 nmg196

Fixed in EPPlus 7.3

JanKallman avatar Aug 20 '24 11:08 JanKallman