Open-XML-SDK icon indicating copy to clipboard operation
Open-XML-SDK copied to clipboard

Getting cell values is returning different format - OpenXML

Open kev-ops opened this issue 1 year ago • 13 comments

Describe the bug Getting cell value returns many decimal places.

Screenshots I have this "4.9" value in excel image

When retrieving the cell value in OpenXML. 4.9 was stored as 4.90000000000000000000004. It should return value as it was inputted. image

To Reproduce // Please add a self-contained, minimum viable repro of the issue. // If you require external resources, please provide a gist or GitHub repro // An Xunit style test is preferred, but a console application would work too.

Steps to reproduce the behavior:

  1. Go to '...'
  2. Click on '....'
  3. Scroll down to '....'
  4. See error

Observed behavior A clear and concise description of what you expected to happen.

Expected behavior A clear and concise description of what you expected to happen.

Desktop (please complete the following information):

  • OS: [e.g. Windows]
  • Office version [e.g. 16.0.15427.20178]
  • .NET Target: (e.g. .NET Framework, .NET Core, UWP, Xamarin...)
  • DocumentFormat.OpenXml Version: (e.g. 2.11.0)

Additional context Add any other context about the problem here.

kev-ops avatar Apr 19 '23 14:04 kev-ops

Can you share how you're doing this? Sounds like this is just an artifact of using a floating point number.

twsouthwick avatar Apr 19 '23 19:04 twsouthwick

I am reading the content of excel file(.XLSX), I am having issue in reading cell with a value "4.9" (please refer to the screenshot below, highlighted in yellow) image

In my code it gets "4.90000000xxx4" as the cell value (please refer to screenshot below) image

The cell format is General & has no special formatting (see screenshot) image

I am expecting that what is inputted in excel which in my case it should be "4.9" is what should be retrieve on code level not "4.9000000xxx4"

Checked also the XML version of the excel spreadsheet (see screenshot below) image

I can't apply the try parsing or converting the cell values in code since we are dealing with different cell values.

Having said that, what could be the workaround so we can get the cell value the same way it was inputted on excel.

Currently using OpenXML SDK 2.20.0 OS: Windows 10 .NET Target 6.0

kev-ops avatar Apr 20 '23 13:04 kev-ops

This looks like something excel is adding and the SDK is surfacing that correctly. @tomjebo @mikeebowen please correct me, but I'm assuming this is because you mark it as number and general number must be using a float which are not able to accurately represent numbers. Could the s="2" value be the significant digits?

twsouthwick avatar Apr 20 '23 17:04 twsouthwick

Yes, the number in my case "4.9" which is a general number resulted to accurately represent numbers. which is unwanted.

Could the s="2" value be the significant digits?

Answer: I tried changing the value of AK3 cell on excel from 4.9 to string like "SAMPLE" and is in General format, the value of s remains "2". Refer to screenshot below:

image

So it is not a significant basis. t="s" refers to SharedString value I presume.

Please let me know if there will be some workaround on this issue. Thanks

kev-ops avatar Apr 21 '23 14:04 kev-ops

@tomjebo @mikeebowen thoughts as to what excel does here and if we can replicate it in the sdk?

twsouthwick avatar Apr 24 '23 17:04 twsouthwick

The s attribute is the index of the cell's style.

mikeebowen avatar Apr 24 '23 22:04 mikeebowen

I think the SDK is doing the correct thing here. Excel stores numbers using double-precision, so the value the SDK retrieves is correct. If the values stored by Excel in the underlying xml need to match what is displayed in the UI, users can set the precision as displayed option in Excel. With that option set Excel will store the exact decimal places showing in the UI. Note, this isn't retroactive, so if you set the "precision as displayed" option on a spreadsheet, new numbers will match the decimal places in the UI, but values entered before setting the option will remain double-precision.

mikeebowen avatar Apr 24 '23 23:04 mikeebowen

How does excel know when to show 4.9 or 4.90000000000000000000004?

gartenkralle avatar May 04 '23 18:05 gartenkralle

@gartenkralle By the number formatting.

More:

Excel uses the s attribute in the cell element in the sheet.xml part:

        <row r="7" spans="1:7" x14ac:dyDescent="0.25">
            <c r="A7" s="1">
                <v>3.05</v>
            </c>
        </row>

In this case the following s attribute is "1":

 <c r="A7" s="1">

That corresponds to the second (zero based 1) cellXfs entry in the styles.xml part:

    <cellXfs count="2">
        <xf numFmtId="0" fontId="0" fillId="0" borderId="0" xfId="0"/>
        <xf numFmtId="164" fontId="0" fillId="0" borderId="0" xfId="0" applyNumberFormat="1"/>
    </cellXfs>

which specifies the attribute values numFmtId="164" and applyNumberFormat="1" (meaning to apply the number format "164"). That identifies the format in numFmts:

    <numFmts count="1">
        <numFmt numFmtId="164" formatCode="0.000"/>
    </numFmts>

Therefore displaying the value:

image

tomjebo avatar May 04 '23 19:05 tomjebo

I also have relevant concern in date formatting

Observed behavior Custom Date format and other date format returns inconsistent NumberFormatId

Please refer to the attached file 1.xlsx image

I have two custom date formats mm/dd/yyyy and mm-dd-yyyy. Checking it on XML it has numFmtId "164" and numFmtID "165" respectively. image image

image

I have also this "2.xlsx" attached file which returns a different number format Ids for the same custom date format in "1.xlsx" file image image

Checking in on XML , it has numFmts "165" and "166" for mm/dd/yyyy & mm-dd-yyyy respectively image

I have also this "3.xlsx" attached file which returns a different range of number format ids as well. image image

Checking it on XML , it has numFmtId = 171 & numFmtId = 172 for mm/dd/yyyy & mm-dd-yyyy respectively image

This also happens to other date formats as well not limited to custom date formats. We still don't know what is the basis of what numFmtId is assigned to particular date format.

How can we identify if a cell value in excel has date format (mm-dd-yyyy) or (mm/dd/yyyy) etc. ??

What we are trying to accomplish here is to check whether a cell value has format (mm-dd-yyyy) or (mm/dd/yyyy) . Since NumFmtId assignment is not consistent.

Please see all files for your reference. Thank you! Hope to hear from you soon

1.xlsx 2.xlsx 3.xlsx

kev-ops avatar Dec 15 '23 14:12 kev-ops

In addition, Is it possible to convert a date in c# using these formatCodes? We are struggling converting the date values retrieved from excel in c# using the format codes in the screenshot.

image

kev-ops avatar Dec 18 '23 10:12 kev-ops

Hi @kev-ops,

"We still don't know what is the basis of what numFmtId is assigned to particular date format. / How can we identify if a cell value in excel has date format (mm-dd-yyyy) or (mm/dd/yyyy) etc. ?"

"Is it possible to convert a date in c# using these formatCodes? We are struggling converting the date values retrieved from excel in c# using the format codes in the screenshot."

I hope that helps.

mikeebowen avatar Feb 20 '24 21:02 mikeebowen