unioffice
unioffice copied to clipboard
Support for RichTextRun in Spreadsheet
Description
I was pulling content from an existing spreadsheet and noticed two cells which have content, but returned an empty string from cell.GetRawValue()
and cell.GetString()
. After digging into the raw xml, I noticed that the shared string for it looked like:
<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sst count="1" uniqueCount="1" xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
<si>
<r>
<t xml:space="preserve">some content and </t>
</r>
<r>
<rPr>
<sz val="11"/>
<color rgb="FF000000"/>
<rFont val="Calibri"/>
<family val="2"/>
</rPr>
<t>example.com</t>
</r>
<r>
<rPr>
<sz val="11"/>
<color theme="1"/>
<rFont val="Calibri"/>
<family val="2"/>
<scheme val="minor"/>
</rPr>
<t>and more content.</t>
</r>
</si>
</sst>
I figured the runs were the issue. I dug through the code, and while RichTextRun exists, it doesn't seem to be used anywhere. I also inspected all attributes on cell.X()
(sml.CT_Cell
) and couldn't find the runs anywhere. It appears that RichTextRuns are not even being parsed in to the CT_Cell
from what I can tell.
Expected Behavior
GetFormattedValue()
should not be empty when a cell has content displayed in Excel. Ideally GetString()
would be updated to return a plaintext version of the content, though according to how GetString
is documented, it is currently working as expected.
Actual Behavior
GetFormattedValue()
returns and empty string for cells with RichTextRuns. There is also no method that I was able to find to access the raw RichTextRun content directly through cell.X()
.
I've attached a shreadsheet with RichTextRun content in A1: wb.xlsx.