unioffice icon indicating copy to clipboard operation
unioffice copied to clipboard

Support for RichTextRun in Spreadsheet

Open freb opened this issue 4 years ago • 0 comments

Description

I was pulling content from an existing spreadsheet and noticed two cells which have content, but returned an empty string from cell.GetRawValue() and cell.GetString(). After digging into the raw xml, I noticed that the shared string for it looked like:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?>
<sst count="1" uniqueCount="1" xmlns="http://schemas.openxmlformats.org/spreadsheetml/2006/main">
	<si>
		<r>
			<t xml:space="preserve">some content and </t>
		</r>
		<r>
			<rPr>
				<sz val="11"/>
				<color rgb="FF000000"/>
				<rFont val="Calibri"/>
				<family val="2"/>
			</rPr>
			<t>example.com</t>
		</r>
		<r>
			<rPr>
				<sz val="11"/>
				<color theme="1"/>
				<rFont val="Calibri"/>
				<family val="2"/>
				<scheme val="minor"/>
			</rPr>
			<t>and more content.</t>
		</r>
	</si>
</sst>

I figured the runs were the issue. I dug through the code, and while RichTextRun exists, it doesn't seem to be used anywhere. I also inspected all attributes on cell.X() (sml.CT_Cell) and couldn't find the runs anywhere. It appears that RichTextRuns are not even being parsed in to the CT_Cell from what I can tell.

Expected Behavior

GetFormattedValue() should not be empty when a cell has content displayed in Excel. Ideally GetString() would be updated to return a plaintext version of the content, though according to how GetString is documented, it is currently working as expected.

Actual Behavior

GetFormattedValue() returns and empty string for cells with RichTextRuns. There is also no method that I was able to find to access the raw RichTextRun content directly through cell.X().

I've attached a shreadsheet with RichTextRun content in A1: wb.xlsx.

freb avatar Jun 24 '20 19:06 freb