Open-XML-SDK icon indicating copy to clipboard operation
Open-XML-SDK copied to clipboard

Feature request: SAX Approach addition of ReadLastChild() function for OpenXmlReader

Open tauheedul opened this issue 3 years ago • 8 comments

Suggestion There are three SAX options to read from a spreadsheet document e.g.

  • Read()
  • ReadFirstChild()
  • ReadNextSibling()

Please can this be extended to include ReadLastChild() which will move the SAX reader straight to the Last Child Element of the node?

Current situation When using the DOM approach we can easily sort by Descending or take the LastChild from the element collection.

With the SAX approach, the reader must iterate through all preceeding Cells or Rows using reader.ReadNextSibling and checking if reader.IsEndElement before the last element can be loaded using the reader.LoadCurrentElement();

It would be faster if when ReadLastChild() is used the OpenXmlReader's position is moved immediately to the last child Element of the node whether it is a Row or Cell type.

The SAX reader is used for large Excel document's and this will reduce processing speed in these instances.

tauheedul avatar Aug 10 '22 13:08 tauheedul

@tauheedul Thanks for the suggestion, I've marked this as an enhancement request.

tomjebo avatar Aug 10 '22 18:08 tomjebo

Thank you!

tauheedul avatar Aug 11 '22 10:08 tauheedul

@tauheedul if you have the time, PRs are very welcome :) Please let us know before you approach it and what your general idea is

twsouthwick avatar Aug 31 '22 02:08 twsouthwick

Hello @twsouthwick , I am rather new to the inner workings of the SDK and mostly use it to consume features so I am unable to submit a PR for this.

From reviewing the methods, I had the following in mind...

OpenXmlReader.cs

  • Addition of abstract method ReadLastChild similar to the existing ReadFirstChild method

Then implementation in the following... OpenXmlPartReader.cs OpenXmlDomReader.cs

ReadLastChild() calls upon MoveToLastChild() a bit like the existing MoveToFirstChild block.

This way the reader does not need to parse elements before the last child node.

image

I hope that helps.

tauheedul avatar Sep 04 '22 17:09 tauheedul

So is the only method that is needed is ReadLastChild()? That should be fine.

Addition of abstract method ReadLastChild similar to the existing ReadFirstChild method

This is a breaking change and may be considered (we'll probably be releasing a v3 soon. However, could this be done as a virtual method? Then it will be fine.

twsouthwick avatar Sep 06 '22 18:09 twsouthwick

I don't have the time right now to do this kind of request, but happy to take contributions from anyone interested in doing so.

twsouthwick avatar Sep 06 '22 18:09 twsouthwick

@twsouthwick Hi, yes it is only the addition of one method I am requesting and would appreciate if this would be added in a future release. Thank you.

tauheedul avatar Sep 07 '22 08:09 tauheedul

Seems like a few issues have been raised around this. I still don't have time to address this, but if anyone is interested in doing so, I'm happy to review design/PRs.

This would be an ideal time to add such behavior before v3.0 since it appears that we may want an abstract method to OpenXmlReader to be added.

twsouthwick avatar Feb 03 '23 17:02 twsouthwick