quick-xml icon indicating copy to clipboard operation
quick-xml copied to clipboard

Parser position tracking

Open Themayu opened this issue 1 year ago • 1 comments

Parser position tracking

Store an Option<quick_xml::reader::Span> inside each of the quick_xml::events::Bytes* structs. This span is computed by the parser during reading, and passed into the relevant wrap() function during value construction. It can be accessed by calling relevant span() methods, which will clone and return the span (Range, sadly, does not implement Copy due to its status as an Iterator implementation.)

~~Additionally, the derive implementations of PartialEq on the structs is replaced with a manual implementation of PartialEq. This is to prevent previously true equality comparisons such as bytes == BytesStart::new("...", 4) suddenly changing their behaviour to return false. This does mean that we lose our implementations of StructuralPartialEq, meaning that constants of the relevant types can no longer be used as patterns in match arms. However, since quick_xml exposes no way to construct values of these types at compile time, and since StructuralPartialEq is currently nightly-only and therefore cannot be used in trait bounds in stable Rust, I do not believe this change is sufficiently observable to require a major version bump under semver.~~ Paragraph rendered obsolete by design change.

Other than the above, I do not intend to make any other possibly breaking changes to the public API with this contribution.

Changes

  • Each of the Bytes* structs under quick_xml::events receives a new field, span: crate::reader::Span.
    • If created via parsing, this field will store a value of n..m, representing the start and end positions of that event's contents - for tags, this is the positions of the < and > character; while for text, it is the start and end of the text content, with respect to optional whitespace trimming.
    • If created via the API, this field will store 0..0.
  • Each of the structs receives a new trait, Spanned, which defines methods to retrieve and update the stored span.
  • Altering data for a given event, such as its element name or adding attributes, does not alter the span it is given. This is because such edits do not alter the source it was initially read from.

Progress

  • [X] Events
    • [X] quick_xml::events::BytesStart
    • [X] quick_xml::events::BytesDecl
      • As far as I can tell, this one can just reuse the span inside BytesStart
    • [X] quick_xml::events::BytesEnd
    • [X] quick_xml::events::BytesText
    • [X] quick_xml::events::BytesCData
    • [X] Method on quick_xml::events::Event to return the underlying span if there is one
  • [ ] Attributes
    • [ ] quick_xml::events::attributes::Attributes
    • [ ] quick_xml::events::attributes::Attribute
    • [ ] quick_xml::events::attributes::Attr
  • [ ] Events
    • [ ] quick_xml::errors::Error
    • [ ] quick_xml::escape::EscapeError
    • [ ] quick_xml::events::attributes::AttrError
  • [ ] Namespaces (unsure whether this one makes sense)
    • [ ] quick_xml::name::QName
    • [ ] quick_xml::name::LocalName
    • [ ] quick_xml::name::Prefix
    • [ ] quick_xml::name::Namespace
  • [X] Figure out where to add tests for the new functionality
  • [ ] Edit Changelog.md with new functionality

PR changelog

  • 2023/02/16 18:31 GMT: Rebased PR source on an incomplete implementation that will be continued.
  • 2023/02/17 07:16 GMT: Restructured Progress section, added mini-changelog to PR.

Themayu avatar Feb 11 '23 01:02 Themayu

Rebased PR source on an incomplete implementation by @Mingun which I will continue work on. Updated PR details to reflect change in design.

Themayu avatar Feb 16 '23 18:02 Themayu