quick-xml
quick-xml copied to clipboard
Expose underlying Cow in Event data
Would it be possible to expose the underlying Cow in event datatypes like BytesStart and BytesEnd? I have to match only specific tags, and have to keep track of the pushed tag stack.
Ideally I would keep a Vec<&'datasource [u8]> as the tag stack, but I am forced to copy since a BytesStart<'datasource> cannot return the underlying Cow, which I would then match to Cow::Borrowed.
I would prefer to hide that fact that we store Cow<[u8]> internally and make you think that we store an str. If you sure that you will able to store Vec<&'datasource [u8]> (i.e. borrowed data from input), then you also could store BytesStart / BytesEnd -- they will store only offsets in that case (or only BytesEnd if you doesn't need attributes, use .to_end() to convert).
Would that solution acceptable for you? Otherwise feel free to submit a PR
It would add a fair amount of compliction to my code, so I'd prefer to use the reference with the same lifetime. I'll submit a pr.
Note, that all events will borrow when comes from Reader. Also, cloning them is cheap -- the underlying Cow stays in Borrowed state, if you clone borrowed Cow.
hello, just to chime in and provide another example / motivation ...
i also felt the need to get hold of the ownership of the underlying Cow. Wanting to store the names of "start/end elements" in a HashMap/Set for later reference outside my "event reader loop", required me to either clone the underlying byte slice or write my own wrappers around ByteStart/End in order to provide a Hash implementation (merely delegating merely to the underlying byte slice.) later, when looking up values in the collection, i faced the situation that while ByteStart/End expose the names as &[u8] (not as a str) they do not allow construction from such a type (i had to ...
let name: &[u8] = ...; // comming from some other event
let key = MyWarpper(quick_xml::events::BytesEnd::new(unsafe {
std::str::from_utf8_unchecked(name)
}));
match map.get(&key) {...}
... which I'd like to avoid of course.
i can imagine, somebody else might face similar difficulties if needing the standard Ord implementation over the names for example. so the trouble is rather the (limited) utility of ByteStart/End.
- Could providing these standard, derived impls (e.g.
PartialOrd, Ord, Hash) help to solve the original issue? (for my use case that would do it.) - Is there anything (semantically) anything speaking against deriving at least
HashforByteStart/End? If not I'd set up a PR.
the trouble of course is, that one can never foresee future use-cases, so an exposure of the name as a Cow (e.g. ByteStart#into_bytes/_raw_name(self) might make sense after all.)
btw: many thanks for the amazing work on this library! believe it or not, but quick-xml allows me to do stuff that i can't manage to do with standard parsers in java :)