go proposal: encoding/xml: add ability to create decoder with start element already on stack

In a project that includes handlers similar to http.Handler except reading from XML, not the byte stream directly, I frequently find myself doing something like the following (where "xmlstream" is a package that contains the functionality of the io package except dealing with XML tokens instead of bytes):

// HandleXML implements XMLHandler. The token reader is the original stream
// but limited to the remainder of the element that triggered this handler, and start is the
// token that triggered the handler.
func (something) HandleXML(r xml.TokenReader, start *xml.StartElement) error { 
 	d := xml.NewTokenDecoder(xmlstream.MultiReader(
           xmlstream.Token(*start),
           t,
        ))
        // Pop the start element through the decoder, otherwise the stack is wrong
        // and when we read the end element it will throw an error about there being
        // no matching start element.
        d.Token()

        var foo someType
        d.DecodeElement(&foo, start)

In this example we could change our handler types to pass in a decoder directly, but this isn't desirable for several reasons:

the Handler type could be defined in another package that we don't control
the actual type of r is not actually xml.TokenReader but something that includes various other methods and implements xml.TokenReader, passing in a Decoder would make us have to provide two arguments, the decoder wrapping r and r itself which would be odd.
r is actually a pipeline of various wrapped XML readers that perform transformations and checks on a document. We don't want the entire document read through a decoder with its associated overhead and validity checks as we know the document is valid already which is why we create one (for the users convenience, not for checking validity) only in the handler and only if the user actually needs it.

Instead I'd like to have an equivalent of the Decode/DecodeElement pattern but for creating a new decoder, something like:

// NewTokenDecoderElement is like NewTokenDecoder except that it creates a
// decoder with start already on the stack as the outermost tag.
//
// This is to accommodate the common pattern where a token is popped from a
// reader and if it is a StartElement it is decoded and otherwise some other
// action is performed without requiring that the entire stream be subject to
// the overhead of a decoder.
func NewTokenDecoderElement(t TokenReader, start *StartElement) *Decoder

This would allow us to pop from an xml.TokenReader and only create a decoder when the start element is encountered, then decode into a struct (consuming the end element) without error. The initial example then becomes:

func (something) HandleXML(r xml.TokenReader, start *xml.StartElement) error { 
 	d := xml.NewTokenDecoderElement(r, start)

        var foo someType
        d.DecodeElement(&foo, start)

EDIT: I should mention that while this can be done in an external package using the workaround from the initial example, doing it directly in encoding/xml has the benefit of not doing namespace transformations on the start element (we can just call pushElement directly). This reduces overhead and possibly other subtle issues due to double decoding and the way XML namespaces are handled.

/cc @rsc

Nov 03 '21 12:11 SamWhited

Ping

Jan 05 '22 23:01 SamWhited

We haven't gotten this out of incoming yet, sorry. It won't be for Go 1.18.

Jan 12 '22 18:01 rsc

This proposal has been added to the active column of the proposals project and will now be reviewed at the weekly proposal review meetings. — rsc for the proposal review group

Jan 12 '22 19:01 rsc

I'm confused about

func (something) HandleXML(r xml.TokenReader, start *xml.StartElement) error { 
 	d := xml.NewTokenDecoderElement(r, start)

        var foo someType
        d.DecodeElement(&foo, start)

Why does start appear twice?

What is wrong with the current API which allows

d := xml.NewTokenDecoder(r)
d.DecodeElement(&foo, start)

?

Jan 19 '22 21:01 rsc

This contrived example shows the error:

https://go.dev/play/p/P2n9VLzEQX6

Because the start token has already been popped from the stream (to check what the name is to trigger the appropriate handler or callback) it is never read through the decoder. If you call decode there is no start element on the stack and the XML appears invalid. There is a workaround as detailed in the initial post, but this requires yet another wrapper and looks rather ugly.

Jan 20 '22 02:01 SamWhited

I still need to take time to understand this better.

Feb 09 '22 18:02 rsc

Instead of xml.NewTokenDecoderElement, what about adding a Push method to Decoder, like:

package xml

// Push pushes the token onto the input stack, so that the next call to RawToken, Token, or Decode
// will use this token as the first element of the input.
func (d *Decoder) Push(Token)

Then NewTokenDecoderElement = NewTokenDecoder+Push, but Push could be used in other contexts as well.

Thoughts?

Aug 17 '22 16:08 rsc

That seems better to me at first glance since it's a lot more flexible. I'm assuming any XML errors would be deferred until the token was read back out? Even if XML level errors are deferred, is an error or ok bool on Push necessary in case the type pushed is not one of the token types?

I'll go stick that in some of the code where I would have used NewTokenDecoderElement and see what it looks like.

Aug 18 '22 10:08 SamWhited

Actually, looking again this isn't the same thing. The next call to Token would return the start token, but in the initial examples the start token has already been consumed, its namespaces are just being added as the default (if necessary) and a matching end element will be necessary. Maybe Push should operate that way so that we don't end up with a double-decoding issue if one of the underlying readers somewhere in the pipeline is also a decoder.

Aug 18 '22 10:08 SamWhited

Actually, looking again this isn't the same thing. The next call to Token would return the start token, but in the initial examples the start token has already been consumed, its namespaces are just being added as the default (if necessary) and a matching end element will be necessary.

Apologies for the delay in replying.

I don't understand this comment. It sounds like you are saying that Push cannot be used to "un-read" a token from a decoder. I agree with that. But I thought we were talking about prepping a new decoder that hasn't been used at all yet. In that case, it seems like

d := xml.NewTokenDecoder(r)
d.Push(start)
d.DecodeElement(&foo)

should work. It's true that it will never make sense to push an element that was just read from the TokenDecoder. But you could imagine other manipulations that might make sense, such as peeking at the next element from r to see if it's something you want to modify and then pushing that directly.

Oct 26 '22 16:10 rsc

I'm having a hard time expressing this in an example as it either becomes very long or doesn't look like it makes sense. The question is, with your proposal, which of the following would be returned (ignoring for a moment that it's silly to pop a token just to push it immediately):

r := strings.NewReader(`<one xmlns="foo"><two/></one>`)
d := xml.NewDecoder(r)
one, _ := d.Token()
d.Push(one.(xml.StartElement))

// Does this result in "one" being returned, or "two"?
d.Token()

Oct 26 '22 16:10 SamWhited

d.Token would return one, but that's not how you'd use d.Push. Instead you would make a new decoder, as in your original comment above, and seed it with "one".

Oct 26 '22 17:10 rsc

Right, so in an example like the following:

r := strings.NewReader(`<one xmlns="foo"><two/></one>`)
d := xml.NewDecoder(r)
tok, _ := d.Token()
start := tok.(xml.StartElement)

switch start.Name.Space {
case "foo":
  FooHandler(Middleware(d), start)
case "bar":
  BarHandler(d, start)
}

If we wanted to use push inside FooHandler, we already know what start is, so we have to pop it even though we've already popped it once and then pushed it back onto the stack. eg.

func FooHandler(r xml.TokenReader, start xml.StartElement) {
  d := xml.NewTokenDecoder(r)
  d.Push(start)
  d.Token() // Discard, we already know what this is and don't need it again.
  d.DecodeElement(&whatever, start)
}

Instead, it would be nice if we could avoid the unnecessary d.Token since we know what we're pushing already, there's no need to pop it again.

Oct 26 '22 20:10 SamWhited

Paging this back in again. I'm OK with having d.Push and d.Token and having to use both in that example. Because you might just as often use

d := xml.NewTokenDecoder(r)
d.Push(start)
d.Decode(&whatever)

Of course as was noted in the initial post, a custom xml.TokenReader can do the same already, something like

d := xml.NewTokenDecoder(xmlcat(start, r))
d.Decode(&whatever)

I'm not sure there's really a huge amount of difference between those, and since third-party implementations of xmlcat already exist (including your xmlstream.MultiReader), I think maybe we should leave things as they are.

I thought for a bit about Push and I can't decide which semantics of Push you'd really want. That is, if you want to push <A><B><C> into a stream that continues data</C></B></A>, there are reasonably arguments for writing Push(A); Push(B); Push(C) and reasonable arguments for writing Push(C), Push(B), Push(A), depending eventually on whether you think of the token sequence being manipulated by Push as a stack attached to the input or a queue checked before the main input. Using a custom xmlstream.MultiReader lets those APIs define that detail in whatever way is natural instead of us having to decide it.

Apr 19 '23 16:04 rsc

Based on the discussion above, this proposal seems like a likely decline. — rsc for the proposal review group

May 03 '23 20:05 rsc

No change in consensus, so declined. — rsc for the proposal review group

May 10 '23 21:05 rsc