html5ever icon indicating copy to clipboard operation
html5ever copied to clipboard

Provide source spans for tokens and DOM nodes

Open kmcallister opened this issue 11 years ago • 4 comments

We can use something like libsyntax's Span and Spanned types to track positions in the input stream.

The tokenizer will remember its current position and the position at certain events, e.g. start tag, start attribute name. The tree builder will call a tree sink method (with an empty default) to annotate the DOM with span information.

Then we can write a command-line HTML validator with the same output UI as rustc :)

Note that eventually it will be possible for a single document's nodes to come from multiple text sources, e.g. with document.write.

kmcallister avatar Oct 21 '14 23:10 kmcallister

I am interested in this feature. What is the status of this? Is there some spanning information available or is this yet to be implemented?

My use case is that I am using html5ever in my proc macros to parse templates for a web framework and I would like to give errors with spans that point various parts of the original HTML

707090 avatar Aug 17 '20 02:08 707090

The tree builder sink is notified when the current line is updated via https://github.com/servo/html5ever/blob/36ee935f6884224d6b692cc2e8be0e4a308b8a6d/html5ever/src/tree_builder/mod.rs#L459.

jdm avatar Aug 17 '20 03:08 jdm

Are there any plans to add this to rcdom/markup5ever?

noahbald avatar May 07 '24 22:05 noahbald