html-differ
html-differ copied to clipboard
Proposal: Rework internal html representation and diff logic
I think we should use posthtml-like (bemjson-like) structure for internal representation to make diff calculation process cleaner and more flexible.
Current reports are non-informative on big projects. Current way we comparing html documents leak on big files and don't feel quotes in attributes.
That's why we should think about core code refactoring.
Representation proposal
HtmlNode {
meta: MetaData, // Different meta data, raw string, another helpful data
tag: String,
classList: ClassesCollection, // Set of classes
attrs: Object<String, HtmlAttr>, // Map of attributes with attrs meta info like quote types
//? dataAttrs: Object<String, HtmlAttr>, // Map of data attributes
content: Array<TextNode, CommentNode, HtmlNode>
}
Samples proposal
<button class="button2" data-bem='{"button2":{}}'></button2>
→
{ meta: { diff: 'subset' }, tag: 'button', classList: ['button2'] }
Report proposal
(Something like https://github.com/chaijs/deep-eql mixed with https://github.com/debitoor/chai-subset)
/x/path
- tag: 'button'
+ tag: 'button2'
.or .selector
+ attrs.missed: 'attr'
button.or like.or__that
+ classList[2]: 'missed-class'
Futher possible features
- More accurate comparing methods where it needs (#144, #146).
- More clean diff calculation logic and testing possibilities (#139, #136, #127).
- HtmlNode can achieve additional fields to compare like
bemEntitieswith a set ofBemjsonNodesotBemEntityNames. - etc.
Pay attention to this comment