node-html-parser
node-html-parser copied to clipboard
[Doc] Clarify wrapper behaviour of parse()
Here is a failing test case:
it("reproduces error", () => {
const html = `
<html>
<body>
<div>
<p>b</p>
<p>c</p>
<div>
</body>
</html>`
const root = parse(html)
root.querySelector('div').childNodes.unshift(parse('<p>a</p>'));
expect(root.querySelector("p:nth-of-type(2)").rawText).toEqual("b");
/*
Expected: "b"
Received: "c"
28 | root.querySelector('div').childNodes.unshift(parse('<p>a</p>'));
29 |
> 30 | expect(root.querySelector("p:nth-of-type(2)").rawText).toEqual("b");
| ^
31 | });
*/
});
My workaround was to use childNodes[0].set_content(newContent + oldContent).
The reason I wrote this code in the first place is because I couldn't figure out from the documentation how to prepend content. I looked at the tests and noticed some of them were using childNodes[0] and figured, if you can index that array, you should be able to call other functions on it and have it work.
I found another bug using my workaround: it seems to nest another div around everything. I think there are a lot of bugs to find using similar patterns to the above.
const root = parse(html);
root.querySelector('div').childNodes.unshift(parse('<p>a</p>').firstChild);
I don't understand why I need to do that firstChild. To me, it suggests that everything named root in the documentation is not really the root.
Regardless, finding a part of HTML and modifying it (or its neighbors) seems like a common use case. Can the readme be updated to show examples of this use case?
I don't know if you would prefer another issue for this, or for me to change the title of this one. Let me know.
The root node is not really a node, it is a container of all the root nodes.
The node in question is a wrapper node. It is a commonly used pattern for html parsers, however, I think it makes sense to add something to the readme which explains the behaviour of parse.
If you want to change the issue title, I'll reopen if @taoqf is amenable to my doing a PR that clarifies that bit in the readme.