node-htmlparser
node-htmlparser copied to clipboard
peek() method
peek() is required for streaming parsers.
In browsers, the live DOM (document.*) must be updated with new output from the HTML parser in these situations (and possibly others): 1 Done reading HTML from a single packet 2 script tag found 3 document.write() call completed
So we need a way to query the parser's DOM state without calling done(), which prevents any further parsing.
This gist shows how peek() would be used in a simple case: https://gist.github.com/849639/ec57b97213acb92b8a20e377d06cf1cffaf01e99
This is a more complex use case where we sync the DOM whenever a script tag appears: https://gist.github.com/849639/dd05bcccefe82c0cc01d10c0ec54ce3f31bda4b8
Thanks for the contribution! Looking through it now and will get back to you with questions or just accept the pull.
Sweet! By the way, this is a little unclear from those gist examples: calling peek() will trigger the handler callback function with the partial DOM, so seeing that callback no longer implies that parsing is totally finished.
FYI - Working on 1.8 now and working through pull requests