node-htmlparser icon indicating copy to clipboard operation
node-htmlparser copied to clipboard

peek() method

Open jtwb opened this issue 13 years ago • 3 comments

peek() is required for streaming parsers.

In browsers, the live DOM (document.*) must be updated with new output from the HTML parser in these situations (and possibly others): 1 Done reading HTML from a single packet 2 script tag found 3 document.write() call completed

So we need a way to query the parser's DOM state without calling done(), which prevents any further parsing.

This gist shows how peek() would be used in a simple case: https://gist.github.com/849639/ec57b97213acb92b8a20e377d06cf1cffaf01e99

This is a more complex use case where we sync the DOM whenever a script tag appears: https://gist.github.com/849639/dd05bcccefe82c0cc01d10c0ec54ce3f31bda4b8

jtwb avatar Mar 01 '11 19:03 jtwb

Thanks for the contribution! Looking through it now and will get back to you with questions or just accept the pull.

tautologistics avatar Mar 04 '11 02:03 tautologistics

Sweet! By the way, this is a little unclear from those gist examples: calling peek() will trigger the handler callback function with the partial DOM, so seeing that callback no longer implies that parsing is totally finished.

jtwb avatar Mar 04 '11 05:03 jtwb

FYI - Working on 1.8 now and working through pull requests

tautologistics avatar Mar 07 '11 12:03 tautologistics