node-unfluff
node-unfluff copied to clipboard
Fixed side effect from invocation of cleaner in unfluff.lazy
I was sure that I checked that for #16 but it seems that I missed that.
cleaner
mutates original doc
object so doc
needs to be re-calculated. So right now after cleaner
is applied we will suffer from side effect. Consider next example:
[fs, unfluff] = ['fs', 'unfluff'].map require
html = fs.readFileSync('test_tags_kexp.html', 'utf8')
doc1 = unfluff.lazy html
doc2 = unfluff.lazy html
console.log 'tags1: ', doc1.tags() # ['Dennis Morton', 'film', 'kusp film review', 'Stand Up Guys']
console.log 'text1: ', doc1.text()
console.log 'text2: ', doc2.text()
console.log 'tags2: ', doc2.tags() # [ ]
Using this code over test_tags_kexp.html
fixture we will have different results for tags()
since cleaner
is called inside text()
.
So when cleaner
is called we need to reload document. Besides, I added some refactoring.
Thanks for catching this! I'll take a look in detail when I have some time this weekend.
Sure. If you have ideas how we can avoid reloading document bring it up.
Sorry, I've been lax on reviewing this. Still plan to get to this very soon. Thanks!