Added support for parsing huge trees
I added support for parsing huge trees (either nested > 256 levels or just big nodes) as discussed in #110.
As suggested it can be disabled when passing huge_tree=False to the Selector. I do version checking to check for lxml 4.2 for support of the huge_tree option there.
A ValueError is raised when parsing a huge document and:
- lxml doesn't support the
HUGE_TREEoption or -
huge_tree=Falseis passed to the Selector.
I think raising an Error here is okay, because the resulting DOM is not what one would expect. However, I'm fine with any changes to handling this.
There is a test case added that tests this new behavior.
Codecov Report
Merging #116 (aa93f88) into master (6451476) will not change coverage. The diff coverage is
n/a.
@@ Coverage Diff @@
## master #116 +/- ##
=========================================
Coverage 100.00% 100.00%
=========================================
Files 4 4
Lines 138 138
Branches 33 33
=========================================
Hits 138 138
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.
@Langdi Hey, this pull request seems close to being ready, do you plan to continue working on it?
@Gallaecio Yes, I'll finish this up some time next week.
Should be done now. I added the behavior as discussed above. Please notify me if there's something left to do, I'm a first time contributor.
@Langdi Do you think you will have time to fix those two minor comments I left? Also, it would be great if you could rebase your changes to solve the conflicts with recent changes in master.
@Langdi Do you think you will have time to fix those two minor comments I left? Also, it would be great if you could rebase your changes to solve the conflicts with recent changes in
master.
Sorry, I thought because of the "Changes approved" and "Needs one more approval" that my job is done, and someone else just has to review this. I fixed those now.
You assumption was quite correct :slightly_smiling_face: ; we still need someone else to review this, which may take some time.