data_extractor
data_extractor copied to clipboard
Combine XPath, CSS Selectors and JSONPath for Web data extracting.
Bumps [lxml](https://github.com/lxml/lxml) from 4.8.0 to 4.9.1. Changelog Sourced from lxml's changelog. 4.9.1 (2022-07-01) Bugs fixed A crash was resolved when using iterwalk() (or canonicalize()) after parsing certain incorrect input. Note...
#68 shows a bad example of how to remove the super class sub-extractors. But what's the right thing to do?
Add [elementpath](https://github.com/sissaschool/elementpath) for XPath2.0 support. Make it as a optional dependency.
- `python-json-rw` is not designed for iterable extracting - `lxml.xpath` is not designed for iterable extracting But implement `AbstractSimpleExtractor.iter_extract` and `AbstractComplexExtractor.iter_extract` can be able to reduce the memory usage when...
Bumps [urllib3](https://github.com/urllib3/urllib3) from 2.0.4 to 2.0.7. Release notes Sourced from urllib3's releases. 2.0.7 Made body stripped from HTTP requests changing the request method to GET after HTTP 303 "See Other"...