docs-scraper
docs-scraper copied to clipboard
TypeError: argument of type 'NoneType' is not iterable
I am on the Meilisearch droplet from Digitalocean.
Fresh installation of docs-scraper via install directions, python 3.8.2.
When attempting to run the first scrape, I receive the following error:
pipenv run ./docs_scraper config.json Traceback (most recent call last): File "./docs_scraper", line 22, in <module> run_config(sys.argv[1]) File "/root/docs-scraper/scraper/src/index.py", line 34, in run_config config = ConfigLoader(config) File "/root/docs-scraper/scraper/src/config/config_loader.py", line 81, in __init__ self._parse() File "/root/docs-scraper/scraper/src/config/config_loader.py", line 114, in _parse self.selectors = SelectorsParser().parse(self.selectors) File "/root/docs-scraper/scraper/src/config/selectors_parser.py", line 64, in parse if 'lvl0' in config_selectors: TypeError: argument of type 'NoneType' is not iterable
I have reduced my config json to the smallest possible to rule out any issues:
{ "index_uid": "docs", "start_urls": ["https://socialtools.io"], "strip_chars": " .,;:#" }
You need to provide some selectors. I am not sure if the selectors are a true requirement or if this is a bug, there is some discussion here and here about this.
Interesting, in the docs it lists selector keys as "Optional"
I have not had a chance to test this, but i wonder if you use this if it will work.
{ "index_uid": "docs", "start_urls": ["https://socialtools.io"], "selectors": [], "strip_chars": " .,;:#" }
This issue didn't receive an update this a long time, so I suppose it's no longer relevant