docs-scraper icon indicating copy to clipboard operation
docs-scraper copied to clipboard

TypeError: argument of type 'NoneType' is not iterable

Open GarlandM opened this issue 3 years ago • 3 comments
trafficstars

I am on the Meilisearch droplet from Digitalocean.

Fresh installation of docs-scraper via install directions, python 3.8.2.

When attempting to run the first scrape, I receive the following error:

pipenv run ./docs_scraper config.json Traceback (most recent call last): File "./docs_scraper", line 22, in <module> run_config(sys.argv[1]) File "/root/docs-scraper/scraper/src/index.py", line 34, in run_config config = ConfigLoader(config) File "/root/docs-scraper/scraper/src/config/config_loader.py", line 81, in __init__ self._parse() File "/root/docs-scraper/scraper/src/config/config_loader.py", line 114, in _parse self.selectors = SelectorsParser().parse(self.selectors) File "/root/docs-scraper/scraper/src/config/selectors_parser.py", line 64, in parse if 'lvl0' in config_selectors: TypeError: argument of type 'NoneType' is not iterable

I have reduced my config json to the smallest possible to rule out any issues:

{ "index_uid": "docs", "start_urls": ["https://socialtools.io"], "strip_chars": " .,;:#" }

GarlandM avatar Feb 18 '22 11:02 GarlandM

You need to provide some selectors. I am not sure if the selectors are a true requirement or if this is a bug, there is some discussion here and here about this.

sanders41 avatar Feb 18 '22 13:02 sanders41

Interesting, in the docs it lists selector keys as "Optional"

GarlandM avatar Feb 18 '22 14:02 GarlandM

I have not had a chance to test this, but i wonder if you use this if it will work.

{ "index_uid": "docs", "start_urls": ["https://socialtools.io"], "selectors": [], "strip_chars": " .,;:#" }

sanders41 avatar Feb 18 '22 17:02 sanders41

This issue didn't receive an update this a long time, so I suppose it's no longer relevant

alallema avatar Aug 03 '23 13:08 alallema