selectorlib
selectorlib copied to clipboard
Specifying a "type" other than Text, Link, HTML, Attribute or Image (even the same ones in different casing) will yield an UnboundLocalError
- selectorlib version: 0.16.0
- Python version: 3.8.10
- Operating System: Ubuntu 20.04.5 LTS
Description
Specifying a "type" in YAML other than Text, Link, HTML, Attribute or Image (even the same ones in different casing) yields an UnboundLocalError for "content" variable. A quick inspection through the source code shows a missing "else" branch and as a result "content" is never defined.
What I Did
Used type: "Html" in YAML and ran extractor.extract()
Traceback (most recent call last):
File "scraper.py", line 18, in <module>
print(extractor.extract(r.text))
File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 74, in extract
fields_data[selector_name] = self._extract_selector(selector_config, sel)
File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 93, in _extract_selector
value = self._get_child_item(field_config, element)
File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 113, in _get_child_item
child_value = self._extract_selector(children_config[field], element)
File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 100, in _extract_selector
value = extract_field(element, item_type, **kwargs)
File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 21, in extract_field
return content
UnboundLocalError: local variable 'content' referenced before assignment
I would be more than happy to submit a PR to fix this.
Hi @ashwinrajeev, I have fixed this issue and created the following PR - https://github.com/scrapehero/selectorlib/pull/86. Please verify and let me know if any changes need to be made.