selectorlib icon indicating copy to clipboard operation
selectorlib copied to clipboard

Specifying a "type" other than Text, Link, HTML, Attribute or Image (even the same ones in different casing) will yield an UnboundLocalError

Open ghost opened this issue 2 years ago • 1 comments

  • selectorlib version: 0.16.0
  • Python version: 3.8.10
  • Operating System: Ubuntu 20.04.5 LTS

Description

Specifying a "type" in YAML other than Text, Link, HTML, Attribute or Image (even the same ones in different casing) yields an UnboundLocalError for "content" variable. A quick inspection through the source code shows a missing "else" branch and as a result "content" is never defined.

What I Did

Used type: "Html" in YAML and ran extractor.extract()

Traceback (most recent call last):
  File "scraper.py", line 18, in <module>
    print(extractor.extract(r.text))
  File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 74, in extract
    fields_data[selector_name] = self._extract_selector(selector_config, sel)
  File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 93, in _extract_selector
    value = self._get_child_item(field_config, element)
  File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 113, in _get_child_item
    child_value = self._extract_selector(children_config[field], element)
  File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 100, in _extract_selector
    value = extract_field(element, item_type, **kwargs)
  File "/home/kartik/dev/selectorlib-projects/demo/venv/lib/python3.8/site-packages/selectorlib/selectorlib.py", line 21, in extract_field
    return content
UnboundLocalError: local variable 'content' referenced before assignment

I would be more than happy to submit a PR to fix this.

ghost avatar Nov 07 '22 11:11 ghost

Hi @ashwinrajeev, I have fixed this issue and created the following PR - https://github.com/scrapehero/selectorlib/pull/86. Please verify and let me know if any changes need to be made.

sumeshmurali avatar Jan 26 '23 14:01 sumeshmurali