parsel icon indicating copy to clipboard operation
parsel copied to clipboard

remove_namespaces asserts on json TextResponse

Open mohmad-null opened this issue 1 year ago • 7 comments

Regression. As part of my unit tests, I have a well-formed JSON file that I'm using in some tests as "bad" data (the script wants XML).

I do something like this:

response = TextResponse(body='path-to.json', encoding='utf8')
response.selector.remove_namespaces()

This used to work (1.7.x), now (2.11.1) it asserts:

    def remove_namespaces(self) -> None:
        """
        Remove all namespaces, allowing to traverse the document using
        namespace-less xpaths. See :ref:`removing-namespaces`.
        """
		for el in self.root.iter("*"):
		AttributeError: 'dict' object has no attribute 'iter'

I appreciate JSON doesn't have namespaces, but it should fail gracefully, not raise an assertion.

mohmad-null avatar May 03 '24 18:05 mohmad-null

It's parsel, not Scrapy.

wRAR avatar May 03 '24 20:05 wRAR

Yes, I noticed that, but I figured Scrapy should catch it.

mohmad-null avatar May 03 '24 20:05 mohmad-null

I don't think Scrapy should catch it, and Scrapy will only able to do that by overriding remove_namespaces().

wRAR avatar May 04 '24 05:05 wRAR

But AttributeError: 'dict' object has no attribute 'iter' is a meaningless error for a scrapy user. If there's an exception (fine), it should be something meaningful "Cannot remove namespaces from JSON" or something - that's why I consider it a Scrapy issue.

mohmad-null avatar May 04 '24 09:05 mohmad-null

The code is in parsel, not Scrapy. It should be done in parsel, not in Scrapy. Migrating.

wRAR avatar May 04 '24 09:05 wRAR

Hello @wrar, I'm new to contributing to parsel and would be more than happy to take a shot at this. Can you elaborate a bit on what needs to be done?

ghost avatar Jul 03 '24 16:07 ghost

@kanjikinsmoke we need to make the remove_namespaces() call on JSON-type selectors not fail with an uncaught exception. I think it needs a self.type check similar to e.g. Selector.css().

Also it would be nice to check if there are any other methods or properties that behave the same and do the same fix for them, e.g. attrib also gives an unhandled exception (#284 is likely about that).

wRAR avatar Jul 05 '24 11:07 wRAR

I saw

bpdvarma avatar Jul 26 '24 10:07 bpdvarma