Add support for set type
In the previous version (1.1.0), adding a set would return a list:
$ pip freeze | grep itemloaders
itemloaders==1.1.0
$ cat test_set.py
from itemloaders import ItemLoader
test_set = {"foo", "bar"}
il = ItemLoader()
il.add_value("item_list", test_set)
print(il.load_item())
$ python ./test_set.py
{'item_list': ['bar', 'foo']}
In version 1.2.0, a list o af set would be returned:
$ pip freeze | grep itemloaders
itemloaders==1.2.0
$ python ./test_set.py
{'item_list': [{'bar', 'foo'}]}
This PR is aimed to keep the previous behaviour. Follow-up from https://github.com/scrapy/itemloaders/pull/51#issuecomment-2107173388
I must say, when you said keeping set, I thought we were talking about changing the existing isinstance to include it, not to add a separate isinstance to convert it to a list. So I’m +0 on this, I’m not against the change but it feels weird that arg_to_iter converts a set to a list, feels like that should be done by the calling code, either before or after.
@Gallaecio I honestly had the same initial idea, but then I double-checked how it used to work and started to doubt the whole idea. Ultimately though, I decided to make it backwards compatible and push it anyway, to have the conversation at least. Thinking about it again today, maybe it's better not to include the set at all as it might break existing code even more...
Adding set to the list of classes in isinstance solves the same problem while keeping the code simple:
https://github.com/scrapy/itemloaders/pull/85
I think it's good to keep backwards compatibility, otherwise, this could cause many problems. If someone wants to actually have a dict in that field they can use the Identity processor.
@ava7 I didn't mind to steal your thunder here, I just created the PR to test my assumptions. Feel free to copy my solution here and close my PR.
@VMRuiz It's alright mate, nothing to steal here, the more opinions the better :) The problem with that code is that it might not be backwards compatible against version 1.1.0, if that is what you meant. Weirdly enough, a set used to return a simple list, now a set would return a list of a set... I'm not sure which one is more confusing. Maybe the old behaviour should be treated as a bug, and proceed with your changes?
I just checked that in 1.1.0 using default_output_processor or default_input_processor will not affect set being always converted into list, so it was impossible for an item to contain a set value.
Based on that, I propose to treat the old behavior as a bug and do not try to fix it. The downside to this is that someone that was relaying on the behavior may be surprised when he is not longer getting a list of values but a list of sets of values.