Allow None values in Itemloaders/Items
Summary
I would like to pass None values to the Itemloader() and store them in an Item(). Right now, None values are discarded and therefore working with Item() does not work properly.
Motivation
Sometimes values are not available on every parsed page and when the Selector returns None, the database pipeline (Postgres) results in an KeyError: 'fieldname'.
I solved this problem by filling in a null String which is later changed to None but this seems like a hacky solution.
Hey, this has been a discussion in the past, as I recall. See https://github.com/scrapy/scrapy/pull/556
Ultimately the decision was for None values to not be kept by itemloader.
But you can restore that possibility by using a custom loader like this:
https://github.com/nyov/scrapyext/blob/2dd5e0fc03f8e4b8793b808744d4dd6452e5d5b3/scrapyext/loader.py#L19-L27
Beware, this is old code I have yet to update. All you'll really want is just to remove the following line in the current codebase:
https://github.com/scrapy/itemloaders/blob/951e9edf2e52620db0338a4edb9015352356abc5/itemloaders/init.py#L264
Or we could try to overturn the old decision, now that some water has passed under the bridge (evil laugh).
Indeed, I'm in favor of having a flag or specialized ItemLoader for this behavior.
I think it's weird to loader.add_value('field', None) and not have the field in the output.
Even though None is the absence of a value, it is still a value itself
I don't even know why that wasn't a consideration then.
But that's exactly what we should add, I think. A documented NoneValueItemLoader subclass or a flag ItemLoader(item, nonevalues=True), either, should both work just fine?
Any updates according to it?
I'm struggling with the same ! any updates???