scrapy-elasticsearch
scrapy-elasticsearch copied to clipboard
ELASTICSEARCH_UNIQ_KEY from multiple item fields
You can have an item without a single-field primary key, so this functionality is useless (or even dangerous!). Sure you can compute and add a new really unique field to the item, ie. from a concatenation of fields, but so it will be indexed along with the others. Maybe ELASTICSEARCH_UNIQ_KEY can accept a list of fields keys and concatenate their values (forced to strings) before the hash computing.
This 'feature' was implemented by the original author and I am keeping it there for backward compatibility purpose. I agree it is sorta useless and do plan to remove it in the next release.
It doesn't work anymore now so should be removed from the README:
# can also accept a list of fields if need a composite key ELASTICSEARCH_UNIQ_KEY = ['url', 'id']
File "/usr/local/lib/python3.9/site-packages/scrapyelasticsearch/scrapyelasticsearch.py", line 104, in get_id
item_unique_key = item[self.settings['ELASTICSEARCH_UNIQ_KEY']]
TypeError: unhashable type: 'list'
Because of this line: https://github.com/jayzeng/scrapy-elasticsearch/blob/56d60d99182874218ab75cd13a5e1b7f4f5f94d9/scrapyelasticsearch/scrapyelasticsearch.py#L104