scrapy-elasticsearch icon indicating copy to clipboard operation
scrapy-elasticsearch copied to clipboard

ELASTICSEARCH_UNIQ_KEY from multiple item fields

Open jenkin opened this issue 8 years ago • 2 comments

You can have an item without a single-field primary key, so this functionality is useless (or even dangerous!). Sure you can compute and add a new really unique field to the item, ie. from a concatenation of fields, but so it will be indexed along with the others. Maybe ELASTICSEARCH_UNIQ_KEY can accept a list of fields keys and concatenate their values (forced to strings) before the hash computing.

jenkin avatar Oct 27 '16 10:10 jenkin

This 'feature' was implemented by the original author and I am keeping it there for backward compatibility purpose. I agree it is sorta useless and do plan to remove it in the next release.

jayzeng avatar Nov 05 '16 22:11 jayzeng

It doesn't work anymore now so should be removed from the README:

# can also accept a list of fields if need a composite key
ELASTICSEARCH_UNIQ_KEY = ['url', 'id']
  File "/usr/local/lib/python3.9/site-packages/scrapyelasticsearch/scrapyelasticsearch.py", line 104, in get_id
    item_unique_key = item[self.settings['ELASTICSEARCH_UNIQ_KEY']]
TypeError: unhashable type: 'list'

Because of this line: https://github.com/jayzeng/scrapy-elasticsearch/blob/56d60d99182874218ab75cd13a5e1b7f4f5f94d9/scrapyelasticsearch/scrapyelasticsearch.py#L104

kasbah avatar Jan 21 '21 23:01 kasbah