csv2es
csv2es copied to clipboard
Add docs to same index
How to read a new csv file and store the records in already existing index using this ? If i try giving command, i get this : raise error_class(status, error_message) pyelasticsearch.exceptions.ElasticHttpError: (400, {u'index': u'test-index', u'root_cause': [{u'index': u'test-index', u'reason': u'already exists', u'type': u'index_already_exists_exception'}], u'type': u'index_already_exists_exception', u'reason': u'already exists'})
-- I want --update-index option, is this possible with current version ?
anything new on this?
Another related feature is the ability to add multiple document types into the same index without needing to wipe out the index each time.
I would also love to see this added. In my case, I've designed an index with edge-ngrams for auto completion in a search box, and now I need to add the data.
Kudos on the project by the way, even with this feature missing, it's an excellent utility!
The issue is that the code is trapping IndexAlreadyExistsError but it appears the the actual exception is a ElasticHttpError. So this is not a feature request but a bug. See below:
try:
es.create_index(index_name)
echo('Created new index: ' + index_name, quiet)
except IndexAlreadyExistsError:
echo('Index ' + index_name + ' already exists', quiet)
A quick workaround is something like:
try:
es.create_index(index_name)
echo('Created new index: ' + index_name, quiet)
except IndexAlreadyExistsError:
echo('Index ' + index_name + ' already exists', quiet)
except ElasticHttpError:
echo('Index ' + index_name + ' already exists', quiet)
I need this as well.
The workaround seems not to work, I'm getting:
NameError: global name 'ElasticHttpError' is not defined
[+1] on both requests: (1) Add documents to existing indices, (2) Add new document types to existing indices*
* Don't know if adding a new document type into existing indices is doable if the field's associated core type does not already exist?
Note that I've also tried some workarounds on specifying whether to create or delete an index with no success.
I've also never been able to get providing the mapping table feature to work either. The builtin auto-detection/generation works generally well for most fields but does not correctly recognize "geo points" or "IP Address" types/ So the ability to specify the mapping table is important.
This is what I do to add documents to an existing index:
# get a client
es = Elasticsearch(hosts=[{"host": args.host, "port": args.port}])
#read mapping
mapping = open(args.mapping, 'r').read()
# create an index, ignore if it exists already
es.indices.create(index='index_name', ignore=400, body=mapping)
And then, bulk to that index.
Hope it's useful.