neo4j-elasticsearch icon indicating copy to clipboard operation
neo4j-elasticsearch copied to clipboard

Data not written to elasticsearch

Open apowers313 opened this issue 8 years ago • 6 comments

Assuming #42 isn't the source of my problems, I'm not seeing any data being written to elasticsearch and I don't get any errors that would indicate why.

I have 6538642 nodes of type BibliographicResource which look like:

// MATCH (n:BibliographicResource) WHERE id(n) = toInteger(rand() * 6000000) RETURN n
{
  "iri": "gbr:3512409",
  "year": "2007",
  "label": "bibliographic resource 3512409 [br/3512409]",
  "title": "The Isometric Torque at Which Knee-Extensor Muscle Reoxygenation Stops",
  "record_type": "article"
}

And my configuration looks like:

elasticsearch.host_name=http://localhost:9200
elasticsearch.index_spec=br:BibliographicResource(title,iri)

I can see that the plugin loads in the log file:

2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]   plugins:
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]     .DS_Store: 2017-09-02T13:40:51-0700 - 6.00 kB
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]     neo4j-elasticsearch-3.2.3.jar: 2017-09-02T13:40:24-0700 - 4.91 MB
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]   - Total: 2017-09-02T13:40:51-0700 - 4.92 MB
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]   schema:
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]     index:
2017-09-02 20:41:01.897+0000 INFO [o.n.k.i.DiagnosticsManager]       lucene:
[ ... ]

I attempt to populate the data:

MATCH (br:BibliographicResource) SET br.title = br.title, br.iri = br.iri

Set 13077247 properties, completed after 225693 ms.

But the elasticsearch index is never created / no data is populated in elasticsearch:

$ curl 'localhost:9200/_cat/indices?v'
health status index   uuid                   pri rep docs.count docs.deleted store.size pri.store.size
yellow open   .kibana qOP15da-RaK5bfIWpjmLLA   1   1          1            0      3.2kb          3.2kb

There aren't any errors in neo4j's debug.log or elasticsearch's elasticsearch.log. Any ideas of how to fix and / or debug this problem?

apowers313 avatar Sep 02 '17 20:09 apowers313

I have the same issue with 3.1

I actually had to modify the data for elasticsearch to be populated. a SET a.b=a.b query is not enough for elasticsearch to be notified.

SeguinBe avatar Sep 22 '17 18:09 SeguinBe

Yes, neo4j doesn't write properties that haven't changed.

We could add some means (e.g. a procedure to actually trigger initial indexing).

jexp avatar Nov 23 '17 13:11 jexp

btw. updating all 13M entries at once might also overload the plugin.

Can you try to update e.g. a subset of 100k, let's say with a timestamp?

jexp avatar Nov 23 '17 13:11 jexp

Are there any updates on this ? Unchanged existing data isn't sent to elasticsearch

bradeac avatar Sep 26 '18 11:09 bradeac

@jexp would it overload that many records if one were to use apoc.periodic.commit with smaller batch sizes (since it's executeAsync anyway)?

leviwilson avatar Oct 02 '18 15:10 leviwilson

Hi @jexp, is there any update on whether a procedure will be implemented to trigger initial indexing? The problem still seems to be happening.

ewc340 avatar Oct 30 '20 22:10 ewc340