neo4j-elasticsearch icon indicating copy to clipboard operation
neo4j-elasticsearch copied to clipboard

Neo4j and AWS ElasticSearch Service integration failed occasionally

Open Kunal-Dethe opened this issue 9 years ago • 5 comments

Hello All,

I have been using this module to insert data into ElasticSearch from Neo4j. It works fine when used on the local, development and staging server given that the ElasticSearch service is running on the server itself.

But when the Amazon AWS ElasticSearch service is used and data is added in the Neo4j db - sometimes the data is not getting inserted into ElasticSearch.

There is no error or exception thrown while the transaction takes place between the Neo4j and ElasticSearch. Checked the logs file created at /var/log/neo4j/console.log, /var/log/neo4j/http.log

As the data is inserted sometimes and most importantly when ElasticSearch is on the same server, the settings does not seems to be of any issue.

So it's getting difficult to debug as why it could be happening.

Any ideas are appreciated.

Kunal-Dethe avatar Sep 01 '16 10:09 Kunal-Dethe

Hi Kunal,

we can add some more logging if that helps you. Can you check your network setup so that the neo4j box sees the ES box and vice versa?

Which version do you use? The one for Neo4j 3.0 ? The insertion happens asynchronously as fire & forget. but we can check responses and log them if that helps.

jexp avatar Sep 03 '16 21:09 jexp

Hello @jexp

Thanks for the reply.

As for the network setup, the server is a EC2 instance where the Neo4j is installed and ElasticSearch service in question is AWS ElasticSearch Service. As it does work sometimes I am not understanding any issue with the network.

Neo4j version: 2.3.6 ElasticSearch version: 2.3.2

Again to point out, this only happens when the AWS ElasticSearch Service is connected and not with the one running on EC2 instance itself.

It would be of really great help to know if there is any way to log the transactions happening between the Neo4j and ElasticSearch services.

Below is the content of the log file: /var/log/neo4j/console.log

2016-09-02 12:27:47.494+0000 INFO  Remote interface ready and available at http://0.0.0.0:7474/
12:28:42.520 [NodeChecker RUNNING] ERROR i.s.c.config.discovery.NodeChecker - Error executing NodesInfo!
io.searchbox.client.config.exception.NoServerConfiguredException: No Server is assigned to client to connect
        at io.searchbox.client.AbstractJestClient$ServerPool.getNextServer(AbstractJestClient.java:132) ~[jest-common-2.0.2.jar:na]
        at io.searchbox.client.AbstractJestClient.getNextServer(AbstractJestClient.java:81) ~[jest-common-2.0.2.jar:na]
        at io.searchbox.client.http.JestHttpClient.prepareRequest(JestHttpClient.java:80) ~[jest-2.0.2.jar:na]
        at io.searchbox.client.http.JestHttpClient.execute(JestHttpClient.java:46) ~[jest-2.0.2.jar:na]
        at io.searchbox.client.config.discovery.NodeChecker.runOneIteration(NodeChecker.java:65) ~[jest-common-2.0.2.jar:na]
        at com.google.common.util.concurrent.AbstractScheduledService$ServiceDelegate$Task.run(AbstractScheduledService.java:189) [guava-19.0.jar:na]
        at com.google.common.util.concurrent.Callables$3.run(Callables.java:100) [guava-19.0.jar:na]
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511) [na:1.8.0_101]
        at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308) [na:1.8.0_101]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180) [na:1.8.0_101]
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294) [na:1.8.0_101]
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142) [na:1.8.0_101]
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617) [na:1.8.0_101]
        at java.lang.Thread.run(Thread.java:745) [na:1.8.0_101]
12:28:42.540 [NodeChecker RUNNING] INFO  i.s.client.AbstractJestClient - Setting server pool to a list of 1 servers: [ELASTICSEARCH_URL]
12:29:42.541 [NodeChecker RUNNING] DEBUG i.s.client.http.JestHttpClient - GET method created based on client request
12:29:42.541 [NodeChecker RUNNING] DEBUG i.s.client.http.JestHttpClient - Request method=GET url=ELASTICSEARCH_URL/_nodes/_all/http
12:29:42.553 [NodeChecker RUNNING] DEBUG io.searchbox.action.AbstractAction - Request and operation succeeded
12:29:42.553 [NodeChecker RUNNING] DEBUG i.s.c.config.discovery.NodeChecker - Discovered 0 HTTP hosts:
12:29:42.553 [NodeChecker RUNNING] INFO  i.s.client.AbstractJestClient - Setting server pool to a list of 0 servers: []
12:29:42.553 [NodeChecker RUNNING] WARN  i.s.client.AbstractJestClient - No servers are currently available to connect.

The response from the API: ELASTICSEARCH_URL/_nodes/_all/http

EC2 instance:

{"cluster_name":"elasticsearch","nodes":{"X9zagEOlSK-h3l9dSG08PA":{"name":"Her","transport_address":"172.31.50.210:9300","host":"172.31.50.210","ip":"172.31.50.210","version":"2.3.0","build":"8371be8","http_address":"172.31.50.210:9200","http":{"bound_address":["[::]:9200"],"publish_address":"172.31.50.210:9200","max_content_length_in_bytes":104857600}}}}

AWS ElasticSearch instance:

{"cluster_name":"102372860153:ES_DONAIN_NAME","nodes":{"kXO7l2ZyRgaDq44Ohx4qCA":{"name":"Cassie Lang","version":"2.3.2","build":"0944b4b"}}}

Kunal-Dethe avatar Sep 06 '16 09:09 Kunal-Dethe

Hello @jexp ,

Would appreciate if you can provide some further insights on the issue.

We got in touch with AWS for the same and here is what they have to say,

  • Looking over the logs, it seems that 'i.s.c.config.discovery.NodeChecker' is trying to auto discover and connect to the individual nodes of the cluster. Amazon is continuously working hard on improving the service features but unfortunately, at this moment AWS doesn't allow clients to connect to the individual nodes of the cluster. Instead, you can connect using the URL
  • The way AWS Elastic Search is built is such that the cluster doesn't return any ip's or ports that are being used by the individual nodes. This is the reason why you are seeing the difference in the API call output.

nishadk123 avatar Oct 12 '16 07:10 nishadk123

https://github.com/searchbox-io/Jest/issues/382

Probably have to add support for the mentioned aws signing interceptor. Do you have capacity to test that out?

I have no AWS ES to test it.

jexp avatar Nov 08 '16 20:11 jexp

@jexp can i give you AWS ES so you can test it?

vvavepacket avatar Jul 23 '17 10:07 vvavepacket