dask-elk
dask-elk copied to clipboard
Failed connect to elasticsearch manage by aws behind VPN
Failed connect to elasticsearch manage by aws behind VPN
client = DaskElasticClient(host='es.amazonaws.com', port=9200, scheme="https")
my_index="my_index"
df = client.read(index=my_index)
Could you provide me with some more information e.g some log messages or exception messages?
Traceback (most recent call last):
File "
======================================================
Then I change the port to connect, that I use using package elasticsearch
client = DaskElasticClient(host=['es.amazonaws.com'], scheme="https", port=443)
I get this error:
Traceback (most recent call last):
File "
It works fine when I do connection using elasticsearch
package
from elasticsearch import Elasticsearch es_client = Elasticsearch(['https://es.amazonaws.com']) es_client <Elasticsearch([{'host': 'es.amazonaws.com', 'port': 443, 'use_ssl': True}])> es_client.indices.exists(my_index) True
Please can you try creating a client using wan_only=True
. The client tries to connect to each node in order to fetch data in parallel. To do that the data nodes need to be accessible from outside ELK cluster. This isn't always the case with cases like Amazon. See here for more info
Still with same like last error message
client1 = DaskElasticClient(host=['es.amazonaws.com'], scheme="https", port=443, wan_only=True) df = client1.read(index=my_index) Traceback (most recent call last): File "
", line 1, in File "/Users/dmg/opt/miniconda3/envs/es/lib/python3.6/site-packages/dask_elk/client.py", line 110, in read node_registry.get_nodes_from_elastic(elk_client) File "/Users/dmg/opt/miniconda3/envs/es/lib/python3.6/site-packages/dask_elk/elk_entities/node.py", line 45, in get_nodes_from_elastic publish_address = node_info['http']['publish_address'] KeyError: 'http'
==========
Option wan_only
, I think it not making difference/not affected to difference result in your code on earlier run read
function,
if I set True or False CMIIW.
https://github.com/avlahop/dask-elk/blob/8a958a8a20e44c487f2a1c9b66a0710603b0295e/dask_elk/elk_entities/node.py#L36
========= Except in here https://github.com/avlahop/dask-elk/blob/8a958a8a20e44c487f2a1c9b66a0710603b0295e/dask_elk/client.py#L130
Notes: I thing my elasticsearch cluster version is 7.1.1 (may be help)
Hello @ibnbay99 I openned a PR (#27). Can you try it with your setup and update whether it fixes your issue?
Ty for fast fix, but I thing this lead into another error :pray:
df = client1.read(index=my_index)
Traceback (most recent call last):
File "/Users/dmg/opt/miniconda3/envs/es/lib/python3.6/site-packages/dask_elk/elk_entities/index.py", line 127, in __get_mappings
mapping = mappings["mappings"][doc_type]["properties"]
KeyError: None
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "
================
When I fill doc_type
df = client1.read(index=my_index, doc_type='_doc')
Traceback (most recent call last):
File "
What is the version of your Elasticsearch?
My aws elastic cluster version is 7.1.1
Hi @avlahop , I try to continue your work and I think it works
I move wan_only
option from here
https://github.com/avlahop/dask-elk/blob/50841e63ac96d899eb36c9e25ede1061105e6c7f/dask_elk/client.py#L123
Get nodes info first node_registry = NodeRegistry() node_registry.get_nodes_from_elastic(elk_client, self.wan_only)
=======
Into here
https://github.com/avlahop/dask-elk/blob/8a958a8a20e44c487f2a1c9b66a0710603b0295e/dask_elk/elk_entities/node.py#L45
To be like this
def get_nodes_from_elastic(self, elk_client, wan_only):
publish_address = node_info["http"]["publish_address"] if not wan_only else None
=========================
And create a simple run app, and works. with some adjustment on hosts while initialize DaskElasticClient
-
hosts
must not a list of string, when optionwan_only
isTrue
Feel free to lookup in my fork
repo
https://github.com/ibnbay99/dask-elk/tree/bug/26/no_node_lookup_when_wan_only
I hope this PR doesn't breaking the run of the other state if wan_only
option is False
:crossed_fingers:
The problem is that "DaskElasticsearchClient still tries to get the shards as they are distributed upon the different data nodes Do you want to open a new PR that merges in to the upstream master?
For faster merge, I thing we can continue using your branch, and you can continue change 3 line of code (can see compare in my fork
) using this
https://github.com/avlahop/dask-elk/compare/bug/26/no_node_lookup_when_wan_only...ibnbay99:bug/26/no_node_lookup_when_wan_only
What do you thing @avlahop ?
I opened #28 would you like to assign it to your self?I don't know if you have rights, but you can try
@ibnbay99 have you tried #28 from your own branch, is everything working?
Yes it work, like last image I posted.
I confuse to write a test for option wan_only
is true. I just added doc for part wan_only
you comment on git commit.
Because it run on function read
right?