amundsen icon indicating copy to clipboard operation
amundsen copied to clipboard

Bug Report - Can't get databuilder working with ECS. (Version mismatch)

Open cdechery opened this issue 2 years ago • 3 comments

I have noticed that the version that runs containers for ECS is not the same as the "oficial" version of Amundsen which is far ahead in terms of version for the containers. One of the problems with that is that I can't seem to get databuilder running. It is all set, built, installed properly but I keep running into errors when I run it.

Imporant note: I have managed to get it working in some occasions in the past, by changing versions of some libs on requirements.txt. I took note of this changes and thought they were enough, but now even those arent't getting the job done.

I have tried databuilder versions: 6.5.2, 6.74, 6.10, 7.2.0. None of them work. The error bellow keeps popping, it seems like an inability to handle ElasticSearch 6.7, which is the version that currently is being used by ECS-ready containers.

WARNING:elasticsearch:PUT http://xxxxxxx:9200/table_7adb2685-0388-416b-aa4b-e1108a466303 [status:400 request:0.023s] Traceback (most recent call last): File "example/scripts/sample_data_loader.py", line 407, in job_es_table.launch() File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/amundsen_databuilder-7.1.2-py3.7.egg/databuilder/job/job.py", line 76, in launch File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/amundsen_databuilder-7.1.2-py3.7.egg/databuilder/job/job.py", line 72, in launch File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/amundsen_databuilder-7.1.2-py3.7.egg/databuilder/publisher/base_publisher.py", line 40, in publish File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/amundsen_databuilder-7.1.2-py3.7.egg/databuilder/publisher/base_publisher.py", line 37, in publish File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/amundsen_databuilder-7.1.2-py3.7.egg/databuilder/publisher/elasticsearch_publisher.py", line 104, in publish_impl File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/elasticsearch/client/utils.py", line 347, in _wrapped return func(*args, params=params, headers=headers, **kwargs) File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/elasticsearch/client/indices.py", line 146, in create "PUT", _make_path(index), params=params, headers=headers, body=body File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/elasticsearch/transport.py", line 466, in perform_request raise e File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/elasticsearch/transport.py", line 434, in perform_request timeout=timeout, File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/elasticsearch/connection/http_urllib3.py", line 291, in perform_request self._raise_error(response.status, raw_data) File "/home/ec2-user/amundsen/databuilder/lib64/python3.7/site-packages/elasticsearch/connection/base.py", line 329, in _raise_error status_code, error_message, additional_info elasticsearch.exceptions.RequestError: RequestError(400, 'mapper_parsing_exception', 'Root mapping definition has unsupported parameters: [schema : {analyzer=simple, type=text, fields={raw={type=keyword}}}] [cluster : {analyzer=simple, type=text, fields={raw={type=keyword}}}] [description : {analyzer=simple, type=text}] [display_name : {type=keyword}] [column_descriptions : {analyzer=simple, type=text}] [programmatic_descriptions : {analyzer=simple, type=text}] [tags : {type=keyword}] [badges : {type=keyword}] [database : {analyzer=simple, type=text, fields={raw={type=keyword}}}] [total_usage : {type=long}] [name : {analyzer=simple, type=text, fields={raw={type=keyword}}}] [last_updated_timestamp : {format=epoch_second, type=date}] [unique_usage : {type=long}] [column_names : {analyzer=simple, type=text, fields={raw={normalizer=column_names_normalizer, type=keyword}}}] [key : {type=keyword}]')

cdechery avatar Aug 31 '22 16:08 cdechery

Thanks for opening your first issue here!

boring-cyborg[bot] avatar Aug 31 '22 16:08 boring-cyborg[bot]

I had a similar error, I think it was because of an incompatible version of elastic search. I use Open Search, with elastic search 7.10 (the last version of elastic search provided by aws open search service). I'm using amundsen-databuilder = "^6.7.4", but with python elasticsearch client elasticsearch = "7.13.4" It worked for me. In my opinion, you should use the elasticsearch python package, according to your elastic search version

ggirodda avatar Sep 12 '22 17:09 ggirodda

I got it working too with databuilder 6.7.4. This is definetly the databuilder version that works with ECS.

But I had to tweak some library versions as well, the requirements.txt that comes with 6.7.4 will not work by itself. It will either fail to build because of compability issues or it will fail to load data in Elasticsearch because of version mismatch. After some tweaking I got it working. I can share my version of requirements.txt here if someone needs help setting this up.

cdechery avatar Sep 16 '22 12:09 cdechery

Is this issue gonna get some attention anytime soon? Seems to me no one really cares about ECS, since it is really outdated in relation to the rest of the project. I have created a CloudFormation template that deploys Amundsen automatically in a highly-available environment. I could contribute this to the project if it is of interest of the community.

cdechery avatar Sep 22 '22 11:09 cdechery

Is this issue gonna get some attention anytime soon? Seems to me no one really cares about ECS, since it is really outdated in relation to the rest of the project. I have created a CloudFormation template that deploys Amundsen automatically in a highly-available environment. I could contribute this to the project if it is of interest of the community.

Hello Christian, would you mind sharing the frontend/metadata/search/neo4j versions you are running? I'm looking for a fresh deploy but the version mismatches does seem a little confusing to me. Thank you in advance.

caiopavanelli avatar Sep 23 '22 19:09 caiopavanelli

Hi

Is this issue gonna get some attention anytime soon? Seems to me no one really cares about ECS, since it is really outdated in relation to the rest of the project. I have created a CloudFormation template that deploys Amundsen automatically in a highly-available environment. I could contribute this to the project if it is of interest of the community.

Hello Christian, would you mind sharing the frontend/metadata/search/neo4j versions you are running? I'm looking for a fresh deploy but the version mismatches does seem a little confusing to me. Thank you in advance.

Hi @caiopavanelli , in order for me to get Amundsen to work with ECS I used this file for reference, it has all the versions of Amundsen's containers. https://github.com/amundsen-io/amundsen/blob/main/docs/installation-aws-ecs/docker-ecs-amundsen.yml But as you can see, they are really outdated. So when you try to get databuilder working on it, for example, you run into a nightmare of versions mismatch, the containers, Amundsen's release, python libraries, etc.

cdechery avatar Sep 29 '22 13:09 cdechery

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] avatar Oct 14 '22 04:10 stale[bot]

@cdechery - would you be able to share your updated requirements , facing similar issue at my end to get it working with data builder.

saranathak avatar Oct 19 '22 01:10 saranathak

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs.

stale[bot] avatar Nov 02 '22 05:11 stale[bot]

This issue has been automatically closed for inactivity. If you still wish to make these changes, please open a new pull request or reopen this one.

stale[bot] avatar Nov 27 '22 05:11 stale[bot]

@cdechery would you be able to share your updated requirements , facing similar issue at my end to get it working with data builder.

annamalaikasi avatar Apr 04 '23 22:04 annamalaikasi