docker-airflow icon indicating copy to clipboard operation
docker-airflow copied to clipboard

Airflow Localexecutor issues with s3 Logging

Open valeriozhang opened this issue 5 years ago • 6 comments

i'm having issues with airflow s3 aws s3 logging saying that it is unable to load the credentials. i added my access and secret key to aws_default as the username and password, respectively. And created a new connection called MyS3Conn with the login/username empty and extra args formatted as. Still receiving the error. Any ideas?

{ "aws_access_key_id":"XXXX", "aws_secret_access_key": "XXXX" }

webserver_1 | [2019-08-01 16:36:23,291] {{s3_task_handler.py:173}} ERROR - Could not write logs to s3://mdland-airflow/logs/DataConversion_ECW_Cloud/Test/2019-08-01T16:36:11.263749+00:00/1.log webserver_1 | Traceback (most recent call last): webserver_1 | File "/usr/local/lib/python3.6/site-packages/airflow/utils/log/s3_task_handler.py", line 170, in s3_write webserver_1 | encrypt=configuration.conf.getboolean('core', 'ENCRYPT_S3_LOGS'), webserver_1 | File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", line 382, in load_string webserver_1 | encrypt=encrypt) webserver_1 | File "/usr/local/lib/python3.6/site-packages/airflow/hooks/S3_hook.py", line 422, in load_bytes webserver_1 | client.upload_fileobj(filelike_buffer, bucket_name, key, ExtraArgs=extra_args) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/boto3/s3/inject.py", line 539, in upload_fileobj webserver_1 | return future.result() webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/s3transfer/futures.py", line 106, in result webserver_1 | return self._coordinator.result() webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/s3transfer/futures.py", line 265, in result webserver_1 | raise self._exception webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/s3transfer/tasks.py", line 126, in __call__ webserver_1 | return self._execute_main(kwargs) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/s3transfer/tasks.py", line 150, in _execute_main webserver_1 | return_value = self._main(**kwargs) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/s3transfer/upload.py", line 692, in _main webserver_1 | client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/client.py", line 357, in _api_call webserver_1 | return self._make_api_call(operation_name, kwargs) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/client.py", line 648, in _make_api_call webserver_1 | operation_model, request_dict, request_context) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/client.py", line 667, in _make_request webserver_1 | return self._endpoint.make_request(operation_model, request_dict) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/endpoint.py", line 102, in make_request webserver_1 | return self._send_request(request_dict, operation_model) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/endpoint.py", line 132, in _send_request webserver_1 | request = self.create_request(request_dict, operation_model) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/endpoint.py", line 116, in create_request webserver_1 | operation_name=operation_model.name) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/hooks.py", line 356, in emit webserver_1 | return self._emitter.emit(aliased_event_name, **kwargs) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/hooks.py", line 228, in emit webserver_1 | return self._emit(event_name, kwargs) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/hooks.py", line 211, in _emit webserver_1 | response = handler(**kwargs) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/signers.py", line 90, in handler webserver_1 | return self.sign(operation_name, request) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/signers.py", line 157, in sign webserver_1 | auth.add_auth(request) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/auth.py", line 425, in add_auth webserver_1 | super(S3SigV4Auth, self).add_auth(request) webserver_1 | File "/usr/local/airflow/.local/lib/python3.6/site-packages/botocore/auth.py", line 357, in add_auth webserver_1 | raise NoCredentialsError webserver_1 | botocore.exceptions.NoCredentialsError: Unable to locate credentials

valeriozhang avatar Aug 01 '19 16:08 valeriozhang

Hi, in your docker-compose file, do you have environment entries, such as : AIRFLOW__CORE__REMOTE_LOG_CONN_ID='MyS3Conn'

enys avatar Aug 14 '19 08:08 enys

@valeriozhang have you been able to resolve this? I am having a similar issue. I have set up all the necessary env vars as well as the Airflow Connection (s3_default) and receiving the same error.

AIRFLOW__CORE__REMOTE_LOGGING=True
AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER=s3://my-bucket/logs
AIRFLOW__CORE__REMOTE_LOG_CONN_ID=s3_default
AIRFLOW__CORE__ENCRYPT_S3_LOGS=False

aamangeldi avatar Oct 11 '19 18:10 aamangeldi

@aamangeldi hellooo good sir yes.

        - AIRFLOW__CORE__REMOTE_LOGGING=True
        - AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER=s3://my-bucket/logs
        - AIRFLOW__CORE__ENCRYPT_S3_LOGS=False
        - AWS_ACCESS_KEY_ID=XXXXXX
        - AWS_SECRET_ACCESS_KEY=XXXXXXX

all i did was hard code in the AWS keys. not smart at all. you should solve your environmental varibles. I didnt look into it - you can print your OS aws keys inside a task and see if you are really getting the correct AWS keys.

valeriozhang avatar Oct 11 '19 18:10 valeriozhang

@valeriozhang thank you for following up! I'm on puckel/docker-airflow:1.10.4 and what worked for me is setting up the URI the following way (instead of hardcoding the AWS_ACCESS_KEY_ID and AWS_SECRET_ACCESS_KEY as separate env vars:

AIRFLOW_CONN_S3_LOGS=s3://AWS_ACCESS_KEY_ID:AWS_SECRET_ACCESS_KEY@S3

aamangeldi avatar Oct 11 '19 19:10 aamangeldi

What I really want to be able to do is specify role arn and aws account id, instead of the access keys. Unfortunately doesn't look like it's possible atm :(

aamangeldi avatar Oct 11 '19 19:10 aamangeldi

I am having the same issue when I'm trying to run the docker container on an EC2 instance. I am using network mode host and I've set remote logging as follows:

AIRFLOW__CORE__REMOTE_LOGGING=True AIRFLOW__CORE__REMOTE_BASE_LOG_FOLDER="s3://${AIRFLOW_BUCKET_NAME}/logs" AIRFLOW__CORE__REMOTE_LOG_CONN_ID=aws_default

Other things I've done are running the containers with network mode host to be sure the metadata service can be reached.

mng1dev avatar Oct 16 '19 08:10 mng1dev