airflow icon indicating copy to clipboard operation
airflow copied to clipboard

Masking configuration values irrelevant to DAG author

Open amoghrajesh opened this issue 1 year ago • 14 comments

Some configurations are irrelevant to DAG authors and hence we need to mask those to avoid it from getting logged unknowingly.


^ Add meaningful description above Read the Pull Request Guidelines for more information. In case of fundamental code changes, an Airflow Improvement Proposal (AIP) is needed. In case of a new dependency, check compliance with the ASF 3rd Party License Policy. In case of backwards incompatible changes please leave a note in a newsfragment file, named {pr_number}.significant.rst or {issue_number}.significant.rst, in newsfragments.

amoghrajesh avatar Oct 15 '24 15:10 amoghrajesh

Just adding a unit test, and I think it's ready to go :)

potiuk avatar Oct 15 '24 15:10 potiuk

Struggling with some test related setup. I am running into this as of now:

../../tests_common/test_utils/db.py:23: in <module>
    from airflow.models import (
../../airflow/models/__init__.py:78: in __getattr__
    val = import_string(f"{path}.{name}")
../../airflow/utils/module_loading.py:39: in import_string
    module = import_module(module_path)
/Library/Developer/CommandLineTools/Library/Frameworks/Python3.framework/Versions/3.9/lib/python3.9/importlib/__init__.py:127: in import_module
    return _bootstrap._gcd_import(name[level:], package, level)
../../airflow/models/dag.py:95: in <module>
    from airflow.models.abstractoperator import AbstractOperator, TaskStateChangeCallback
../../airflow/models/abstractoperator.py:33: in <module>
    from airflow.template.templater import Templater
../../airflow/template/templater.py:23: in <module>
    from airflow.io.path import ObjectStoragePath
../../airflow/io/__init__.py:30: in <module>
    from airflow.providers_manager import ProvidersManager
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ 

    """Manages all providers."""
    
    from __future__ import annotations
    
    import fnmatch
    import functools
    import inspect
    import json
    import logging
    import os
    import sys
    import traceback
    import warnings
    from dataclasses import dataclass
    from functools import wraps
    from time import perf_counter
    from typing import TYPE_CHECKING, Any, Callable, MutableMapping, NamedTuple, TypeVar
    
    from packaging.utils import canonicalize_name
    
    from airflow.exceptions import AirflowOptionalProviderFeatureException
>   from airflow.providers.standard.hooks.filesystem import FSHook
E   ModuleNotFoundError: No module named 'airflow.providers.standard.hooks'

I see some discussions around, does anyone have any suggestions for this? I use Pycharm

amoghrajesh avatar Oct 16 '24 05:10 amoghrajesh

I see some discussions around, does anyone have any suggestions for this? I use Pycharm

I believe (I have not yet had time to look at it) this requires to use uv workspace and the latest uv installed - @ashb - are there any instructions for that already somewhere ?

I guess those should be updated ?

  • https://github.com/apache/airflow/blob/main/contributing-docs/07_local_virtualenv.rst
  • https://github.com/apache/airflow/blob/main/contributing-docs/quick-start-ide/contributors_quick_start_pycharm.rst
  • https://github.com/apache/airflow/blob/main/contributing-docs/quick-start-ide/contributors_quick_start_vscode.rst

potiuk avatar Oct 16 '24 11:10 potiuk

I think we generally should provide rather comprehensive guide (at least with some links) to our contributors how to setup the venv now (Breeze is covered because it uses workspace installation internally I believe)? Am I right @ashb ?

potiuk avatar Oct 16 '24 11:10 potiuk

Breeze is all set up yes. UV workspace isn't 100% required yet (though I don't think anyone tested running tests from inside pycharm), but for pycharm to find imports you need to set up some paths as the right kind of folder in the UI

I'm not a pycharm user, but @kaxil @dstandish can provide some insight.

ashb avatar Oct 16 '24 11:10 ashb

https://github.com/apache/airflow/pull/42951 should help Pycharm too. Remaining "manual" steps as pycharm user are:

  1. We marked providers/src as Source Folder and providers/src/airflow as Namespace package.
  2. Verify that there is nothing on old provider path: rm -rf airflow/providers :slightly_smiling_face:
  3. Trigger Rescan Project Indexes from File -> Repair IDE.

ashb avatar Oct 16 '24 11:10 ashb

I just pulled in https://github.com/apache/airflow/pull/42951, it doesn't seem to fix it for me. I still keep getting


    """Manages all providers."""
    
    from __future__ import annotations
    
    import fnmatch
    import functools
    import inspect
    import json
    import logging
    import os
    import sys
    import traceback
    import warnings
    from dataclasses import dataclass
    from functools import wraps
    from time import perf_counter
    from typing import TYPE_CHECKING, Any, Callable, MutableMapping, NamedTuple, TypeVar
    
    from packaging.utils import canonicalize_name
    
    from airflow.exceptions import AirflowOptionalProviderFeatureException
>   from airflow.providers.standard.hooks.filesystem import FSHook
E   ModuleNotFoundError: No module named 'airflow.providers.standard'

amoghrajesh avatar Oct 16 '24 12:10 amoghrajesh

I'm not a pycharm user, but @kaxil @dstandish can provide some insight.

Anything in particular you wanted me to look at?

dstandish avatar Oct 16 '24 13:10 dstandish

@ashb @potiuk @kaxil I changed the logic to incorporate what was suggested above. Now I am able to get this result:

[2024-10-16, 18:39:13 IST] {logging_mixin.py:190} INFO - environ({'SHELL': '/bin/bash', 'AIRFLOW__SMTP__SMTP_PORT': '587', 'VERSION_SUFFIX_FOR_PYPI': '', 'NUM_RUNS': '', 'AIRFLOW__CELERY__RESULT_BACKEND': 'db+***', 'AIRFLOW_AUTH_MANAGER_CREDENTIAL_DIRECTORY': '/files', 'USE_AIRFLOW_VERSION': '', 'AIRFLOW__SMTP__SMTP_PASSWORD': 'dummy_smtp_password', 'AIRFLOW__SMTP__SMTP_USER': 'dummy_user', 'DB_RESET': 'false', 'TERM_PROGRAM_VERSION': '3.3a', 'CASS_DRIVER_NO_CYTHON': '1', 'DOCKER_IS_ROOTLESS': 'false', 'TMUX': '/root/.tmux/tmp/tmux-0/default,96,0', 'DEPENDENCIES_EPOCH_NUMBER': '11', 'HOSTNAME': '22a3f86e4c95', 'COLOR_RESET': '\x1b[0m', 'AIRFLOW__WEBSERVER__SECRET_KEY_SECRET': '***_secret', 'PYTHON_VERSION': '3.9.20', 'LANGUAGE': 'C.UTF-8', 'HOST_USER_ID': '20', 'FASTAPI_API_HOST_PORT': '29091', 'USE_PACKAGES_FROM_DIST': 'false', 'COLLECT_ONLY': 'false', 'PACKAGE_FORMAT': 'wheel', 'AIRFLOW__CORE__PLUGINS_FOLDER': '/files/plugins', 'AIRFLOW_CI_BUILD_EPOCH': '10', 'CHICKEN_EGG_PROVIDERS': '', 'MSSQL_HOST_PORT': '21433', 'SQLALCHEMY_WARN_20': 'true', 'CELERY_BROKER_URLS_MAP': "{'rabbitmq': 'amqp://guest:guest@rabbitmq:5672', 'redis': 'redis://redis:6379/0'}", 'AIRFLOW_CONSTRAINTS_LOCATION': '', 'DEV_MODE': 'false', 'AIRFLOW__CORE__DAGS_FOLDER': '/files/dags', 'COLOR_RED': '\x1b[31m', 'RUN_SYSTEM_TESTS': 'false', 'COLOR_BLUE': '\x1b[34m', 'AIRFLOW_ENABLE_AIP_44': 'true', 'AIRFLOW_CI_IMAGE': 'ghcr.io/apache/airflow/main/ci/python3.9', 'TEST_TYPE': '', 'ADDITIONAL_PATH': '~/.local/bin', 'EAGER_UPGRADE_ADDITIONAL_REQUIREMENTS': '', 'PROVIDERS_SKIP_CONSTRAINTS': 'false', 'BREEZE_INIT_COMMAND': '', 'PWD': '/opt/airflow', 'AIRFLOW_SKIP_CONSTRAINTS': 'false', 'AIRFLOW__CELERY__BROKER_URL': 'redis://redis:6379/0', 'REMOVE_ARM_PACKAGES': 'false', 'AIRFLOW_VERSION': '3.0.0.dev0', 'ENABLE_COVERAGE': 'false', 'VERBOSE_COMMANDS': 'false', 'AIRFLOW__CORE__LOAD_EXAMPLES': 'false', 'GITHUB_ACTIONS': 'false', 'INSTALL_MSSQL_CLIENT': 'true', 'CASS_DRIVER_BUILD_CONCURRENCY': '8', 'CI_EVENT_TYPE': 'pull_request', 'UPGRADE_IF_NEEDED': '--upgrade', 'INSTALL_MYSQL_CLIENT_TYPE': 'mariadb', 'BACKEND': 'mysql', 'CI_TARGET_BRANCH': 'main', 'PROVIDERS_CONSTRAINTS_MODE': 'constraints-source-providers', 'GUNICORN_CMD_ARGS': '--worker-tmp-dir /dev/shm/', 'AIRFLOW_EXTRAS': '', 'LD_PRELOAD': '/usr/lib/aarch64-linux-gnu/libstdc++.so.6', 'PIP_PROGRESS_BAR': 'on', 'AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_SECRET': 'dummy_sql_alchemy_conn', 'HOME': '/root', 'QUIET': 'false', 'UV_NO_CACHE': 'true', 'LANG': 'C.UTF-8', 'SUSPENDED_PROVIDERS_FOLDERS': '', 'HELM_TEST_PACKAGE': '', 'TMUX_TMPDIR': '/root/.tmux/tmp', 'SQLITE_URL': 'sqlite:////root/airflow/sqlite/airflow.db', 'ISSUE_ID': '', 'AIRFLOW_HOME': '/root/airflow', 'MYSQL_VERSION': '8.0', 'AIRFLOW_BREEZE_CONFIG_DIR': '/files/airflow-breeze-config', 'POSTGRES_HOST_PORT': '25433', 'EXTRA_UNINSTALL_FLAGS': '--python /usr/local/bin/python', 'AIRFLOW_VERSION_SPECIFICATION': '', 'GPG_KEY': 'E3FF2839C048B25C084DEBE9B26995E310250568', 'AIRFLOW__DATABASE__SQL_ALCHEMY_CONN': '***', 'AIRFLOW_USE_UV': 'true', 'AIRFLOW_PRE_CACHED_PIP_PACKAGES': 'true', 'ADDITIONAL_DEV_APT_DEPS': 'bash-completion dumb-init git graphviz krb5-user less libenchant-2-2 libgcc-11-dev libgeos-dev libpq-dev net-tools netcat-openbsd openssh-server postgresql-client software-properties-common rsync tmux unzip vim xxd', 'COLOR_YELLOW': '\x1b[33m', 'ONLY_MIN_VERSION_UPDATE': 'false', 'AIRFLOW__CELERY__WORKER_CONCURRENCY': '8', 'STANDALONE_DAG_PROCESSOR': 'false', 'AIRFLOW__CORE__EXECUTOR': 'LocalExecutor', 'INSTALL_SELECTED_PROVIDERS': '', 'USE_XDIST': 'false', 'COMMIT_SHA': 'c03ccfc0716fdbf88eb778c920767b4d3953ac5b', 'CONSTRAINTS_GITHUB_REPOSITORY': 'apache/airflow', 'AIRFLOW__DATABASE__SQL_ALCHEMY_ENGINE_ARGS_SECRET': 'dummy_sql_engine_args', 'EXTRA_INSTALL_FLAGS': '--python /usr/local/bin/python', 'AIRFLOW_CI_IMAGE_WITH_TAG': 'ghcr.io/apache/airflow/main/ci/python3.9:latest', 'SKIP_SSH_SETUP': 'false', 'TMUX_SESSION': 'Airflow', 'AIRFLOW_REPO': 'apache/airflow', 'DEV_APT_COMMAND': '', 'AIRFLOW__WEBSERVER__SECRET_KEY': '***', 'AIRFLOW_PIP_VERSION': '24.2', 'ADDITIONAL_PIP_INSTALL_FLAGS': '', 'SKIP_ENVIRONMENT_INITIALIZATION': 'false', 'ADDITIONAL_DEV_APT_COMMAND': '', 'AIRFLOW_ENV': 'development', 'START_AIRFLOW': 'true', 'COLOR_GREEN': '\x1b[32m', 'AIRFLOW_BRANCH': 'main', 'CLEAN_AIRFLOW_INSTALLATION': 'false', 'FORCE_LOWEST_DEPENDENCIES': 'false', 'AIRFLOW__CORE__INTERNAL_API_SECRET_KEY_SECRET': '***_secret', 'HOST_GROUP_ID': '20', 'DEFAULT_BRANCH': 'main', 'INIT_SCRIPT_FILE': 'init.sh', 'AIRFLOW_CONSTRAINTS_MODE': 'constraints-source-providers', 'PYTHONPATH': '/opt/airflow', 'TERM': 'tmux-256color', 'UPGRADE_BOTO': 'false', 'DOWNGRADE_PENDULUM': 'false', 'USER': 'root', 'HOST_OS': 'darwin', 'USE_UV': 'true', 'TMUX_PANE': '%1', 'WEBSERVER_HOST_PORT': '28080', 'AIRFLOW_UV_VERSION': '0.4.22', 'AIRFLOW__CORE__INTERNAL_API_SECRET_KEY': '***', 'INSTALL_POSTGRES_CLIENT': 'true', 'COMPOSE_FILE': '/Users/adesai/Documents/OSS/airflow-repos/airflow/scripts/ci/docker-compose/base.yml:/Users/adesai/Documents/OSS/airflow-repos/airflow/scripts/ci/docker-compose/docker-socket.yml:/Users/adesai/Documents/OSS/airflow-repos/airflow/scripts/ci/docker-compose/backend-mysql.yml:/Users/adesai/Documents/OSS/airflow-repos/airflow/scripts/ci/docker-compose/backend-mysql-port.yml:/Users/adesai/Documents/OSS/airflow-repos/airflow/scripts/ci/docker-compose/files.yml:/Users/adesai/Documents/OSS/airflow-repos/airflow/scripts/ci/docker-compose/base-ports.yml:/Users/adesai/Documents/OSS/airflow-repos/airflow/scripts/ci/docker-compose/local.yml', 'SHLVL': '2', 'AIRFLOW_CONSTRAINTS_REFERENCE': 'constraints-main', 'DEFAULT_CONSTRAINTS_BRANCH': 'constraints-main', 'FILES_DIR': '/files', 'CI_BUILD_ID': '0', 'INSTALL_AIRFLOW_VERSION': '', 'LC_MESSAGES': 'C.UTF-8', 'SYSTEM_TESTS_ENV_ID': '', 'MOUNT_SOURCES': 'selected', 'PYTHONDONTWRITEBYTECODE': 'true', 'KUBECONFIG': '/files/.kube/config', 'UPGRADE_INVALIDATION_STRING': '', 'MYSQL_HOST_PORT': '23306', 'LC_CTYPE': 'C.UTF-8', 'AIRFLOW_IMAGE_KUBERNETES': 'ghcr.io/apache/airflow/main/kubernetes/python3.9', 'ANSWER': '', 'RUN_TESTS': 'false', 'LOAD_EXAMPLES': 'false', '_AIRFLOW_SKIP_DB_TESTS': 'false', 'DOWNGRADE_SQLALCHEMY': 'false', 'AIRFLOW_INSTALLATION_METHOD': '.', 'REDIS_HOST_PORT': '26379', 'LC_ALL': 'C.UTF-8', 'CI_TARGET_REPO': 'apache/airflow', 'CI_JOB_ID': '0', 'DATABASE_ISOLATION': 'false', 'INSTALL_MYSQL_CLIENT': 'true', 'SSH_PORT': '12322', 'PATH': '/files/bin/:/opt/airflow/scripts/in_container/bin/:/root/.local/bin:/usr/local/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/opt/airflow', 'PYTHON_MAJOR_MINOR_VERSION': '3.9', '_AIRFLOW_RUN_DB_TESTS_ONLY': 'false', 'AIRFLOW__CORE__DATASET_MANAGER_KWARGS_SECRET': 'dummy_dataset_manager_kwargs', 'UPGRADE_EAGERLY': '--upgrade --resolution highest', 'CELERY_FLOWER': 'false', 'INSTALL_AIRFLOW_WITH_CONSTRAINTS': 'true', 'CI': 'false', 'PACKAGING_TOOL': '', 'PACKAGING_TOOL_CMD': 'uv pip', 'LOAD_DEFAULT_CONNECTIONS': 'false', 'FLOWER_HOST_PORT': '25555', 'PIP_NO_CACHE_DIR': 'true', 'BASE_BRANCH': 'main', 'PYTHON_BASE_IMAGE': 'python:3.9-slim-bookworm', 'ENABLED_SYSTEMS': '', 'AIRFLOW__LOGGING__REMOTE_TASK_HANDLER_KWARGS_SECRET': 'dummy_remote_task_handler_kwargs', 'AIRFLOW__SMTP__SMTP_PASSWORD_SECRET': 'dummy_smtp_password_secret', 'UV_HTTP_TIMEOUT': '300', 'BREEZE': 'true', 'DRILL_HOST_PORT': '28047', 'BUILD_ID': '0', 'AIRFLOW__SENTRY__SENTRY_DSN_SECRET': 'dummy_sentry_dsn', 'AIRFLOW__CORE__FERNET_KEY': '***', 'POSTGRES_VERSION': '12', 'DEBIAN_FRONTEND': 'noninteractive', 'EXCLUDED_PROVIDERS': '', 'UV_CONCURRENT_DOWNLOADS': '10', 'OLDPWD': '/opt/airflow', 'AIRFLOW__SMTP__SMTP_SSL': 'True', 'AIRFLOW_SOURCES': '/opt/airflow', 'VERBOSE': 'false', 'TERM_PROGRAM': 'tmux', 'AIRFLOW__SMTP__SMTP_HOST': 'smtp.dummy.com', 'REGENERATE_MISSING_DOCS': 'false', '_': '/usr/local/bin/airflow', '_AIRFLOW_PARSING_CONTEXT_DAG_ID': 'os-env-variable', '_AIRFLOW_PARSING_CONTEXT_TASK_ID': 'test_me', 'AIRFLOW_CTX_DAG_OWNER': 'airflow', 'AIRFLOW_CTX_DAG_ID': 'os-env-variable', 'AIRFLOW_CTX_TASK_ID': 'test_me', 'AIRFLOW_CTX_EXECUTION_DATE': '2024-10-16T13:09:12.855444+00:00', 'AIRFLOW_CTX_TRY_NUMBER': '1', 'AIRFLOW_CTX_DAG_RUN_ID': 'manual__2024-10-16T13:09:12.855444+00:00'})

From my original list, these aren't masked:

    "AIRFLOW__SMTP__SMTP_PORT",
    "AIRFLOW__SMTP__SMTP_PASSWORD",
    "AIRFLOW__SMTP__SMTP_PASSWORD_SECRET",
    "AIRFLOW__SMTP__SMTP_USER",
    "AIRFLOW__SMTP__SMTP_SSL",
    "AIRFLOW__SMTP__SMTP_HOST",
    "AIRFLOW__SENTRY__SENTRY_DSN_SECRET",
    "AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_SECRET",
    "AIRFLOW__DATABASE__SQL_ALCHEMY_ENGINE_ARGS_SECRET",
    "AIRFLOW__CORE__DATASET_MANAGER_KWARGS_SECRET",
    "AIRFLOW__LOGGING__REMOTE_TASK_HANDLER_KWARGS_SECRET",

As per the comments above, looks like AIRFLOW__SMTP__SMTP_PORT, AIRFLOW__SMTP__SMTP_USER (removed), AIRFLOW__SMTP__SMTP_PASSWORD (removed), arent relevant.

I think:

  • AIRFLOW__SMTP__SMTP_USER
  • AIRFLOW__SMTP__SMTP_SSL
  • AIRFLOW__SMTP__SMTP_HOST

Are ok to be unmasked. That leaves us with:

    "AIRFLOW__SENTRY__SENTRY_DSN_SECRET",
    "AIRFLOW__DATABASE__SQL_ALCHEMY_CONN_SECRET",
    "AIRFLOW__DATABASE__SQL_ALCHEMY_ENGINE_ARGS_SECRET",
    "AIRFLOW__CORE__DATASET_MANAGER_KWARGS_SECRET",
    "AIRFLOW__LOGGING__REMOTE_TASK_HANDLER_KWARGS_SECRET",

Should these be masked or its alright to keep them unmasked?

amoghrajesh avatar Oct 16 '24 14:10 amoghrajesh

Should these be masked or its alright to keep them unmasked?

https://github.com/apache/airflow/pull/43040#discussion_r1802935540

ashb avatar Oct 16 '24 14:10 ashb

Okay, in that case, this should be ok

amoghrajesh avatar Oct 16 '24 14:10 amoghrajesh

I just pulled in #42951, it doesn't seem to fix it for me. I still keep getting


    """Manages all providers."""
    
    from __future__ import annotations
    
    import fnmatch
    import functools
    import inspect
    import json
    import logging
    import os
    import sys
    import traceback
    import warnings
    from dataclasses import dataclass
    from functools import wraps
    from time import perf_counter
    from typing import TYPE_CHECKING, Any, Callable, MutableMapping, NamedTuple, TypeVar
    
    from packaging.utils import canonicalize_name
    
    from airflow.exceptions import AirflowOptionalProviderFeatureException
>   from airflow.providers.standard.hooks.filesystem import FSHook
E   ModuleNotFoundError: No module named 'airflow.providers.standard'

Fixed in https://github.com/apache/airflow/pull/43082

kaxil avatar Oct 16 '24 14:10 kaxil

@potiuk @ashb added a unit test here. Can you take a look at this when you have some time?

amoghrajesh avatar Oct 17 '24 03:10 amoghrajesh

Hey @ashb. Left a comment on your last review, can you please take a look when you have some time?

amoghrajesh avatar Oct 19 '24 06:10 amoghrajesh

@ashb you are right, it didn't work the last time and I landed at circular imports due to a top level import. It works now, pushed the new changes. Still assuming your +1 as I handled your review comments.

@potiuk / @kaxil could you take a look when you have some time?

amoghrajesh avatar Oct 22 '24 04:10 amoghrajesh

@ashb does it look good to go? Got a green CI

amoghrajesh avatar Oct 22 '24 15:10 amoghrajesh

Looks like all the comments are resolved and address by @amoghrajesh . LGTM. @kaxil @ashb - are you ok with merging it?

potiuk avatar Oct 23 '24 09:10 potiuk

Computer says no

Screenshot_20241023-164334.png

But yes, good to merge

ashb avatar Oct 23 '24 15:10 ashb

This change has caused the @suppress_logs_and_warning decorator to not work for CLI commands and hence CLI commands now show warnings. Example: https://github.com/apache/airflow/pull/43334

kaxil avatar Oct 23 '24 23:10 kaxil

Fix here: https://github.com/apache/airflow/pull/43335

kaxil avatar Oct 23 '24 23:10 kaxil

@amoghrajesh , shouldn't the list of sensitive_config_values be updated with the list of keys that need to redact ? Line: https://github.com/amoghrajesh/airflow/blob/840ea3efb9533837e9f36b75fa527a0fbafeb23a/airflow/configuration.py#L121

saurabhb-dev avatar Oct 24 '24 08:10 saurabhb-dev

@amoghrajesh , shouldn't the list of sensitive_config_values be updated with the list of keys that need to redact ? Line: https://github.com/amoghrajesh/airflow/blob/840ea3efb9533837e9f36b75fa527a0fbafeb23a/airflow/configuration.py#L121

Not really @saurabhb-dev , It's even better to mask values in this case, rather than keys.

The secrets_masker works in two modes:

  • for known structures it can mask sensitive key values (for example if a key in connection is "password" - it will mask the value when it prints the structure (when it is aware of the structure and acts on it)

  • but this is impossible when you log messages when the structures or values have been already converted to log string - by the time secrets masker in logger (installed as filter) gets the message, the structure is gone already (converted to string representation) and we are not aware that particular key corresponds to particular value. So our secrets masker has the second mode - where it can mask specific values. It will scan the string for all the values that are registered upfront and mask them if it finds any of them in the string.

This is what happens here - we retrieve all the secret config values (whether they come by env vars or by other means) and we add values (i.e. actual secrets) to be masked. This way when any secret is printed anywhere where secrets_masker is used, it will automatically mask it - regardless if it is a structure (secrets_masker checks values of dicts for example) or whether it's already converted to string.

potiuk avatar Oct 24 '24 09:10 potiuk

figure i'd ask here before creating a new issue

https://github.com/apache/airflow/blob/c99887ec11ce3e1a43f2794fcf36d27555140f00/airflow/configuration.py#L857-L859

self.sensitive_config_values returns both ('database', 'sql_alchemy_conn') and ('core', 'sql_alchemy_conn')

hence the self.get() raises a warning:

airflow/configuration.py:859 FutureWarning: section/key [core/sql_alchemy_conn] has been deprecated, you should use[database/sql_alchemy_conn] instead. Please update your `conf.get*` call to use the new name

which is a bit confusing since i removed all references to core.sql_alchemy_conn long ago

zachliu avatar Nov 07 '24 04:11 zachliu

@zachliu we do not return ('database', 'sql_alchemy_conn') as part of self.sensitive_config_values. Here are the sensitive values: https://github.com/amoghrajesh/airflow/blob/840ea3efb9533837e9f36b75fa527a0fbafeb23a/airflow/configuration.py#L122-L129

The warning comes from configuration.py#919

amoghrajesh avatar Nov 07 '24 05:11 amoghrajesh

@amoghrajesh i beg to differ

  1. the warning says configuration.py:859
  2. i tested with a print statement at these locations:
    • configuration.py:859
    • configuration.py:323

i'm on https://github.com/apache/airflow/tree/2.10.3

zachliu avatar Nov 07 '24 12:11 zachliu

i think i know what's going on here. somehow https://github.com/apache/airflow/commit/7dea23fd23fa347143fded6979879cf5f58fa132 (https://github.com/apache/airflow/pull/42126) didn't make the cut for 2.10.3, which caused the self.sensitive_config_values to return both ('database', 'sql_alchemy_conn') and ('core', 'sql_alchemy_conn')

zachliu avatar Nov 07 '24 12:11 zachliu

@zachliu from the PR it doesnt look like it was targetted for 2.10.3. I will leave a comment however

amoghrajesh avatar Nov 07 '24 15:11 amoghrajesh

I created a new issue to track this: https://github.com/apache/airflow/issues/43794

zachliu avatar Nov 07 '24 16:11 zachliu