charts icon indicating copy to clipboard operation
charts copied to clipboard

Unable to install pip requirements via extravolumeMount due to Read-only file system

Open Gdtav opened this issue 1 year ago • 17 comments

Name and Version

bitnami/airflow 18.3.9

What architecture are you using?

amd64

What steps will reproduce the bug?

Install the Helm Chart with a requirements.txt file mounted as a configMap (extraDeploy) using extraVolumes and extraVolumeMounts as described in the documentation.

Are you using any custom parameters or values?

relevant values:

extraVolumes:
  - name: requirements-volume
    configMap:
      name: airflow-requirements
extraVolumeMounts:
  - name: requirements-volume
    mountPath: /bitnami/python
extraDeploy:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: airflow-requirements
    data:
      requirements.txt: |
        apache-airflow[google]
        apache-airflow-providers-postgres
        pandas
        google-cloud-bigquery
        openlineage-python
        pandas-gbq

What is the expected behavior?

During the first initialization, the scheduler, web and worker containers will execute pip install -r /bitnami/python/requirements.txt successfully and install the required dependencies for my DAGs

What do you see instead?

This is the container log, and it repeats in a crash loop (truncated most of the "requirements already satisfied" lines):

airflow-scheduler 15:47:58.77 INFO  ==> 
airflow-scheduler 15:47:58.78 INFO  ==> Welcome to the Bitnami airflow-scheduler container
airflow-scheduler 15:47:58.78 INFO  ==> Subscribe to project updates by watching https://github.com/bitnami/containers
airflow-scheduler 15:47:58.78 INFO  ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
airflow-scheduler 15:47:58.78 INFO  ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
airflow-scheduler 15:47:58.79 INFO  ==> 
airflow-scheduler 15:47:58.79 INFO  ==> Enabling non-root system user with nss_wrapper
WARNING: The directory '/opt/bitnami/airflow/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Requirement already satisfied: apache-airflow-providers-postgres in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 2)) (5.11.1)
Requirement already satisfied: pandas in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 3)) (2.1.4)
Requirement already satisfied: google-cloud-bigquery in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 4)) (3.20.1)
Collecting openlineage-python (from -r /bitnami/python/requirements.txt (line 5))
  Downloading openlineage_python-1.18.0-py3-none-any.whl.metadata (1.7 kB)
Requirement already satisfied: pandas-gbq in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 6)) (0.23.0)
Requirement already satisfied: apache-airflow[google] in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 1)) (2.9.2)
[...]
Requirement already satisfied: pydantic-core==2.18.4 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from pydantic<3->google-cloud-aiplatform>=1.42.1->apache-airflow-providers-google->apache-airflow[google]->-r /bitnami/python/requirements.txt (line 1)) (2.18.4)
Requirement already satisfied: importlib-resources>=1.3 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from limits>=2.8->Flask-Limiter<4,>3->flask-appbuilder==4.4.1->apache-airflow-providers-fab>=1.0.2->apache-airflow[google]->-r /bitnami/python/requirements.txt (line 1)) (6.4.0)
Downloading openlineage_python-1.18.0-py3-none-any.whl (44 kB)
   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 44.2/44.2 kB 1.1 MB/s eta 0:00:00
Installing collected packages: openlineage-python
ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/opt/bitnami/airflow/venv/lib/python3.11/site-packages/openlineage'
[notice] A new release of pip is available: 24.1 -> 24.1.2
[notice] To update, run: pip install --upgrade pip

Gdtav avatar Jul 16 '24 15:07 Gdtav

Same here

airflow-worker 15:55:42.14 INFO ==>
airflow-worker 15:55:42.14 INFO ==> Welcome to the Bitnami airflow-worker container
airflow-worker 15:55:42.15 INFO ==> Subscribe to project updates by watching https://github.com/bitnami/containers
airflow-worker 15:55:42.15 INFO ==> Submit issues and feature requests at https://github.com/bitnami/containers/issues
airflow-worker 15:55:42.15 INFO ==> Upgrade to Tanzu Application Catalog for production environments to access custom-configured and pre-packaged software components. Gain enhanced features, including Software Bill of Materials (SBOM), CVE scan result reports, and VEX documents. To learn more, visit https://bitnami.com/enterprise
airflow-worker 15:55:42.15 INFO ==>
airflow-worker 15:55:42.16 INFO ==> Enabling non-root system user with nss_wrapper
WARNING: The directory '/opt/bitnami/airflow/.cache/pip' or its parent directory is not owned or is not writable by the current user. The cache has been disabled. Check the permissions and owner of that directory. If executing pip with sudo, you should use sudo's -H flag.
Collecting corsound-airflow@ https://_json_key_base64:****@us-python.pkg.dev/mvp-2023-10-10/corsound/corsound-airflow/corsound_airflow-2.0.0-py3-none-any.whl (from -r /bitnami/python/requirements.txt (line 2))
Downloading https://_json_key_base64:****@us-python.pkg.dev/mvp-2023-10-10/corsound/corsound-airflow/corsound_airflow-2.0.0-py3-none-any.whl (4.5 kB)
Requirement already satisfied: flask in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from -r /bitnami/python/requirements.txt (line 1)) (2.2.5)
Requirement already satisfied: Werkzeug>=2.2.2 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from flask->-r /bitnami/python/requirements.txt (line 1)) (2.2.3)
Requirement already satisfied: Jinja2>=3.0 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from flask->-r /bitnami/python/requirements.txt (line 1)) (3.1.4)
Requirement already satisfied: itsdangerous>=2.0 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from flask->-r /bitnami/python/requirements.txt (line 1)) (2.2.0)
Requirement already satisfied: click>=8.0 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from flask->-r /bitnami/python/requirements.txt (line 1)) (8.1.7)
Collecting peppercorn (from corsound-airflow@ https://_json_key_base64:[email protected]/mvp-2023-10-10/corsound/corsound-airflow/corsound_airflow-2.0.0-py3-none-any.whl->-r /bitnami/python/requirements.txt (line 2))
Downloading peppercorn-0.6-py3-none-any.whl.metadata (3.4 kB)
Requirement already satisfied: MarkupSafe>=2.0 in /opt/bitnami/airflow/venv/lib/python3.11/site-packages (from Jinja2>=3.0->flask->-r /bitnami/python/requirements.txt (line 1)) (2.1.5)
Downloading peppercorn-0.6-py3-none-any.whl (4.8 kB)
Installing collected packages: peppercorn, corsound-airflow
ERROR: Could not install packages due to an OSError: [Errno 30] Read-only file system: '/opt/bitnami/airflow/venv/lib/python3.11/site-packages/peppercorn'
[notice] A new release of pip is available: 24.0 -> 24.1.2
[notice] To update, run: pip install --upgrade pip

vorandrew avatar Jul 16 '24 16:07 vorandrew

Apologies for the drive by but would you not want to provide a custom container image with the dependencies baked in? Installing on a vanilla image on pod start feels very anti-pattern.

kav avatar Jul 17 '24 17:07 kav

Apologies for the drive by but would you not want to provide a custom container image with the dependencies baked in? Installing on a vanilla image on pod start feels very anti-pattern.

I was following the instructions to add dependencies as described on the chart page; I tried to go the custom image approach with the official airflow chart but it didn't work and I couldn't figure out why, and the bitnami chart had no instructions on how to do it that way. For me either solution would be fine, as long as I can finally deploy this.

Gdtav avatar Jul 18 '24 13:07 Gdtav

same problem here

matheuscarreirod avatar Jul 23 '24 13:07 matheuscarreirod

As a temporal solution you can install the chart by disabling the podSecurityContext and containerSecurityContext for the web, scheduler and worker deployments

heizerbalazs avatar Jul 24 '24 09:07 heizerbalazs

Hi, The filesystem is readonly, so for this case, you would need to set the pod and container security contexts.

rafariossaa avatar Jul 24 '24 13:07 rafariossaa

Hi, The filesystem is readonly, so for this case, you would need to set the pod and container security contexts.

Alright! Would be nice to update the documentation to reflect that caveat. I ended up using the official airflow with a custom image as was suggested by @kav, I tried to do the same with this chart but also couldn't do it, is it possible at all? If it's considered a best practice, shouldn't there be a short example about it in the chart page? (anyway, for me the issue can be closed).

Gdtav avatar Jul 25 '24 08:07 Gdtav

Another workaorund, instead of setting the container security contexts, is to mount a volume with the content of the virtualenv. Here you can find see an example:

extraVolumes:
  - name: requirements-volume
    configMap:
      name: airflow-requirements
  - name: venv
    emptyDir: {}
extraVolumeMounts:
  - name: requirements-volume
    mountPath: /bitnami/python
  - name: venv
    mountPath: /opt/bitnami/airflow/venv/lib
initContainers:
  - name: copy-python-env
    image: bitnami/airflow
    command:
      - /bin/bash
    args:
      - -ec
      - |
        #!/bin/bash
        cp -r /opt/bitnami/airflow/venv/lib/* /venv
    volumeMounts:
      - name: venv
        mountPath: /venv
extraDeploy:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: airflow-requirements
    data:
      requirements.txt: |
        apache-airflow[google]
        apache-airflow-providers-postgres
        pandas
        google-cloud-bigquery
        openlineage-python
        pandas-gbq

fmulero avatar Aug 12 '24 09:08 fmulero

I wanted to point out that no one can use PythonVirtualEnvOperator since it requires an optional airflow dependency to be installed: apache-airflow[virtualenv]==2.9.1. Can there be a fix for letting us define optional airflow dependencies without messing with security contexts?

keelamp avatar Aug 15 '24 23:08 keelamp

Another workaorund, instead of setting the container security contexts, is to mount a volume with the content of the virtualenv. Here you can find see an example:

extraVolumes:
  - name: requirements-volume
    configMap:
      name: airflow-requirements
  - name: venv
    emptyDir: {}
extraVolumeMounts:
  - name: requirements-volume
    mountPath: /bitnami/python
  - name: venv
    mountPath: /opt/bitnami/airflow/venv/lib
initContainers:
  - name: copy-python-env
    image: bitnami/airflow
    command:
      - /bin/bash
    args:
      - -ec
      - |
        #!/bin/bash
        cp -r /opt/bitnami/airflow/venv/lib/* /venv
    volumeMounts:
      - name: venv
        mountPath: /venv
extraDeploy:
  - apiVersion: v1
    kind: ConfigMap
    metadata:
      name: airflow-requirements
    data:
      requirements.txt: |
        apache-airflow[google]
        apache-airflow-providers-postgres
        pandas
        google-cloud-bigquery
        openlineage-python
        pandas-gbq

I can confirm that this works for me. Only small modifications I did is to change /opt/bitnami/airflow/venv/lib to /opt/bitnami/airflow/venv as some Python libraries install into venv/bin as well.

cheeyeelim avatar Aug 20 '24 11:08 cheeyeelim

Thanks @cheeyeelim for sharing your outputs.

fmulero avatar Aug 29 '24 17:08 fmulero

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] avatar Sep 14 '24 01:09 github-actions[bot]

This issue prevents us from normally using PythonVirtualEnvOperator operator!

alimoezzi avatar Sep 16 '24 20:09 alimoezzi

There needs to be data directory built-in to the chart.

alimoezzi avatar Sep 17 '24 12:09 alimoezzi

This solution https://github.com/bitnami/charts/issues/28124#issuecomment-2298639129 is working for me! However, in my case, it only works after I delete the existing Helm release and create a new one.

Updating the existing release config and saving it doesn't trigger the copy-python-env command. 🙅

panjiyudasetya avatar Sep 25 '24 10:09 panjiyudasetya

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] avatar Oct 11 '24 01:10 github-actions[bot]

In my case it still has permission problem and installs packages to .local

alimoezzi avatar Oct 11 '24 12:10 alimoezzi

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

github-actions[bot] avatar Oct 27 '24 01:10 github-actions[bot]

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

github-actions[bot] avatar Nov 01 '24 01:11 github-actions[bot]