containers icon indicating copy to clipboard operation
containers copied to clipboard

Pyodbc install erroring

Open FabianDS2 opened this issue 3 years ago • 20 comments

I am relatively new to Airflow, Docker, and Bitnami but I am having trouble getting pyodbc to be installed on the bitnami airflow containers. I want to be able to use Airflow on Azure for work projects so that's how I found out about the bitnami-docker-airflow project.

I have followed the directions from this page: https://github.com/bitnami/bitnami-docker-airflow/blob/master/README.md

I started with the curl -sSL https://raw.githubusercontent.com/bitnami/bitnami-docker-airflow/master/docker-compose.yml > docker-compose.yml and got the docker-compose.yaml file in my documents folder. I then went in to make some changes to mount a local folder with DAG files I wanted to use and mounted another folder that had a requirements.txt file in it.

The lines below in the docker-compose.yml are the ones I added for mounting. ./dags and ./packages are the folders that the DAG .py and requirement.txt files are in. My docker-compose file will be attached.

  • ./dags:/opt/bitnami/airflow/dags # added this!
  • ./packages:/bitnami/python/ # added this!

The requirements.txt has the text pyodbc===4.0.30 as the only content in the file.

I run docker-compose up to get the containers up and running.

The output shows me that pyodbc install is failing but I can't exactly figure out what the source of the error is and what could fix it. Will attach the copy and pasted output that shows the error, having a hard time interpreting it. I have tried the docker-compose without the requirements file mounting and airflow will start up and I can see my DAGs at localhost:8080 as I would expect. I want to be able to use pyodbc though in my DAGs

Please let me know if I can provide more context

docker compose errors.txt docker-compose file.txt

FabianDS2 avatar Mar 09 '21 04:03 FabianDS2

Hi, The issue here is that our airflow image does not include gcc (C compiler) hence this error:

airflow-scheduler_1  |   unable to execute 'gcc': No such file or directory
airflow-scheduler_1  |   error: command 'gcc' failed with exit status 1

rafariossaa avatar Mar 09 '21 16:03 rafariossaa

Thank you @rafariossaa for the explanation. I am relatively new to Docker so is there a location in the Dockerfile/docker.compose.yaml that I should look at or do more research where I could add that dependency? For future planning, as I want to use the bitnami image for production workflow in Azure, is there a way to add that dependency on the Azure image? Maybe once it's set-up and deployed? Looking forward to hearing any guidance you can give, anything is helpful!

FabianDS2 avatar Mar 10 '21 13:03 FabianDS2

Hi, You can add build-essential to the list of packages here, and build your own image. I am not very sure about your Azure question. Once you have you own image, push it to a repository (eg. docker hub) and you will be able to use from wherever you need.

rafariossaa avatar Mar 11 '21 16:03 rafariossaa

This Issue has been automatically marked as "stale" because it has not had recent activity (for 15 days). It will be closed if no further activity occurs. Thanks for the feedback.

stale[bot] avatar Apr 14 '21 10:04 stale[bot]

Due to the lack of activity in the last 5 days since it was marked as "stale", we proceed to close this Issue. Do not hesitate to reopen it later if necessary.

github-actions[bot] avatar Apr 20 '21 01:04 github-actions[bot]

We also see this issue!

The airflow documentation says to do:

pip install apache-airflow[odbc]

http://airflow.apache.org/docs/apache-airflow-providers-odbc/stable/connections/odbc.html

But this leads to the same gcc error.

We'd expect this to work out-of-the-box given that all the other airflow providers do. Is it something that could be added do you think?

msmerc avatar Apr 27 '21 11:04 msmerc

Hi, I am creating an internal task to evaluate and add the build tools to our container image. We will come back as soon as we have news.

rafariossaa avatar Apr 27 '21 15:04 rafariossaa

@rafariossaa - thanks, appreciated.

If you do airflow info in this image once it is running the apache-airflow-providers-odbc is conspicuously absent. I'm guessing it's all related.

msmerc avatar Apr 27 '21 15:04 msmerc

Hi, Yes, I guess it is as consequence of the compiling error.

rafariossaa avatar Apr 28 '21 08:04 rafariossaa

Bitnami has suggested here that this is by design.

This is how I solved it. https://www.shawnmcgough.com/airflow-connect-to-sql-server-mssql/

ShawnMcGough avatar Apr 30 '21 17:04 ShawnMcGough

Hi @ShawnMcGough Thanks for the feedback and for creating that blog post, that will be super useful for other users. On our side, I think it would be nice to have the compiling tools added by default to airflow image. This way users won't need to extend the image just to include them.

rafariossaa avatar May 03 '21 07:05 rafariossaa

Thanks for the article @ShawnMcGough - that will be helpful in the meantime. Our use case is getting Airflow working on MS Azure. On Azure, to get Airflow set up properly on App Services, you have to select the Bitnami image (which are already "loaded" as in you select from which Bitnami image you want) you want it to provision a container for. I don't think you can modify the image so having it as part of the official Bitnami image will make the Azure App Services one work for our use case. We're pretty early on in our Azure (and Airflow) learning so the easier it can be, the better ;)

FabianDS2 avatar May 03 '21 21:05 FabianDS2

I see you're actually an Azure expert maybe @ShawnMcGough from your profile? We're a data science team of two people just getting to the cloud so configuring this type of stuff is super new, let alone Docker, etc. We figured the pre-provided Bitnami images were the easiest way to go

FabianDS2 avatar May 03 '21 21:05 FabianDS2

@FabianDS2 I hadn't thought to deploy Airflow within an App Service. Also, there are many deploy variants that can quickly add complexity (which it sounds like you're discovering)! Which method are you using, exactly?

Unfortunately, connecting to Microsoft SQL Server is a feature not available 'out of the box' for any of the options currently available.

ShawnMcGough avatar May 04 '21 12:05 ShawnMcGough

@ShawnMcGough up until now, I've just been working with Airflow locally to get a feel for how it works since it was brand new. I took the DataCamp class on it and built the most basic POC I could locally. Our likely goal though is to get it to a point where it could be deployed to a non-local "site" where we could log in and manage/run pipelines etc to execute the queries for our SQL queries that get us to the point of having a modeling data set to feed to ML algorithms or a training script (which could also be part of the pipeline). That's where the Azure part comes in. On Azure Marketplace, Bitnami has those Airflow containers on there where it deploys those containers to Azure App Services as far as I can tell. Since we don't have DevOps resources on our team, that seemed like the best option. I think we could get away with just having Airflow set up on the Remote Desktop that we use for data science development / model training etc. The Azure part is probably our thoughts as to how to "professionalize" it if I can make up a phrase. Eventually, it might be more than just us using Airflow (maybe some other Analytics - department we sit in - people will use it also) so that's where the App Services environment may become helpful.

But like you said, you can only choose between the Bitnami images that are already created without a lot of space for customization as far as I'm aware

FabianDS2 avatar May 04 '21 16:05 FabianDS2

Hi, I have followed the thread, and my query is if this is still in the same state? It is still not possible to connect to MSSQL with the bitnami airflow image ? Thank you for your clarification

alexisaraya avatar Apr 18 '22 12:04 alexisaraya

You can, but right now, you need to add the connector by yourself.

rafariossaa avatar Apr 19 '22 07:04 rafariossaa

Not sure whether the thread is still valid, but thought it would be helpful for those are still struggling with getting bitnami airflow to connect with MSSQL, here is how I rebuild the image to make it work (just use scheduler as an example):

FROM bitnami/airflow-scheduler:latest
USER root
# Change default terminal to Bash
SHELL ["/bin/bash", "-c"]
RUN rm /bin/sh && ln -s /bin/bash /bin/sh
# Install linux dependencies
RUN install_packages gcc unixodbc-dev unixodbc libpq-dev g++ build-essential python3-dev 
# Install MSSQL Debian Driver
RUN curl https://packages.microsoft.com/keys/microsoft.asc | apt-key add -
RUN curl https://packages.microsoft.com/config/debian/10/prod.list > /etc/apt/sources.list.d/mssql-release.list
RUN ACCEPT_EULA=Y install_packages msodbcsql18 mssql-tools18
RUN echo 'export PATH="$PATH:/opt/mssql-tools18/bin"' >> ~/.bashrc
RUN source ~/.bashrc
# Resolving the MSSQL driver issue
RUN chmod +rwx /etc/ssl/openssl.cnf
RUN sed -i 's/TLSv1.2/TLSv1/g' /etc/ssl/openssl.cnf
RUN sed -i 's/SECLEVEL=2/SECLEVEL=1/g' /etc/ssl/openssl.cnf
USER 1001

Hope this would help.

Something to note:

  1. I had to downgrade TLS from 1.2 to 1.0, because I was meeting the issue of "Connection Timeout Expired" due to our vendor's MSSQL server doesn't support TLS1.2. More information: SqlClient troubleshooting guide
  2. Also pls don't forget to get airflow-ODBC in your requirements.txt, i.e., apache-airflow-providers-odbc

Happy coding :).

yimingpeng avatar May 06 '22 00:05 yimingpeng

Hi @yimingpeng , Thank you very much for providing an example.

rafariossaa avatar May 06 '22 07:05 rafariossaa

We are going to transfer this issue to bitnami/containers

In order to unify the approaches followed in Bitnami containers and Bitnami charts, we are moving some issues in bitnami/bitnami-docker-<container> repositories to bitnami/containers.

Please follow bitnami/containers to keep you updated about the latest bitnami images.

More information here: https://blog.bitnami.com/2022/07/new-source-of-truth-bitnami-containers.html

carrodher avatar Jul 28 '22 13:07 carrodher