sherlock
sherlock copied to clipboard
Docker run fails
Checklist
- [x] I'm reporting a bug in Sherlock's functionality
- [x] The bug I'm reporting is not a false positive or a false negative
- [x] I've verified that I'm running the latest version of Sherlock
- [x] I've checked for similar bug reports including closed ones
- [x] I've checked for pull requests that attempt to fix this bug
Description
- Cloned repo
- Built Docker Image
docker build -t mysherlock-image .
- tried to use
docker run --rm -t mysherlock-image username
Traceback (most recent call last):
File "sherlock.py", line 11, in <module>
import pandas as pd
File "/usr/local/lib/python3.7/site-packages/pandas/__init__.py", line 50, in <module>
from pandas.core.api import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/api.py", line 48, in <module>
from pandas.core.groupby import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/groupby/__init__.py", line 1, in <module>
from pandas.core.groupby.generic import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 73, in <module>
from pandas.core.frame import DataFrame
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 129, in <module>
from pandas.core import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/generic.py", line 144, in <module>
from pandas.core.window import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/window/__init__.py", line 1, in <module>
from pandas.core.window.ewm import ( # noqa:F401
File "/usr/local/lib/python3.7/site-packages/pandas/core/window/ewm.py", line 11, in <module>
import pandas._libs.window.aggregations as window_aggregations
ImportError: Error loading shared library libstdc++.so.6: No such file or directory (needed by /usr/local/lib/python3.7/site-packages/pandas/_libs/window/aggregations.cpython-37m-x86_64-linux-gnu.so)
Could you tell me what Linux distribution do you use?
Didn't you try to execute "sudo apt-get install libstdc++6"?
I get a similar message when replicating steps. Using M1 macOS 12.4.
Traceback (most recent call last):
File "sherlock.py", line 11, in <module>
import pandas as pd
File "/usr/local/lib/python3.7/site-packages/pandas/__init__.py", line 50, in <module>
from pandas.core.api import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/api.py", line 48, in <module>
from pandas.core.groupby import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/groupby/__init__.py", line 1, in <module>
from pandas.core.groupby.generic import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/groupby/generic.py", line 73, in <module>
from pandas.core.frame import DataFrame
File "/usr/local/lib/python3.7/site-packages/pandas/core/frame.py", line 129, in <module>
from pandas.core import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/generic.py", line 144, in <module>
from pandas.core.window import (
File "/usr/local/lib/python3.7/site-packages/pandas/core/window/__init__.py", line 1, in <module>
from pandas.core.window.ewm import ( # noqa:F401
File "/usr/local/lib/python3.7/site-packages/pandas/core/window/ewm.py", line 11, in <module>
import pandas._libs.window.aggregations as window_aggregations
ImportError: Error loading shared library libstdc++.so.6: No such file or directory (needed by /usr/local/lib/python3.7/site-packages/pandas/_libs/window/aggregations.cpython-37m-aarch64-linux-gnu.so)
I dont know much about Docker, but the command below get Sherlock working for me:
docker run theyahya/sherlock user123
But its of course fetching everything from docker hub and not the Dockerfile in this repo
suggestion
Currently, we are building this service using Alpine. This however has its downside, that can lead to longer build times, obscure bugs, and performance issues.
what we can do
- try using Ubuntu LTS, RedHat Universal Base Image, Debian as base image. This will guarantee portability on all devices
You can assign this for me after team comes into a conclusion
@Murithijoshua Feel free to send a PR, I liked your suggestion
The issue is that the first build part of Dockerfile adds requirements but the second parts doesn't.
The simplest fix is to add this after the ENTRYPOINT line:
RUN apk add --no-cache libxml2 libxslt libstdc++
Why split out the Dockerfile in two? To reduce the size simply run the cleanup in the same command like:
RUN apk add --no-cache libxml2 \
&& apk add -t .build-deps --no-cache g++ libxml2-dev \
&& pip3 wheel -r /opt/sherlock/requirements.txt \
&& apk del --purge .build-deps
Yes if you update the requirements, you'd need to install again these packages, but that's quick and rare.
Hello! I gave this issue a try and found the following:
- After adding
RUN apk add --no-cache g++to the Dockerfile, the issue was solved. g++ seems to be the minimal dependency needed to solve the issue. - I also implemented a solution using a Debian Slim Bullseye as the base image, as suggested by the discussion above. These were the results:
- The Debian image ended up with a size of 284MB. For comparison, the Alpine image with the solution I explained before had a final size of 344MB. A 60MB difference.
- The Debian image took a considerably less amount of time to build than the Alpine image. I didn't take the exact measurements, but the Alpine image seemed to take about 5 minutes to build, while the Debian image took <1 minute.
I am ready to submit a PR with the implementation that you think is best. Please tell me what you think! :D
@NeoLight1010 If this fixes the issue, feel free to send a PR :)