ml-commons icon indicating copy to clipboard operation
ml-commons copied to clipboard

[BUG] model deployment fails as pytorch files fail to download in restricted environments

Open maxlepikhin opened this issue 8 months ago • 1 comments

What is the bug? Pytorch is downloaded as part of DJIL initialization. In kubernetes environments with restricted egress, it fails with below error.

[2025-03-18T21:35:32,414][ERROR][o.o.m.e.a.DLModel        ] [opensearch-cluster-nodes-0] Failed to deploy model 3WHcqpUBVdmFfogFGkie
ai.djl.engine.EngineException: Failed to save pytorch index file
        at ai.djl.pytorch.jni.LibUtils.downloadPyTorch(LibUtils.java:429) ~[pytorch-engine-0.31.1.jar:?]
        at ai.djl.pytorch.jni.LibUtils.findNativeLibrary(LibUtils.java:314) ~[pytorch-engine-0.31.1.jar:?]
        at ai.djl.pytorch.jni.LibUtils.getLibTorch(LibUtils.java:93) ~[pytorch-engine-0.31.1.jar:?]
        at ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:81) ~[pytorch-engine-0.31.1.jar:?]
        at ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:53) ~[pytorch-engine-0.31.1.jar:?]
        at ai.djl.pytorch.engine.PtEngineProvider.getEngine(PtEngineProvider.java:41) ~[pytorch-engine-0.31.1.jar:?]
        at ai.djl.engine.Engine.getEngine(Engine.java:190) ~[api-0.31.1.jar:?]
        at org.opensearch.ml.engine.algorithms.DLModel.doLoadModel(DLModel.java:188) ~[opensearch-ml-algorithms-2.19.1.0.jar:?]
        at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:286) [opensearch-ml-algorithms-2.19.1.0.jar:?]
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:571) [?:?]
        at org.opensearch.ml.engine.algorithms.DLModel.loadModel(DLModel.java:252) [opensearch-ml-algorithms-2.19.1.0.jar:?]
        at org.opensearch.ml.engine.algorithms.DLModel.initModel(DLModel.java:142) [opensearch-ml-algorithms-2.19.1.0.jar:?]
        at org.opensearch.ml.engine.MLEngine.deploy(MLEngine.java:144) [opensearch-ml-algorithms-2.19.1.0.jar:?]
        at org.opensearch.ml.model.MLModelManager.lambda$deployModel$49(MLModelManager.java:1274) [opensearch-ml-2.19.1.0.jar:2.19.1.0]
        at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.19.1.jar:2.19.1]
        at org.opensearch.ml.model.MLModelManager.lambda$retrieveModelChunks$77(MLModelManager.java:2150) [opensearch-ml-2.19.1.0.jar:2.19.1.0]
        at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.19.1.jar:2.19.1]
        at org.opensearch.action.support.ThreadedActionListener$1.doRun(ThreadedActionListener.java:78) [opensearch-2.19.1.jar:2.19.1]
        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1014) [opensearch-2.19.1.jar:2.19.1]
        at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.19.1.jar:2.19.1]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: java.net.ConnectException: Connection timed out
        at java.base/sun.nio.ch.Net.connect0(Native Method) ~[?:?]
        at java.base/sun.nio.ch.Net.connect(Net.java:589) ~[?:?]
        at java.base/sun.nio.ch.Net.connect(Net.java:578) ~[?:?]
        at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:583) ~[?:?]
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) ~[?:?]
        at java.base/java.net.Socket.connect(Socket.java:751) ~[?:?]
        at java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:304) ~[?:?]
        at java.base/sun.security.ssl.BaseSSLSocketImpl.connect(BaseSSLSocketImpl.java:181) ~[?:?]
        at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:183) ~[?:?]
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:531) ~[?:?]
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:636) ~[?:?]
        at java.base/sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264) ~[?:?]
        at java.base/sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:377) ~[?:?]
        at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:193) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1252) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1138) ~[?:?]
        at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:179) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1690) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1614) ~[?:?]
        at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:223) ~[?:?]
        at ai.djl.util.Utils.openUrl(Utils.java:519) ~[api-0.31.1.jar:?]
        at ai.djl.util.Utils.openUrl(Utils.java:498) ~[api-0.31.1.jar:?]
        at ai.djl.util.Utils.openUrl(Utils.java:487) ~[api-0.31.1.jar:?]
        at ai.djl.pytorch.jni.LibUtils.downloadPyTorch(LibUtils.java:424) ~[pytorch-engine-0.31.1.jar:?]
        ... 22 more
[2025-03-18T21:35:32,451][ERROR][o.o.m.m.MLModelManager   ] [opensearch-cluster-nodes-0] Failed to retrieve model 3WHcqpUBVdmFfogFGkie
org.opensearch.ml.common.exception.MLException: Failed to deploy model 3WHcqpUBVdmFfogFGkie
        at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:300) ~[?:?]
        at java.base/java.security.AccessController.doPrivileged(AccessController.java:571) ~[?:?]
        at org.opensearch.ml.engine.algorithms.DLModel.loadModel(DLModel.java:252) ~[?:?]
        at org.opensearch.ml.engine.algorithms.DLModel.initModel(DLModel.java:142) ~[?:?]
        at org.opensearch.ml.engine.MLEngine.deploy(MLEngine.java:144) ~[?:?]
        at org.opensearch.ml.model.MLModelManager.lambda$deployModel$49(MLModelManager.java:1274) ~[?:?]
        at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.19.1.jar:2.19.1]
        at org.opensearch.ml.model.MLModelManager.lambda$retrieveModelChunks$77(MLModelManager.java:2150) [opensearch-ml-2.19.1.0.jar:2.19.1.0]
        at org.opensearch.core.action.ActionListener$1.onResponse(ActionListener.java:82) [opensearch-core-2.19.1.jar:2.19.1]
        at org.opensearch.action.support.ThreadedActionListener$1.doRun(ThreadedActionListener.java:78) [opensearch-2.19.1.jar:2.19.1]
        at org.opensearch.common.util.concurrent.ThreadContext$ContextPreservingAbstractRunnable.doRun(ThreadContext.java:1014) [opensearch-2.19.1.jar:2.19.1]
        at org.opensearch.common.util.concurrent.AbstractRunnable.run(AbstractRunnable.java:52) [opensearch-2.19.1.jar:2.19.1]
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144) [?:?]
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642) [?:?]
        at java.base/java.lang.Thread.run(Thread.java:1583) [?:?]
Caused by: ai.djl.engine.EngineException: Failed to save pytorch index file
        at ai.djl.pytorch.jni.LibUtils.downloadPyTorch(LibUtils.java:429) ~[?:?]
        at ai.djl.pytorch.jni.LibUtils.findNativeLibrary(LibUtils.java:314) ~[?:?]
        at ai.djl.pytorch.jni.LibUtils.getLibTorch(LibUtils.java:93) ~[?:?]
        at ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:81) ~[?:?]
        at ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:53) ~[?:?]
        at ai.djl.pytorch.engine.PtEngineProvider.getEngine(PtEngineProvider.java:41) ~[?:?]
        at ai.djl.engine.Engine.getEngine(Engine.java:190) ~[?:?]
        at org.opensearch.ml.engine.algorithms.DLModel.doLoadModel(DLModel.java:188) ~[?:?]
        at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:286) ~[?:?]
        ... 14 more
Caused by: java.net.ConnectException: Connection timed out
        at java.base/sun.nio.ch.Net.connect0(Native Method) ~[?:?]
        at java.base/sun.nio.ch.Net.connect(Net.java:589) ~[?:?]
        at java.base/sun.nio.ch.Net.connect(Net.java:578) ~[?:?]
        at java.base/sun.nio.ch.NioSocketImpl.connect(NioSocketImpl.java:583) ~[?:?]
        at java.base/java.net.SocksSocketImpl.connect(SocksSocketImpl.java:327) ~[?:?]
        at java.base/java.net.Socket.connect(Socket.java:751) ~[?:?]
        at java.base/sun.security.ssl.SSLSocketImpl.connect(SSLSocketImpl.java:304) ~[?:?]
        at java.base/sun.security.ssl.BaseSSLSocketImpl.connect(BaseSSLSocketImpl.java:181) ~[?:?]
        at java.base/sun.net.NetworkClient.doConnect(NetworkClient.java:183) ~[?:?]
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:531) ~[?:?]
        at java.base/sun.net.www.http.HttpClient.openServer(HttpClient.java:636) ~[?:?]
        at java.base/sun.net.www.protocol.https.HttpsClient.<init>(HttpsClient.java:264) ~[?:?]
        at java.base/sun.net.www.protocol.https.HttpsClient.New(HttpsClient.java:377) ~[?:?]
        at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.getNewHttpClient(AbstractDelegateHttpsURLConnection.java:193) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect0(HttpURLConnection.java:1252) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.plainConnect(HttpURLConnection.java:1138) ~[?:?]
        at java.base/sun.net.www.protocol.https.AbstractDelegateHttpsURLConnection.connect(AbstractDelegateHttpsURLConnection.java:179) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream0(HttpURLConnection.java:1690) ~[?:?]
        at java.base/sun.net.www.protocol.http.HttpURLConnection.getInputStream(HttpURLConnection.java:1614) ~[?:?]
        at java.base/sun.net.www.protocol.https.HttpsURLConnectionImpl.getInputStream(HttpsURLConnectionImpl.java:223) ~[?:?]
        at ai.djl.util.Utils.openUrl(Utils.java:519) ~[?:?]
        at ai.djl.util.Utils.openUrl(Utils.java:498) ~[?:?]
        at ai.djl.util.Utils.openUrl(Utils.java:487) ~[?:?]
        at ai.djl.pytorch.jni.LibUtils.downloadPyTorch(LibUtils.java:424) ~[?:?]
        at ai.djl.pytorch.jni.LibUtils.findNativeLibrary(LibUtils.java:314) ~[?:?]
        at ai.djl.pytorch.jni.LibUtils.getLibTorch(LibUtils.java:93) ~[?:?]
        at ai.djl.pytorch.jni.LibUtils.loadLibrary(LibUtils.java:81) ~[?:?]
        at ai.djl.pytorch.engine.PtEngine.newInstance(PtEngine.java:53) ~[?:?]
        at ai.djl.pytorch.engine.PtEngineProvider.getEngine(PtEngineProvider.java:41) ~[?:?]
        at ai.djl.engine.Engine.getEngine(Engine.java:190) ~[?:?]
        at org.opensearch.ml.engine.algorithms.DLModel.doLoadModel(DLModel.java:188) ~[?:?]
        at org.opensearch.ml.engine.algorithms.DLModel.lambda$loadModel$1(DLModel.java:286) ~[?:?]
        ... 14 more
[2025-03-18T21:35:32,459][INFO ][o.o.m.a.d.TransportDeployModelOnNodeAction] [opensearch-cluster-nodes-0] deploy model task 

How can one reproduce the bug? Probably the simplest would be:

  • Clear opensearch data/ml_cache directory.
  • Disconnect from internet.
  • Start opensearch and attempt to deploy a custom model from a local dir.

What is the expected behavior? Opensearch image must contain necessary dependencies or provide means to add these dependencies at image build or run time.

When using opensearch operator:

  • It's not possible to use a custom init containers to copy files from another image to data/ml_cache/pytorch.
  • It's not possible to bake pytorch files into data/ directory at build time as it is an empty volume mounted to the container.

Either the ml_cache/pytroch needs to be outside of data/ directory or the opensearch image should include the necessary files.

What is your host/environment? Ubuntu 24.04

Do you have any screenshots? N/A

Do you have any additional context? It is critical for some of our customers to limit egress.

maxlepikhin avatar Mar 18 '25 22:03 maxlepikhin

hey

Yerzhaisang avatar Mar 25 '25 17:03 Yerzhaisang

Hi guys, let me share the another way to reproduce this issue:

  1. Turn off the internet connection
  2. Allow registering model from local files:
PUT /_cluster/settings
 {
   "persistent" : {
     "plugins.ml_commons.allow_registering_model_via_local_file": true,
     "plugins.ml_commons.allow_registering_model_via_url": true
   }
 }
  1. Use opensearch-py-ml to register model:
from opensearchpy import OpenSearch
from opensearch_py_ml.ml_commons import MLCommonClient

client = OpenSearch(
        hosts=[CLUSTER_URL],
        http_auth=(username, password),
        verify_certs=False
    )
ml_client = MLCommonClient(client)

model_path = 'sentence-transformers_all-MiniLM-L6-v2-1.0.1-torch_script.zip' #the path to your model
model_config_path = 'config.json' #the path to your model config

model_id = ml_client.register_model(
    model_path=model_path,
    model_config_path=model_config_path,
    isVerbose=True
)

Yerzhaisang avatar Apr 03 '25 18:04 Yerzhaisang

@maxlepikhin I uploaded local pytorch dependencies to data/ml_cache/pytorch, and everything works fine. If you need help, please let me know - [email protected]

Yerzhaisang avatar Apr 03 '25 23:04 Yerzhaisang

@jngz-es @dhrubo-os @ylwu-amzn I believe this is not issue, because storing local pytorch dependencies in the project isn't memory efficient, so please close it

Yerzhaisang avatar Apr 03 '25 23:04 Yerzhaisang

Thanks @Yerzhaisang for raising and investigating this issue. I agree with you about it is not efficient to download everything and cache them. I am closing the issue. Feel free to reopen it, if any questions.

jngz-es avatar Apr 04 '25 21:04 jngz-es

@jngz-es @dhrubo-os @ylwu-amzn @Yerzhaisang how are the torch/DJL binaries scanned for CVEs if they are downloaded at runtime?

For others who hit the same issue in containerized environments, the solution is to copy required files into a custom opensearch image:

# Define PyTorch version
# Defined at: https://github.com/deepjavalibrary/djl/blob/41f75681aab8708c375e94f0a99ad7673a74f7ae/bom/build.gradle.kts#L135
PYTORCH_VERSION="1.13.1"
# Defined at: https://github.com/opensearch-project/ml-commons/blob/5bb035e2f5edb8ea936faedb403d6414695463fe/ml-algorithms/build.gradle#L48
DJL_VERSION="0.31.1"
CACHE_DIR="./data/ml_cache/pytorch"
INDEX_FILE="${CACHE_DIR}/${PYTORCH_VERSION}.txt"
BASE_URL="https://publish.djl.ai/pytorch/${PYTORCH_VERSION}"

# Define supported platforms and flavors
PLATFORMS=("linux-x86_64" "linux-aarch64")
FLAVORS=("cpu" "cpu-precxx11")

# Ensure the cache directory exists
mkdir -p "$CACHE_DIR"

# Download index file if it does not exist
if [[ ! -f "$INDEX_FILE" ]]; then
    echo "Downloading index file..."
    curl -fsSL "${BASE_URL}/files.txt" -o "${INDEX_FILE}"
fi

# Function to decode URL-encoded filenames (fix %2B -> +, etc.)
decode_url() {
    echo -e "$(printf '%b' "${1//%/\\x}")"
}

# Download and extract all necessary files for each platform/flavor combination
for PLATFORM in "${PLATFORMS[@]}"; do
    for FLAVOR in "${FLAVORS[@]}"; do
        DEST_DIR="${CACHE_DIR}/${PYTORCH_VERSION}-${FLAVOR}-${PLATFORM}"
        mkdir -p "$DEST_DIR"

        # Download DJL JNI.
        # Example: https://publish.djl.ai/pytorch/1.13.1/jnilib/0.31.1/linux-x86_64/cpu/libdjl_torch.so
        JNI_URL="$BASE_URL/jnilib/$DJL_VERSION/$PLATFORM/$FLAVOR/libdjl_torch.so"
        echo "Downloading $JNI_URL ..."
        DEST_FILE="${DEST_DIR}/${DJL_VERSION}-libdjl_torch.so"
        set +e
        curl -fSL "$JNI_URL" -o "$DEST_FILE"
        CURL_EXIT_CODE=$?
        set -e
        if [[ $CURL_EXIT_CODE -ne 0 ]]; then
          # cpu flavor and osx are not available, report an error and continue.
          echo "--- Failed to download $JNI_URL"
        fi

        # Download Pytorch native binaries.
        echo "Downloading PyTorch native libraries for ${FLAVOR} on ${PLATFORM}..."
        while IFS= read -r line; do
            if [[ "$line" == "${FLAVOR}/${PLATFORM}/"* ]]; then
                FILE_NAME=$(basename "$line" .gz)
                DECODED_FILE_NAME=$(decode_url "$FILE_NAME")  # Fix C++ filename issues

                URL="${BASE_URL}/${line}"
                DEST_FILE="${DEST_DIR}/${DECODED_FILE_NAME}"

                echo "Downloading ${URL} -> ${DEST_FILE}..."
                curl -fsSL "${URL}" | gunzip -c > "${DEST_FILE}"
                chmod 644 "${DEST_FILE}"
            fi
        done < "$INDEX_FILE"

        echo "PyTorch native libraries downloaded to ${DEST_DIR}"
    done
done

maxlepikhin avatar Apr 08 '25 23:04 maxlepikhin

@maxlepikhin WhiteSource Security Check does his job very well

Yerzhaisang avatar Apr 09 '25 03:04 Yerzhaisang

@jngz-es @dhrubo-os @ylwu-amzn @Yerzhaisang how are the torch/DJL binaries scanned for CVEs if they are downloaded at runtime?

For others who hit the same issue in containerized environments, the solution is to copy required files into a custom opensearch image:

# Define PyTorch version
# Defined at: https://github.com/deepjavalibrary/djl/blob/41f75681aab8708c375e94f0a99ad7673a74f7ae/bom/build.gradle.kts#L135
PYTORCH_VERSION="1.13.1"
# Defined at: https://github.com/opensearch-project/ml-commons/blob/5bb035e2f5edb8ea936faedb403d6414695463fe/ml-algorithms/build.gradle#L48
DJL_VERSION="0.31.1"
CACHE_DIR="./data/ml_cache/pytorch"
INDEX_FILE="${CACHE_DIR}/${PYTORCH_VERSION}.txt"
BASE_URL="https://publish.djl.ai/pytorch/${PYTORCH_VERSION}"

# Define supported platforms and flavors
PLATFORMS=("linux-x86_64" "linux-aarch64")
FLAVORS=("cpu" "cpu-precxx11")

# Ensure the cache directory exists
mkdir -p "$CACHE_DIR"

# Download index file if it does not exist
if [[ ! -f "$INDEX_FILE" ]]; then
    echo "Downloading index file..."
    curl -fsSL "${BASE_URL}/files.txt" -o "${INDEX_FILE}"
fi

# Function to decode URL-encoded filenames (fix %2B -> +, etc.)
decode_url() {
    echo -e "$(printf '%b' "${1//%/\\x}")"
}

# Download and extract all necessary files for each platform/flavor combination
for PLATFORM in "${PLATFORMS[@]}"; do
    for FLAVOR in "${FLAVORS[@]}"; do
        DEST_DIR="${CACHE_DIR}/${PYTORCH_VERSION}-${FLAVOR}-${PLATFORM}"
        mkdir -p "$DEST_DIR"

        # Download DJL JNI.
        # Example: https://publish.djl.ai/pytorch/1.13.1/jnilib/0.31.1/linux-x86_64/cpu/libdjl_torch.so
        JNI_URL="$BASE_URL/jnilib/$DJL_VERSION/$PLATFORM/$FLAVOR/libdjl_torch.so"
        echo "Downloading $JNI_URL ..."
        DEST_FILE="${DEST_DIR}/${DJL_VERSION}-libdjl_torch.so"
        set +e
        curl -fSL "$JNI_URL" -o "$DEST_FILE"
        CURL_EXIT_CODE=$?
        set -e
        if [[ $CURL_EXIT_CODE -ne 0 ]]; then
          # cpu flavor and osx are not available, report an error and continue.
          echo "--- Failed to download $JNI_URL"
        fi

        # Download Pytorch native binaries.
        echo "Downloading PyTorch native libraries for ${FLAVOR} on ${PLATFORM}..."
        while IFS= read -r line; do
            if [[ "$line" == "${FLAVOR}/${PLATFORM}/"* ]]; then
                FILE_NAME=$(basename "$line" .gz)
                DECODED_FILE_NAME=$(decode_url "$FILE_NAME")  # Fix C++ filename issues

                URL="${BASE_URL}/${line}"
                DEST_FILE="${DEST_DIR}/${DECODED_FILE_NAME}"

                echo "Downloading ${URL} -> ${DEST_FILE}..."
                curl -fsSL "${URL}" | gunzip -c > "${DEST_FILE}"
                chmod 644 "${DEST_FILE}"
            fi
        done < "$INDEX_FILE"

        echo "PyTorch native libraries downloaded to ${DEST_DIR}"
    done
done

Thanks @maxlepikhin I'm glad to know that your problem is solved and thanks for sharing the solution here as well. Can you please add raise a PR to add a documentation about this in the docs directory

Regarding your question about CVE checks, I don't think there's any CVE check for run time. However we use the same torch version in opensearch-py-ml which is being scanned for CVE issues.

Please let me know if that answers your question. Thanks.

dhrubo-os avatar Apr 10 '25 19:04 dhrubo-os

@dhrubo-os you are welcome. The question about CVEs was not to learn how the scans are done but to point out to the fact that if anybody takes a dependency on opensearch docker image, they will scan it for vulnerabilities and will miss DJL and pytorch binaries downloaded at runtime.

maxlepikhin avatar Apr 10 '25 20:04 maxlepikhin

@dhrubo-os you are welcome. The question about CVEs was not to learn how the scans are done but to point out to the fact that if anybody takes a dependency on opensearch docker image, they will scan it for vulnerabilities and will miss DJL and pytorch binaries downloaded at runtime.

Yeah agree which is why we use the same torch version what was scanned in the py-ml repo. But if you find out any better way to add this in the compile time, please feel free to raise a PR.

dhrubo-os avatar Apr 11 '25 19:04 dhrubo-os