inference
inference copied to clipboard

Published 20 hours ago •

Reame
Issues

BUG: Could not start v0.11.0 from Docker Compose

Open Minamiyama opened this issue 9 months ago • 16 comments

May 12 '24 06:05 Minamiyama

我也是最新版的镜像起不来

May 12 '24 13:05 XiaoCC

me too.+1

May 12 '24 16:05 yanmao2023

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:

FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1

And then test it?

In my own machine with two GPUs, I can use xinference it normally with above method.

May 13 '24 05:05 ChengjieLi28

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:
FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1
And then test it?

In my own machine with two GPUs, I can use xinference it normally with above method.

It seems not work for me

May 13 '24 07:05 Minamiyama

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:
FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1
And then test it? In my own machine with two GPUs, I can use xinference it normally with above method.
It seems not work for me

Paste your error stack. And your related commands.

May 13 '24 07:05 ChengjieLi28

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:
FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1
And then test it? In my own machine with two GPUs, I can use xinference it normally with above method.
It seems not work for me
Paste your error stack. And your related commands.

May 13 '24 07:05 Minamiyama

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:
FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1
And then test it? In my own machine with two GPUs, I can use xinference it normally with above method.
It seems not work for me
Paste your error stack. And your related commands.

What's the error here? You can also add --log-level debug in your entrypoint command. Could you just test:

build the new image
run it

docker run -p 9997:9997 --gpus all <the new image> xinference-local --log-level debug -H 0.0.0.0

And then use it

May 13 '24 07:05 ChengjieLi28

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:
FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1
And then test it? In my own machine with two GPUs, I can use xinference it normally with above method.
It seems not work for me
Paste your error stack. And your related commands.
What's the error here? You can also add --log-level debug in your entrypoint command. Could you just test:

build the new image

run it
docker run -p 9997:9997 --gpus all <the new image> xinference-local --log-level debug -H 0.0.0.0
And then use it

is this message useful?

May 13 '24 08:05 Minamiyama

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:
FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1
And then test it? In my own machine with two GPUs, I can use xinference it normally with above method.
It seems not work for me
Paste your error stack. And your related commands.
What's the error here? You can also add --log-level debug in your entrypoint command. Could you just test:

build the new image

run it
docker run -p 9997:9997 --gpus all <the new image> xinference-local --log-level debug -H 0.0.0.0
And then use it
is this message useful?

What's this? It seems no relation with xinference and it may be the issue with your cuda environment. The docker image uses pytorch image as the base image. You can try that whether you can use this image directly:

pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel

May 13 '24 08:05 ChengjieLi28

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:
FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1
And then test it? In my own machine with two GPUs, I can use xinference it normally with above method.
It seems not work for me
Paste your error stack. And your related commands.
What's the error here? You can also add --log-level debug in your entrypoint command. Could you just test:

build the new image

run it
docker run -p 9997:9997 --gpus all <the new image> xinference-local --log-level debug -H 0.0.0.0
And then use it
is this message useful?
What's this? It seems no relation with xinference and it may be the issue with your cuda environment. The docker image uses pytorch image as the base image. You can try that whether you can use this image directly: pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel

cause by adding --log-level, mayby it's my wrong usage

May 13 '24 08:05 Minamiyama

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:
FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1
And then test it? In my own machine with two GPUs, I can use xinference it normally with above method.
It seems not work for me
Paste your error stack. And your related commands.
What's the error here? You can also add --log-level debug in your entrypoint command. Could you just test:

build the new image

run it
docker run -p 9997:9997 --gpus all <the new image> xinference-local --log-level debug -H 0.0.0.0
And then use it
is this message useful?
What's this? It seems no relation with xinference and it may be the issue with your cuda environment. The docker image uses pytorch image as the base image. You can try that whether you can use this image directly: pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel
cause by adding --log-level, mayby it's my wrong usage

nothing new shown, and auto shut down as well

May 13 '24 08:05 Minamiyama

pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel

Just run

docker run --gpus all pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel tail -f /dev/null

still auto shutdown?

May 13 '24 08:05 ChengjieLi28

@Minamiyama @yanmao2023 @XiaoCC Could you please help me test this: Build a new image based on our offical image:
FROM xprobe/xinference:v0.11.0

RUN pip install torchvision==0.17.1
And then test it? In my own machine with two GPUs, I can use xinference it normally with above method.
It seems not work for me
Paste your error stack. And your related commands.
What's the error here? You can also add --log-level debug in your entrypoint command. Could you just test:

build the new image

run it
docker run -p 9997:9997 --gpus all <the new image> xinference-local --log-level debug -H 0.0.0.0
And then use it
is this message useful?
What's this? It seems no relation with xinference and it may be the issue with your cuda environment. The docker image uses pytorch image as the base image. You can try that whether you can use this image directly: pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel
cause by adding --log-level, mayby it's my wrong usage
nothing new shown, and auto shut down as well

The host machine is windows OS. May cannot use 0.0.0.0. I haven't tried windows. Remove -H 0.0.0.0 and try again.

May 13 '24 08:05 ChengjieLi28

pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel

Just run
docker run --gpus all pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel tail -f /dev/null
still auto shutdown?

running normally

May 13 '24 08:05 Minamiyama

pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel

Just run
docker run --gpus all pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel tail -f /dev/null
still auto shutdown?
running normally

Cannot reproduce.

docker pull xprobe/xinference:nightly-bug_torchvision_version

This image is built by #1485 . And I can use it normally on my ubuntu machine.

May 13 '24 09:05 ChengjieLi28

pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel

Just run
docker run --gpus all pytorch/pytorch:2.1.2-cuda12.1-cudnn8-devel tail -f /dev/null
still auto shutdown?
running normally
Cannot reproduce.
docker pull xprobe/xinference:nightly-bug_torchvision_version
This image is built by #1485 . And I can use it normally on my ubuntu machine.

failed as well

May 13 '24 20:05 Minamiyama

@Minamiyama Try this image:

docker pull xprobe/xinference:nightly-docker_crash_due_to_llama

May 16 '24 08:05 ChengjieLi28

0.11.1 is ok

May 20 '24 13:05 Minamiyama