EnergonAI icon indicating copy to clipboard operation
EnergonAI copied to clipboard

Can not start the Bloom server

Open SAI990323 opened this issue 2 years ago • 3 comments

Infomation V100 CUDA 11.3 transformers==4.23.1 torch==1.12.0 colossalai==0.2.5 energonai==0.0.1+torch1.12cu11.3 running for bloom-560m & bloom-7b1 Question When I try to start the bloom server using the examples in this link, I find it stops in this scenario. image I do not meet any errors and I can not send request to http://[ip]:[host]//generation.

SAI990323 avatar Feb 18 '23 18:02 SAI990323

Is there any other information? The normal startup situation should be as shown in the figure below image

cauyxy avatar Feb 23 '23 05:02 cauyxy

I have meet the same problem.I start bloom540 with docker:hpcaitech/energon-ai:latest Infomation 4090 CUDA 11.3 transformers 4.24.0 colossalai 0.2.0+torch1.12cu11.3 energonai 0.0.1+torch1.12cu11.3 torch 1.12.1

running for bloom-560m & bloom-7b1 The application is hang.No other logs is print image

image

baibaiw5 avatar Feb 28 '23 02:02 baibaiw5

I have meet the same problem.I start bloom540 with docker:hpcaitech/energon-ai:latest Infomation 4090 CUDA 11.3 transformers 4.24.0 colossalai 0.2.0+torch1.12cu11.3 energonai 0.0.1+torch1.12cu11.3 torch 1.12.1

running for bloom-560m & bloom-7b1 The application is hang.No other logs is print image

image

comment random_init in run.sh .now it can be started python server.py --tp ${GPU_NUM} --name ${DATASET} --dtype "int8" --max_batch_size 4 --random_model_size "560m" #--random_init False

baibaiw5 avatar Feb 28 '23 02:02 baibaiw5