DeepSpeed
DeepSpeed copied to clipboard
[BUG]transformer_inference.so: cannot open shared object file: No such file or directory
Describe the bug In the third stage of running RLHF, this error occurred.
To Reproduce Steps to reproduce the behavior: sh step3_rlhf_finetuning/training_scripts/single_gpu/run_1.3b.sh
Expected behavior A clear and concise description of what you expected to happen.
ds_report output
Please run ds_report
to give us details about your setup.
Screenshots
If applicable, add screenshots to help explain your problem.
System info (please complete the following information):
- OS: [e.g. Ubuntu 18.04]
- four V100 32G]
- (if applicable) what DeepSpeed-MII version are you using
- (if applicable) Hugging Face Transformers/Accelerate/etc. versions
- Python version
- Any other relevant info about your setup
Docker context Are you using a specific docker image that you can share?
Additional context Add any other context about the problem here.
I am facing same issue while inferencing using ds-mii. Any progress?
I am facing same issue while inferencing using ds-mii. Any progress?
no progress
Can you try
git clone https://github.com/microsoft/DeepSpeed
cd DeepSpeed
DS_BUILD_OPS=1 DS_BUILD_AIO=0 DS_BUILD_SPARSE_ATTN=0 pip install -e . --global-option="build_ext" --global-option="-g" --global-option="-j8" --no-cache -v --disable-pip-version-check
I have the same problem for ds inference with bloom 176B