SLAM-LLM icon indicating copy to clipboard operation
SLAM-LLM copied to clipboard

Currently we use a single GPU for decoding. We have the plan to support Multi-GPU decoding and the script is on the way.

Open Learneducn opened this issue 1 year ago • 5 comments

          Currently we use a single GPU for decoding. We have the plan to support Multi-GPU decoding and the script is on the way.

Originally posted by @ddlBoJack in https://github.com/X-LANCE/SLAM-LLM/issues/100#issuecomment-2151660523

Learneducn avatar Jul 20 '24 13:07 Learneducn

Hello, excuse me. When I run the inference and training scripts, I specified the CUDA ID, but it always defaulted to on CUDA=0. How to solve this?

Learneducn avatar Jul 20 '24 13:07 Learneducn

Hello, excuse me. When I ran the inference and training scripts, I specified the CUDA ID, but it always defaulted to cuda=0. How should I solve this? In short, an error was reported: torch.distributed.elastic .mutiprocessing.errors.ChildFailedError:torch.distributed.elastic.mutiprocessing.errors.ChildFailedError

Learneducn avatar Jul 20 '24 14:07 Learneducn

Have you solved it? I think this may relate to a local problem with your GPU config.

ddlBoJack avatar Jul 23 '24 12:07 ddlBoJack

I believe there is a straightforward implementation for multi-GPU support. You can wrap the existing script with an outer script that handles the splitting of the test set and passes the GPU IDs accordingly. This approach is similar to what FunASR did previously.

fclearner avatar Jul 24 '24 01:07 fclearner

我相信多 GPU 支持有一个简单的实现。您可以使用外部脚本包装现有脚本,该脚本处理测试集的拆分并相应地传递 GPU ID。这种方法类似于 FunASR 之前所做的。 Thank you very much. The problem about specifying a certain card for testing has been solved. I have encountered another problem now. I directly used the SLAM framework to fine-tune the inference results. Why haven’t I directly used the whisper open source model to test the results?

Learneducn avatar Jul 24 '24 04:07 Learneducn