ChatGLM-6B
ChatGLM-6B copied to clipboard
[BUG/Help] <推理时,如何使用数据并行呢?>
Is there an existing issue for this?
- [X] I have searched the existing issues
Current Behavior
求问各路大佬,在做推理时,如何使用数据并行在多张卡上同时进行推理呢(每张卡显存为32G)? 我尝试了传统模型(BERT)做数据并行的方式,相关代码如下:
然后发现并不能在多卡上放置多个模型备份,且运行时会报错,猜想项目中应该有实现好的关于 “推理数据并行”的函数,不过菜鸡的我还没有找到,希望各路大神指点!! @cifangyiquan @yfyang86
os.environ['CUDA_VISIBLE_DEVICES'] = args.gpu
device = 'cuda' if torch.cuda.is_available() else 'cpu'
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True, revision="v1.1.0")
model = AutoModel.from_pretrained(model_dir, trust_remote_code=True, revision="v1.1.0").half().cuda()
model = model.eval()
model = nn.DataParallel(model)
print(f"数据并行....")
model.to(device)
``` ###
### Expected Behavior
_No response_
### Steps To Reproduce
torch~=1.10.0
numpy~=1.23.5
pandas~=1.5.3
transformers~=4.27.1
utils~=1.0.1
### Environment
```markdown
- OS: CentOS Linux release 7.6.1810 (Core)
- Python: 3.8.16
- Transformers:4.27.1
- PyTorch:1.10.0
- CUDA Support (`python -c "import torch; print(torch.cuda.is_available())"`) :
Anything else?
No response