Jack BAI comments

Results 37 comments of


                                            Jack BAI

你们用的显存都是多大，为啥我12G显存 batch_size 要设置成2才不会oom

神他喵加钱,改参数好哇

RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cuda:1

https://pytorch.org/docs/stable/generated/torch.nn.DataParallel.html

RuntimeError: module must have its parameters and buffers on device cuda:0 (device_ids[0]) but found one of them on device: cuda:1

Further Hint: 只要把`device='cuda'`改成`device=cuda:0`就可以解决了

训练效率问题

我单机四块V100,用的DistributedDataParallel+Apex,10秒一个epoch,5m的语料库(0.1B)一小时基本拟合完成.

master分支，能否提供一下train.json的样例

已经加上去了

能问一下你用的环境的tensorflow版本吗？

这个项目主要是PyTorch做的叭,可以说下哪里用到了tf吗?

[Frontend] Add option for LLMEngine to return model hidden states.

Thanks a lot for your contribution. Would you like to provide snippet samples for using the hidden states - specifically, what does the returned `hidden_states` vector contain?

[Frontend] Add option for LLMEngine to return model hidden states.

Just figured it out - so the hidden states output vector is a **concatenation** of all the hidden states at the last layer. From the functional aspect I would strongly...

[Frontend] Add option for LLMEngine to return model hidden states.

Thanks for the fix. I also find that `return_hidden_states=True` makes the GPU usages keeps going up when using your patch and do `llm.generate`. I guess it can be solved by...

[Frontend] Add option for LLMEngine to return model hidden states.

Confirmed that this fix solved the problem.