carlos gemmell
carlos gemmell
Same here. Any update?
Same result since it's just using one GPU for 7B.
yup @Markhzz ``` (llama) user@e9242bd8ac2c:~/llama$ free -h total used free shared buff/cache available Mem: 440Gi 246Gi 44Gi 1.6Gi 149Gi 189Gi Swap: 507Gi 365Mi 507Gi ```
I just ran this code and get the same output @mperacchi Cheers. My system is 2x 3090 24Gb VRAM I'm going to try 13B with the following command `CUDA_VISIBLE_DEVICES="0,1" torchrun...
@sweetpeach I see the device set for Theano is CPU in train.sh. Since CPU training from scratch would be slow, could you describe your setup for GPU training please. I...
@jingtaozhan, @pertschuk Were you able to successfully run the model with the correct type embeddings? I have the same issue with the [2,1024] tensor. How do we map the current...
I can confirm this works correctly when loading the weights into HuggingFace (Pytorch) The pertained dir needs to include, just by changing the file names. - config.json - duobert-large-msmarco-pretrained-and-finetuned.zip -...