carlos gemmell

Results 7 comments of carlos gemmell

Same here. Any update?

Same result since it's just using one GPU for 7B.

yup @Markhzz ``` (llama) user@e9242bd8ac2c:~/llama$ free -h total used free shared buff/cache available Mem: 440Gi 246Gi 44Gi 1.6Gi 149Gi 189Gi Swap: 507Gi 365Mi 507Gi ```

I just ran this code and get the same output @mperacchi Cheers. My system is 2x 3090 24Gb VRAM I'm going to try 13B with the following command `CUDA_VISIBLE_DEVICES="0,1" torchrun...

@sweetpeach I see the device set for Theano is CPU in train.sh. Since CPU training from scratch would be slow, could you describe your setup for GPU training please. I...

@jingtaozhan, @pertschuk Were you able to successfully run the model with the correct type embeddings? I have the same issue with the [2,1024] tensor. How do we map the current...

I can confirm this works correctly when loading the weights into HuggingFace (Pytorch) The pertained dir needs to include, just by changing the file names. - config.json - duobert-large-msmarco-pretrained-and-finetuned.zip -...