carlos gemmell comments

Results 7 comments of


                                            carlos gemmell

Able to load 13B model on 2x3090 24Gb! But not inference... :(

Same result since it's just using one GPU for 7B.

Able to load 13B model on 2x3090 24Gb! But not inference... :(

yup @Markhzz ``` (llama) user@e9242bd8ac2c:~/llama$ free -h total used free shared buff/cache available Mem: 440Gi 246Gi 44Gi 1.6Gi 149Gi 189Gi Swap: 507Gi 365Mi 507Gi ```

Attempting to run 7B model on two Nvidia 3090s but getting OOM error with one GPU, and can't use both

I just ran this code and get the same output @mperacchi Cheers. My system is 2x 3090 24Gb VRAM I'm going to try 13B with the following command `CUDA_VISIBLE_DEVICES="0,1" torchrun...

How to retrain the model

@sweetpeach I see the device set for Theano is CPU in train.sh. Since CPU training from scratch would be slow, could you describe your setup for GPU training please. I...

Segment_id for two docs

@jingtaozhan, @pertschuk Were you able to successfully run the model with the correct type embeddings? I have the same issue with the [2,1024] tensor. How do we map the current...

I can confirm this works correctly when loading the weights into HuggingFace (Pytorch) The pertained dir needs to include, just by changing the file names. - config.json - duobert-large-msmarco-pretrained-and-finetuned.zip -...

carlos gemmell

new release?

Able to load 13B model on 2x3090 24Gb! But not inference... :(

Able to load 13B model on 2x3090 24Gb! But not inference... :(

Attempting to run 7B model on two Nvidia 3090s but getting OOM error with one GPU, and can't use both

How to retrain the model

Segment_id for two docs

Segment_id for two docs