Benoit Favre comments

Results 20 comments of


                                            Benoit Favre

Tagging Apex version

I am using Red Hat Enterprise Linux release 8.1

Tagging Apex version

@brightbsit Apex does run with pytorch 1.7, but Oscar doesn't. What commit of Apex did you successfuly complie with pytorch 1.2?

Tagging Apex version

Do you mean apex@a651e2c24ecf97cbf367fd3f330df36760e1c597?

how to run the largest possible model on a single A100 80Gb

> I had an idea that you might use GPU virtualisation to make it appear like you had two GPUs. Well, it wasn't my idea, it was Chat GPTs. Seems...

how to run the largest possible model on a single A100 80Gb

I was able to run the 13B and 30B (batch size 1) models on a single A100-80GB. I used a script [1] to reshard the models and torchrun with --nproc_per_node...

Are the weights of the lm head of the model tied with the word embeddings?

Long ago at the time of LSTMs, I used to train a lot of LMs and tying embedding weights was a game changer: much faster convergence and less memory usage....

I am getting the following error when trying batched inference. Did you need any trick? ``` ../aten/src/ATen/native/cuda/Indexing.cu:1088: indexSelectSmallIndex: block: [4,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed. ```

anyone tried batch inference?

Thanks, the problem came from elsewhere. Note that I had to use ``` tokenizer = LlamaTokenizer.from_pretrained(config.backbone, padding_side='left') ```

Is it possible to do inference (i.e. run generate.py) without a CUDA GPU?

> Why do we need this step https://github.com/TimDettmers/bitsandbytes/blob/main/compile_from_source.md instead of just doing pip install? it requires compiling a library which is not installed by pip

Question: minibatch data is not contiguous?

As far as I understand, in general, a minibatch should process independent examples (for the gradient to be a good estimation of the global gradient). In RNNs, examples are not...