Benoit Favre

Results 20 comments of Benoit Favre

I am using Red Hat Enterprise Linux release 8.1

@brightbsit Apex does run with pytorch 1.7, but Oscar doesn't. What commit of Apex did you successfuly complie with pytorch 1.2?

Do you mean apex@a651e2c24ecf97cbf367fd3f330df36760e1c597?

> I had an idea that you might use GPU virtualisation to make it appear like you had two GPUs. Well, it wasn't my idea, it was Chat GPTs. Seems...

I was able to run the 13B and 30B (batch size 1) models on a single A100-80GB. I used a script [1] to reshard the models and torchrun with --nproc_per_node...

Long ago at the time of LSTMs, I used to train a lot of LMs and tying embedding weights was a game changer: much faster convergence and less memory usage....

I am getting the following error when trying batched inference. Did you need any trick? ``` ../aten/src/ATen/native/cuda/Indexing.cu:1088: indexSelectSmallIndex: block: [4,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed. ```

Thanks, the problem came from elsewhere. Note that I had to use ``` tokenizer = LlamaTokenizer.from_pretrained(config.backbone, padding_side='left') ```

> Why do we need this step https://github.com/TimDettmers/bitsandbytes/blob/main/compile_from_source.md instead of just doing pip install? it requires compiling a library which is not installed by pip

As far as I understand, in general, a minibatch should process independent examples (for the gradient to be a good estimation of the global gradient). In RNNs, examples are not...