Benoit Favre
Benoit Favre
I am using Red Hat Enterprise Linux release 8.1
@brightbsit Apex does run with pytorch 1.7, but Oscar doesn't. What commit of Apex did you successfuly complie with pytorch 1.2?
Do you mean apex@a651e2c24ecf97cbf367fd3f330df36760e1c597?
> I had an idea that you might use GPU virtualisation to make it appear like you had two GPUs. Well, it wasn't my idea, it was Chat GPTs. Seems...
I was able to run the 13B and 30B (batch size 1) models on a single A100-80GB. I used a script [1] to reshard the models and torchrun with --nproc_per_node...
Long ago at the time of LSTMs, I used to train a lot of LMs and tying embedding weights was a game changer: much faster convergence and less memory usage....
I am getting the following error when trying batched inference. Did you need any trick? ``` ../aten/src/ATen/native/cuda/Indexing.cu:1088: indexSelectSmallIndex: block: [4,0,0], thread: [44,0,0] Assertion `srcIndex < srcSelectDimSize` failed. ```
Thanks, the problem came from elsewhere. Note that I had to use ``` tokenizer = LlamaTokenizer.from_pretrained(config.backbone, padding_side='left') ```
> Why do we need this step https://github.com/TimDettmers/bitsandbytes/blob/main/compile_from_source.md instead of just doing pip install? it requires compiling a library which is not installed by pip
As far as I understand, in general, a minibatch should process independent examples (for the gradient to be a good estimation of the global gradient). In RNNs, examples are not...