Thomas Capelle

Results 169 comments of Thomas Capelle

Lol I was using: `meta-llama/Llama-2-7B-hf` instead of `meta-llama/Llama-2-7b-hf`...

should we merge this?

Can I merge this @scottire ?

nop, have been trying all day. MNIST works fine

I do here, but with a different codebase: [https://wandb.ai/capecape/train\_sd/reports/How-to-Train-a-Conditional-Diffusion-Model-from-Scratch--VmlldzoyNzIzNTQ1][https_wandb.ai_capecape_train_sd_reports_How-to-Train-a-Conditional-Diffusion-Model-from-Scratch--VmlldzoyNzIzNTQ1] Sent from ProtonMail mobile \-------- Original Message -------- On Nov 5, 2022, 6:30 PM, Michael Albergo < ***@***.***> wrote: > >...

Is this going to be fixed? is there are workaround?

Already, to train in FP16, you need a ton of memory; the 7B param model will need 14GB of memory in FP16 just to load the weights. The gradients would...

here you go! ```docker FROM pytorch/pytorch:2.2.2-cuda12.1-cudnn8-runtime WORKDIR /workspace/torchtune # RUN git config --global --add safe.directory /workspace/torchtune # add this if you have issues COPY . /workspace/torchtune RUN python -m pip...