Thomas Capelle
Thomas Capelle
Lol I was using: `meta-llama/Llama-2-7B-hf` instead of `meta-llama/Llama-2-7b-hf`...
should we merge this?
Can I merge this @scottire ?
Do you need help?
Can I merge this?
nop, have been trying all day. MNIST works fine
I do here, but with a different codebase: [https://wandb.ai/capecape/train\_sd/reports/How-to-Train-a-Conditional-Diffusion-Model-from-Scratch--VmlldzoyNzIzNTQ1][https_wandb.ai_capecape_train_sd_reports_How-to-Train-a-Conditional-Diffusion-Model-from-Scratch--VmlldzoyNzIzNTQ1] Sent from ProtonMail mobile \-------- Original Message -------- On Nov 5, 2022, 6:30 PM, Michael Albergo < ***@***.***> wrote: > >...
Is this going to be fixed? is there are workaround?
Already, to train in FP16, you need a ton of memory; the 7B param model will need 14GB of memory in FP16 just to load the weights. The gradients would...
here you go! ```docker FROM pytorch/pytorch:2.2.2-cuda12.1-cudnn8-runtime WORKDIR /workspace/torchtune # RUN git config --global --add safe.directory /workspace/torchtune # add this if you have issues COPY . /workspace/torchtune RUN python -m pip...