Rasmus
Rasmus
The main reason it takes so long to do a fresh install, is that the patcher creates tens or hundreds of thousands of temporary files in the RADS/temp folder. You...
Well I just tested it now and got a bugsplat at 54% when doing a fresh install :( It does seem to be faster with the tmpfs folder, but maybe...
I think I'm having an issue related to this when saving/loading a struct containing arrays: > ERROR: MethodError: no method matching Array{var"#s20",1} where var"#s20" Stacktrace: > [1] convert(::Type{Array{var"#s20",1} where var"#s20"...
I've been testing this out, and ran into an issue with resuming from a checkpoint. I suspect it's because of how `StatefulDataLoader` handles the state dict: https://github.com/pytorch/data/blob/11e16da61d7f5f587627c75e99ea664efef3e0f8/torchdata/stateful_dataloader/stateful_dataloader.py#L249 That is, a...
> @rlrs Would it be possible to test it after my latest commit ([b9b045d](https://github.com/pytorch/torchtitan/pull/279/commits/b9b045d32933c2824ae6f667e944a51c3255a2d1))? I missed adding that part. I had already added that in my version. I can't get...
I have a straightforward script for converting from HF to a DCP checkpoint, if that helps. Mostly the script already exists in gpt-fast.
Alright so this is the script I'm using for HF->DCP. It uses the safetensors weights (but can easily be converted to load a torch.save instead), which only exist in https://huggingface.co/meta-llama/Meta-Llama-3-8B/tree/main...
Ogre2.3 also fails. On my side it seems to be a Freetype issue: ``` /tmp/ogre2.3-20240103-59973-jivofc/ogre-next-2.3.1/Components/Overlay/src/OgreFont.cpp:48:10: fatal error: 'ft2build.h' file not found #include ^~~~~~~~~~~~ 1 error generated. make[2]: *** [Components/Overlay/CMakeFiles/OgreOverlay.dir/src/OgreFont.cpp.o] Error...
Oops, some of these changes are for our internal use. Will remove them from here.
Apologies for the lack of explanation or tests, I rushed this a bit. So far I've used this with Mistral 7B, comparing against the standard [Transformer Math](https://blog.eleuther.ai/transformer-math/) `6PD` calculation, and...