NanoCode012
NanoCode012
> `vllm serve ` works flawlessly Same with `CUDA_VISIBLE_DEVICES=7` prepended?
If vllm-serve works, can you just leave that up and run the axolotl train command
Discord thread for reference: https://discord.com/channels/1104757954588196865/1426831119340273787/1427939353446977607
Thanks for the report. We're aware of this. We're thinking of having a post-training script that rewrites the keys as a workaround at the moment.
@zinccat , do you have a working script for the above that you can share? I didn't want to duplicate the effort if you've done so already.
Yeah, leaving this gist for others https://gist.github.com/NanoCode012/0c971d00a32a7d691bd0c19fc3a6d6e1 @shang-zhu, please give this script a try while we debug the real reason
@NicholasGuerrero , hey, could you provide more trace? The issue above is about being nested under an additional, `_checkpoint_wrapped...` key, which I don't see in yours.
@NicholasGuerrero , thanks for the detailed logs. Are you able to print out the model layers in your checkpoints? Can you see if the keys still contains "_checkpoint_wrapped"? To double...
Hi @glenn-jocher , I tested on my 11pro. Very nice fps (19-25 FPS) and accuracy (90+ on recognizable objects). It does get quite hot in a matter of minutes though...
Hey, I'm not sure if this is something we plan to support at the moment. Only `pretraining_datasets: ` support processing on the fly. For regular SFT, we pre-tokenize. There is...