Matthew Stallone

Results 2 comments of Matthew Stallone

Thanks for sharing, @cg123! Agreed, that is promising. Do you mind sharing your scripts you used for fine tuning? I am also very interested in this aspect of LLM research,...

Same issue here as well with `TRANSFORMER_BASED_WRAP`: `RuntimeError: 'weight' must be 2-D` `SIZED_BASED_WRAP` seems to work but then NCCL timeouts (30minutes) on the last request batch. It is hanging on...