Daniel Han comments

Results 781 comments of


                                            Daniel Han

Faster Inference & Training Roadmap

@jeromeku Oh ye let's try be device agnostic :)) compile is OK, but I guess handwritting is best :) We then can use CUDAGraphs manually

Faster Inference & Training Roadmap

@jeromeku Fantastic work as always!! very very cool on fusing Adam and Galore!! Love this! Oh on Mixtral - https://github.com/shawntan/scattermoe/tree/main/scattermoe :) Was reading up on this as well :) On...

Add support for loading checkpoints with newly added tokens.

Wait would this load the lm_head and embed_tokens matrix correctly?

Add support for loading checkpoints with newly added tokens.

Would it not cause it to be randomnly inited?

Add support for loading checkpoints with newly added tokens.

Whoopsies sorry on the horrible delay - I'll review this PR and test it out - so sorry!

Add support for loading checkpoints with newly added tokens.

@charlesCXK @chtmp223 Extreme apologies on the delay - I think I might have fixed it. You need to call `add_new_tokens` before `get_peft_model` to update the vocab, resize, and also save...

Problem with installation

@freQuensy23-coder Oh my apologies!! I forgot to change the name of the installation paths. Ie change `cu121_ampere_torch211` to `cu121amperetorch211` with no underscores - hope that works! Sorry again!

Problem with installation

@freQuensy23-coder Sorry on the delay!! If you join our Discord server at https://discord.gg/u54VK8m8tk - it'll be an async discussion. But maybe try `pip install --upgrade pip` to update pip Also...

Problem with installation

@freQuensy23-coder I just updated Unsloth's pyproject.toml - so maybe it might work better. I suggest: ``` pip install --upgrade pip pip install --upgrade --force-reinstall --no-cache-dir "unsloth[cu121_ampere_torch211] @ git+https://github.com/unslothai/unsloth.git" ```

Any solution for MultiGPU

@gotzmann Thanks for using Unsloth again!! :) Sadly multi GPU is not yet supported for now - we're working on it for a future release in the OSS version