Nicolas Patry

Results 51 issues of Nicolas Patry

Should be more robust to shared tensors (ok when using `from_pretrained). But forcing us to add new checks in our loading code (since the chosen key to keep might be...

# What does this PR do? This adds a non flash version of MPT. Flash is harder because we need to create a bias ready cuda kernel of flash attention....

# What does this PR do? Adds a new flag propagated everywhere. Disjoint from `--quantize` which also changes the actual dtype of layers. Fixes #490 Fixes # (issue) ## Before...

Hi here. I am attempting to port basically ggml matrix multiplication into a standalone crate: https://github.com/Narsil/ggblas For most of the operations, I was able to leverage intrinsics: https://doc.rust-lang.org/core/arch/arm/index.html However for...

**Is your feature request related to a problem? Please describe.** `Discussion` and `CommitInfo` declare two things which seem the same with different names `num` and `pr_num`. **Describe the solution you'd...

# What does this PR do? - Changed all models to extract `embed_tokens` in order to enable llava to separately call the embeddings and the core model layers. - Added...

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

Stale

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

Causes issues with `ByteLevel` messing up some `AddedTokens` with some utf-8 range used in the bytelevel mapping. This commit tests the extend of the damage of ignoring the decoder for...