Daniël de Kok

Results 73 issues of Daniël de Kok

Plan: - [ ] Release sticker 0.5 with all improvements so far. - [ ] Migrate low-hanging fruit using `compat.v1`. - [ ] Rewrite functionality that is not available in...

maintenance

If several models in the pipeline use the same word embeddings, reuse them between the models.

## Description Relax the upper bound a little. ### Types of change Maintenance ## Checklist - [x] I confirm that I have the right to submit this contribution under the...

# What does this PR do? The `GPTWeightLoader` was structured like this in pseudocode: ``` if marlin: Set up tensors in a way that GPTQ-Marlin expects else: Set up tensors...

# What does this PR do? **CI test run**, not for review yet. Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the...

Stale

# What does this PR do? Some FP8 checkpoints use a scalar weight scale. This change adds support for that. ## Before submitting - [ ] This PR fixes a...

# What does this PR do? This replaces the custom layers in both models. ## Before submitting - [ ] This PR fixes a typo or improves the docs (you...

# What does this PR do? Fixes # (issue) ## Before submitting - [ ] This PR fixes a typo or improves the docs (you can dismiss the other checks...

We saw that support for JIT compilation will be added in #507. We were wondering what the plans are for ahead-of-time compilation. We are happily using flashinfer in [Text Generation...