AlpinDale

Results 75 issues of AlpinDale

Support for mamba, including the hybrid architectures jamba and zamba.

Target all PRs to this branch. > [!WARNING] > Highly experimental branch. A lot of things may not work. Currently, the following are dysfunctional in this branch: - [ ]...

Currently produces garbage outputs.

Seems like our current implementation has an issue: ``` dynatemp_logits = logits[dynatemp_mask] ERROR: | ~~~~~~^^^^^^^^^^^^^^^ ERROR: | IndexError: The shape of the mask [1] at index 0 does not match...

Test units will be committed here as they're mass-generated and tested to be functioning locally

WIP, will revisit later after some worker refactoring. Fixes #36

### Describe the bug I've been trying to download `NousResearch/Meta-Llama-3.1-8B-Instruct` with and without `hf-transfer`, but it consistently hangs at the 10GB point (2 shards with hf-transfer, half of each without),...

bug