Nicolas Patry comments

Results 978 comments of


                                            Nicolas Patry

Adding some help for the options in `text-generation-benchmark`.

You send request: ``` input_ids = [A, B, C, A] # Those are tokens new_token_D, past = forward(input_ids) ``` That's a prefill step. Then we continue generating new tokens in...

Adding some help for the options in `text-generation-benchmark`.

> I understand that flash or sparse attention models won't use padding, but if the user generates a very long sequence, say thousands of tokens, how many of those tokens...

Adding some help for the options in `text-generation-benchmark`.

@OlivierDehaene Can we merge this ?

The HF_TRANSFER is not working for the model CalderaAI/30B-Lazarus

Try disabling it ? It should still download the model just a bit slower. `hf_transfer` is really barebones, and any flaky network might trigger issues for you (or because you're...

The HF_TRANSFER is not working for the model CalderaAI/30B-Lazarus

Isn't there a way for you to provide environement variables ? ``` HF_HUB_ENABLE_HF_TRANSFER=0 ``` Is what you are looking for.

The HF_TRANSFER is not working for the model CalderaAI/30B-Lazarus

Thanks for sharing the solution ! Closing this then

Any chance to support replit-code-v1-3b?

It should work, but you would need `--trust-remote-code` flag for it to work Can you provide a full stacktrace ?

Exclude extraneous `.bin` files from `safetensors` conversion

I think the first one is a very easy fix we could implement. Today there are 3 issues about this conversion, so maybe making it a bit more robust/effective in...

.bin weights not found for model

Hi @mayurtikundi12 You need to work with latest for this model to work. We're going to release 0.9 soon which should work. @OlivierDehaene (For vis)

[Documentation] Unclear how to use other architectures

Try with `--auto-convert false`. This error happens when trying to convert to safetensors, but it shouldn't be required for non *core* models.