Nicolas Patry

Results 978 comments of Nicolas Patry

How many pages are there in the document ? If many you could be just measuring HTTP calls vs raw function calls You are also running on CPU where flash-attention...

Seems 100% like a bug to me. The behaviour should be either raising an exception (because no unk token was defined) or returning `[UNK]` as you expect. I opened a...

Any idea how that file was created and by whom ? This model uses a single storage as backend for all the tensors. Something like ``` A = torch.zeros((26_000_000_000,)) q_proj...

Do you mean this : https://github.com/huggingface/transformers/tree/main/src/transformers/models/longt5 ? It is supported like all `transformers` models in a "best effort" mode. That means it should work, but TP sharding, flash attention are...

> [2mrouter/src/main.rs [0m [2m: [0m [2m166: [0m Rust input length validation and truncation is disabled Note: this will hurt performance The rust router cannot count tokens and will assume longest...

No, router requires to be able to use a `Fast` version of the tokenizer (so usually just having `tokenizer.json` present in the repo). This is because the router is written...

So far `parameters` are accepted but not implemented ever. The pipelines don't even receive them, and the post processing in `api-inference-community` purely serializes the output. (See code here : https://github.com/huggingface/huggingface_hub/search?q=top_k)....

The params are purely not sent, no ? https://github.com/huggingface/huggingface_hub/blob/main/api-inference-community/api_inference_community/routes.py#L73 The scaffolding is there, so enabling them should be not too hard (but might required changing the signature of all previous...

FWIW I ran tokenizers tests on transformers and didn't find any related error in the test suite. ``` RUN_SLOW=1 pytest -sv tests/ -k tokenizers AILED tests/tokenization/test_tokenization_utils.py::TokenizerUtilsTest::test_pretrained_tokenizers - AttributeError: type object...