Nicolas Patry

Results 977 comments of Nicolas Patry

Thanks for this. The code looks working, but I think it could be simplified quite a lot. Is there any source/paper for trying to do fixed sized chunking ? Before...

Happy to help with the rebase btw.

Closing this as we added support for FP8 kv cache support in https://github.com/huggingface/text-generation-inference/pull/2603. More support is coming (for pre-scaled kv-cache fp8)

We don't want to copy the python code here. This is Rust-land, the goal is to stick to the simplest possible thing. For instance `from_str` is very unrusty. Having real...

Thanks, but that doesn't apply to the `abi3-pyxx` features, does it ? Here it would something like that, but more a way to switch features in `pyproject.toml` based on the...

> #[pyo3(signature = (url, filename, max_files, chunk_size, parallel_failures=0, max_retries=0, headers=None, callback=None))] With max_files and chunk_size you should be able to throttle this. default is 100 files and 10MB chunk size

`hf_xet` uses a transfer protocol based on `hf_transfer` so the control should be the same, but yes please report to `hf_xet` if you're using it.

As previously suggested, the fix cannot be accepted as-is. It bloats the image way too much (20GB vs 12GB). First we need to reproduce locally, then figure out why the...