compilade

Results 109 comments of compilade

> > @ngxson I'm not sure how this will interact with multithreaded remote tensor fetching in #12820; there may be a lot of threads at once. > > Yes, I...

@ngxson I've thought about some situations regarding multi-threaded download as in #12820. The numbered top-level statements are the two situations I'm considering, and they each get split further into two...

> I assume that for very big models like Scout or Maverick, either having multithread per tensor or multithread per write, the time it takes will be the same, because...

@ngxson I've tested more cases, and it seems to still give identical resulting files as `master`, even when outputting sharded models or when there's definitely padding (all tested with 4...

Note that I did not yet test on a slow HDD, only an external SSD. I don't know to what extent slower drives are affected by read/write contention from multiple...

@ngxson I've tried this with the remote lazy tensors from , and when there's a connection error the thread which does the download raises an exception but that doesn't stop...

> hey @compilade , is there any chance we can accelerate this a bit? @ngxson The main remaining concern is that threads in Python make error handling more difficult. I...

> I think it probably fine to kill the whole process if one of the thread is KO @ngxson Right, but ideally graceful error handling would be preferred (in this...

I recently found out about , which might make thread cancellation usable. > I think python make it complicated because they don't want people to do unsafe stuff with multithreading....

> Hey @compilade , thanks for implementing this! > > I tried converting https://huggingface.co/mistralai/Mamba-Codestral-7B-v0.1 using `convert_hf_to_gguf.py`, but it gives error: > > ``` > with open(dir_model / "config.json", "r", encoding="utf-8")...