hf_transfer icon indicating copy to clipboard operation
hf_transfer copied to clipboard

Feature request: in-memory download

Open Wauplin opened this issue 1 year ago • 2 comments

Implement in-memory download in hf_transfer. Output type would be a byte array. @Narsil is this something feasible? Context: would allow in-memory download in HfFileSystem for example when doing HfFileSystem().read_bytes("hf://models/my-model/model.safetensors") without a write to disk/read from disk step. Would be interesting for integrations where a library use HF as a filesystem and we don't know where the file will be stored.

(from-memory upload would also be nice but less important).

Wauplin avatar Mar 25 '24 13:03 Wauplin

It's relatively easy to do.

I'm not sure how valuable it is on regular LLMs where hosting everything within CPU RAM is relatively wasteful (part of the transfer speed is because there's only limited amount of RAM being used, as everything is dumped to file regularly).

A simple way to try is to mount a directory to tmpfs and download there.

Narsil avatar Mar 25 '24 20:03 Narsil

Thanks for confirming it's possible @Narsil! Let's delay testing and implementation for now. I was thinking about this for an integration with a library that can already load from bytes from gcp (e.g. a gs://... link) so wanted to see if it would be possible to provide the same from hf (e.g. a hf://... link). To be as close as what already exists, I'd prefer not to download to drive and then read the file.

That's for the context. For now, let's see if the integration happens -without hf_transfer as a start- and if it makes sense we can come back to this feature request.

Wauplin avatar Mar 26 '24 16:03 Wauplin

Let's close this as stale for now

Narsil avatar Dec 23 '24 13:12 Narsil