FlexGen icon indicating copy to clipboard operation
FlexGen copied to clipboard

Initial offloading

Open xloem opened this issue 2 years ago • 5 comments

I’m on a system hardlimited to 40GB of cpu ram + swap.

When I try to load opt-30b the process is killed from memory exhaustion. If I load the model manually using device_map="auto", offload_folder=“offload”, the load succeeds.

Is there a way to pass these flags manually or otherwise accommodate ram limits during initial loading?

xloem avatar Feb 21 '23 01:02 xloem

Do you know which line the out-of-memory happens at?

Ying1123 avatar Feb 22 '23 09:02 Ying1123

It happens while loading shards from disk inside Model.from_pretrained() on https://github.com/FMInference/FlexGen/blob/0342e2a0e93593b2c11f84be0e9f5d5bcb73e598/flexgen/opt_config.py#L146 .

xloem avatar Feb 22 '23 10:02 xloem

I'm also encountering these errors. @xloem were you able to modify the code to get it to work?

freedmand avatar Feb 22 '23 15:02 freedmand

I modified that function to pass the kwparams I mentioned, and then also called it manually before anything else was constructed, so that more ram was free, and then got farther, but encountered a later crash that I haven't looked into yet.

EDIT: it looks like the second crash is because the policy needs changing for the model and system i'm using, and the addition of the kwargs does move by this issue. i personally also added code to wipe the transformers.utils.TRANSFORMERS_CACHE after the initial download if not enough disk space remained available.

xloem avatar Feb 22 '23 15:02 xloem

In case anyone else finds this thread and are in a similar situation to me (with the opt-13b model, using a 1080Ti with 11GB VRAM + 32GB CPU RAM, with a 2GB swap file, but unlike OP, able to enlarge it) - try enlarging your swapfile. I created a 16GB swapfile and now it works.

james9001 avatar Feb 23 '23 09:02 james9001

I’m observing this issue was closed without change or explanation and am guessing maybe it is out of scope for now or would need the changes introduced as a PR.

xloem avatar Feb 26 '23 05:02 xloem

I’m observing this issue was closed without change or explanation and am guessing maybe it is out of scope for now or would need the changes introduced as a PR.

Sorry that I misread the thread and thought the problem had been resolved. I reopened it and will do it soon.

Ying1123 avatar Feb 26 '23 09:02 Ying1123

@xloem This should be fixed by #69. It is merged into the main branch. Could you try it now?

Ying1123 avatar Feb 26 '23 10:02 Ying1123

By inspection it looks like you’ve resolved the issue. I might delete the .bin file after conversion to save disk space, maybe you are and i missed it.

I tried to pull the changes but it looks like there’s been a force push and the new tip doesn’t merge with my old checkout. My test code doesn’t quite run yet against the new codebase but I’ll keep in mind you fixed this.

Thank you.

xloem avatar Feb 26 '23 14:02 xloem