aphrodite-engine icon indicating copy to clipboard operation
aphrodite-engine copied to clipboard

[Misc]: Building docker container requires insane amount of memory

Open mrseeker opened this issue 1 year ago • 7 comments

Anything you want to discuss about Aphrodite.

I am trying to build a custom version of Aphrodite, however during the build of the Aphrodite engine with docker I need an insane amount of Memory and CPU. Is there a way to reduce this?

I already tried setting "MAX_JOBS=1" but that did not help.

mrseeker avatar Mar 21 '24 08:03 mrseeker

Where did you set the MAX_JOBS variable? It should be set in the Dockerfile right before the build command towards the end.

AlpinDale avatar Mar 21 '24 08:03 AlpinDale

I tried to set it at line 30 in the Dockerfile, but it still receives "Killed" by the OOM Killer.

mrseeker avatar Mar 21 '24 11:03 mrseeker

Perhaps it would be best to pull the aphrodite package from pypi instead of building it in the docker. pip install aphrodite-engine==0.5.1 should do it.

AlpinDale avatar Mar 21 '24 11:03 AlpinDale

I don't think the Aphrodite package supports custom-made AWS endpoints...

mrseeker avatar Mar 21 '24 11:03 mrseeker

with MAX_JOBS 2 compile ok with 64gb of ram

puppetm4st3r avatar Mar 21 '24 11:03 puppetm4st3r

Ah right this reminds me, @mrseeker , we build for all GPU architectures which may take more time and use more memory. You can try getting rid of the export for torch cuda arch list, that'll probably help.

AlpinDale avatar Mar 21 '24 11:03 AlpinDale

Found out that if I changed the arch list to just include the arch that I need, then it's scaling down to almost 90Gb when compiling...

mrseeker avatar Mar 21 '24 12:03 mrseeker