galai
galai copied to clipboard
Launch requirements
Who can share what equipment specifications are needed for each of the model sizes?
bump, would like to know the VRAM requirements of each model
For interest, tested using my 3090ti w/ 24gb dedicated:
The standard model (inference), immediately got cuda memory issue... Ran model in fp16 and runs! Seems to hover around 15gb (standard, 16fp). Not scientific test though haha.
Here is one more anecdotal data point:
Getting started with the Galactica language model - Prog.World says:
The “basic” version consumes about 11 GB of memory. .... in the “standard” version, our laptop simply ran out of memory
FWIW, they also say:
Galactica currently works with Python versions 3.8 and 3.9. Model installation is not possible with version 3.10 and above. This limitation is currently due to a library requirement prompt source.
Haven't tested w/o using 16fp. Or had time to do more tests, but standard using 16fp is giving some fantastic results. I'd imagine basic will be similarly good, though.
edit: article looks like a good set up guide tho - mentions couple issues I also had to work out
Note that the "base" model works in a free colab notebook, after selecting Runtime/Change runtime time and picking "GPU".
Note that the "base" model works in a free colab notebook, after selecting Runtime/Change runtime time and picking "GPU".
Would you be so kind and share an example of a colab notebook?
@vladislavivanistsev Here is an example of a colab notebook that you should be able to run for free with a GPU runtime: galactica on Colab
I also note, from the paper:
For training the largest 120B model, we use 128 NVIDIA A100 80GB nodes. For inference Galactica 120B requires a single A100 node.
Ran model in fp16 and runs! Seems to hover around 15gb (standard, 16fp). Not scientific test though haha.
My experience is similar: the (standard, 16fp) model runs for me with no issues across two GPUs (one RTX 2080 Ti + one GTX 1080 Ti), using about 15gb total (8gb+7gb). Not scientific test either :) but wanted to mention in case it helps someone.
Where do you specify that the model should be fp16 and not fp32?
Ah Seems something like this:
model = gal.load_model("huge", num_gpus=4, dtype='float16')
Please add a column listing the inference memory requirements for the models, so people can easier judge how much GPU RAM they need for the versions.