galai icon indicating copy to clipboard operation
galai copied to clipboard

Getting ZeroDivisionError when using `galai` module

Open phineas-pta opened this issue 1 year ago • 0 comments

I keep getting the ZeroDivisionError with the galai module

import galai
model = galai.load_model("mini", num_gpus = 1)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "███/myconda/condaGA/lib/python3.7/site-packages/galai/__init__.py", line 39, in load_model
    model._load_checkpoint(checkpoint_path=get_checkpoint_path(name))
  File "███/myconda/condaGA/lib/python3.7/site-packages/galai/model.py", line 69, in _load_checkpoint
    offload_state_dict=True
  File "███/myconda/condaGA/lib/python3.7/site-packages/accelerate/big_modeling.py", line 358, in load_checkpoint_and_dispatch
    low_zero=(device_map == "balanced_low_0"),
  File "███/myconda/condaGA/lib/python3.7/site-packages/accelerate/utils/modeling.py", line 370, in get_balanced_memory
    per_gpu = module_sizes[""] // (num_devices - 1 if low_zero else num_devices)
ZeroDivisionError: integer division or modulo by zero

But if I use the transformers module it works perfectly on GPU

phineas-pta avatar Nov 24 '22 21:11 phineas-pta