gemma
gemma copied to clipboard
Getting 'Killed' message trying to run sampling.py on 2b-it
I'm running on WSL2/Ubuntu on Win11. Deliberately using CPU mode as my GPU is too weak. Using Python 3.10.12.
Here is the output when trying to run sampling.py:
~/gemma$ python3 examples/sampling.py --path_checkpoint=/home/david/gemma/2b-it/ --path_tokenizer=/home/david/gemma/tokenizer.model
Loading the parameters from /home/david/gemma/2b-it/
I0224 16:15:11.469793 140378916851712 checkpointer.py:164] Restoring item from /home/david/gemma/2b-it.
I0224 16:15:27.563535 140378916851712 xla_bridge.py:689] Unable to initialize backend 'cuda':
I0224 16:15:27.563766 140378916851712 xla_bridge.py:689] Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
I0224 16:15:27.568676 140378916851712 xla_bridge.py:689] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
W0224 16:15:27.568863 140378916851712 xla_bridge.py:727] An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.
I0224 16:15:27.757100 140378916851712 checkpointer.py:167] Finished restoring checkpoint from /home/david/gemma/2b-it.
Parameters loaded.
Killed
Any idea what could be causing it to blow up during sampling?
I am getting same error. I am using GPU
Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:59<00:00, 14.83s/it]
WARNING:root:Some parameters are on the meta device device because they were offloaded to the disk and cpu.
/usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py:1178: UserWarning: Using the model-agnostic default max_length
(=20) to control the generation length. We recommend setting max_new_tokens
to control the maximum length of the generation.
warnings.warn(
Killed
Thinking it's some kind of resource exhaustion issue that's causing the process to just die.
(
gemma-venv) rpo@MSI:~/venvs/gemma-demo$ python examples/sampling.py --path_checkpoint=/home/rpo/venvs/gemma-demo/gemmamodel/2b/ --path_tokenizer=/home/rpo/venvs/gemma-demo/gemmamodel/tokenizer.model Loading the parameters from /home/rpo/venvs/gemma-demo/gemmamodel/2b/ I0307 20:05:49.580328 139929737028032 checkpointer.py:164] Restoring item from /home/rpo/venvs/gemma-demo/gemmamodel/2b. Killed
I was successful to run the unit test, and also i have tested jax installation before with GPU. I am now stucked - does it strictly require 8GB GPU as I am on a 4GB GPU RTX 3050 instead of 8GB, the CPU is 16GB RAM
I am also on WSL, it stated experimental, probably not fully tested or funcational?
I had a similar issue running it on the Transformers library. It turns out it was only resource exhaustion as commented above, in particular, the process ran out of memory to allocate resources.
You can run this https://stackoverflow.com/questions/624857/finding-which-process-was-killed-by-linux-oom-killer to confirm if that was the reason of the killed process.
Also, if you are using WSL you have to confirm that you are actually getting the full 4 + 16 GB of RAM you are mentioning. You can do so with something like this https://learn.microsoft.com/en-us/answers/questions/1296124/how-to-increase-memory-and-cpu-limits-for-wsl2-win