gemma icon indicating copy to clipboard operation
gemma copied to clipboard

Getting 'Killed' message trying to run sampling.py on 2b-it

Open davidebbo opened this issue 1 year ago • 5 comments

I'm running on WSL2/Ubuntu on Win11. Deliberately using CPU mode as my GPU is too weak. Using Python 3.10.12.

Here is the output when trying to run sampling.py:

~/gemma$ python3 examples/sampling.py --path_checkpoint=/home/david/gemma/2b-it/ --path_tokenizer=/home/david/gemma/tokenizer.model
Loading the parameters from /home/david/gemma/2b-it/
I0224 16:15:11.469793 140378916851712 checkpointer.py:164] Restoring item from /home/david/gemma/2b-it.
I0224 16:15:27.563535 140378916851712 xla_bridge.py:689] Unable to initialize backend 'cuda':
I0224 16:15:27.563766 140378916851712 xla_bridge.py:689] Unable to initialize backend 'rocm': module 'jaxlib.xla_extension' has no attribute 'GpuAllocatorConfig'
I0224 16:15:27.568676 140378916851712 xla_bridge.py:689] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
W0224 16:15:27.568863 140378916851712 xla_bridge.py:727] An NVIDIA GPU may be present on this machine, but a CUDA-enabled jaxlib is not installed. Falling back to cpu.
I0224 16:15:27.757100 140378916851712 checkpointer.py:167] Finished restoring checkpoint from /home/david/gemma/2b-it.
Parameters loaded.
Killed

Any idea what could be causing it to blow up during sampling?

davidebbo avatar Feb 24 '24 15:02 davidebbo

I am getting same error. I am using GPU

Loading checkpoint shards: 100%|███████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 4/4 [00:59<00:00, 14.83s/it] WARNING:root:Some parameters are on the meta device device because they were offloaded to the disk and cpu. /usr/local/lib/python3.9/dist-packages/transformers/generation/utils.py:1178: UserWarning: Using the model-agnostic default max_length (=20) to control the generation length. We recommend setting max_new_tokens to control the maximum length of the generation. warnings.warn( Killed

bnarasimha avatar Feb 27 '24 09:02 bnarasimha

Thinking it's some kind of resource exhaustion issue that's causing the process to just die.

davidebbo avatar Feb 27 '24 11:02 davidebbo

(

gemma-venv) rpo@MSI:~/venvs/gemma-demo$ python examples/sampling.py --path_checkpoint=/home/rpo/venvs/gemma-demo/gemmamodel/2b/ --path_tokenizer=/home/rpo/venvs/gemma-demo/gemmamodel/tokenizer.model Loading the parameters from /home/rpo/venvs/gemma-demo/gemmamodel/2b/ I0307 20:05:49.580328 139929737028032 checkpointer.py:164] Restoring item from /home/rpo/venvs/gemma-demo/gemmamodel/2b. Killed

I was successful to run the unit test, and also i have tested jax installation before with GPU. I am now stucked - does it strictly require 8GB GPU as I am on a 4GB GPU RTX 3050 instead of 8GB, the CPU is 16GB RAM

rhpoon avatar Mar 07 '24 12:03 rhpoon

I am also on WSL, it stated experimental, probably not fully tested or funcational?

rhpoon avatar Mar 07 '24 12:03 rhpoon

I had a similar issue running it on the Transformers library. It turns out it was only resource exhaustion as commented above, in particular, the process ran out of memory to allocate resources.

You can run this https://stackoverflow.com/questions/624857/finding-which-process-was-killed-by-linux-oom-killer to confirm if that was the reason of the killed process.

Also, if you are using WSL you have to confirm that you are actually getting the full 4 + 16 GB of RAM you are mentioning. You can do so with something like this https://learn.microsoft.com/en-us/answers/questions/1296124/how-to-increase-memory-and-cpu-limits-for-wsl2-win

eljoserass avatar May 30 '24 12:05 eljoserass