LWM
LWM copied to clipboard
bash run_vision_chat.sh -- cause flax.errors.ScopeParamNotFoundError: Could not find parameter named "embedding" in scope "/transformer/wte"
While run the command of "bash scripts/run_vision_chat.sh". Error happended .How to fix it.
(lwm) llm@llm-PowerEdge-R730xd:~/projects/LWM-main$ bash scripts/run_vision_chat.sh
I0221 14:02:43.257625 139932541391232 xla_bridge.py:660] Unable to initialize backend 'rocm': NOT_FOUND: Could not find registered platform with name: "rocm". Available platform names are: CUDA
I0221 14:02:43.260045 139932541391232 xla_bridge.py:660] Unable to initialize backend 'tpu': INTERNAL: Failed to open libtpu.so: libtpu.so: cannot open shared object file: No such file or directory
100%|██████████| 1/1 [00:05<00:00, 5.59s/it]
Traceback (most recent call last):
File "/home/llm/anaconda3/envs/lwm/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/home/llm/anaconda3/envs/lwm/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/home/llm/projects/LWM-main/lwm/vision_chat.py", line 254, in
Who can give me some advice?
Can you paste your run_vision_chat.sh
script, as well as your jax/flax versions?
Thank you in advance. Related info see as belows.
run_vision_chat.sh
#! /bin/bash
export SCRIPT_DIR="$( cd -- "$( dirname -- "${BASH_SOURCE[0]}" )" &> /dev/null && pwd )" export PROJECT_DIR="$( cd -- "$( dirname -- "$SCRIPT_DIR" )" &> /dev/null && pwd )" cd $PROJECT_DIR export PYTHONPATH="$PYTHONPATH:$PROJECT_DIR"
export llama_tokenizer_path="LWM-Chat-1M-Jax/tokenizer.model" export vqgan_checkpoint="LWM-Chat-1M-Jax/vqgan" export lwm_checkpoint="LWM-Chat-1M-Jax/params" export input_file="demo.jpg"
python3 -u -m lwm.vision_chat
--prompt="What is the image about?"
--input_file="$input_file"
--vqgan_checkpoint="$vqgan_checkpoint"
--dtype='fp32'
--load_llama_config='7b'
--max_n_frames=8
--update_llama_config="dict(sample_mode='text',theta=50000000,max_sequence_length=131072,use_flash_attention=False,scan_attention=False,scan_query_chunk_size=128,scan_key_chunk_size=128,remat_attention='',scan_mlp=False,scan_mlp_chunk_size=2048,remat_mlp='',remat_block='',scan_layers=True)"
--load_checkpoint="params::$lwm_checkpoint"
--tokenizer.vocab_file="$llama_tokenizer_path"
2>&1 | tee ~/output.log
read
pip list
Package Version
absl-py 2.1.0 aiohttp 3.9.3 aiosignal 1.3.1 appdirs 1.4.4 asttokens 2.4.1 async-timeout 4.0.3 attrs 23.2.0 build 1.0.3 cachetools 5.3.2 certifi 2024.2.2 charset-normalizer 3.3.2 chex 0.1.82 click 8.1.7 cloudpickle 3.0.0 contextlib2 21.6.0 datasets 2.13.0 decorator 5.1.1 decord 0.6.0 dill 0.3.6 docker-pycreds 0.4.0 einops 0.7.0 etils 1.7.0 exceptiongroup 1.2.0 executing 2.0.1 filelock 3.13.1 flax 0.7.0 frozenlist 1.4.1 fsspec 2024.2.0 gcsfs 2024.2.0 gitdb 4.0.11 GitPython 3.1.42 google-api-core 2.17.1 google-auth 2.28.0 google-auth-oauthlib 1.2.0 google-cloud-core 2.4.1 google-cloud-storage 2.14.0 google-crc32c 1.5.0 google-resumable-media 2.7.0 googleapis-common-protos 1.62.0 huggingface-hub 0.20.3 idna 3.6 imageio 2.34.0 imageio-ffmpeg 0.4.9 importlib-resources 6.1.1 ipdb 0.13.13 ipython 8.21.0 jax 0.4.23 jaxlib 0.4.23+cuda12.cudnn89 jedi 0.19.1 markdown-it-py 3.0.0 matplotlib-inline 0.1.6 mdurl 0.1.2 ml-collections 0.1.1 ml-dtypes 0.3.2 msgpack 1.0.7 multidict 6.0.5 multiprocess 0.70.14 nest-asyncio 1.6.0 numpy 1.26.4 nvidia-cublas-cu12 12.3.4.1 nvidia-cuda-cupti-cu12 12.3.101 nvidia-cuda-nvcc-cu12 12.3.107 nvidia-cuda-nvrtc-cu12 12.3.107 nvidia-cuda-runtime-cu12 12.3.101 nvidia-cudnn-cu12 8.9.7.29 nvidia-cufft-cu12 11.0.12.1 nvidia-cusolver-cu12 11.5.4.101 nvidia-cusparse-cu12 12.2.0.103 nvidia-nccl-cu12 2.19.3 nvidia-nvjitlink-cu12 12.3.101 oauthlib 3.2.2 opt-einsum 3.3.0 optax 0.1.7 orbax-checkpoint 0.5.3 packaging 23.2 pandas 2.2.0 parso 0.8.3 pexpect 4.9.0 pillow 10.2.0 pip 23.3.1 prompt-toolkit 3.0.43 protobuf 4.25.3 psutil 5.9.8 ptyprocess 0.7.0 pure-eval 0.2.2 pyarrow 15.0.0 pyasn1 0.5.1 pyasn1-modules 0.3.0 Pygments 2.17.2 pyproject_hooks 1.0.0 python-dateutil 2.8.2 pytz 2024.1 PyYAML 6.0.1 regex 2023.12.25 requests 2.31.0 requests-oauthlib 1.3.1 rich 13.7.0 rsa 4.9 scipy 1.12.0 sentencepiece 0.2.0 sentry-sdk 1.40.5 setproctitle 1.3.3 setuptools 68.2.2 six 1.16.0 smmap 5.0.1 stack-data 0.6.3 tensorstore 0.1.53 tiktoken 0.6.0 tokenizers 0.13.3 tomli 2.0.1 toolz 0.12.1 tqdm 4.66.2 traitlets 5.14.1 transformers 4.29.2 tux 0.0.2 typing_extensions 4.9.0 tzdata 2024.1 urllib3 2.2.1 wandb 0.16.3 wcwidth 0.2.13 wheel 0.41.2 xxhash 3.4.1 yarl 1.9.4 zipp 3.17.0
I encountered the same problem, but eventually found that the cause was the incomplete download of the model file.
Thanks, Which model did you use? And are you ok to run "bash run_vision_chat.sh"?
Thanks, Which model did you use? And are you ok to run "bash run_vision_chat.sh"?
hello,I encountered the same problem, and eventually found out that the model was not fully uploaded to the server. Can you please check whether the model file size at your end is consistent? If the command runs normally, it should not throw an error. If you encounter any issues, please send it over again for further review.你应该是模型文件没下载完整,或者传输时候没传完整,但没看到这个问题,你配置的环境也没问题的,jax和flax库的版本都是对的