VideoLLaMB
VideoLLaMB copied to clipboard
Faild run demo cause of version conflict
During the process of testing the demo, I encountered several version conflicts. I have tried the following two setups:
1. pytorchvideo==0.1.5, torch==2.2.1, torchvision==0.17.1, with nvcc version cuda_12.4.r12.4
2. torch==2.1.0, torchvision==0.16.0, with nvcc version cuda_12.4.r12.4
I followed the installation process as guided in the documentation::
pip install -e ".[train]"
pip install flash-attn --no-build-isolation
pip install flash-attn --no-build-isolation --no-cache-dir
So,what are the correct environment dependencies and version dependencies?
The error message:
python -m llava.serve.cli --model-path model --video-file demo/1.c.mp4
/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/site-packages/torchvision/transforms/_functional_video.py:6: UserWarning: The 'torchvision.transforms._functional_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms.functional' module instead.
warnings.warn(
/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/site-packages/torchvision/transforms/_transforms_video.py:22: UserWarning: The 'torchvision.transforms._transforms_video' module is deprecated since 0.12 and will be removed in the future. Please use the 'torchvision.transforms' module instead.
warnings.warn(
Traceback (most recent call last):
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/runpy.py", line 110, in _get_module_details
__import__(pkg_name)
File "/root/huangjch/VideoLLaMB/llava/__init__.py", line 1, in <module>
from .model import LlavaLlamaForCausalLM
File "/root/huangjch/VideoLLaMB/llava/model/__init__.py", line 8, in <module>
from .language_model.llava_llama import LlavaLlamaForCausalLM, LlavaConfig
File "/root/huangjch/VideoLLaMB/llava/model/language_model/llava_llama.py", line 27, in <module>
from ..llava_arch import LlavaMetaModel, LlavaMetaForCausalLM
File "/root/huangjch/VideoLLaMB/llava/model/llava_arch.py", line 22, in <module>
from .multimodal_encoder.builder import build_image_tower, build_video_tower
File "/root/huangjch/VideoLLaMB/llava/model/multimodal_encoder/builder.py", line 10, in <module>
from .languagebind import LanguageBindVideoTower, LanguageBindImageTower, RMTLanguageBindVideoTower
File "/root/huangjch/VideoLLaMB/llava/model/multimodal_encoder/languagebind/__init__.py", line 13, in <module>
from .video.processing_video import LanguageBindVideoProcessor
File "/root/huangjch/VideoLLaMB/llava/model/multimodal_encoder/languagebind/video/processing_video.py", line 21, in <module>
from pytorchvideo.transforms import ApplyTransformToKey, ShortSideScale, UniformTemporalSubsample
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/site-packages/pytorchvideo/transforms/__init__.py", line 3, in <module>
from .augmix import AugMix # noqa
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/site-packages/pytorchvideo/transforms/augmix.py", line 6, in <module>
from pytorchvideo.transforms.augmentations import (
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/site-packages/pytorchvideo/transforms/augmentations.py", line 9, in <module>
import torchvision.transforms.functional_tensor as F_t
ModuleNotFoundError: No module named 'torchvision.transforms.functional_tensor'
python -m llava.serve.cli --model-path model --video-file demo/1.c.mp4
Traceback (most recent call last):
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/runpy.py", line 187, in _run_module_as_main
mod_name, mod_spec, code = _get_module_details(mod_name, _Error)
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/runpy.py", line 110, in _get_module_details
__import__(pkg_name)
File "/root/huangjch/VideoLLaMB/llava/__init__.py", line 1, in <module>
from .model import LlavaLlamaForCausalLM
File "/root/huangjch/VideoLLaMB/llava/model/__init__.py", line 8, in <module>
from .language_model.llava_llama import LlavaLlamaForCausalLM, LlavaConfig
File "/root/huangjch/VideoLLaMB/llava/model/language_model/llava_llama.py", line 21, in <module>
from transformers import AutoConfig, AutoModelForCausalLM, \
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/site-packages/transformers/__init__.py", line 26, in <module>
from . import dependency_versions_check
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/site-packages/transformers/dependency_versions_check.py", line 16, in <module>
from .utils.versions import require_version, require_version_core
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/site-packages/transformers/utils/__init__.py", line 33, in <module>
from .generic import (
File "/usr/local/bin/miniconda3/envs/videollamb/lib/python3.10/site-packages/transformers/utils/generic.py", line 478, in <module>
_torch_pytree.register_pytree_node(
AttributeError: module 'torch.utils._pytree' has no attribute 'register_pytree_node'. Did you mean: '_register_pytree_node'?
Sorry for any inconvenience. Following are the version details of the related packages :
pytorchvideo 0.1.5 pypi_0 pypi
torch 2.2.1+cu118 pypi_0 pypi
torchaudio 2.2.1+cu118 pypi_0 pypi
torchvision 0.17.1+cu118 pypi_0 pypi
transformers 4.39.1 pypi_0 pypi
for the torchvision issue, you can refer to this reply https://github.com/AUTOMATIC1111/stable-diffusion-webui/issues/13985#issuecomment-1814870368