ms-swift
ms-swift copied to clipboard
V100推理internVL-1.5问题
[INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers. [INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语). [INFO:swift] Input
exitorquitto exit the conversation. [INFO:swift] Inputmulti-lineto switch to multi-line input mode. [INFO:swift] Inputreset-systemto reset the system and clear the history. [INFO:swift] Inputclearto clear the history. [INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file. <<< 描述一下这张图片的内容 Input a media path or URL <<< https://img2.baidu.com/it/u=2085854734,3872819026&fm=253&fmt=auto&app=138&f=JPEG?w=762&h=500 Exception in thread Thread-2: Traceback (most recent call last): File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 57, in _import_flash_attn from flash_attn import flash_attn_func as _flash_attn_func ModuleNotFoundError: No module named 'flash_attn'
您好,使用V100按您教程里的来做,最后还是会遇到这个问题,请问V100显卡能绕过这个问题吗?
拉下最新代码
--use_flash_attn false
拉下最新代码
--use_flash_attn false
感谢回复,但是还是报错,麻烦您再指点下:
[INFO:swift] InternVLChatModel: 25514.1861M Params (613.0541M Trainable [2.4028%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 描述一下这张图片,尽可能详细
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 57, in _import_flash_attn
from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
return generate(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internvl_chat.py", line 353, in generate
outputs = self.language_model.generate(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
result = self._sample(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2791, in _sample
outputs = self(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2470, in _new_forward
output = old_forward(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 1052, in forward
outputs = self.model(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
output = module._old_forward(*args, **kwargs)
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 859, in forward
_import_flash_attn()
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 67, in _import_flash_attn
raise ImportError('flash_attn is not installed.')
ImportError: flash_attn is not installed.
命令如下:
CUDA_VISIBLE_DEVICES=6 swift infer --model_type internvl-chat-v1_5 --model_id_or_path /data/InternVL-Chat-V1-5-Int8/ --use_flash_attn false
感谢回复,但是还是报错,麻烦您再指点下:
int8版本还没来得及做接入兼容 原版的应该没问题
你可以尝试修改本地模型的config.json文件中的attn_implementation值,改为eager
感谢回复,但是还是报错,麻烦您再指点下:
int8版本还没来得及做接入兼容 原版的应该没问题
你可以尝试修改本地模型的
config.json文件中的attn_implementation值,改为eager
Int8按您说的改了attn_implementation值,但还是报错:
[INFO:swift] InternVLChatModel: 25514.1861M Params (613.0541M Trainable [2.4028%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 详细描述一下这张图片的内容
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
return generate(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internvl_chat.py", line 353, in generate
outputs = self.language_model.generate(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
result = self._sample(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2829, in _sample
next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0
然后不加载量化的,加载原版internvl-chat-v1_5,命令如下:
CUDA_VISIBLE_DEVICES=1,2 swift infer --model_type internvl-chat-v1_5 --model_id_or_path /data/InternVL-Chat-V1-5/ --use_flash_attn false
还是会报错:
[INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 详细解释这张图片的内容
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 57, in _import_flash_attn
from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
self.run()
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
self._target(*self._args, **self._kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
return generate(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internvl_chat.py", line 359, in generate
outputs = self.language_model.generate(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
result = self._sample(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2791, in _sample
outputs = self(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2470, in _new_forward
output = old_forward(*args, **kwargs)
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 1052, in forward
outputs = self.model(
File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
return forward_call(*args, **kwargs)
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 859, in forward
_import_flash_attn()
File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 67, in _import_flash_attn
raise ImportError('flash_attn is not installed.')
ImportError: flash_attn is not installed.
但是把原版internvl-chat-v1_5的config.json的attn_implementation改成eager,再用上边的命令就可以了!感谢大佬的工作!
感谢回馈 我明天再修改下
int8模型已兼容
对于不支持flash attention的gpu,可以使用use_flash_attn false 来正常训练和推理了