ms-swift icon indicating copy to clipboard operation
ms-swift copied to clipboard

V100推理internVL-1.5问题

Open NLP-Learning opened this issue 1 year ago • 6 comments

[INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers. [INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语). [INFO:swift] Input exit or quit to exit the conversation. [INFO:swift] Input multi-line to switch to multi-line input mode. [INFO:swift] Input reset-system to reset the system and clear the history. [INFO:swift] Input clear to clear the history. [INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file. <<< 描述一下这张图片的内容 Input a media path or URL <<< https://img2.baidu.com/it/u=2085854734,3872819026&fm=253&fmt=auto&app=138&f=JPEG?w=762&h=500 Exception in thread Thread-2: Traceback (most recent call last): File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 57, in _import_flash_attn from flash_attn import flash_attn_func as _flash_attn_func ModuleNotFoundError: No module named 'flash_attn'

您好,使用V100按您教程里的来做,最后还是会遇到这个问题,请问V100显卡能绕过这个问题吗?

NLP-Learning avatar May 06 '24 11:05 NLP-Learning

拉下最新代码

--use_flash_attn false

hjh0119 avatar May 07 '24 05:05 hjh0119

拉下最新代码

--use_flash_attn false

感谢回复,但是还是报错,麻烦您再指点下:

[INFO:swift] InternVLChatModel: 25514.1861M Params (613.0541M Trainable [2.4028%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 描述一下这张图片,尽可能详细
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 57, in _import_flash_attn
    from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
    return generate(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internvl_chat.py", line 353, in generate
    outputs = self.language_model.generate(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
    result = self._sample(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2791, in _sample
    outputs = self(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2470, in _new_forward
    output = old_forward(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 1052, in forward
    outputs = self.model(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/accelerate/hooks.py", line 166, in new_forward
    output = module._old_forward(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 859, in forward
    _import_flash_attn()
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internlm2.py", line 67, in _import_flash_attn
    raise ImportError('flash_attn is not installed.')
ImportError: flash_attn is not installed.

命令如下:

CUDA_VISIBLE_DEVICES=6 swift infer --model_type internvl-chat-v1_5 --model_id_or_path /data/InternVL-Chat-V1-5-Int8/ --use_flash_attn false

NLP-Learning avatar May 07 '24 11:05 NLP-Learning

感谢回复,但是还是报错,麻烦您再指点下:

int8版本还没来得及做接入兼容 原版的应该没问题

你可以尝试修改本地模型的config.json文件中的attn_implementation值,改为eager

hjh0119 avatar May 07 '24 13:05 hjh0119

感谢回复,但是还是报错,麻烦您再指点下:

int8版本还没来得及做接入兼容 原版的应该没问题

你可以尝试修改本地模型的config.json文件中的attn_implementation值,改为eager

Int8按您说的改了attn_implementation值,但还是报错:

[INFO:swift] InternVLChatModel: 25514.1861M Params (613.0541M Trainable [2.4028%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 详细描述一下这张图片的内容
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
    return generate(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5-Int8/modeling_internvl_chat.py", line 353, in generate
    outputs = self.language_model.generate(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
    result = self._sample(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2829, in _sample
    next_tokens = torch.multinomial(probs, num_samples=1).squeeze(1)
RuntimeError: probability tensor contains either `inf`, `nan` or element < 0

然后不加载量化的,加载原版internvl-chat-v1_5,命令如下:

CUDA_VISIBLE_DEVICES=1,2 swift infer --model_type internvl-chat-v1_5 --model_id_or_path /data/InternVL-Chat-V1-5/ --use_flash_attn false

还是会报错:

[INFO:swift] InternVLChatModel: 25514.1861M Params (25514.1861M Trainable [100.0000%]), 402.6563M Buffers.
[INFO:swift] system: You are an AI assistant whose name is InternLM (书生·浦语).
[INFO:swift] Input `exit` or `quit` to exit the conversation.
[INFO:swift] Input `multi-line` to switch to multi-line input mode.
[INFO:swift] Input `reset-system` to reset the system and clear the history.
[INFO:swift] Input `clear` to clear the history.
[INFO:swift] Please enter the conversation content first, followed by the path to the multimedia file.
<<< 详细解释这张图片的内容
Input a media path or URL <<< http://t13.baidu.com/it/u=2673063178,3630151739&fm=224&app=112&f=JPEG?w=500&h=500
Exception in thread Thread-2:
Traceback (most recent call last):
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 57, in _import_flash_attn
    from flash_attn import flash_attn_func as _flash_attn_func
ModuleNotFoundError: No module named 'flash_attn'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 980, in _bootstrap_inner
    self.run()
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/threading.py", line 917, in run
    self._target(*self._args, **self._kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2447, in _new_generate
    return generate(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internvl_chat.py", line 359, in generate
    outputs = self.language_model.generate(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
    return func(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 1622, in generate
    result = self._sample(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/generation/utils.py", line 2791, in _sample
    outputs = self(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/swift/llm/utils/model.py", line 2470, in _new_forward
    output = old_forward(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 1052, in forward
    outputs = self.model(
  File "/data/sunyuan/anaconda3/envs/swift/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1501, in _call_impl
    return forward_call(*args, **kwargs)
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 859, in forward
    _import_flash_attn()
  File "/home/sunyuan/.cache/huggingface/modules/transformers_modules/InternVL-Chat-V1-5/modeling_internlm2.py", line 67, in _import_flash_attn
    raise ImportError('flash_attn is not installed.')
ImportError: flash_attn is not installed.

但是把原版internvl-chat-v1_5config.jsonattn_implementation改成eager,再用上边的命令就可以了!感谢大佬的工作!

image

NLP-Learning avatar May 07 '24 15:05 NLP-Learning

感谢回馈 我明天再修改下

hjh0119 avatar May 07 '24 16:05 hjh0119

int8模型已兼容

对于不支持flash attention的gpu,可以使用use_flash_attn false 来正常训练和推理了

hjh0119 avatar May 08 '24 11:05 hjh0119