Yi
Yi copied to clipboard
sft报错:ValueError: YiForCausalLM does not support Flash Attention 2.0 yet.
根据README运行sft脚本:
cd finetune/scripts
bash run_sft_Yi_6b.sh
报错信息
[2024-01-02 10:43:01,920] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 10:43:04,373] [WARNING] [runner.py:203:fetch_hostfile] Unable to find hostfile, will proceed with training with local resources only.
Detected CUDA_VISIBLE_DEVICES=0,1,2,3: setting --include=localhost:0,1,2,3
[2024-01-02 10:43:04,373] [INFO] [runner.py:570:main] cmd = /data/xxxx/conda/miniconda/envs/llm_yi/bin/python -u -m deepspeed.launcher.launch --world_info=eyJsb2NhbGhvc3QiOiBbMCwgMSwgMiwgM119 --master_addr=127.0.0.1 --master_port=29500 --enable_each_rank_log=None main.py --data_path ../yi_example_dataset/ --model_name_or_path /xxxxYi/Yi-6B --per_device_train_batch_size 1 --per_device_eval_batch_size 1 --max_seq_len 4096 --learning_rate 2e-6 --weight_decay 0. --num_train_epochs 4 --training_debug_steps 20 --gradient_accumulation_steps 1 --lr_scheduler_type cosine --num_warmup_steps 0 --seed 1234 --gradient_checkpointing --zero_stage 2 --deepspeed --offload --output_dir ./finetuned_model
[2024-01-02 10:43:06,184] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 10:43:07,963] [INFO] [launch.py:145:main] WORLD INFO DICT: {'localhost': [0, 1, 2, 3]}
[2024-01-02 10:43:07,963] [INFO] [launch.py:151:main] nnodes=1, num_local_procs=4, node_rank=0
[2024-01-02 10:43:07,963] [INFO] [launch.py:162:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0, 1, 2, 3]})
[2024-01-02 10:43:07,963] [INFO] [launch.py:163:main] dist_world_size=4
[2024-01-02 10:43:07,963] [INFO] [launch.py:165:main] Setting CUDA_VISIBLE_DEVICES=0,1,2,3
[2024-01-02 10:43:09,740] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 10:43:09,820] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 10:43:09,829] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 10:43:09,869] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
[2024-01-02 10:43:11,832] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-01-02 10:43:11,832] [INFO] [comm.py:668:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
[2024-01-02 10:43:12,099] [INFO] [comm.py:637:init_distributed] cdb=None
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
[2024-01-02 10:43:12,181] [INFO] [comm.py:637:init_distributed] cdb=None
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
[2024-01-02 10:43:12,240] [INFO] [comm.py:637:init_distributed] cdb=None
tokenizer path existtokenizer path existtokenizer path exist
tokenizer path exist
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3456, in from_pretrained
config = cls._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1382, in _check_and_enable_flash_attn_2
raise ValueError(
ValueErrorThe model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
: YiForCausalLM does not support Flash Attention 2.0 yet. Please open an issue on GitHub to request support for this architecture: https://github.com/huggingface/transformers/issues/new
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3456, in from_pretrained
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
config = cls._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1382, in _check_and_enable_flash_attn_2
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3456, in from_pretrained
raise ValueError(
ValueError: YiForCausalLM does not support Flash Attention 2.0 yet. Please open an issue on GitHub to request support for this architecture: https://github.com/huggingface/transformers/issues/new
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
config = cls._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 561, in from_pretrained
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1382, in _check_and_enable_flash_attn_2
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3456, in from_pretrained
raise ValueError(
ValueError: YiForCausalLM does not support Flash Attention 2.0 yet. Please open an issue on GitHub to request support for this architecture: https://github.com/huggingface/transformers/issues/new
config = cls._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1382, in _check_and_enable_flash_attn_2
raise ValueError(
ValueError: YiForCausalLM does not support Flash Attention 2.0 yet. Please open an issue on GitHub to request support for this architecture: https://github.com/huggingface/transformers/issues/new
[2024-01-02 10:43:16,976] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 90923
[2024-01-02 10:43:16,991] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 90924
[2024-01-02 10:43:16,991] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 90925
[2024-01-02 10:43:16,998] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 90926
[2024-01-02 10:43:17,006] [ERROR] [launch.py:321:sigkill_handler] ['/data/xxxx/conda/miniconda/envs/llm_yi/bin/python', '-u', 'main.py', '--local_rank=3', '--data_path', '../yi_example_dataset/', '--model_name_or_path', '/xxxxYi/Yi-6B', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1', '--max_seq_len', '4096', '--learning_rate', '2e-6', '--weight_decay', '0.', '--num_train_epochs', '4', '--training_debug_steps', '20', '--gradient_accumulation_steps', '1', '--lr_scheduler_type', 'cosine', '--num_warmup_steps', '0', '--seed', '1234', '--gradient_checkpointing', '--zero_stage', '2', '--deepspeed', '--offload', '--output_dir', './finetuned_model'] exits with return code = 1
环境
GPU A100 * 4
config.json
{
"architectures": [
"YiForCausalLM"
],
"auto_map": {
"AutoConfig": "configuration_yi.YiConfig",
"AutoModel": "modeling_yi.YiModel",
"AutoModelForCausalLM":"modeling_yi.YiForCausalLM"
},
"bos_token_id": 1,
"eos_token_id": 2,
"hidden_act": "silu",
"hidden_size": 4096,
"initializer_range": 0.02,
"intermediate_size": 11008,
"max_position_embeddings": 200000,
"model_type": "Yi",
"num_attention_heads": 32,
"num_hidden_layers": 32,
"num_key_value_heads": 4,
"pad_token_id": 0,
"rms_norm_eps": 1e-05,
"rope_theta": 5000000.0,
"tie_word_embeddings": false,
"torch_dtype": "bfloat16",
"transformers_version": "4.34.0",
"use_cache": true,
"vocab_size": 64000
}
python 库
accelerate 0.23.0
aiohttp 3.8.6
aiosignal 1.3.1
annotated-types 0.6.0
asttokens 2.4.1
async-timeout 4.0.3
attrs 23.1.0
beautifulsoup4 4.12.2
certifi 2023.7.22
charset-normalizer 3.3.0
click 8.1.7
cmake 3.27.7
comm 0.2.0
conda-pack 0.7.1
datasets 2.14.5
debugpy 1.8.0
decorator 5.1.1
deepspeed 0.12.2
dill 0.3.7
einops 0.7.0
exceptiongroup 1.2.0
executing 2.0.1
filelock 3.12.4
flash-attn 2.3.3
frozenlist 1.4.0
fsspec 2023.6.0
hjson 3.1.0
huggingface-hub 0.20.1
idna 3.4
ipykernel 6.28.0
ipython 8.19.0
jedi 0.19.1
Jinja2 3.1.2
jsonschema 4.20.0
jsonschema-specifications 2023.12.1
jupyter_client 8.6.0
jupyter_core 5.5.1
lit 17.0.2
MarkupSafe 2.1.3
matplotlib-inline 0.1.6
mpmath 1.3.0
msgpack 1.0.7
multidict 6.0.4
multiprocess 0.70.15
nest-asyncio 1.5.8
networkx 3.1
ninja 1.11.1.1
numpy 1.26.0
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-cupti-cu11 11.7.101
nvidia-cuda-nvrtc-cu11 11.7.99
nvidia-cuda-runtime-cu11 11.7.99
nvidia-cudnn-cu11 8.5.0.96
nvidia-cufft-cu11 10.9.0.58
nvidia-curand-cu11 10.2.10.91
nvidia-cusolver-cu11 11.4.0.1
nvidia-cusparse-cu11 11.7.4.91
nvidia-nccl-cu11 2.14.3
nvidia-nvtx-cu11 11.7.91
packaging 23.2
pandas 2.1.1
parso 0.8.3
pexpect 4.9.0
pip 23.2.1
platformdirs 4.1.0
prompt-toolkit 3.0.43
protobuf 4.25.1
psutil 5.9.5
ptyprocess 0.7.0
pure-eval 0.2.2
py-cpuinfo 9.0.0
pyarrow 13.0.0
pydantic 2.4.2
pydantic_core 2.10.1
Pygments 2.17.2
pynvml 11.5.0
python-dateutil 2.8.2
pytz 2023.3.post1
PyYAML 6.0.1
pyzmq 25.1.2
ray 2.7.0
referencing 0.32.0
regex 2023.10.3
requests 2.31.0
rpds-py 0.16.2
safetensors 0.4.0
sentencepiece 0.1.99
setuptools 68.0.0
six 1.16.0
soupsieve 2.5
stack-data 0.6.3
sympy 1.12
tokenizers 0.15.0
torch 2.0.1
tornado 6.4
tqdm 4.66.1
traitlets 5.14.0
transformers 4.36.2
triton 2.0.0
typing_extensions 4.8.0
tzdata 2023.3
urllib3 2.0.6
wcwidth 0.2.12
wheel 0.41.2
xxhash 3.4.1
yarl 1.9.2
经指点,需要下载最新的模型文件
下载最新的模型文件后再进行sft,报如下错误
[2024-01-02 15:57:58,042] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 15:57:59,681] [INFO] [launch.py:145:main] WORLD INFO DICT: {'localhost': [0, 1, 2, 3, 4, 5, 6, 7]}
[2024-01-02 15:57:59,681] [INFO] [launch.py:151:main] nnodes=1, num_local_procs=8, node_rank=0
[2024-01-02 15:57:59,681] [INFO] [launch.py:162:main] global_rank_mapping=defaultdict(<class 'list'>, {'localhost': [0, 1, 2, 3, 4, 5, 6, 7]})
[2024-01-02 15:57:59,681] [INFO] [launch.py:163:main] dist_world_size=8
[2024-01-02 15:57:59,681] [INFO] [launch.py:165:main] Setting CUDA_VISIBLE_DEVICES=0,1,2,3,4,5,6,7
[2024-01-02 15:58:01,434] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 15:58:01,455] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 15:58:01,491] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 15:58:01,501] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 15:58:01,513] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 15:58:01,525] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 15:58:01,542] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
[2024-01-02 15:58:01,548] [INFO] [real_accelerator.py:158:get_accelerator] Setting ds_accelerator to cuda (auto detect)
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
[2024-01-02 15:58:04,583] [INFO] [comm.py:637:init_distributed] cdb=None
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
[2024-01-02 15:58:05,357] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-01-02 15:58:05,357] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-01-02 15:58:05,362] [INFO] [comm.py:637:init_distributed] cdb=None
/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/deepspeed.py:23: FutureWarning: transformers.deepspeed module is deprecated and will be removed in a future version. Please import deepspeed modules directly from transformers.integrations
warnings.warn(
[2024-01-02 15:58:05,393] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-01-02 15:58:05,393] [INFO] [comm.py:668:init_distributed] Initializing TorchBackend in DeepSpeed with backend nccl
[2024-01-02 15:58:05,393] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-01-02 15:58:05,395] [INFO] [comm.py:637:init_distributed] cdb=None
[2024-01-02 15:58:05,401] [INFO] [comm.py:637:init_distributed] cdb=None
tokenizer path existtokenizer path existtokenizer path exist
tokenizer path exist
tokenizer path existtokenizer path exist
tokenizer path exist
tokenizer path exist
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1108, in __init__
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
super().__init__(config)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1190, in __init__
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
config = self._autoset_attn_implementation(model = create_hf_model(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1422, in _check_and_enable_flash_attn_2
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
raise ValueError(
ValueError: Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes. You passed torch.float32, this might lead to unexpected behaviour.
model = cls(config, *model_args, **model_kwargs)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1108, in __init__
super().__init__(config)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1190, in __init__
config = self._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1422, in _check_and_enable_flash_attn_2
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
raise ValueError(
ValueError: Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes. You passed torch.float32, this might lead to unexpected behaviour.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1108, in __init__
super().__init__(config)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1190, in __init__
config = self._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1422, in _check_and_enable_flash_attn_2
raise ValueError(
ValueError: Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes. You passed torch.float32, this might lead to unexpected behaviour.
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
The model was loaded with use_flash_attention_2=True, which is deprecated and may be removed in a future release. Please use `attn_implementation="flash_attention_2"` instead.
model = cls(config, *model_args, **model_kwargs)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1108, in __init__
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
super().__init__(config)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1190, in __init__
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = cls(config, *model_args, **model_kwargs)config = self._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1108, in __init__
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
Traceback (most recent call last):
return model_class.from_pretrained( File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
super().__init__(config)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1190, in __init__
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1422, in _check_and_enable_flash_attn_2
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
config = self._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
raise ValueError(
ValueError: Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes. You passed torch.float32, this might lead to unexpected behaviour.
You are attempting to use Flash Attention 2.0 without specifying a torch dtype. This might lead to unexpected behaviour
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1422, in _check_and_enable_flash_attn_2
You are attempting to use Flash Attention 2.0 with a model not initialized on GPU. Make sure to move the model to GPU after initializing it on CPU with `model.to('cuda')`.
model = cls(config, *model_args, **model_kwargs)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1108, in __init__
raise ValueError(
ValueError: Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes. You passed torch.float32, this might lead to unexpected behaviour.
super().__init__(config)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1190, in __init__
model = cls(config, *model_args, **model_kwargs)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1108, in __init__
config = self._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
super().__init__(config)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1190, in __init__
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1422, in _check_and_enable_flash_attn_2
config = self._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
raise ValueError(
ValueError: Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes. You passed torch.float32, this might lead to unexpected behaviour.
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1422, in _check_and_enable_flash_attn_2
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 253, in main
model = create_hf_model(
File "/data/xxxx/ai_parse/Yi/finetune/utils/model/model_utils.py", line 30, in create_hf_model
model = model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/auto/auto_factory.py", line 566, in from_pretrained
raise ValueError(
ValueError: Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes. You passed torch.float32, this might lead to unexpected behaviour.
return model_class.from_pretrained(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 3462, in from_pretrained
model = cls(config, *model_args, **model_kwargs)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/models/llama/modeling_llama.py", line 1108, in __init__
super().__init__(config)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1190, in __init__
config = self._autoset_attn_implementation(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1302, in _autoset_attn_implementation
cls._check_and_enable_flash_attn_2(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/transformers/modeling_utils.py", line 1422, in _check_and_enable_flash_attn_2
raise ValueError(
ValueError: Flash Attention 2.0 only supports torch.float16 and torch.bfloat16 dtypes. You passed torch.float32, this might lead to unexpected behaviour.
[2024-01-02 15:58:14,702] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 59071
[2024-01-02 15:58:14,719] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 59072
[2024-01-02 15:58:14,777] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 59073
[2024-01-02 15:58:14,784] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 59074
[2024-01-02 15:58:14,790] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 59075
[2024-01-02 15:58:14,791] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 59076
[2024-01-02 15:58:14,797] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 59077
[2024-01-02 15:58:14,803] [INFO] [launch.py:315:sigkill_handler] Killing subprocess 59078
[2024-01-02 15:58:14,810] [ERROR] [launch.py:321:sigkill_handler] ['/data/xxxx/conda/miniconda/envs/llm_yi/bin/python', '-u', 'main.py', '--local_rank=7', '--data_path', '../yi_example_dataset/', '--model_name_or_path', '/xxxx/Yi/Yi-6B', '--per_device_train_batch_size', '1', '--per_device_eval_batch_size', '1', '--max_seq_len', '4096', '--learning_rate', '2e-6', '--weight_decay', '0.', '--num_train_epochs', '4', '--training_debug_steps', '20', '--gradient_accumulation_steps', '1', '--lr_scheduler_type', 'cosine', '--num_warmup_steps', '0', '--seed', '1234', '--gradient_checkpointing', '--zero_stage', '2', '--deepspeed', '--offload', '--output_dir', './finetuned_model'] exits with return code = 1
Similar to this issue. Check the solution provided there might be helpful
Similar to this issue. Check the solution provided there might be helpful
是因为Yi的 sft 代码里少了这两行参数:
加上之后,再次运行,报新的错如下:
Loading extension module cpu_adam...
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 330, in main
optimizer = AdamOptimizer(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__
self.ds_opt_adam = CPUAdamBuilder().load()
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 452, in load
return self.jit_load(verbose)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 501, in jit_load
op_module = load(name=self.name,
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1535, in _jit_compile
return _import_module_from_library(name, build_directory, is_python_module)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1929, in _import_module_from_library
module = importlib.util.module_from_spec(spec)
File "<frozen importlib._bootstrap>", line 571, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 1176, in create_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
ImportError: /data/xxxx/.cache/torch_extensions/py310_cu117/cpu_adam/cpu_adam.so: cannot open shared object file: No such file or directory
Loading extension module cpu_adam...
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 330, in main
optimizer = AdamOptimizer(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__
self.ds_opt_adam = CPUAdamBuilder().load()
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 452, in load
return self.jit_load(verbose)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 501, in jit_load
op_module = load(name=self.name,
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
Loading extension module cpu_adam...
return _jit_compile(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1535, in _jit_compile
Traceback (most recent call last):
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module>
return _import_module_from_library(name, build_directory, is_python_module)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1929, in _import_module_from_library
main()
File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 330, in main
optimizer = AdamOptimizer(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__
self.ds_opt_adam = CPUAdamBuilder().load()
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 452, in load
module = importlib.util.module_from_spec(spec)
File "<frozen importlib._bootstrap>", line 571, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 1176, in create_module
return self.jit_load(verbose)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 501, in jit_load
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
ImportError: /data/xxxx/.cache/torch_extensions/py310_cu117/cpu_adam/cpu_adam.so: cannot open shared object file: No such file or directory
op_module = load(name=self.name,
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load
return _jit_compile(
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1535, in _jit_compile
return _import_module_from_library(name, build_directory, is_python_module)
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1929, in _import_module_from_library
module = importlib.util.module_from_spec(spec)
File "<frozen importlib._bootstrap>", line 571, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 1176, in create_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
ImportError: /data/xxxx/.cache/torch_extensions/py310_cu117/cpu_adam/cpu_adam.so: cannot open shared object file: No such file or directory
Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7f9aa027f520>
Traceback (most recent call last):
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__
self.ds_opt_adam.destroy_adam(self.opt_id)
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7fc984fc7520>
Traceback (most recent call last):
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__
self.ds_opt_adam.destroy_adam(self.opt_id)
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7fd85d71b520>
Traceback (most recent call last):
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__
self.ds_opt_adam.destroy_adam(self.opt_id)
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7efcd27c7520>
Traceback (most recent call last):
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7fb0b7ed7520>
Traceback (most recent call last):
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__
self.ds_opt_adam.destroy_adam(self.opt_id)
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7ff36df6b520>
Traceback (most recent call last):
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__
self.ds_opt_adam.destroy_adam(self.opt_id)
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7f7f1a357520>
Traceback (most recent call last):
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__
self.ds_opt_adam.destroy_adam(self.opt_id)
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7f834245f520>
Traceback (most recent call last):
File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__
self.ds_opt_adam.destroy_adam(self.opt_id)
AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
Similar to this issue. Check the solution provided there might be helpful
是因为Yi的 sft 代码里少了这两行参数:
加上之后,再次运行,报新的错如下:
Loading extension module cpu_adam... Traceback (most recent call last): File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module> main() File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 330, in main optimizer = AdamOptimizer( File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 452, in load return self.jit_load(verbose) File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 501, in jit_load op_module = load(name=self.name, File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load return _jit_compile( File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1535, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1929, in _import_module_from_library module = importlib.util.module_from_spec(spec) File "<frozen importlib._bootstrap>", line 571, in module_from_spec File "<frozen importlib._bootstrap_external>", line 1176, in create_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed ImportError: /data/xxxx/.cache/torch_extensions/py310_cu117/cpu_adam/cpu_adam.so: cannot open shared object file: No such file or directory Loading extension module cpu_adam... Traceback (most recent call last): File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module> main() File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 330, in main optimizer = AdamOptimizer( File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 452, in load return self.jit_load(verbose) File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 501, in jit_load op_module = load(name=self.name, File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load Loading extension module cpu_adam... return _jit_compile( File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1535, in _jit_compile Traceback (most recent call last): File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 415, in <module> return _import_module_from_library(name, build_directory, is_python_module) File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1929, in _import_module_from_library main() File "/data/xxxx/ai_parse/Yi/finetune/sft/main.py", line 330, in main optimizer = AdamOptimizer( File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 94, in __init__ self.ds_opt_adam = CPUAdamBuilder().load() File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 452, in load module = importlib.util.module_from_spec(spec) File "<frozen importlib._bootstrap>", line 571, in module_from_spec File "<frozen importlib._bootstrap_external>", line 1176, in create_module return self.jit_load(verbose) File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/op_builder/builder.py", line 501, in jit_load File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed ImportError: /data/xxxx/.cache/torch_extensions/py310_cu117/cpu_adam/cpu_adam.so: cannot open shared object file: No such file or directory op_module = load(name=self.name, File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1284, in load return _jit_compile( File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1535, in _jit_compile return _import_module_from_library(name, build_directory, is_python_module) File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/torch/utils/cpp_extension.py", line 1929, in _import_module_from_library module = importlib.util.module_from_spec(spec) File "<frozen importlib._bootstrap>", line 571, in module_from_spec File "<frozen importlib._bootstrap_external>", line 1176, in create_module File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed ImportError: /data/xxxx/.cache/torch_extensions/py310_cu117/cpu_adam/cpu_adam.so: cannot open shared object file: No such file or directory Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7f9aa027f520> Traceback (most recent call last): File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7fc984fc7520> Traceback (most recent call last): File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7fd85d71b520> Traceback (most recent call last): File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7efcd27c7520> Traceback (most recent call last): File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7fb0b7ed7520> Traceback (most recent call last): File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7ff36df6b520> Traceback (most recent call last): File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7f7f1a357520> Traceback (most recent call last): File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam' Exception ignored in: <function DeepSpeedCPUAdam.__del__ at 0x7f834245f520> Traceback (most recent call last): File "/data/xxxx/conda/miniconda/envs/llm_yi/lib/python3.10/site-packages/deepspeed/ops/adam/cpu_adam.py", line 102, in __del__ self.ds_opt_adam.destroy_adam(self.opt_id) AttributeError: 'DeepSpeedCPUAdam' object has no attribute 'ds_opt_adam'
I saw a similar issue in the official deepspeed repo, and it is probably due to the cuda-toolkit version. Hope you will find this helpful. microsoft/DeepSpeed#1846
我也遇到了类似的问题,请问你解决了吗 @zhangxiann
我也遇到了类似的问题,请问你解决了吗 @zhangxiann
没有解决,我先调研别的模型去了
不要用flash-attn2.0或以上,安装flash-attn==1.0.4