AlexJJJChen

Results 10 comments of AlexJJJChen

> transformers的neftune_noise_alpha不支持多模态,使用`--neftune_backend swift` > > The transformers' neftune_noise_alpha does not support multimodal, use `--neftune_backend swift`. it worked after I deleted the neftune_noise_alpha in .sh

> swift infer 正常嘛, 可以参考最佳实践中的单样本推理代码进行修改 swift infer是可以运行的,但是infer怎么使用自己的dataset做测试呀? 还有我发现我改变model type之后,代码无法运行 import os os.environ['CUDA_VISIBLE_DEVICES'] = '0,1,2,3' from swift.llm import ( get_model_tokenizer, get_template, inference, ModelType, get_default_template_type, inference_stream ) from swift.utils import seed_everything import...

> batch_size设置为1 设置为1也不行,会突然间爆显存

> 请问栈信息能贴一下吗,我们的代码中应该没有显式直接用这种方式加载完整模型的地方 File "/home/jianc/miniconda3/envs/benchmark-llm/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl return forward_call(*args, **kwargs) File "/home/jianc/miniconda3/envs/benchmark-llm/lib/python3.10/site-packages/peft/tuners/tuners_utils.py", line 161, in forward return self.model.forward(*args, **kwargs) File "/home/jianc/.cache/modelscope/hub/_github/LLaVA.git/llava/model/language_model/llava_mistral.py", line 91, in forward return super().forward( File "/home/jianc/miniconda3/envs/benchmark-llm/lib/python3.10/site-packages/transformers/models/mistral/modeling_mistral.py",...

> > > batch_size设置为1 > > > > > > 设置为1也不行,会突然间爆显存 > > 可以尝试 `export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512` 减少内存碎片。 还是不行,运行了一个step之后会爆显存 Traceback (most recent call last): File "/home/jianc/miniconda3/envs/benchmark-llm/lib/python3.10/site-packages/swift/cli/sft.py", line 5, in sft_main() File...

2.0.5版本无法完成infer所有data,2.0.4的可以。麻烦修改一下bug

> 你可以确定一下是2.0.5的版本不 是2.0.5有问题,我重新安装了2.0.4才能跑完整的数据集。 跑infer的时候,本来我的数据集有5w条,但是 linux 显示val dataset只有1.5k 条,进度条也只有1.5k

> 感觉2.0.5不应该出现这个问题呀,你确认一下不是2.1.0.dev吗 🤔 但我pip show的时候显示是2.0.5呀。不过是最新版就对了最近几天才更新的package,更新之后就出现了这个bug

> 我这里没有复现 我的数据集有5w行,但这里显示1k行

[INFO:swift] model.max_model_len: 8192 [INFO:swift] model_config: MiniCPMVConfig { "_name_or_path": "/root/.cache/modelscope/hub/OpenBMB/MiniCPM-Llama3-V-2_5", "architectures": [ "MiniCPMV" ], "attention_bias": false, "attention_dropout": 0.0, "auto_map": { "AutoConfig": "configuration_minicpm.MiniCPMVConfig", "AutoModel": "modeling_minicpmv.MiniCPMV", "AutoModelForCausalLM": "modeling_minicpmv.MiniCPMV" }, "batch_vision_input": true, "bos_token_id": 128000,...