jfy1016
jfy1016
运行脚本就是官方给出的示例脚本 import os os.environ['CUDA_VISIBLE_DEVICES'] = '2' from swift.llm import ( get_model_tokenizer, load_dataset, get_template, EncodePreprocessor, get_model_arch, get_multimodal_target_regex, LazyLLMDataset ) from swift.utils import get_logger, get_model_parameter_info, plot_images, seed_everything from swift.tuners import Swift, LoraConfig...
代码没有变化,用数据量较少的数据集就可以正常训练,用数据量较多的数据集就报错out of memory,两个数据集除了数据量不同,没有任何差别。为什么?
CUDA_VISIBLE_DEVICES=0,1,2 \ MAX_PIXELS=1003520 \ swift sft \ --model /home/jdn/.cache/modelscope/hub/models/deepseek-ai/deepseek-vl2-tiny \ --dataset /home/jdn/deepseek/save_json/xunlian_CT_and_Xray.json \ --train_type lora \ --torch_dtype float16 \ --num_train_epochs 5 \ --per_device_train_batch_size 1 \ --per_device_eval_batch_size 1 \ --learning_rate 1e-4...