预训练问题
File "/data/FlagEmbedding/FlagEmbedding/bge_m3.py", line 6, in
需要按照datasets :pip install datasets
需要按照datasets :
pip install datasets您好,这个问题我已经解决,但在微调和与训练过程中依然遇到一些问题 预训练报错: Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/data/FlagEmbedding/FlagEmbedding/baai_general_embedding/retromae_pretrain/run.py", line 128, inmain() File "/data/FlagEmbedding/FlagEmbedding/baai_general_embedding/retromae_pretrain/run.py", line 123, in main trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1624, in train return inner_training_loop( File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1653, in _inner_training_loop train_dataloader = self.get_train_dataloader() File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 852, in get_train_dataloader return self.accelerator.prepare(DataLoader(train_dataset, **dataloader_params)) File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 243, in init assert prefetch_factor > 0 TypeError: '>' not supported between instances of 'NoneType' and 'int' ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2667) of binary: /usr/bin/python3.10
微调报错:
Traceback (most recent call last):
File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/usr/lib/python3.10/runpy.py", line 86, in _run_code
exec(code, run_globals)
File "/data/FlagEmbedding/FlagEmbedding/baai_general_embedding/finetune/run.py", line 111, in
建议更新transformers再试试:pip install -U transformers