FlagEmbedding icon indicating copy to clipboard operation
FlagEmbedding copied to clipboard

预训练问题

Open tmy123-tech opened this issue 1 year ago • 3 comments

File "/data/FlagEmbedding/FlagEmbedding/bge_m3.py", line 6, in import datasets ModuleNotFoundError: No module named 'datasets'

tmy123-tech avatar Feb 23 '24 05:02 tmy123-tech

需要按照datasets :pip install datasets

staoxiao avatar Feb 23 '24 09:02 staoxiao

需要按照datasets :pip install datasets 您好,这个问题我已经解决,但在微调和与训练过程中依然遇到一些问题 预训练报错: Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/data/FlagEmbedding/FlagEmbedding/baai_general_embedding/retromae_pretrain/run.py", line 128, in main() File "/data/FlagEmbedding/FlagEmbedding/baai_general_embedding/retromae_pretrain/run.py", line 123, in main trainer.train(resume_from_checkpoint=training_args.resume_from_checkpoint) File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1624, in train return inner_training_loop( File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1653, in _inner_training_loop train_dataloader = self.get_train_dataloader() File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 852, in get_train_dataloader return self.accelerator.prepare(DataLoader(train_dataset, **dataloader_params)) File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 243, in init assert prefetch_factor > 0 TypeError: '>' not supported between instances of 'NoneType' and 'int' ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 2667) of binary: /usr/bin/python3.10

微调报错: Traceback (most recent call last): File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main return _run_code(code, main_globals, None, File "/usr/lib/python3.10/runpy.py", line 86, in _run_code exec(code, run_globals) File "/data/FlagEmbedding/FlagEmbedding/baai_general_embedding/finetune/run.py", line 111, in main() File "/data/FlagEmbedding/FlagEmbedding/baai_general_embedding/finetune/run.py", line 102, in main trainer.train() File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1624, in train return inner_training_loop( File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 1653, in _inner_training_loop train_dataloader = self.get_train_dataloader() File "/usr/local/lib/python3.10/dist-packages/transformers/trainer.py", line 852, in get_train_dataloader return self.accelerator.prepare(DataLoader(train_dataset, **dataloader_params)) File "/usr/local/lib/python3.10/dist-packages/torch/utils/data/dataloader.py", line 241, in init raise ValueError('prefetch_factor option could only be specified in multiprocessing.' ValueError: prefetch_factor option could only be specified in multiprocessing.let num_workers > 0 to enable multiprocessing.

tmy123-tech avatar Feb 26 '24 01:02 tmy123-tech

建议更新transformers再试试:pip install -U transformers

staoxiao avatar Feb 26 '24 02:02 staoxiao