starcoder icon indicating copy to clipboard operation
starcoder copied to clipboard

TypeError: expected str, bytes or os.PathLike object, not NoneType

Open HansBug opened this issue 2 years ago • 6 comments
trafficstars

I tried to fine-tune using the commands provided in the README and encountered the aforementioned error. For specific details, please refer to my wandb log.

HansBug avatar Jun 28 '23 06:06 HansBug

Correct me if I am wrong but your training seems to have gone well. The problem should come after trainer.train(). Do you have a checkpoints folder with checkpoint-1000 in after your training?

ArmelRandy avatar Jun 28 '23 12:06 ArmelRandy

Correct me if I am wrong but your training seems to have gone well. The problem should come after trainer.train(). Do you have a checkpoints folder with checkpoint-1000 in after your training?

Yes, I think so

(base) root@vgpu-test-codellm-idle-20230614-serial-0:/mnt/nfs/zhangshaoang.p/starcoder# tree checkpoints
checkpoints
└── checkpoint-1000
    ├── adapter_config.json
    ├── adapter_model.bin
    ├── optimizer.pt
    ├── pytorch_model.bin
    ├── README.md
    ├── rng_state.pth
    ├── scheduler.pt
    ├── trainer_state.json
    └── training_args.bin

So what happened?

HansBug avatar Jun 29 '23 05:06 HansBug

The issue is due to the callback which allows to load the best checkpoint. The callback is used but load_best_model_at_end is set to False. I'll look into this.

ArmelRandy avatar Jun 29 '23 08:06 ArmelRandy

This may helpful, put it in SavePeftModelCallback

        if state.best_model_checkpoint is None:
            print(f"Setting best_model_checkpoint to {checkpoint_folder}")
            state.best_model_checkpoint = checkpoint_folder
        elif state.best_model_checkpoint.endswith(checkpoint_folder):
            print(f"Updating best_model_checkpoint to {checkpoint_folder}")
            state.best_model_checkpoint = checkpoint_folder

h-clickshift avatar Jul 04 '23 08:07 h-clickshift

I got a similar error - TypeError: expected str, bytes or os.PathLike object, not NoneType. It seemed to output these thing son console - Starting main loop Training... {'loss': 0.6581, 'learning_rate': 0.0001, 'epoch': 0.5} {'eval_loss': 1.4904661178588867, 'eval_runtime': 7.503, 'eval_samples_per_second': 0.8, 'eval_steps_per_second': 0.8, 'epoch': 0.5} {'loss': 0.068, 'learning_rate': 0.0, 'epoch': 1.0} {'eval_loss': 1.9775662422180176, 'eval_runtime': 7.4805, 'eval_samples_per_second': 0.802, 'eval_steps_per_second': 0.802, 'epoch': 1.0} {'train_runtime': 28299.0398, 'train_samples_per_second': 0.113, 'train_steps_per_second': 0.007, 'train_loss': 0.3630624198913574, 'epoch': 1.0} Loading best peft model from None (score: None).

However, the script stopped after running into this error.

My checkpoints folder is also empty.

ruchaa0112 avatar Jul 07 '23 19:07 ruchaa0112

I think this might be the reason. " This may helpful, put it in SavePeftModelCallback

    if state.best_model_checkpoint is None:
        print(f"Setting best_model_checkpoint to {checkpoint_folder}")
        state.best_model_checkpoint = checkpoint_folder
    elif state.best_model_checkpoint.endswith(checkpoint_folder):
        print(f"Updating best_model_checkpoint to {checkpoint_folder}")
        state.best_model_checkpoint = checkpoint_folder

"

thanhnew2001 avatar Jul 24 '23 13:07 thanhnew2001