edixiong
Results
2
issues of
edixiong
Hi author, I am having difficulty reproducing the results on cifar-10. The paper claimed test error of 2.46+-0.03 with 600 epochs, but when I am evaluating with the provided 'DrNAS_cifar10'...
For SFTTrainer, if we load the dataset using a conversational form (ChatML format), the function `apply_chat_template` is used (https://github.com/huggingface/trl/blob/v0.7.11/trl/extras/dataset_formatting.py#L55) with `tokenize=False`. Later in SFTTrainer, the data is tokenized again with...