CodeUltraFeedback
CodeUltraFeedback copied to clipboard
Evaluation for humaneval
What is the processing of the inference and extraction for humaneval? My test score is only 20.73 for humaneval (k=1, model = CodeLlama-7b-Instruct-hf+DPO ).