Mustafa
Results
1
comments of
Mustafa
I tackled the same scoring challenge but stumbled upon poor performance in zero-shot inference for certain benchmarks, sometimes even worse than random chance. Here's the code I employed: def calculate_perplexity(self,...