lm-evaluation-harness
lm-evaluation-harness copied to clipboard
Loglikelihood refactor
Not sure if utils.py is the best place for these functions. Open to starting a new utils file under models.
Hey @anjor , thanks for taking this on, and I'm sorry it took so long to look at for me!
I agree with Lintang that the current refactor is a bit more confusing in terms of adding abstraction / redirection for someone reading the code.
I think an alternative way to execute this refactor would be to do something analogous to the BaseLM class in the old v0.3.0: implementing the skeleton code / outer loops of the 3 different request type functions in a subclass of LM that is an intermediate between the fully-abstract LM base class and a fully-implemented specific LM subclass. Then, things like HFLM could have the potential to offload some of the boilerplate to this shared location and just keep their specific machinery.
All good, thanks for the comments.
Your suggestions sounds good, I will get that implemented. Any ideas on what to call the intermediate layer?
TemplateLM perhaps?
Sorry have been a bit occupied with some other stuff. Hoping to get to this soon.