elk-generalization
elk-generalization copied to clipboard
Investigating the generalization behavior of LM probes trained to predict truth labels: (1) from one annotator to another, and (2) from easy questions to hard
Results
1
elk-generalization issues
Sort by
recently updated
recently updated
newest added
Huggingface evaluation doesn't allow passing of the answer choices to the evaluation metric function, so we are currently asserting that the answer choices for all examples are the same.