elk icon indicating copy to clipboard operation
elk copied to clipboard

Keeping language models honest by directly eliciting knowledge encoded in their activations.

Results 24 elk issues
Sort by recently updated
recently updated
newest added

Given a set of reporters for each layer of a model and a fixed input, we can extract the model's "belief" at each layer and see how it evolves over...

enhancement

This is probably useful for the layer ensembling #60 and also I'd like to know how well correlated the loss is with accuracy in general

I noticed that the datasets supported in the code are all multiple-choice and classification types, such as IMDB, QNLI, and BoolQ. Can the code in this repository support free-form types...