aici
aici copied to clipboard
provide feedback on the probability mass dropped by logit bias
pre = softmax(logits)
logits += bias
post = softmax(logits)
dropped = sum(max(0, pre[i] - post[i]) for i in range(len(post)))
if dropped is close to 1 we're going against the model