[Proposal] Include num_kv_heads in the Model Properties Table in the docs
Proposal
Include num_kv_heads in the Model Properties Table in the docs
Motivation
A fair amount of models now use multi-query attention (ie single key and single value) or grouped-query (ie k keys and values, q queries, for 1 < k < q). This lets users see this easily about models.
Should be a fairly easy change.
https://neelnanda-io.github.io/TransformerLens/generated/model_properties_table.html
@neelnanda-io I've made the change locally, but noticed that only 12 of the models would have n_key_value_heads. And when I add
if "num_key_value_heads" in cfg_json:
cfg_dict["n_key_value_heads"] = cfg_json["num_key_value_heads"]
below this line, that 17 of the models have n_key_value_heads, which makes me think that not all the configs have a value for n_key_value_heads when they should. Do you think I should open a PR just to update the docs (and either assume the configs are correct or fix the configs in a separate PR) or open a PR to do all of it?
Oops, I miscounted. 17 models would have n_key_value_heads without those additions. I'll just create a PR to update the docs.