memit
memit copied to clipboard
Discussion: About Knowledge Editing
Dear authors: I read two of your articles on knowledge editing and benefited a lot. Sorry to bother you, I have two questions I want to make sure of with you:
-
MEMIT's Equation 9, why specifically split into two items, looks like it can be combined into [1,u]. Does the specific split [1,n] and [n+1,u] have any special meanings?
-
ROME looks like it can also perform batch editing? The original paper is to calculate a set of [k*, v*] and then update W_proj. But what if I calculate [k*_1, k*_2...k*_n] with [v*_1, v*_2...v*_n], and then update W_proj? In this way, ROME can also be batch edited?
Looking forward to your reply, thank you
[1] Locating and Editing Factual Associations in GPT [2] MASS-EDITING MEMORY IN A TRANSFORMER
Just for discussion @guanghuixu. For question 1:
In MEMIT, authors said "In each individual layer l, we wish to store a large batch of u ≫ 1 memories. " u here means the new associations that we want the model to learn.
On the next line, authors said "assuming that the layer contains previously-stored memories that should be preserved." I believe i means the memorized associations that we want the model to preserve.
I guess the reason why author write like that is to emphasize the batch of associations updating. As we can see on author's previous work ROME (eq 2), they use :
$the \ W \ that \ minimized\ ||WK-V||^2_F$
where I believe it shares a same meaning with:
$W_0 = argmin_W\ \sum||\bar{W}k_i-m_i||^2 \ \ (equation\ 9\ in\ MEMIT)$