KGReasoning icon indicating copy to clipboard operation
KGReasoning copied to clipboard

"&" operator instead of "weighted product of the PDFs"

Open ZuoZicheng opened this issue 3 years ago • 4 comments

Hi ! Thanks for sharing the source code . In the paper, you use “weighted product of the PDFs” to implement the intersection operator . But why not use the "&" operator to implement the intersection operator ?

ZuoZicheng avatar Mar 09 '21 03:03 ZuoZicheng

Thanks for the interest. Can you elaborate a bit more what you mean by "the & operator"?

hyren avatar Mar 09 '21 07:03 hyren

Thanks for the interest. Can you elaborate a bit more what you mean by "the & operator"?

Maybe you have misunderstood what I mean . For example:

a = [ 1 , 2 , 3 , 4 , 1 , 2 ] b = [ 1 , 2 , 5 , 9 , 8 , 4 ] c = list ( set (a) & set (b) ) c [ 1, 2, 4 ]

ZuoZicheng avatar Mar 09 '21 07:03 ZuoZicheng

Each reasoning step corresponds to a fuzzy set of entities, and the number of entities/nodes on a KG can be enormous, so we really want to avoid finding all the entities and explicitly "recover" the set of entities for each reasoning step.

Instead we aim to represent the outcome of each reasoning step using embeddings, e.g., box or Beta. The benefit will be it is a nice representation of the fuzzy set, i.e., after training, the embedding at each step can be fairly close to the set of entities it represent in the latent space, and second we only need to "decode" the actual set of entities at the very last step: when we find the answers to the query without the need to track the intermediate entities.

So back to your question, since we do not actually decode the set of entities for each reasoning step, we cannot use the actual intersection operator. That's also the reason why we want to design neural intersection operators that simulate the real one in the latent space (box or beta).

hyren avatar Mar 09 '21 07:03 hyren

Each reasoning step corresponds to a fuzzy set of entities, and the number of entities/nodes on a KG can be enormous, so we really want to avoid finding all the entities and explicitly "recover" the set of entities for each reasoning step.

Instead we aim to represent the outcome of each reasoning step using embeddings, e.g., box or Beta. The benefit will be it is a nice representation of the fuzzy set, i.e., after training, the embedding at each step can be fairly close to the set of entities it represent in the latent space, and second we only need to "decode" the actual set of entities at the very last step: when we find the answers to the query without the need to track the intermediate entities.

So back to your question, since we do not actually decode the set of entities for each reasoning step, we cannot use the actual intersection operator. That's also the reason why we want to design neural intersection operators that simulate the real one in the latent space (box or beta).

Thank you for your detailed answer to make me understand the superiority of the algorithm . And another question is what are the factors that limit the accuracy of the algorithm besides the random KG or a KG manipulated by adversarial and malicious attacks .

ZuoZicheng avatar Mar 10 '21 01:03 ZuoZicheng