cesi
cesi copied to clipboard
Why representative of cluster using ent2freq and NOT sub2freq dict?
I noticed that when at this line the subject embeddings and relation embeddings are passed for clustering, and then the cluster representative is found using (possibly) wrong ent2freq dictionary here. The subject embeddings dict contains 11878 subjects, whereas the ent2freq dict contains 23219 entities. The ent2freq dict maps from entity, and not subject, to its frequency i.e. there is a mismatch in entity id and subject id. Could you please clarify this? I am happy to elaborate my concern if needed.