ETypeClus icon indicating copy to clipboard operation
ETypeClus copied to clipboard

Could you provide the codes to evaluate the results of clustering???

Open Lee-zix opened this issue 3 years ago • 2 comments

How could I get the results reported in Table 4 of the paper? The output of the command "CUDA_VISIBLE_DEVICES=0 python3 latent_space_clustering.py
--dataset_path ./pandemic
--input_emb_name po_tuple_features_all_svos.pk" is only a file recording clustering results. Could you provide the codes to evaluate the results?????

Lee-zix avatar Nov 27 '21 12:11 Lee-zix

Could you also comment on the effect that the same p_o pairs are (based on the clustering result outputs) assigned to more than one topic. Its not immediately obvious if there is a runtime option to control this behaviour. Is this something you looked at in your evaluations?

Thanks

antonyscerri avatar Feb 23 '22 11:02 antonyscerri

Could you also comment on the effect that the same p_o pairs are (based on the clustering result outputs) assigned to more than one topic. Its not immediately obvious if there is a runtime option to control this behaviour. Is this something you looked at in your evaluations?

Thanks

In the file latent_space_clustering.py

                if args.sort_method == 'discriminative':
                    word_idx = torch.arange(embs.size(0))[pred_cluster == j]
                    sorted_idx = torch.argsort(p[pred_cluster == j][:, j], descending=True)
                    word_idx = word_idx[sorted_idx]
                else:
                    sim = torch.matmul(topic_cluster.topic_emb[j], z.t())
                    _, word_idx = sim.topk(k=30, dim=-1)

it seems that "sort_method" is used to control this behavior

  1. when it is set to "discriminative", one P-O pair will be assigned to the specific cluster
  2. when it is set to "generative", one P-O pair may be assigned to the more than one cluster based on the cosine similarity

M010K avatar Apr 07 '22 13:04 M010K