GraphCare icon indicating copy to clipboard operation
GraphCare copied to clipboard

Having trouble reproducing the results

Open Gloria-LIU opened this issue 7 months ago • 0 comments

Hi,

Thank you for the great work.

I followed the current framework trying to reproduce the results but cannot outperform the RNN/Transformer baselines. It seems like the kg used in the provided code is only with GPT-KG, not merging with UMLS-KG?

The process I did are:

**run  /graphcare_/graph_generation/graph_gen.ipynb
outputs: 
	/graphs/condition/CCSCM/{key}.txt
	/graphs/procedure/CCSPROC/{key}.txt
	/graphs/drug/ATC3/{key}.txt

**run /graphcare_/graph_generation/umls_emb_ret.py
outputs: 
	/data/pj20/exp_data/umls_ent_emb_.pkl

**run  /graphcare_/graph_generation/umls_sim_retriever.py
outputs: 
/data/pj20/exp_data/ccscm2umls.pkl
/data/pj20/exp_data/ccsproc2umls.pkl
/data/pj20/exp_data/atc32umls.pkl
**run  /KG_sampling/umls_sampling.py
outputs: 
/graphs/ccscm_umls
/graphs/ccsproc_umls
/graphs/atc3_umls

**run  /graphcare_/graph_generation/{cond,proc,drug}_emb_ret.ipynb
outputs: 
/graphs/condition/CCSCM
/graphs/procedure/CCSPROC
/graphs/drug/ATC5
id2ent.json  ent2id.json  id2rel.json  rel2id.json
entity_embedding.pkl relation_embedding.pkl
→ required by data_prepare.py

**run /graphcare_/graph_generation/ehr_emb_ret.ipynb → get the clusters
inputs:
path_1 = "../../data/pj20/exp_data/ccscm_ccsproc"
path_1_ = "../../graphs/cond_proc/CCSCM_CCSPROC"
ent2id.json entity_embedding.pkl clusters_th015.json clusters_inv_th015.json
path_2 = "../../data/pj20/exp_data/ccscm_ccsproc_atc3"
path_2_ = "../../graphs/cond_proc_drug/CCSCM_CCSPROC_ATC3"
ent2id.json entity_embedding.pkl clusters_th015.json clusters_inv_th015.json
outputs: 
	path_1 path_2
ccscm_id2clus.json  ​​ccsproc_id2clus.json  atc3_id2clus.json 
→ required by graphcare.py
	Note that we do not need clusters_th015.json???

 **run data_prepare.py
output: 
sample_dataset_{dataset}_{task}_th015.pkl
graph_{dataset}_{task}_th015.pkl
→ required by graphcare.py as dataset

**run graphcare.py

Could you kindly guide me how to merge the KGs and reproduce the results?

Thanks a lot!

Gloria-LIU avatar Jul 09 '24 16:07 Gloria-LIU