MedLinker icon indicating copy to clipboard operation
MedLinker copied to clipboard

Data is not downloadable

Open asad1996172 opened this issue 4 years ago • 6 comments

I tried running the piece of code given in the ReadMe.md but am running into an error which is related to data folder. I wasn't able to download data.zip as it takes me to a 404 page. Can you guide me?

Thanks.

asad1996172 avatar Jul 17 '20 18:07 asad1996172

Hi,

Sorry for the late reply... Upon closer reading of the terms, I've confirmed that I'm not allowed to share data derivative from UMLS, unfortunately.

Still, as I pointed out in the README, you should be able to re-generate these contents using the create_umls_kb.py script.

As another alternative, you may be able to use/adapt scispacy's UMLS KB as a replacement.

In case you're only interested in the adaptations to the MedMentions dataset, I've uploaded that separately here: https://drive.google.com/file/d/1wJdW3Tcb6VZ0z-d8XQahk2Gm4Cj0BrRu/view?usp=sharing

Best

danlou avatar Jul 31 '20 10:07 danlou

Hello, even I am facing a similar issue. I tried to run the create_umls_kb.py script but it's giving the following error:

"""

Traceback (most recent call last): File "scripts/create_umls_kb.py", line 10, in umls_tree = construct_umls_tree_from_tsv('data/umls_semantic_type_tree.tsv') # change to your location File "/home/keshav/anaconda3/envs/medlinker/lib/python3.6/site-packages/scispacy/umls_semantic_type_tree.py", line 82, in construct_umls_tree_from_tsv for line in open(filepath, "r"): FileNotFoundError: [Errno 2] No such file or directory: 'data/umls_semantic_type_tree.tsv'

"""

I tried to download the data from the google drive link. I have requested for the access as well. But still there's no luck. Can you please let me know what do I have to so ? Thanks Regards

kbiyani33 avatar Aug 20 '20 11:08 kbiyani33

Sorry, didn't realize that had restricted permissions. I've now accepted your request and updated permissions.

Best

danlou avatar Aug 20 '20 13:08 danlou

In case you're having trouble accessing the umls_semantic_type_tree.tsv file from scispacy, you may also find that here: https://drive.google.com/file/d/1UGRWvynFmLb5gSF0kc16Bsh4DTCdVMJ2/view?usp=sharing

danlou avatar Aug 20 '20 13:08 danlou

Hello Danlou Wished to ask one more thing, how do we train from scratch ?

kbiyani33 avatar Aug 24 '20 12:08 kbiyani33

The code available in this repo can help you train from scratch.

Check the 'create' methods in the 'matcher' scripts, as well as precompute_contextual.py for extracting embeddings from the NLMs.

danlou avatar Aug 27 '20 10:08 danlou