medaCy icon indicating copy to clipboard operation
medaCy copied to clipboard

Request for information about models and annotation formats

Open balachander1964 opened this issue 4 years ago • 4 comments

Hi, I will appreciate if you share the links to download the models and data annotation format details.

balachander1964 avatar Dec 25 '20 06:12 balachander1964

MedaCy reads files in the BRAT format.

There's only one model online currently, namely clinical notes. However we do have other datasets available with different entity types. What types of entities did you want to be able to identify?

swfarnsworth avatar Dec 25 '20 18:12 swfarnsworth

Hi Steele,

Thank you for responding to my query. First Wishing you and your team a Very Happy and Prosperous New Year. We are trying to extract different types of entities and extract the relationship among them. Those entities include: Drug, dosage, duration of therapy, Frequency, and mode of administration; Diseases, signs and symptoms, conditions, etc Severity and grading information Affected part of the body Date and time information when the patient saw this condition, and when he / she was treated

If the model helps us to extract at least one or two from the list it will be great. I look forward to your reply. Thank you.

Bala

Sent from Mail for Windows 10

From: Steele Farnsworth Sent: Friday, December 25, 2020 11:32 PM To: NLPatVCU/medaCy Cc: balachander1964; Author Subject: Re: [NLPatVCU/medaCy] Request for information about models andannotation formats (#203)

MedaCy reads files in the BRAT format. There's only one model online currently, namely clinical notes. However we do have other datasets available with different entity types. What types of entities did you want to be able to identify? — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

balachander1964 avatar Dec 28 '20 04:12 balachander1964

Bala,

Thank you for the well wishes. I wish you and those close to you the same.

We have a dataset with most of the entity types you specified, however I am not sure that we have one that can identify severity and affected areas. A model trained upon it is given here, though the API for it was designed before we transitioned medaCy to a primarily command line application. This file is the actual model, and if you download it, you should be able to use it with medaCy's command line interface's predict functionality, using ClinicalPipeline as the pipeline option.

This particular model uses a conditional random field. We have had more success with BiLSTM and BERT models, though these have better performance when a GPU is available.

Please let us know if we can be of further assistance.

Steele

swfarnsworth avatar Dec 28 '20 05:12 swfarnsworth

Thank you Steele.

Sent from Mail for Windows 10

From: Steele Farnsworth Sent: Monday, December 28, 2020 10:43 AM To: NLPatVCU/medaCy Cc: balachander1964; Author Subject: Re: [NLPatVCU/medaCy] Request for information about models andannotation formats (#203)

Bala, Thank you for the well wishes. I wish you and those close to you the same. We have a dataset with most of the entity types you specified, however I am not sure that we have one that can identify severity and affected areas. A model trained upon it is given here, though the API for it was designed before we transitioned medaCy to a primarily command line application. This file is the actual model, and if you download it, you should be able to use it with medaCy's command line interface's predict functionality, using ClinicalPipeline as the pipeline option. This particular model uses a conditional random field. We have had more success with BiLSTM and BERT models, though these have better performance when a GPU is available. Please let us know if we can be of further assistance. Steele — You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub, or unsubscribe.

balachander1964 avatar Dec 29 '20 03:12 balachander1964