helo-word icon indicating copy to clipboard operation
helo-word copied to clipboard

Domain specific corpus

Open Mahasweta-usc opened this issue 5 years ago • 1 comments

Hi Could explain a way to incorporate domain specific corpus to train the model? My work involves identifying n-grams prevalent in medical texts, such as "sudden infant death syndrome" which appears only across handful instances in the corpus files. Are there any scripts we can tweak to include files and how? Or otherwise, can the current model perform across domains?

Mahasweta-usc avatar Nov 03 '19 04:11 Mahasweta-usc

Hi Could explain a way to incorporate domain specific corpus to train the model? My work involves identifying n-grams prevalent in medical texts, such as "sudden infant death syndrome" which appears only across handful instances in the corpus files. Are there any scripts we can tweak to include files and how? Or otherwise, can the current model perform across domains?

I am having the same problem. Did you solve the issue?

Zer0-dev115 avatar Mar 17 '20 09:03 Zer0-dev115