KAZU
KAZU copied to clipboard
Scaling KAZU to run over millions of abstracts
Hi, thanks again for sharing KAZU, it looks like exactly the tool I need for setting up a simple biomed NLP system.
I am planning on using KAZU to add ontology terms to the whole PubMed abstract dump and writing these out to disk. I am a bit concerned that it will not scale well using the approach you show with single document in the tutorial (EGFR query). However, your Ray tutorial page still has TBA.
Do you have some general suggestions for how to scale it for my task? Don't need specific code examples, unless you already have something to share, could be even "quick and dirty".