KAZU icon indicating copy to clipboard operation
KAZU copied to clipboard

Scaling KAZU to run over millions of abstracts

Open kajocina opened this issue 5 months ago • 5 comments

Hi, thanks again for sharing KAZU, it looks like exactly the tool I need for setting up a simple biomed NLP system.

I am planning on using KAZU to add ontology terms to the whole PubMed abstract dump and writing these out to disk. I am a bit concerned that it will not scale well using the approach you show with single document in the tutorial (EGFR query). However, your Ray tutorial page still has TBA.

Do you have some general suggestions for how to scale it for my task? Don't need specific code examples, unless you already have something to share, could be even "quick and dirty".

kajocina avatar Jan 30 '24 10:01 kajocina