the-fair-cookbook
the-fair-cookbook copied to clipboard
New recipe: Mining of electronic health records (EHR) using natural language processing (NLP)
If you needed to add a task to the list below, please think about amending the issue template: https://github.com/FAIRplus/the-fair-cookbook/blob/migrating/.github/ISSUE_TEMPLATE/meta-checklist.md Great! Now, to the actual tasks:
- [x] identify author
- [x] write abstract
- [ ] agree with editors on abstract
- [ ] write recipe
- [ ] identify reviewer
- [ ] conduct review
- [ ] incorporate reviewer's comments
- [ ] publish recipe
Hi @YojanaGadiya , thanks for adding this issue -- I guess that you volunteered as an author for this recipe, didn't you? The next step would then to write a short abstract and just post here. Thanks in advance! :)
@robertgiessmann We have a branch which @proccaserra and I are working on. Would you want us to link this issue to that branch directly for the abstract or you would like me to post the abstract here itself?
hi @YojanaGadiya , it would be the nlp2rdf branch, then, wouldn't it?
Shall we treat this below as the abstract?
Despite the progress in organizing scientific knowledge in specialized databases as evidenced by the ever expanding number of resources available from FAIRsharing.org, unstructured text remains in many areas how domain knowledge is held in a field.
Unstructured text
refers simply to natural language text as found in journal articles, scientific reports or medical notes. The qualifierunstructured
refers to the fact that the information is available simply as a sequence of words, without any markup or annotation. While these documents are meaningful to humans, it presents two challenges. To the humans, reading and synthesizing the volumes of texts available is impossible. To computers and software agents, it was, until recently, extremely difficult to extract meaning from unstructured text. However, in recent years, the fields of computational linguistics and machine learning have seen breakthroughs in 'natural language processing', which enable researchers and scientists to tap into the knowledge mines that constitutes unstructured text. Tools such asBERT
, short for Bidirectional Entity Recognition Transformers, now known collectively asTransformers
have transformed our ability to exploit unstructured text. The tasks of Named Entity Recognition (NER) and Relation Recognition, critical to accomplishing 'natural language understanding' (NLU) can now be done with higher confidence and, as evidenced by the increasing number of publications, is delivering advances in biomedicine and healthcase which benefit to patients and science advancement.
Yes, it would be the nlp2rdf branch. I think we can treat the above part as abstract. @proccaserra do you agree on this?
@robertgiessmann @YojanaGadiya that's indeed the starting point. AZ, Novartis and other EFPIAs have pipelines in place (see links provided in the outline). The idea is to provide content detailing the key steps.
felt there's not much more I can do, so removed my assignment