CEVOpen
CEVOpen copied to clipboard
📖✍️ DAILY RECORD
A daily record of activities by each contributor
20190828
Software
ami-search
tested on commandline and will be deployed here on oil1000
Compounds
finalised chemistry
on E1.0 with @ambarishk . May need false positives removing
corpus
[No action on oil1000
]
collaborators
Welcomed @egonw and @larsgw
Sir, volunteering the event will be happy moment for me. You may tell me any thing to perform at any time. As you feel convenient.
I just got approval to license terminology data from the U.S. National Library of Medicine (NLM).
Your UTS account gives you access to the following resources:
- Unified Medical Language System (UMLS)
- Value Set Authority Center (VSAC)
- RxNorm
- SNOMED CT
- NIH Common Data Elements (CDE) Repository
Unified Medical Language System (UMLS)
The UMLS integrates and distributes key terminology, classification and coding standards, and associated resources to promote creation of more effective and interoperable biomedical information systems and services, including electronic health records.
- Download the latest UMLS release
- Search and browse the UMLS Metathesaurus and UMLS Semantic Network
- Use the UMLS API
- Use natural language processing tools like MetaMap and SemRep that rely on the UMLS for identifying meaning in text
- Visit the UMLS Homepage for access to all UMLS resources and documentation
Value Set Authority Center (VSAC)
The VSAC is a repository and authoring tool for public value sets created by external programs. Value sets are lists of codes and corresponding terms, from NLM-hosted standard clinical vocabularies (such as SNOMED CT, RxNorm, LOINC and others), that define clinical concepts to support effective and interoperable health information exchange.
- Download eCQM, C-CDA and other value sets
- Search and browse the Value Set Authority Center
- Use the VSAC SVS and FHIR API Resources
- Visit the VSAC Homepage for access to all VSAC resources and documentation
RxNorm
RxNorm provides normalized names for clinical drugs and links its names to many of the drug vocabularies commonly used in pharmacy management and drug interaction software, including those of First Databank, Micromedex, Gold Standard Drug Database, and Multum. By providing links between these vocabularies, RxNorm can mediate messages between systems not using the same software and vocabulary.
- Download RxNorm weekly and monthly updates
- Search and browse RxNorm
- Use the RxNorm API
- Visit the RxNorm Homepage for access to all RxNorm resources and documentation
SNOMED CT
U.S. Edition of SNOMED CT is one of a suite of designated standards for use in U.S. Federal Government systems for the electronic exchange of clinical health information and is also a required standard in interoperability specifications of the U.S. Healthcare Information Technology Standards Panel. The clinical terminology is owned and maintained by SNOMED International, a not-for-profit association.
- Download the U.S. Edition of SNOMED CT, including the SNOMED CT to ICD-10-CM Map
- Download the International Edition of SNOMED CT, Spanish Edition, and other International derivatives
- Search and browse SNOMED CT
- Visit the NLM SNOMED CT Homepage for access to SNOMED CT resources, documentation, and additional license information
NIH Common Data Elements (CDE) Repository
The NIH Common Data Elements (CDE) Repository has been designed to provide access to structured human and machine-readable definitions of data elements that have been recommended or required by NIH Institutes and Centers and other organizations for use in research and for other purposes. Your UTS license allows you to:
- Create, edit, and comment on CDEs and forms
- Save CDEs and forms to boards
- Obtain additional privileges in an administrator role
20190901
## set up site
debugged ami-search
so in oldstyle (per-project) mode
tested sections
I think they work reasonably well , but the output is ugly. Needs a display (probably HTML).
20190902
added closed/open stats
Issue 13
Sent Peter some DRAFT text to review for VC's info for journal article, as well as possible press release.
Added some more possible schema and scraping tool resources to that issue.
Completed a draft of the Activity Table and sent to peter for preliminary discussion
20190909
Created a tutorial for XML Summer School based on CEVOpen. added it at https://github.com/petermr/CEVOpen/blob/master/docs/2019_raw_petermr.potx (need downloading).
Gives an account of the technical steps in running download and search.
20190911
Sir,
-
I have test-run (ami3) ami-search over CProject - oil186.
-
narrative of slides of XML summer school is really very good and points-out the need, importance and potential of TDM in current research scenario.
How to get word frequencies as of over slide number 10?
On Wed, Sep 11, 2019 at 8:23 AM Ambarish Kumar [email protected] wrote:
20190911
Sir,
I have test-run (ami3) ami-search over CProject - oil186.
excellent. Can you also run ami-section -p oil186 --sections ALL this will extract the main sections from the papers. Then we will need a search engine for sections - I will write it.
narrative of slides of XML summer school is really very good and points-out the need, importance and potential of TDM in current research scenario.
How to get word frequencies as of over slide number 10?
I think it's 15 in my deck - you may have an early one. I think it comes out automatically for each CTree - in search/words, but not for the aggregated summaries underneath the CProject. This needs debugging and you will be able to help do that.
P.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCS3TLNDA3GGJFIUJJE3QJCMFTA5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6NRFZY#issuecomment-530256615, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS4N27LHZZ5QOAVJG6LQJCMFTANCNFSM4IRS2RBQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
20190912
- build ami3
$ mvn install -Dmaven.test.skip=true
[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 01:10 min
[INFO] Finished at: 2019-09-12T11:34:29+05:30
[INFO] ------------------------------------------------------------------------
Locate target
sub-directory within ami3
.
[cbl@localhost ami3]$ ls
AMI-STEM.md CONTRIBUTING.md HELP.md LICENSE PROBLEMS.md src
BUILDING.md EXAMPLES.md INSTALL.md pom.xml README.md target
- set environment path variable to access ami tools.
[cbl@localhost bin]$ pwd
/home/cbl/CEVOpen/ami3/target/appassembler/bin
[cbl@localhost bin]$ export PATH=$PATH:/home/cbl/CEVOpen/ami3/target/appassembler/bin
- Running
ami-section
overCProject - oil186
[cbl@localhost CEVOpen]$ ami-section -p oil186 --sections ALL
Generic values (AMISectionTool)
================================
-v to see generic values
oldstyle true
Specific values (AMISectionTool)
================================
sectionList [ABBREVIATION, ABSTRACT, ACK_FUND, APPENDIX, ARTICLE_META, ARTICLE_TITLE, CONTRIB, AUTH_CONT, BACK, BODY, CASE, CONCL, COMP_INT, DISCUSS, FINANCIAL, FIG, FRONT, INTRO, JOURNAL_META, JOURNAL_TITLE, PUBLISHER_NAME, KEYWORD, METHODS, OTHER, PMCID, REF, RESULTS, SUPPL, TABLE, SUBTITLE, TITLE]
write true
AMISectionTool cTree: PMC5080681
AMISectionTool cTree: PMC5132230
- Running
ami-section
overCProject - oil1000
.
[cbl@localhost CEVOpen]$ ami-section -p oil1000 --sections ALL
Generic values (AMISectionTool)
================================
-v to see generic values
oldstyle true
Specific values (AMISectionTool)
================================
sectionList [ABBREVIATION, ABSTRACT, ACK_FUND, APPENDIX, ARTICLE_META, ARTICLE_TITLE, CONTRIB, AUTH_CONT, BACK, BODY, CASE, CONCL, COMP_INT, DISCUSS, FINANCIAL, FIG, FRONT, INTRO, JOURNAL_META, JOURNAL_TITLE, PUBLISHER_NAME, KEYWORD, METHODS, OTHER, PMCID, REF, RESULTS, SUPPL, TABLE, SUBTITLE, TITLE]
write true
AMISectionTool cTree: PMC5080681
AMISectionTool cTree: PMC5132230
All run is over Linux - CentOS platform.
Is ami3 running satisfactorily ? If so can you give instructions on how to download the jar and run it? We have a workshop next wed and we want delegates to be able to run it Thank you
On Thu, 12 Sep 2019, 07:13 Ambarish Kumar, [email protected] wrote:
20190912
- build ami3
$ mvn install -Dmaven.test.skip=true
[INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 01:10 min [INFO] Finished at: 2019-09-12T11:34:29+05:30 [INFO] ------------------------------------------------------------------------
Locate target sub-directory within ami3.
[cbl@localhost ami3]$ ls AMI-STEM.md CONTRIBUTING.md HELP.md LICENSE PROBLEMS.md src BUILDING.md EXAMPLES.md INSTALL.md pom.xml README.md target
- set environment path variable to access ami tools.
[cbl@localhost bin]$ pwd /home/cbl/CEVOpen/ami3/target/appassembler/bin
[cbl@localhost bin]$ export PATH=$PATH:/home/cbl/CEVOpen/ami3/target/appassembler/bin
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCS5ENZXHDALQHPENKYLQJHMZLA5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6QY3ZI#issuecomment-530681317, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS6VTUMBDCCWVBWJX53QJHMZLANCNFSM4IRS2RBQ .
Yes sir. ami3 is running satisfactorily. Sure sir. That would be my pleasure.
I will copy you into a colleague who is dockerising it.
On Thu, Sep 12, 2019 at 9:38 AM Ambarish Kumar [email protected] wrote:
Yes sir. ami3 is running satisfactorily. Sure sir. That would be my pleasure.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCS7KQTCQWDNVDVJQ4L3QJH5ZBA5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6RD2RQ#issuecomment-530726214, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS4GQJV22C64XDFVNNLQJH5ZBANCNFSM4IRS2RBQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
OK sir.
@petermr, while I never have been a huge fan of Maven, having Java software on Maven Central (or a repository like that) is very useful: it archives the software, ensure it compiles, has clear dependencies. Is that something for AMI?
Yes, I need to set a version. until about 2 years ago there were 8 different libraries in the stack. They were modular and separable. But versioning was a nightmare. Now that I have pulled them all together I think I should start versioning them in Maven Central. But as you know it takes time...
On Thu, Sep 12, 2019 at 11:36 AM Egon Willighagen [email protected] wrote:
@petermr https://github.com/petermr, while I never have been a huge fan of Maven, having Java software on Maven Central (or a repository like that) is very useful: it archives the software, ensure it compiles, has clear dependencies. Is that something for AMI?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCSYEOU3CMEBNAUKU6PTQJILT3A5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6RN6NQ#issuecomment-530767670, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS6L3FAVX3CKWLFFILDQJILT3ANCNFSM4IRS2RBQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
Putting on my project manager's hat ⛑, I just set up a Kanban-style project card structure here. Please bookmark it and set it as your main page.
Also, Please use the template below for each new Issue you open. Aim to complete it such that any reader can be clear about the issue's purpose and importance, and perhaps find ways we can assist you in it. Thanks!
The Big WHY?
We are building AMI so that: Type of User: ______________________ can: _____________ without: _________
Goals: Describe the Challenge, the solution we will bring, and the Desired End State by which all will know we have achieved excellence.
- A.
- B.
- C.
Desired Results: A clear and concise description / outline of the final "state or vision" of the project — the evidence we will see when our goals are achieved.
- A.
- B.
- C.
Guiding principles: What principles will guide our decisions as we do our part to fulfill the mission?
- A.
- B.
- C.
Massive Action Steps: What massive actions will generate the Desired Results?
- A.
- B.
- C.
Responsibilities and Roles: Who will have what completed when?
Interim Deliverable #1:
- Milestone:
- Milestone Date:
- Single Individual Responsible for ensuring this Milestone is reached on time:
Interim Deliverable #2
- Milestone:
- Milestone Date:
- Single Individual Responsible for ensuring this Milestone is reached on time:
Tips, Tools, Shortcuts and Resources: Anything done or used to make the desired outcome more likely to occur.
- A.
- B.
- C.
Rules and Responsibilities for Achieving Excellence Always:
- A.
- B.
- C.
Never:
- A.
- B.
- C.
No different from hangouts :-) A camera for me, a screenshare, a camera on the delegates. Maybe a chairperson.
On Fri, Sep 13, 2019 at 6:55 AM Ambarish Kumar [email protected] wrote:
Sir, how will you conduct workshop in broken leg?
— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCS3U7IUFB4KH6ZRNC6DQJMTNDA5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6UBG4A#issuecomment-531108720, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS77MQZXDEN7UI6IHMTQJMTNDANCNFSM4IRS2RBQ .
-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK
20190920
-
make dictionary for reported biological activity into EssoilDB1.0
run time log (truncated)
..!0 [main] DEBUG org.contentmine.ami.lookups.WikipediaLookup - URL java.io.IOException: Server returned HTTP response code: 400 for URL: https://www.wikidata.org/w/index.php?search=&search=ace-inhibitor acaricide aldose-reductase-inhibitor - antiacetylcholinesterase – antifeedant antioxidant insectifuge irritant perfumery pesticide&title=Special:Search&go=Go
!486 [main] DEBUG org.contentmine.ami.lookups.WikipediaLookup - URL java.io.IOException: Server returned HTTP response code: 400 for URL: https://www.wikidata.org/w/index.php?search=&search=ace-inhibitor acaricide aldose-reductase-inhibitor - antiacetylcholinesterase � antifeedant antioxidant insectifuge irritant perfumery pesticide&title=Special:Search&go=Go
!!...!14677
While running the script for making dictionary, many search terms has generated HTTP response code: 400 for URL
.
we should only look up single word terms. I will edit the dictionary and we'll rerun
OK sir.
20190923
Sir, Please check for new normalised activity table.
I normalised it after making one activity per row.
Total unique activity - 205.
20190930
Updation of sheet - Activitytestforspecies.tsv - for first 50 articles of oil186.
Dictionary making - TargetOrganism.xml
When browsing the content of these files, I ran into this line with what it seems to me a typo: https://github.com/petermr/CEVOpen/blob/master/dictionary/TargetOrganism.xml#L22
Yes sir. It is a typo due to misspelled term - eschrechia coli
.
Added "Manny's Activity Table RAW for Ambarish 2019-10-02.tsv" to CEVOpen/dictionary/activity/raw/. Ready for @petermr to review and deliberate before @ambarishK begins cross-referencing and normalization.
Created new Issues #42 📚DICTIONARIES to consider creating/adding
Finished organizing images of all Activity tables found in Oil186 into sub-categories by table type, activity, and measurements found in the individual tables. See issue 45 for details.
Completed two tasks:
- Used GREP to find all articles in Oil186 that mention antimicrobial, antibacterial, and/or antifungal activities. (?:(?:Minimum Inhibitory Concentration))|(?:(?:Minimum Bactericidal Concentration))|(?:(?:Minimum Fungicidal Concentration))|(?:(?:\bMIC\b))|(?:(?:\b(MIC)\b))|(?:(?:\bMBC\b))|(?:(?:\b(MBC)\b))|(?:(?:\bMFC\b))|(?:(?:\b(MFC)\b))
Used that list of article IDs to create a spreadsheet with the headings as below and parsed the data so @petermr can see some of the "creative" ways the article authors displayed their data, and then find a way to normalize and extract that data:.
Table type | Table Image | paragraphs just before the table (with title, if any) | Table_Caption | Keywords_Phrases | Table_Footnote_KEY_Abbreviation s | Measurements | Measurement Unit | Method | Plant Material | Targets | Non-Plant Control Substances, Solvents, Media, Substrate | Notes | Table Type | Col1 | Col2 | Col3 | Col4 | Col5 | Col6 | |
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
- Created templates to assist @petermr in creating scripts to "machine-read" table captions and auto-detect keywords from them. (PDF and TSV files can be found here.)
Eg. Antibacterial activity of Achillea millefolium L. EO against bacterial pathogens. becomes [ACTIVITY(S)] activity of [PLANT(S)] [EXTRACT(S)] against [TARGET(S)].
Progress Update
I just committed the finished (I hope!) dictionary: PlantMaterialHistory.xml
- All of the issues in my previous comment are now resolved, and I have confidence in the wikidataIDs, where assigned.
- The number of entries increased from 82 to 96, owing to additional entries added for different drying and extraction methods that were absent in the earlier version.
- I've updated PlantMaterialHistoryDictionaryDescription.md and INDEXofOIL186Dictionaries.md