CEVOpen icon indicating copy to clipboard operation
CEVOpen copied to clipboard

📖✍️ DAILY RECORD

Open EmanuelFaria opened this issue 5 years ago • 37 comments

A daily record of activities by each contributor

EmanuelFaria avatar Aug 28 '19 20:08 EmanuelFaria

20190828

Software

ami-search tested on commandline and will be deployed here on oil1000

Compounds

finalised chemistry on E1.0 with @ambarishk . May need false positives removing

corpus

[No action on oil1000]

collaborators

Welcomed @egonw and @larsgw

petermr avatar Aug 28 '19 20:08 petermr

Sir, volunteering the event will be happy moment for me. You may tell me any thing to perform at any time. As you feel convenient.

ambarishK avatar Aug 29 '19 05:08 ambarishK

I just got approval to license terminology data from the U.S. National Library of Medicine (NLM).

Your UTS account gives you access to the following resources:

Unified Medical Language System (UMLS)

The UMLS integrates and distributes key terminology, classification and coding standards, and associated resources to promote creation of more effective and interoperable biomedical information systems and services, including electronic health records.

  • Download the latest UMLS release
  • Search and browse the UMLS Metathesaurus and UMLS Semantic Network
  • Use the UMLS API
  • Use natural language processing tools like MetaMap and SemRep that rely on the UMLS for identifying meaning in text
  • Visit the UMLS Homepage for access to all UMLS resources and documentation

Value Set Authority Center (VSAC)

The VSAC is a repository and authoring tool for public value sets created by external programs. Value sets are lists of codes and corresponding terms, from NLM-hosted standard clinical vocabularies (such as SNOMED CT, RxNorm, LOINC and others), that define clinical concepts to support effective and interoperable health information exchange.

  • Download eCQM, C-CDA and other value sets
  • Search and browse the Value Set Authority Center
  • Use the VSAC SVS and FHIR API Resources
  • Visit the VSAC Homepage for access to all VSAC resources and documentation

RxNorm

RxNorm provides normalized names for clinical drugs and links its names to many of the drug vocabularies commonly used in pharmacy management and drug interaction software, including those of First Databank, Micromedex, Gold Standard Drug Database, and Multum. By providing links between these vocabularies, RxNorm can mediate messages between systems not using the same software and vocabulary.

  • Download RxNorm weekly and monthly updates
  • Search and browse RxNorm
  • Use the RxNorm API
  • Visit the RxNorm Homepage for access to all RxNorm resources and documentation

SNOMED CT

U.S. Edition of SNOMED CT is one of a suite of designated standards for use in U.S. Federal Government systems for the electronic exchange of clinical health information and is also a required standard in interoperability specifications of the U.S. Healthcare Information Technology Standards Panel. The clinical terminology is owned and maintained by SNOMED International, a not-for-profit association.

  • Download the U.S. Edition of SNOMED CT, including the SNOMED CT to ICD-10-CM Map
  • Download the International Edition of SNOMED CT, Spanish Edition, and other International derivatives
  • Search and browse SNOMED CT
  • Visit the NLM SNOMED CT Homepage for access to SNOMED CT resources, documentation, and additional license information

NIH Common Data Elements (CDE) Repository

The NIH Common Data Elements (CDE) Repository has been designed to provide access to structured human and machine-readable definitions of data elements that have been recommended or required by NIH Institutes and Centers and other organizations for use in research and for other purposes. Your UTS license allows you to:

  • Create, edit, and comment on CDEs and forms
  • Save CDEs and forms to boards
  • Obtain additional privileges in an administrator role

EmanuelFaria avatar Aug 30 '19 14:08 EmanuelFaria

20190901

## set up site

debugged ami-search so in oldstyle (per-project) mode

tested sections

I think they work reasonably well , but the output is ugly. Needs a display (probably HTML).

petermr avatar Sep 02 '19 10:09 petermr

20190902

added closed/open stats

Issue 13

petermr avatar Sep 02 '19 11:09 petermr

Sent Peter some DRAFT text to review for VC's info for journal article, as well as possible press release.

Added some more possible schema and scraping tool resources to that issue.

Completed a draft of the Activity Table and sent to peter for preliminary discussion

EmanuelFaria avatar Sep 02 '19 20:09 EmanuelFaria

20190909

Created a tutorial for XML Summer School based on CEVOpen. added it at https://github.com/petermr/CEVOpen/blob/master/docs/2019_raw_petermr.potx (need downloading).

Gives an account of the technical steps in running download and search.

petermr avatar Sep 09 '19 16:09 petermr

20190911

Sir,

  • I have test-run (ami3) ami-search over CProject - oil186.

  • narrative of slides of XML summer school is really very good and points-out the need, importance and potential of TDM in current research scenario.

How to get word frequencies as of over slide number 10?

ambarishK avatar Sep 11 '19 07:09 ambarishK

On Wed, Sep 11, 2019 at 8:23 AM Ambarish Kumar [email protected] wrote:

20190911

Sir,

I have test-run (ami3) ami-search over CProject - oil186.

excellent. Can you also run ami-section -p oil186 --sections ALL this will extract the main sections from the papers. Then we will need a search engine for sections - I will write it.

narrative of slides of XML summer school is really very good and points-out the need, importance and potential of TDM in current research scenario.

How to get word frequencies as of over slide number 10?

I think it's 15 in my deck - you may have an early one. I think it comes out automatically for each CTree - in search/words, but not for the aggregated summaries underneath the CProject. This needs debugging and you will be able to help do that.

P.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCS3TLNDA3GGJFIUJJE3QJCMFTA5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6NRFZY#issuecomment-530256615, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS4N27LHZZ5QOAVJG6LQJCMFTANCNFSM4IRS2RBQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr avatar Sep 11 '19 11:09 petermr

20190912

  • build ami3
$ mvn install -Dmaven.test.skip=true

[INFO] ------------------------------------------------------------------------
[INFO] BUILD SUCCESS
[INFO] ------------------------------------------------------------------------
[INFO] Total time:  01:10 min
[INFO] Finished at: 2019-09-12T11:34:29+05:30
[INFO] ------------------------------------------------------------------------

Locate target sub-directory within ami3.

[cbl@localhost ami3]$ ls
AMI-STEM.md  CONTRIBUTING.md  HELP.md     LICENSE  PROBLEMS.md  src
BUILDING.md  EXAMPLES.md      INSTALL.md  pom.xml  README.md    target
  • set environment path variable to access ami tools.
[cbl@localhost bin]$ pwd
/home/cbl/CEVOpen/ami3/target/appassembler/bin

[cbl@localhost bin]$ export PATH=$PATH:/home/cbl/CEVOpen/ami3/target/appassembler/bin

  • Running ami-section over CProject - oil186
[cbl@localhost CEVOpen]$ ami-section -p oil186 --sections ALL

Generic values (AMISectionTool)
================================
-v to see generic values
oldstyle            true

Specific values (AMISectionTool)
================================
sectionList             [ABBREVIATION, ABSTRACT, ACK_FUND, APPENDIX, ARTICLE_META, ARTICLE_TITLE, CONTRIB, AUTH_CONT, BACK, BODY, CASE, CONCL, COMP_INT, DISCUSS, FINANCIAL, FIG, FRONT, INTRO, JOURNAL_META, JOURNAL_TITLE, PUBLISHER_NAME, KEYWORD, METHODS, OTHER, PMCID, REF, RESULTS, SUPPL, TABLE, SUBTITLE, TITLE]
write                   true

AMISectionTool cTree: PMC5080681
AMISectionTool cTree: PMC5132230

  • Running ami-section over CProject - oil1000.
[cbl@localhost CEVOpen]$ ami-section -p oil1000 --sections ALL

Generic values (AMISectionTool)
================================
-v to see generic values
oldstyle            true

Specific values (AMISectionTool)
================================
sectionList             [ABBREVIATION, ABSTRACT, ACK_FUND, APPENDIX, ARTICLE_META, ARTICLE_TITLE, CONTRIB, AUTH_CONT, BACK, BODY, CASE, CONCL, COMP_INT, DISCUSS, FINANCIAL, FIG, FRONT, INTRO, JOURNAL_META, JOURNAL_TITLE, PUBLISHER_NAME, KEYWORD, METHODS, OTHER, PMCID, REF, RESULTS, SUPPL, TABLE, SUBTITLE, TITLE]
write                   true

AMISectionTool cTree: PMC5080681
AMISectionTool cTree: PMC5132230

All run is over Linux - CentOS platform.

ambarishK avatar Sep 12 '19 06:09 ambarishK

Is ami3 running satisfactorily ? If so can you give instructions on how to download the jar and run it? We have a workshop next wed and we want delegates to be able to run it Thank you

On Thu, 12 Sep 2019, 07:13 Ambarish Kumar, [email protected] wrote:

20190912

  • build ami3

$ mvn install -Dmaven.test.skip=true

[INFO] ------------------------------------------------------------------------ [INFO] BUILD SUCCESS [INFO] ------------------------------------------------------------------------ [INFO] Total time: 01:10 min [INFO] Finished at: 2019-09-12T11:34:29+05:30 [INFO] ------------------------------------------------------------------------

Locate target sub-directory within ami3.

[cbl@localhost ami3]$ ls AMI-STEM.md CONTRIBUTING.md HELP.md LICENSE PROBLEMS.md src BUILDING.md EXAMPLES.md INSTALL.md pom.xml README.md target

  • set environment path variable to access ami tools.

[cbl@localhost bin]$ pwd /home/cbl/CEVOpen/ami3/target/appassembler/bin

[cbl@localhost bin]$ export PATH=$PATH:/home/cbl/CEVOpen/ami3/target/appassembler/bin

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCS5ENZXHDALQHPENKYLQJHMZLA5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6QY3ZI#issuecomment-530681317, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS6VTUMBDCCWVBWJX53QJHMZLANCNFSM4IRS2RBQ .

petermr avatar Sep 12 '19 08:09 petermr

Yes sir. ami3 is running satisfactorily. Sure sir. That would be my pleasure.

ambarishK avatar Sep 12 '19 08:09 ambarishK

I will copy you into a colleague who is dockerising it.

On Thu, Sep 12, 2019 at 9:38 AM Ambarish Kumar [email protected] wrote:

Yes sir. ami3 is running satisfactorily. Sure sir. That would be my pleasure.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCS7KQTCQWDNVDVJQ4L3QJH5ZBA5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6RD2RQ#issuecomment-530726214, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS4GQJV22C64XDFVNNLQJH5ZBANCNFSM4IRS2RBQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr avatar Sep 12 '19 08:09 petermr

OK sir.

ambarishK avatar Sep 12 '19 10:09 ambarishK

@petermr, while I never have been a huge fan of Maven, having Java software on Maven Central (or a repository like that) is very useful: it archives the software, ensure it compiles, has clear dependencies. Is that something for AMI?

egonw avatar Sep 12 '19 10:09 egonw

Yes, I need to set a version. until about 2 years ago there were 8 different libraries in the stack. They were modular and separable. But versioning was a nightmare. Now that I have pulled them all together I think I should start versioning them in Maven Central. But as you know it takes time...

On Thu, Sep 12, 2019 at 11:36 AM Egon Willighagen [email protected] wrote:

@petermr https://github.com/petermr, while I never have been a huge fan of Maven, having Java software on Maven Central (or a repository like that) is very useful: it archives the software, ensure it compiles, has clear dependencies. Is that something for AMI?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCSYEOU3CMEBNAUKU6PTQJILT3A5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6RN6NQ#issuecomment-530767670, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS6L3FAVX3CKWLFFILDQJILT3ANCNFSM4IRS2RBQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr avatar Sep 12 '19 10:09 petermr

Putting on my project manager's hat ⛑, I just set up a Kanban-style project card structure here. Please bookmark it and set it as your main page.

Also, Please use the template below for each new Issue you open. Aim to complete it such that any reader can be clear about the issue's purpose and importance, and perhaps find ways we can assist you in it. Thanks!

The Big WHY?

We are building AMI so that: Type of User: ______________________ can: _____________ without: _________

Goals: Describe the Challenge, the solution we will bring, and the Desired End State by which all will know we have achieved excellence.

  • A.
  • B.
  • C.

Desired Results: A clear and concise description / outline of the final "state or vision" of the project — the evidence we will see when our goals are achieved.

  • A.
  • B.
  • C.

Guiding principles: What principles will guide our decisions as we do our part to fulfill the mission?

  • A.
  • B.
  • C.

Massive Action Steps: What massive actions will generate the Desired Results?

  • A.
  • B.
  • C.

Responsibilities and Roles: Who will have what completed when?

Interim Deliverable #1:

  • Milestone:
  • Milestone Date:
  • Single Individual Responsible for ensuring this Milestone is reached on time:

Interim Deliverable #2

  • Milestone:
  • Milestone Date:
  • Single Individual Responsible for ensuring this Milestone is reached on time:

Tips, Tools, Shortcuts and Resources: Anything done or used to make the desired outcome more likely to occur.

  • A.
  • B.
  • C.

Rules and Responsibilities for Achieving Excellence Always:

  • A.
  • B.
  • C.

Never:

  • A.
  • B.
  • C.

EmanuelFaria avatar Sep 13 '19 03:09 EmanuelFaria

No different from hangouts :-) A camera for me, a screenshare, a camera on the delegates. Maybe a chairperson.

On Fri, Sep 13, 2019 at 6:55 AM Ambarish Kumar [email protected] wrote:

Sir, how will you conduct workshop in broken leg?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/6?email_source=notifications&email_token=AAFTCS3U7IUFB4KH6ZRNC6DQJMTNDA5CNFSM4IRS2RB2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOD6UBG4A#issuecomment-531108720, or mute the thread https://github.com/notifications/unsubscribe-auth/AAFTCS77MQZXDEN7UI6IHMTQJMTNDANCNFSM4IRS2RBQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr avatar Sep 13 '19 11:09 petermr

20190920

run time log (truncated)


..!0    [main] DEBUG org.contentmine.ami.lookups.WikipediaLookup  - URL java.io.IOException: Server returned HTTP response code: 400 for URL: https://www.wikidata.org/w/index.php?search=&search=ace-inhibitor	acaricide	aldose-reductase-inhibitor	-	antiacetylcholinesterase	–	antifeedant	antioxidant	insectifuge	irritant	perfumery	pesticide&title=Special:Search&go=Go
!486  [main] DEBUG org.contentmine.ami.lookups.WikipediaLookup  - URL java.io.IOException: Server returned HTTP response code: 400 for URL: https://www.wikidata.org/w/index.php?search=&search=ace-inhibitor	acaricide	aldose-reductase-inhibitor	-	antiacetylcholinesterase	�	antifeedant	antioxidant	insectifuge	irritant	perfumery	pesticide&title=Special:Search&go=Go
!!...!14677

While running the script for making dictionary, many search terms has generated HTTP response code: 400 for URL .

ambarishK avatar Sep 20 '19 10:09 ambarishK

we should only look up single word terms. I will edit the dictionary and we'll rerun

petermr avatar Sep 21 '19 16:09 petermr

OK sir.

ambarishK avatar Sep 22 '19 04:09 ambarishK

20190923

Sir, Please check for new normalised activity table.

I normalised it after making one activity per row.

Total unique activity - 205.

Script to prepare dictionary

activity dictionary

ambarishK avatar Sep 23 '19 15:09 ambarishK

20190930

Updation of sheet - Activitytestforspecies.tsv - for first 50 articles of oil186.

Dictionary making - TargetOrganism.xml

LiteratureActivity.xml

ambarishK avatar Sep 30 '19 09:09 ambarishK

When browsing the content of these files, I ran into this line with what it seems to me a typo: https://github.com/petermr/CEVOpen/blob/master/dictionary/TargetOrganism.xml#L22

egonw avatar Sep 30 '19 09:09 egonw

Yes sir. It is a typo due to misspelled term - eschrechia coli.

ambarishK avatar Sep 30 '19 09:09 ambarishK

Added "Manny's Activity Table RAW for Ambarish 2019-10-02.tsv" to CEVOpen/dictionary/activity/raw/. Ready for @petermr to review and deliberate before @ambarishK begins cross-referencing and normalization.

EmanuelFaria avatar Oct 03 '19 03:10 EmanuelFaria

Created new Issues #42 📚DICTIONARIES to consider creating/adding

EmanuelFaria avatar Oct 23 '19 20:10 EmanuelFaria

Finished organizing images of all Activity tables found in Oil186 into sub-categories by table type, activity, and measurements found in the individual tables. See issue 45 for details.

EmanuelFaria avatar Nov 18 '19 19:11 EmanuelFaria

Completed two tasks:

  1. Used GREP to find all articles in Oil186 that mention antimicrobial, antibacterial, and/or antifungal activities. (?:(?:Minimum Inhibitory Concentration))|(?:(?:Minimum Bactericidal Concentration))|(?:(?:Minimum Fungicidal Concentration))|(?:(?:\bMIC\b))|(?:(?:\b(MIC)\b))|(?:(?:\bMBC\b))|(?:(?:\b(MBC)\b))|(?:(?:\bMFC\b))|(?:(?:\b(MFC)\b))

Used that list of article IDs to create a spreadsheet with the headings as below and parsed the data so @petermr can see some of the "creative" ways the article authors displayed their data, and then find a way to normalize and extract that data:.

  Table type Table Image paragraphs just before the table (with title, if any) Table_Caption Keywords_Phrases Table_Footnote_KEY_Abbreviation s Measurements Measurement Unit Method Plant Material Targets Non-Plant Control Substances, Solvents, Media, Substrate Notes Table Type Col1 Col2 Col3 Col4 Col5 Col6
                                         
  1. Created templates to assist @petermr in creating scripts to "machine-read" table captions and auto-detect keywords from them. (PDF and TSV files can be found here.)

Eg. Antibacterial activity of Achillea millefolium L. EO against bacterial pathogens. becomes [ACTIVITY(S)] activity of [PLANT(S)] [EXTRACT(S)] against [TARGET(S)].

EmanuelFaria avatar Dec 13 '19 20:12 EmanuelFaria

Progress Update

I just committed the finished (I hope!) dictionary: PlantMaterialHistory.xml

  • All of the issues in my previous comment are now resolved, and I have confidence in the wikidataIDs, where assigned.
  • The number of entries increased from 82 to 96, owing to additional entries added for different drying and extraction methods that were absent in the earlier version.
  • I've updated PlantMaterialHistoryDictionaryDescription.md and INDEXofOIL186Dictionaries.md

EmanuelFaria avatar Feb 04 '20 21:02 EmanuelFaria