CEVOpen icon indicating copy to clipboard operation
CEVOpen copied to clipboard

đź“š Documentation: MASTER INDEX of Dictionary Descriptions for Oil186 test batch

Open EmanuelFaria opened this issue 5 years ago • 13 comments

Here we describe the process of:

  1. creating a master INDEX (INDEXofOIL186Dictionaries.md)of [DictionaryName]DictionaryDescription.md documents, which will describe the contents of the individual dictionaries created to date for data collected for Oil186,
  2. creating individual "DictionaryDescription" documents for each dictionary — which will each have their own Github Issue number, to facilitate discussion and correction.

EmanuelFaria avatar Jan 24 '20 15:01 EmanuelFaria

I started the task of creating individual Dictionary Description Documentation (“DDD”) for each by the following steps:

  1. Since there were a lot of .tsv and .csv files in (A), I first created a new directory in https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … called “DictionaryDuplicateTablesOrganized”

  2. Copied duplicates of them within the following sub-directories: https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDuplicateTablesOrganized

  • ACTIVITY
  • CHEMICAL ANALYSIS CONSTITUENTS
  • CHEMICAL ANALYSIS METHODS
  • PLANT ORIGIN
  • TARGET SPECIES
  1. Examined the contents of each file in each (now sorted) directory and — hopefully — picked the right ones to begin drafting DDDs for each — along with an “AboutOIL186Dictionaries.md” master description document — all of which can be found here: https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDescriptionsOIL186
  • AboutOIL186Dictionaries.md
  • ChemicalConstituentsDictionaryDescription.md
  • PlantOriginDescription.md
  • ExtractionAndChemicalAnalysisMethodsDictionaryDescription.md
  • TargetOrganismDictionaryDescription.md

I will provide further details as updates are made.

EmanuelFaria avatar Jan 24 '20 15:01 EmanuelFaria

Clarification requested:

@petermr Should I actually be writing up Dictionary Descriptions as requested for the items in here: A) https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … or for these ones I us found here: B) https://github.com/petermr/CEVOpen/tree/master/dictionary ?

Direction requested:

Please have a look and provide feedback as to:

  • Have I chosen the correct tables to describe? If not, please point me to the right ones. (eg. the Chemical Constituents file I chose was the only one with wikidataIDs, but had very few entries).
  • What you want me to name the files (Please check the names of the Dictionary themselves — titles inside the dictionary documents, their names in the "AboutOIL186Dictionaries.md" file, and the .md file names themselves.
  • Where you want them to be posted
  • Would you like any changes to the formatting?

Please ### note:

  • The source files for the descriptions are in the .md documents
  • I have pasted questions for you at the bottom of some of them.

Thank you.

EmanuelFaria avatar Jan 24 '20 15:01 EmanuelFaria

On Fri, Jan 24, 2020 at 3:43 PM Emanuel Faria [email protected] wrote:

I started the task of creating individual Dictionary Description Documentation (“DDD”) for each by the following steps:

Since there were a lot of .tsv and .csv files in (A), I first created a new directory in https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … called “DictionaryDuplicateTablesOrganized”

Yes, there was no system in naming the files so there are almost certainly duplicates. Important to try to identify the latest one.

Copied duplicates of them within the following sub-directories:

https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDuplicateTablesOrganized

  • ACTIVITY
  • CHEMICAL ANALYSIS CONSTITUENTS
  • CHEMICAL ANALYSIS METHODS
  • PLANT ORIGIN
  • TARGET SPECIES

Looks appropriate.

  1. Examined the contents of each file in each (now sorted) directory and — hopefully — picked the right ones to begin drafting DDDs for each — along with an “AboutOIL186Dictionaries.md” master description document — all of which can be found here: https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDescriptionsOIL186
  • AboutOIL186Dictionaries.md
  • ChemicalConstituentsDictionaryDescription.md
  • CountryDictionaryDescription.md
  • ExtractionAndChemicalAnalysisMethodsDictionaryDescription.md
  • TargetOrganismDictionaryDescription.md

I will provide further details as updates are made.

Thanks.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCS24FR27KDDFPPX72ZDQ7MEANA5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ3GCQQ#issuecomment-578183490, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSYEYBJDJLLFLRXMQQTQ7MEANANCNFSM4KLHT7WQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr avatar Jan 24 '20 16:01 petermr

On Fri, Jan 24, 2020 at 3:49 PM Emanuel Faria [email protected] wrote:

Clarification requested:

@petermr https://github.com/petermr Should I actually be writing up Dictionary Descriptions as requested for the items in here: A) https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw … or for these ones I us found here: B) https://github.com/petermr/CEVOpen/tree/master/dictionary ?

Note that "tree/master/" chunk is an artefact of Github and won't appear on your disk

B) is the production version, but you should check if there is an obviously larger or newer/cleaner version in A);

A dictionary has a ist of entries like:

... each entry MUST have a term and SHOULD have a wikidata ID. It MAY have a name (which is often the same as the term, but not always). Ideally they should all have IDs. The description is normally the Wikidata description

Direction requested:

Please have a look and provide feedback as to:

  • Have I chosen the correct tables to describe? If not, please point me to the right ones. (eg. the Chemical Constituents file I chose was the only one with wikidataIDs, but had very few entries).

The dictionaries should end up in https://github.com/petermr/CEVOpen/[tree/master/]dictionary https://github.com/petermr/CEVOpen/tree/master/dictionary

  • What you want me to name the files

for the dictionary the name of title in the file , e.g. CEVOpen https://github.com/petermr/CEVOpen/dictionary https://github.com/petermr/CEVOpen/tree/master/dictionary/targetOrganism https://github.com/petermr/CEVOpen/tree/master/dictionary/targetOrganism/ targetOrganism.xml starts ...

The "targetOrganism" is the name of the file (+.xml) and also the title of the dictionary. If they are different the software wi;; throw an error.

converted from essoldb1.0

  • Where you want them to be posted
  • Would you like any changes to the formatting?

Please ### note:

  • The source files for the descriptions are in the .md documents

Please put the links in the issue so I can go straight there...

  • I have pasted questions for you at the bottom of some of them.

Thank you.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCSZT54PF7CQNMYXQBVDQ7MEY7A5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ3GYGI#issuecomment-578186265, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS4EJU2MEHZE24FHVATQ7MEY7ANCNFSM4KLHT7WQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr avatar Jan 24 '20 16:01 petermr

Working on description document for compounds.xml (Draft of CompoundDictionaryDescription.md in the same folder now. It was made with the texts.app I told you about, @petermr ... look ok to you?).

What are the definitions for the following, please:

/desc /entry/@name /entry/@term

EmanuelFaria avatar Jan 24 '20 23:01 EmanuelFaria

Is there information missing from this mail?

On Fri, Jan 24, 2020 at 11:18 PM Emanuel Faria [email protected] wrote:

What are the definitions for the following, please:

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCS3ZVFVM577TVDHVOG3Q7NZLJA5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ4MF4Y#issuecomment-578339571, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCS6OQOBACPO26NZBWNDQ7NZLJANCNFSM4KLHT7WQ .

-- Peter Murray-Rust Founder ContentMine.org and Reader Emeritus in Molecular Informatics Dept. Of Chemistry, University of Cambridge, CB2 1EW, UK

petermr avatar Jan 25 '20 09:01 petermr

Whoops. Yes.... added links to files below.

Working on description document for compounds.xml For the draft of CompoundDictionaryDescription.md in the same folder now. (It was made with the free WYSIWYG markdown editor I told you about, @petermr ... look ok to you?).

I don't know how to distinguish/describe the definitions for the following column headings. Can you help with that?

/entry/@name /entry/@term

EmanuelFaria avatar Jan 25 '20 15:01 EmanuelFaria

The term is the precise string used to identify the concept. The name is a human readable string describing the concept .they are often the same.

On Sat, 25 Jan 2020, 15:58 Emanuel Faria, [email protected] wrote:

Whoops. Yes.... added links to files below.

Working on description document for compounds.xml https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/compound.xml For the draft of CompoundDictionaryDescription.md https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/CompoundDictionaryDescription.md in the same folder now. (It was made with the free WYSIWYG markdown editor http://www.texts.io/ I told you about, @petermr https://github.com/petermr ... look ok to you?).

I don't know how to distinguish/describe the definitions for the following column headings. Can you help with that?

/entry/@name https://github.com/name /entry/@term https://github.com/term

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/petermr/CEVOpen/issues/74?email_source=notifications&email_token=AAFTCS3LZ3FKKVIODM4GD4LQ7ROTZA5CNFSM4KLHT7W2YY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ47JRA#issuecomment-578417860, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAFTCSZNJQ6CWKCHXX253ODQ7ROTZANCNFSM4KLHT7WQ .

petermr avatar Jan 25 '20 16:01 petermr

Thanks Peter. Compound Dictionary description is now ready for review. https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/CompoundDictionaryDescription.md

Interestingly, it contains a table of contents at the top of the page, which I did not create. Does github do this by default, or was it the WYSIWYG editor I'm using?

EmanuelFaria avatar Jan 25 '20 17:01 EmanuelFaria

I've just posted drafts DictionaryDescriptions for the dictionary .xml files I could find.

Location of Main Description of Descriptions .md The main document that provides a description of all the DictionaryDiscriptions is AboutOIL186Dictionaries.md. From here, you can click on the name of any of the sub-sub-headings that end with .md to get to the individual DictionaryDescription for that topic.

Location of Individual Descriptions ### files Because the there were two sources of .xml files to work with (either in CEVOpen/tree/master/dictionary or CEVOpen/tree/master/articleAnalysis/oil186/raw) I have stored the individual DictionaryDescription .md files accordingly in:

  • https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/DictionaryDuplicateTablesOrganized or
  • https://github.com/petermr/CEVOpen/tree/master/dictionary

(Remember: I created the directory /DictionaryDuplicateTablesOrganized and copied the existing files in https://github.com/petermr/CEVOpen/tree/master/articleAnalysis/oil186/raw/ in order to better organize them for my work on creating these dictionaries.)

Heads up Currently, there are notes at the bottom of each of the individual dictionaries — things to fix, clean up, consider, decide, etc.. I will now begin coping the contents of each of them — including their notes — in into separate comment entries for discussion and instruction for correction.

EDIT: On second thought... I'll paste the contents of the master description of descriptions below, and begin new issues for the individual ones. It will be easier to manage the conversation about corrections that way.

Manny

EmanuelFaria avatar Jan 28 '20 02:01 EmanuelFaria

[Index of​ the OIL186 Dictionaries](https://github.com/petermr/CEVOpen/blob/master/articleAnalysis/oil186/raw/DictionaryDescriptionsOIL186/INDEXofOIL186Dictionaries.md )

This document contains information about the Manually Created Dictionaries for OIL186.

The purpose/function of Dictionaries:

  1. *Identify objects/concepts (eg. “e.coli" is a concept.). *

  2. Give each object clear lexical names by which they can be searched. (An object that goes by more than one name is a synonym)

  3. Give each object a link to wikidata (or other authorities) by which we can learn more about them.

PLEASE NOTE: Rather than alphabetical order, are listed here in the logical progression from Plants -> Extracts -> Testing Methods and Instruments -> Results Analysis -> Activities -> Target Organisms the activities were tested upon -> Diseases related to those target organisms

 

Plants

Layman and Botanical Names / Species

 

OilPlantDictionaryDescription.md

  • Description: A dictionary of 1678 constituent chemical compounds extracted from Essential Oils mentioned in the 186 test articles downloaded from PubMed. Of the 1678 entries, ?????? had their names normalized and tagged with corresponding Wikidata IDs, the other 112 remain to be resolved.

  • Filename: OilPlant.xml

  • File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/plant/oilplant.xml

 

Plant Parts

The plant part or parts from which the mentioned oils are extracted

 

PlantPartsDictionaryDescription.md

 

Locations​

The geographical origins of the harvested plant material

 

PlantOriginDescription.md

 

Plant Material History

 

ProcessDictionaryDescription.md

 

EO Extraction and Chemical Analysis Methods

Equipment, methods and materials used for EO extraction

ExtractionAndChemicalAnalysisMethodsDictionaryDescription.md

 

EO Analysis Instruments

A dictionary of [24] makes/models of Gas chromatography–mass spectrometry equipment used to identify different substances within a test sample — in this case, Essential Oils mentioned in the 186 test articles downloaded from PubMed.

 

InstrumentDictionaryDescription.md

 

EO Chemical Analysis Results - Constituents and Concentrations

Essential Oils (EOs) are the concentrated hydrophobic liquid containing volatile chemical compounds extracted from plants. Essential oils are also known as volatile oils, ethereal oils, aetherolea, or simply as the oil of the plant from which they were extracted, such as oil of clove.

Qualitative (constituent compounds) and quantitative (%) analysis of the chemical composition of the tested Essential Oils (Extracts?), with each known compound linked to its IUPAC International Chemical Identifier (InChI).

 

CompoundDictionaryDescription.md

  • Description: A dictionary of 2114 constituent chemical compounds extracted from Essential Oils mentioned in the 186 test articles downloaded from PubMed. Of the 2114 entries, 1010 had their names normalized and tagged with corresponding Wikidata IDs, the other 1104 remain to be resolved.

  • Filename: compound.xml

  • File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/compound/compound.xml

 

 

EO Activities

Tested biochemical and/or biological activities, and where available, their measured results.

 

ActivityDictionaryDescription.md

 

Target Organisms

The organisms used as targets of experiments conducted to determine what effect(s) (Activities) tested EOs may have on them. They may occur as A) single-cells or colonies, such as bacteria, fungi, yeasts and molds, protozoa, algae, or viruses; B) insects such as mosquitos, flies, etc.; or, C) they may be helminths, such as Nematodes (roundworms), Cestodes (tapeworms), and Trematodes (flukes).

 

TargetOrganismDictionaryDescription.md

 

TargetOrganismDictionaryDescription.md

 

Diseases

Text for definitions goes here

This dictionary does not yet exist

EmanuelFaria avatar Jan 28 '20 02:01 EmanuelFaria

FYI: As I clean up each [dictionary].xml file and update their unique [DictionaryName]DictionaryDescription.md files, I have also updated the master INDEX of Oil186 Dictionary Descriptions here: (INDEXofOIL186Dictionaries.md)

EmanuelFaria avatar Feb 22 '20 15:02 EmanuelFaria

As of today, we have 11 finished dictionaries. They are:

  1. eoActivity
  2. eoAnalysisMethod
  3. eoCompound
  4. eoExtractionMethod
  5. eoPlant
  6. eoPlantMaterialHistory
  7. eoPlantPart
  8. eoTargetOrganism
  9. geoLocation
  10. humanDiseases
  11. pests

... as well as a master INDEX of their descriptions, pasted below:

Index Oil186 Dictionaries

This index contains information about the Manually Created Dictionaries for OIL186.

PLEASE NOTE: Rather than alphabetical order, are listed here in the logical progression.

The purpose/function of Dictionaries:

  1. Identify “things” as objects or concepts (eg. “e.coli" is a concept.).

  2. Give each object clear lexical names by which they can be searched.
    (An object that goes by more than one name is a synonym.)

  3. Give each object a link to wikidata (or other authorities) by which we can learn more about them.

 


EO Plant

eoPlant.md

 


EO Plant Part

eoPlantPart.md

 


Geo Location

geoLocation.md​

  • Description: A dictionary of 9568 entries for geolocations including country, countryISOcode, city, latitude, longitude, postal code and time zone sourced from http://www.ip2location.com, along with data agumenting Indian States-Cities created and maintained over the years obtained at https://network.convergenceservices.in/forum/12-joomla-development/4305-mysql-tables-for-country-states-and-indian-states-cities.html.

  • License information: This site or product includes IP2Location LITE data available from http://www.ip2location.com

  • Filename: geoLocation.xml

  • File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/geoLocation/geoLocation.xml

 


EO Plant Material History

eoPlantMaterialHistory.md

  • Description: A dictionary of 81 entries relating to the plant material history leading up to the extraction of Essential Oils mentioned in selected literature chosen from the 186 test articles downloaded from PubMed. The entries include key words and phrases describing: growth conditions, plant life stages, plant material selection, post-harvest treatment methods, and extracted plant material products. Of the 82 entries, 58 were resolved to WikidataIDs.

  • Filename: eoPlantMaterialHistory.xml

  • File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoPlantMaterialHistory/eoPlantMaterialHistory.xml

 


EO Extraction Method

eoExtractionMethod.md

  • Description: A dictionary of 87 terms for Essential Oil extraction methods and apparatus.

  • Filename: eoExtractionMethod.xml

  • File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoExtractionMethod/eoExtractionMethod.xml

 


EO​​ Analysis Method

Analytical chemistry studies and uses instruments and methods used to separate, identify, and quantify matter.[1] In practice, separation, identification or quantification may constitute the entire analysis or be combined with another method. Separation isolates analytes. Qualitative analysis identifies analytes, while quantitative analysis determines the numerical amount or concentration.

Analytical chemistry consists of classical, wet chemical methods and modern, instrumental methods.[2] Classical qualitative methods use separations such as precipitation, extraction, and distillation. Identification may be based on differences in color, odor, melting point, boiling point, radioactivity or reactivity. Classical quantitative analysis uses mass or volume changes to quantify amount. Instrumental methods may be used to separate samples using chromatography, electrophoresis or field flow fractionation. Then qualitative and quantitative analysis can be performed, often with the same instrument and may use light interaction, heat interaction, electric fields or magnetic fields. Often the same instrument can separate, identify and quantify an analyte.

(Source: https://en.wikipedia.org/wiki/Analytical_chemistry)

 

eoAnalysisMethod.md

 


EO Compound

Essential Oils (EOs) are the concentrated hydrophobic liquid containing volatile chemical compounds extracted from plants. Essential oils are also known as volatile oils, ethereal oils, aetherolea, or simply as the oil of the plant from which they were extracted, such as oil of clove.

Qualitative (constituent compounds) and quantitative (%) analysis of the chemical composition of the tested Essential Oils (Extracts?), with each known compound linked to its IUPAC International Chemical Identifier (InChI).

 

eoCompound.md

  • Description: A dictionary of 2114 constituent chemical compounds extracted from Essential Oils converted from essoldb1.0 data. Of the 2114 entries, 1010 had their names normalized and tagged with corresponding Wikidata IDs, the other 1104 remain to be resolved as no Wikidata IDs currently exist for them.

  • Filename: eoCompound.xml

  • File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/eoCompound/eoCompound.xml

 


EO Activity

eoActivity.md

 


EO Target Organism

The organisms used as targets of experiments conducted to determine what effect(s) (Activities) tested EOs may have on them. They may occur as A) single-cells or colonies, such as bacteria, fungi, yeasts and molds, protozoa, algae, or viruses; B) insects such as mosquitos, flies, etc.; or, C) they may be helminths, such as Nematodes (roundworms), Cestodes (tapeworms), and Trematodes (flukes).

 

eoTargetOrganism.md

 


Human Diseases

humanDiseases.md

 


Pests​

disease.md

  • Description: A dictionary of 1032 terms for two categories of insects: A) Insect vectors of human pathogens sourced from https://en.wikipedia.org/wiki/Category:Insect_vectors_of_human_pathogens, and B) Winged insects soursed from https://www.insectidentification.org/winged-insect-key.asp

  • Filename: pests.xml

  • File Location: https://github.com/petermr/CEVOpen/blob/master/dictionary/pests/pests.xml

EmanuelFaria avatar Mar 25 '20 20:03 EmanuelFaria