CompoundDb icon indicating copy to clipboard operation
CompoundDb copied to clipboard

License issues

Open stanstrup opened this issue 8 years ago • 7 comments

From @stanstrup on October 19, 2017 9:1

  1. Which databases can I include data from?
  2. If there are ones I cannot they will need to be download and table generated by the user. Is there such a thing as "in-package cache"?
  3. Which license can the package have if it includes db data?
  4. Is license a concern at all? As far as I know data cannot be copyrighted so is there any concern at all?
  • MONA (lipidblast) is CC BY 4.0. So should be OK? http://mona.fiehnlab.ucdavis.edu/documentation/license
  • I cannot find what license lipidmaps have.
  • Seems hmdb require explicit permission to include http://www.hmdb.ca/downloads

The info I extract is: id, name, inchi, formula, and mass..

For the moment I force-removed the files until this is settled.

Copied from original issue: stanstrup/PeakABro#1

stanstrup avatar Oct 27 '17 12:10 stanstrup

From @egonw on October 19, 2017 13:31

I don't think LipidMaps is Open Data. Wikidata is, PubChem is.

stanstrup avatar Oct 27 '17 12:10 stanstrup

From @chasemc on October 19, 2017 22:45

"LMSD lipid structures are deposited into PubChem database (http://pubchem.ncbi.nlm.nih.gov/) periodically and a link to PubChem substance ID (SID) is also maintained within LMSD. Access to complete set of LMSD lipid structures in PubChem is available at www.ncbi.nlm.nih.gov/entrez/query.fcgi?CMD=search&DB=pcsubstance&term=LipidMAPS[sourcename])."

stanstrup avatar Oct 27 '17 12:10 stanstrup

@chasemc thanks! That is very useful info. So I might be able to get around that one by just including PubChem and leave the indicator to lipidmaps so that you can eventually filter for the lipidmaps compounds.

stanstrup avatar Oct 27 '17 12:10 stanstrup

@chasemc It seems the source is only in the SID entries. Not the CIDs. However the lipidmaps ids have been added as a name so it is possible to filter by those prefixes.

stanstrup avatar Oct 27 '17 12:10 stanstrup

From @egonw on October 22, 2017 9:20

@chasemc also note that PubChem is not formally Open Data: it mixes their own public domain data with copyrighted upstream material. Legally, this is quite hard to untangle.

Generally, just contact LipidMaps and ask if it is OK to index their structures in the table as you want to do, and if you are allowed to make that available under terms compatible with the license of the R package.

For LipidMaps, a subset of about 1400 lipids is available under CCZero from Wikidata: http://tinyurl.com/ycbm9gfq

stanstrup avatar Oct 27 '17 12:10 stanstrup

Thanks! I already contacted LipidMaps. Waiting for an answer.

stanstrup avatar Oct 27 '17 12:10 stanstrup

@egonw what do you mean by upstream material? All the calculated properties? Si if I only use basic info as name and inchi it should be ok?

stanstrup avatar Oct 27 '17 12:10 stanstrup