rentrez icon indicating copy to clipboard operation
rentrez copied to clipboard

Question about entrez_link function

Open sckott opened this issue 10 years ago • 6 comments

hey @dwinter - tried to figure this out, but you can probably do so much faster.

Question from twitter https://twitter.com/neilfws/status/461109878262493184

entrez_link(dbfrom="pccompound",db="all",id="62857") gives no results from db "gds" But db="gds" (entrez_link(dbfrom="pccompound",db="gds",id="62857")) gives lots of results from gds

sckott avatar Apr 29 '14 16:04 sckott

This appears to be happening on the NCBI's end (the xml file for the first query doesn't contain any 'gds' ids). Have just send the follwoing email to the Eutils group, will update here when I hear back

Hello,

I am the maintainer of rentrez, an R library that interfaces with th EUtils api (https://github.com/ropensci/rentrez)

Following a question from a user, I have a question about the meaning of "all" as the destination database for Elink queries. As the user points out, a search for an ID from "pccompund" to "all" doesn't turn up any links to "gds"

(http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pccompound&id=62857&db=all)

But a search on the same id, but with "gds" specified as the database against which to search uncovers many linked ids in this database

(http://eutils.ncbi.nlm.nih.gov/entrez/eutils/elink.fcgi?dbfrom=pccompound&id=62857&db=gds)

Is this an expected behavior? If so, is there some way to tell which databases will be included when db is set to "all" (possibly this table? http://www.ncbi.nlm.nih.gov/entrez/query/static/entrezlinks.html)

I would love to include a note about this behavior in our documentation.

Thank you in advance for your help on this, David Winter

dwinter avatar Apr 29 '14 17:04 dwinter

thanks @dwinter !

sckott avatar Apr 29 '14 17:04 sckott

Thanks guys; yes, I noticed after posting that the raw EUtils URL returns the same result, so this is unexpected behaviour of db=all at the NCBI end. Hope we hear from them soon.

neilfws avatar Apr 29 '14 21:04 neilfws

Just a small update to say I haven't heard from anyone at NCBI other than to say they had recieved my email. Will report back if I hear anything.

Going to remove the "bug" label because it's no a problem with rentrez, and the thought of having an open bug for this long is annoying to me :)

dwinter avatar May 14 '14 19:05 dwinter

FYI @neilfws

sckott avatar May 14 '14 19:05 sckott

Digging this one out of the time tunnel @sckott and @neilfws.

I never head back form entrez about this, but I've just added some "higer level" functions that at least make what's going on cleared.

entrez_db_links lists all the possible links for a given database (I guess this is what you get from "all"):

install_github("ropenscei/rentrez")
library(rentrez)

(links <- entrez_db_links("pccompound"))

#Linked dbs result with the following fields:
# [1] "pccompound_biosystems"                             
# [2] "pccompound_gene"                                   
# [3] "pccompound_mesh"                                   
# [4] "pccompound_nuccore"                                
...

There is a little bit of information about each one of these links, but nothing very helpful:

links$pccompound_structure
#$Name
#[1] "pccompound_structure"
#
#$Menu
#[1] "Protein Structures"
#
#$Description
#[1] "Related Protein Structure"
#
#$DbTo
#[1] "structure"
#

Still a myestery to me why you could get information by specifying a database that isn't listed as having linked information, but I guess this at least let's you know what you expect from all?

dwinter avatar Oct 04 '14 20:10 dwinter