dipper icon indicating copy to clipboard operation
dipper copied to clipboard

add pub fetcher as post-processing utility

Open nlwashington opened this issue 10 years ago • 1 comments

in order to display the publications that any source contains with nice labels, it would be prudent to fetch the publication details from pubmed, if they are available. this could be done as a post-processing step like the following:

  1. after the entire graph is built for a source, run a query to get all nodes that are publications
  2. in batch, query using eutils for the basic publication information for any PMIDs. add the authors, title, short citation, year, and other publication metadata. consider adding the abstract, if we want to show it when hovering.
  3. either insert the publication metadata back into the graph, or make a separate dump of the publication metadata into a new graph file.

also, this could be set to complete based on a commandline flag, because the time spent querying eutils might be quite high, depending on the source.

@cmungall do you like the idea of doing this at dipper time, or at golr time?

nlwashington avatar Jan 29 '16 21:01 nlwashington

It would be good to avoid re-querying each time. I think we'd want to maintain our own permanent cache. This could be SG, or another solution. If it's SG, would it be easy to merge the SG publication graph into each fresh monarch data graph?

cmungall avatar Jan 29 '16 22:01 cmungall