reader icon indicating copy to clipboard operation
reader copied to clipboard

user doesn't know which version/date of CORD19 a carrel was created from

Open nkmeyers opened this issue 4 years ago • 3 comments

Problem: user either a carrel creator or user reading a carrel doesn't know which version/date of CORD19 dataset a carrel was created from

Question(s):

  1. How do we know what the last known date was for when the cord19 dataset was updated and indexed in Distant Reader? @artunit @ralphlevan @ericleasemorgan ?
  2. @ericleasemorgan How can we best write that last cord19dataset-updatetoDR info, and last cord19dataset-indexed-byDRdate info into the visilble body content of /export/reader/www/internal/index.cgi?
  3. @ericleasemorgan How can we best write that last cord19dataset-updatetoDR info, and last cord19dataset-indexed-byDRdate info into the provenance.tsv file for a carrel?

Possible solution(s): Once we can populate that date info as a data element in the provenance file can we then Write the entire contents of provenance file into a new variable derived section in the MANIFEST.htm and index.htm files for ea carrel?

Anyone have other ideas for how to get this info in front of the user(s) at carrel creation time? And at carrel "reading" time?

nkmeyers avatar Jun 28 '20 17:06 nkmeyers

A little late, but I have a way to find out when my database was last updated. http://solr-01:8983/solr/cord/admin/file?wt=json&_=1594418120634&file=dataimport.properties&contentType=text%2Fplain%3Bcharset%3Dutf-8

That will return a file that include the date of the last indexing

ralphlevan avatar Jul 10 '20 21:07 ralphlevan

There's also this: http://solr-01:8983/solr/admin/metrics/history?action=status&name=solr.collection.cord

That will return a JSON file. status.lastModified is timestamp in seconds. Usual tools will convert that into date and time

ralphlevan avatar Jul 10 '20 22:07 ralphlevan

On Jun 28, 2020, at 1:24 PM, Natalie Meyers [email protected] wrote:

Problem: user either a carrel creator or user reading a carrel doesn't know which version/date of CORD19 dataset a carrel was created from

Question(s):

• How do we know what the last known date was for when the cord19 dataset was updated and indexed in Distant Reader? @artunit @ralphlevan @ericleasemorgan ? • @ericleasemorgan How can we best write that last cord19dataset-updatetoDR info, and last cord19dataset-indexed-byDRdate info into the visilble body content of /export/reader/www/internal/index.cgi? • @ericleasemorgan How can we best write that last cord19dataset-updatetoDR info, and last cord19dataset-indexed-byDRdate info into the provenance.tsv file for a carrel? Possible solution(s): Once we can populate that date info as a data element in the provenance file can we then Write the entire contents of provenance file into a new variable derived section in the MANIFEST.htm and index.htm files for ea carrel?

Anyone have other ideas for how to get this info in front of the user(s) at carrel creation time? And at carrel "reading" time?

In short, "Correct."

In long, when CORD is harvested, a date stamp ought to be added to its database. Then, when a carrel is created, the date stamp goes along for the ride in our provenance file. The content of the provenance will then get inserted into various HTML pages and report.

-- Eric

ericleasemorgan avatar Jul 27 '20 14:07 ericleasemorgan