physionet icon indicating copy to clipboard operation
physionet copied to clipboard

few questions about 'database' metadata, and versioning

Open yarikoptic opened this issue 8 years ago • 7 comments

Very well done and lovely project! thanks

We thought may be to provide access to datasets you provide via https://github.com/datalad/datalad/ (based on git/git-annex). Here is e.g. a sample git/annex repository http://datasets.datalad.org/test/physionet/eegmmidb/ which accesses data from your website. Before jumping to just crawl the entire website from https://physionet.org/physiobank/database/ I wondered to ask

  • is list of database (datasets) is available in some machine readable form?
  • per each database, all that information (citations, description) - does it leave solely in html or composed from some centralized DB?
  • how do you "version" files? i.e. if authors reupload newer/fixed versions -- do they just replace old copies? or it never happened (yet)?

Thank you in advance!

yarikoptic avatar Oct 26 '16 11:10 yarikoptic

  • Yes https://physionet.org/physiobank/database/DBS
  • All html pages are independently manually generated (although copied from a general template) but are not automated and do not depend on a set of target metadata files unfortunately at this point.
  • Unfortunately we don't have version control. This is something we are working on implementing. Currently we just put a note on the html page about things being changed.

cx1111 avatar Oct 28 '16 19:10 cx1111

I'm reopening the issue for now to help us keep it in mind, because it would be great to make the freely-available PhysioNet datasets available in this way. At the moment the metadata is not available in machine readable form, but we are working on updating the platform and this is definitely something that we would like to address.

tompollard avatar Nov 03 '16 20:11 tompollard

Thank you Tom!

yarikoptic avatar Nov 03 '16 21:11 yarikoptic

Is any of you is going to sfn next week?

yarikoptic avatar Nov 03 '16 21:11 yarikoptic

As far as I know, none of the MIT-based team will be at the Society for Neuroscience meeting (https://www.sfn.org/annual-meeting/neuroscience-2016), but I'll let others comment if so!

tompollard avatar Nov 03 '16 21:11 tompollard

Just in case -- If anyone would be interested to chat at SfN, you would be able to find me at the DataLad booth 4113.

yarikoptic avatar Nov 03 '16 21:11 yarikoptic

FWIW, a sample dataset now is available from http://datasets.datalad.org/?dir=/physionet . So do not get surprised if you see lots of "git-annex" agents being logged in your web logs ;)

yarikoptic avatar Sep 17 '18 17:09 yarikoptic