physionet
physionet copied to clipboard
few questions about 'database' metadata, and versioning
Very well done and lovely project! thanks
We thought may be to provide access to datasets you provide via https://github.com/datalad/datalad/ (based on git/git-annex). Here is e.g. a sample git/annex repository http://datasets.datalad.org/test/physionet/eegmmidb/ which accesses data from your website. Before jumping to just crawl the entire website from https://physionet.org/physiobank/database/ I wondered to ask
- is list of database (datasets) is available in some machine readable form?
- per each database, all that information (citations, description) - does it leave solely in html or composed from some centralized DB?
- how do you "version" files? i.e. if authors reupload newer/fixed versions -- do they just replace old copies? or it never happened (yet)?
Thank you in advance!
- Yes https://physionet.org/physiobank/database/DBS
- All html pages are independently manually generated (although copied from a general template) but are not automated and do not depend on a set of target metadata files unfortunately at this point.
- Unfortunately we don't have version control. This is something we are working on implementing. Currently we just put a note on the html page about things being changed.
I'm reopening the issue for now to help us keep it in mind, because it would be great to make the freely-available PhysioNet datasets available in this way. At the moment the metadata is not available in machine readable form, but we are working on updating the platform and this is definitely something that we would like to address.
Thank you Tom!
Is any of you is going to sfn next week?
As far as I know, none of the MIT-based team will be at the Society for Neuroscience meeting (https://www.sfn.org/annual-meeting/neuroscience-2016), but I'll let others comment if so!
Just in case -- If anyone would be interested to chat at SfN, you would be able to find me at the DataLad booth 4113.
FWIW, a sample dataset now is available from http://datasets.datalad.org/?dir=/physionet . So do not get surprised if you see lots of "git-annex" agents being logged in your web logs ;)