reproman
reproman copied to clipboard
reproman run - simplify datalad credentials specification for getting data
Could you provide details on what you're referring to here?
I should have been more verbose indeed, and exact use-case escaped my memory now. But I think what it was about is the fact we do datalad get
on the remote resource to fetch data. Data access might require authentication (e.g. any dataset from CRCNS, as the ///crcns
ones), or S3 credentials (while accessing from s3://
), and I am not sure if those credential requests would be visible to user during reproman run
.
OK, thanks.
I am not sure if those credential requests would be visible to user during
reproman run
.
Yeah, I doubt they would. The get
calls were very much done with publicly accessible data in mind and are followed up with a publish
call to try to sync data that's available on the local machine but not publicly.
I guess one of the ways would be to pass credentials via environment variables. https://github.com/datalad/datalad/blob/master/datalad/support/keyring_.py#L48 is the place where we access them. So for CRCNS it would be DATALAD_CRCNS_USER
and DATALAD_CRCNS_PASSWORD
. Ideally it should be interfaced by reproman, so user could just list datalad credentials to be passed through (e.g. "crcns" here, but could be multiple "crcns, s3"), reproman queries those from local credential store (would need to ask for specific to credential fields) and then remote datalad execution get those env variables set.