reproman icon indicating copy to clipboard operation
reproman copied to clipboard

reproman run - simplify datalad credentials specification for getting data

Open yarikoptic opened this issue 5 years ago • 4 comments

yarikoptic avatar Oct 23 '19 15:10 yarikoptic

Could you provide details on what you're referring to here?

kyleam avatar Nov 18 '19 21:11 kyleam

I should have been more verbose indeed, and exact use-case escaped my memory now. But I think what it was about is the fact we do datalad get on the remote resource to fetch data. Data access might require authentication (e.g. any dataset from CRCNS, as the ///crcns ones), or S3 credentials (while accessing from s3://), and I am not sure if those credential requests would be visible to user during reproman run.

yarikoptic avatar Nov 18 '19 23:11 yarikoptic

OK, thanks.

I am not sure if those credential requests would be visible to user during reproman run.

Yeah, I doubt they would. The get calls were very much done with publicly accessible data in mind and are followed up with a publish call to try to sync data that's available on the local machine but not publicly.

kyleam avatar Nov 19 '19 00:11 kyleam

I guess one of the ways would be to pass credentials via environment variables. https://github.com/datalad/datalad/blob/master/datalad/support/keyring_.py#L48 is the place where we access them. So for CRCNS it would be DATALAD_CRCNS_USER and DATALAD_CRCNS_PASSWORD. Ideally it should be interfaced by reproman, so user could just list datalad credentials to be passed through (e.g. "crcns" here, but could be multiple "crcns, s3"), reproman queries those from local credential store (would need to ask for specific to credential fields) and then remote datalad execution get those env variables set.

yarikoptic avatar Nov 19 '19 13:11 yarikoptic