airr-standards icon indicating copy to clipboard operation
airr-standards copied to clipboard

Introduce a field to indicate the license of a data set

Open bussec opened this issue 6 years ago • 5 comments

Until now our focus has been on data sets that are in the public domain, i.e. have been deposited within the infrastructure of INSDC. However, when thinking about the a more diversified structure of the AIRR Data Commons, data sets might come under a variety of licenses.

Taking the recommendations of RDA & CODATA - especially Principle 4: "State the rights transparently and clearly" - into account, should we introduce a field in the AIRR Schema that indicates the license of a data set? And if yes, what would be the best level in the hierarchy for this? Sample? Repertoire?

bussec avatar Apr 25 '19 22:04 bussec

I suggest the license of a dataset to be clearly stated at the sample level.

enkelejdamiho avatar Apr 26 '19 22:04 enkelejdamiho

@bussec I think for simplicity, the user should be able to specify a license at the study level, thus covering all data in the study. And also at the "sample" level as an override, which technically that would be at sequencing run as that's where the filenames are specified. I assume we are talking about informatic (digital data) license and not on organic material in a tube?

schristley avatar Sep 04 '19 22:09 schristley

Yes, digital. I suggest to keep the option open on granularity. If the user can specify the license at study level, this is fast done. I was more focusing on the case where parts/samples of the same study might be differently handled in terms.

enkelejdamiho avatar Sep 26 '19 23:09 enkelejdamiho

@schristley @bcorrie I just realized that there is a license field in the ADC info object. Are your repos using that in a dataset specific way?

bussec avatar Jun 01 '22 13:06 bussec

No, that's the license for the API service. Data licenses likely need to be in the data itself.

schristley avatar Jun 01 '22 16:06 schristley