ga4gh-schemas
ga4gh-schemas copied to clipboard
Allow more flexible linking of individuals and phenotypes
via @sarahhunt
Attaching disease to BioSample seems odd - many sequenced samples are not associated with a disease; if they were it would be more normal to flag this at the individual level. This model does not allow for individuals with multiple diseases. Disease is a limiting term - what about other traits which may be investigated? Wouldn't this be better modelled as a record linking an individual to a phenotype/trait/disease with date of linkage?
If I edit in a mutation into a normal iPSC to mimic a disease - then this is the BioSample and not the individual. How else would we model this?
Thanks for the example use case @helenp. So we need to be able to hold such information at both levels.
Yes. But we have to document this better. Also to make it very explicit that the "disease" is inherent in the sample, with examples. E.g. peripheral blood from a patient with CRC does not have a disease
value CRC. This is a cause of endless grief when mapping data;
It seems like BioSample is the basis for a preparation. Is it possible we need another entity to better describe these experiments? For example, would it make more sense to model @helenp's use case as a preparation of a biosample? For example, I may have a single piece of tissue that will be treated three different ways. Are these three new biosamples, or three preparations of the same BioSample?
The current model collapses layers of more complete models. A sample is split into aliquots for analysis which are then used to produce libraries. In the current GA4GH metadata model, these aliquot and library would be in experimenter. It's still the same sample.
This model might evolve once we get more experience.
David Steinberg [email protected] writes:
It seems like BioSample is the basis for a preparation. Is it possible we need another entity to better describe these experiments? For example, would it make more sense to model @helenp's use case as a preparation of a biosample? For example, I may have a single piece of tissue that will be treated three different ways. Are these three new biosamples, or three preparations of the same BioSample?
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub, or mute the thread.*
Although I'm not suggesting we use TCGA barcodes, I found this image helpful in thinking about the biosample data coverage.
https://wiki.nci.nih.gov/display/TCGA/TCGA+barcode