Extension of the Gene type expressedIn property
Hi
Currently the expressedIn property of the Gene type expects an anatomical entity (e.g. Uberon).
Could it be extended to include also developmental stages, and sex and strain conditions for example?
A condition instead of an anatomical structure would be great.
By condition I mean
anatomical_structure [developmental_stage] [sex] [strain]
to be able to represent something like
expressed in liver, at the embryonic stage, in female of native Americans
or simply expressed in liver in male OR expressed in liver
Can you give a consumption use case where this extra detail is needed? We need to be very careful that we are not trying to completely model the biology.
For example hemoglobin subunit gamma gene is expressed only in fetal developmental stage, not in adult. Ovary and testis specific genes are expressed respectively in female and male sexes.
I'm using this property for linking all sort of conditions coming from the EBI GXA, example.
DefinedTerm, currently being proposed as range for this property, could be kind of fine for this general case, but I think PropertyValue would be better. Conceptually, a condition's domain might include information artefacts like name/value pairs, because this is what you find to export automatically in many databases or datasets.
Moreover, I and others from the plant community are thinking of proposing ExperimentalFactorValue for cases like this. As you can see in the linked example (click on the condition URIs), I have a provisional term like this under the agrischemas namespace, which subclasses schema:PropertyValue.
Hi all, I am jumping into the discussion :) . (@smoretti pointed me out this thread)
I think the best solution is creating something such as ExperimentalFactorValue or Condition as range of this property expressedIn. Maybe it will be more appreciable by the schema.org community, a higher level concept such as "Condition". I'm saying this because schema.org/DefinedTerm is not a solution for our problem because by following DefinedTerm specification if we use it for this purpose it will be like forcing the schema to accommodate our data. PropertyValue could be a temporary solution in the first moment. I am saying it as a temporary solution because, by doing so, we do not have a highly structure schema but a semi-structure, essentially the schema part is pushed into the data level.
thanks @tarcisiotmf. Condition could be a good addition to schema.org, but I think the life science community needs something to identify the more specific concept of experimental factor. A condition might be a measured condition in which you find a sample, a patient or a plant, but experimental factor is the condition that you want to study and that you typically vary on purpose (eg, wrt the baseline condition) in order to study if it makes a difference (eg, on gene expression).
@marco-brandizi , I agree with you by having such concept ExperimentalFactorValue, it will be already quite adequate for these use cases.
It would be helpful to have the consumption use case, e.g. the search, that this extension is seeking to support and some details to state why the less accurate approach is not sufficient.
@AlasdairGray I've real data examples like this. I'll report another, simpler case:
ex:sample1 a bioschema:Sample;
schema:isPartOf ex:study1;
schema:name "The sample 1";
schema:additionalProperty [
a bioschema:BiologicalCondition; # new type, or schema:PropertyValue
schema:propertyID "initial seed size";
schema:value "5 mm"
];
bioschema:experimentalFactor [ # new property or schema:additionalProperty
a bioschema:ExperimentalFactorValue; # marks the perturbed/varied condition
schema:propertyID "harvesting time";
schema:value "4 weeks";
]
.
As you can see, the sample is annotated with two different types of properties, and one is stated as the experimental factor.