Chado icon indicating copy to clipboard operation
Chado copied to clipboard

Normalizing/standardizing data storage (enforcing bestpractices) through PL/pgsql functions

Open guignonv opened this issue 2 years ago • 2 comments

I'd like to bring a discussion (yesterday on gather.town) here for the community.

The main strength of a schema like Chado is to normalize how biological data are stored. Then, we can create generic tools that would work with everybody's chado instance as long as they store things the (exact) same way. However, it is often possible to store the same thing in different ways. Then, making generic tools for Chado becomes a harder task.

In order to enforce standard ways of storing things, there is a best-practices documentation but people may not ready it, forget about it or misunderstand things.

I propose an additional approach: adding embedded functions into Chado schema which roles are to store data appropriately. I see several advantages: first, they will enforce the way things are stored (ie. use the appropriate tables and create the appropriate links between things), they can check data integrity (ie. if a date is stored in a *prop.value column, the function can raise an issue if the date is incorrect), they can adapt to schema changes (ie. a same function can remain from one schema version to another even in case of major table changes). There are some drawback: there is a lot of code to write and maintain, there will be many functions to write to fulfill many use cases, how can we decide what is needed? I will post some example in a next comment.

So far, it's just a though (related to a problem I faced), with no code written. It would require a lot of work but I'd like to have the community opinion about this approach.

Edit: well, there is already some code for basic stuff (store_db, store_dbxref, store_organism, store_feature, store_featureloc, store_feature_synonym, store_analysis) but it could be generalized. "store_feature" could be used by new functions like store_dna, store_snp,...

guignonv avatar Feb 08 '22 11:02 guignonv