pgsc_calc
pgsc_calc copied to clipboard
Improving information about PGS Catalog scoring file versions
Description of feature
It would be great if there was some Version Control, such that the state of the PGS Catalog could be re-constructed as of a given date. At minimum, publishing a running change log (eg. when change occured, what was changed, why the change was made, etc.) would be extremely useful. Apologies if this feature already exists - if it does, making it more prominent would be great. A longer-term goal might be to allow users of pgsc_calc
to request scores based on a given version/release of the PGS Catalog.
Motivation
My lab and collaborators have noticed that executing the exact same pgsc_calc
command that pulls scores from the PGS Catalog has resulted in different output when run on different days. In troubleshooting, we noticed a few issues:
-
In one case, the
#trait_efo
assigned to a scorefile changed from one day to the next. -
In two other cases we noticed that the sign of the
effect_weight
column was changed within the scorefiles.
It's great that archived versions of each scorefile are maintained on PGS Catalog FTP site, which eventually allowed us to troubleshoot these issues. However, tracking down these individual scorefile changes is very time consuming, particularly as the number of scores and number of archived versions increases. This problem also raises the potential for broader transparency/reproducibility issues. Thanks for all the hard work making this resource possible!