myvariant.info
myvariant.info copied to clipboard
Data source: HGMD
http://www.hgmd.cf.ac.uk/ac/index.php
Seems to be very frequently used by a lot of labs working on variant annotation pipelines.
Have licensing issue we need to be aware of, has an academic version and a commercial version.
We can provide a "data plugin" available for standalone myvariant.info instance when users have the permission to use HGMD. But it won't available in public myvariant.info API due to the license restriction.
I think that's the easiest possible route. One thing to remember that HGMD is a product of Qiagen...which also has a stake in annotation distribution as they provide AnnoVar as a service. Interestingly...it was publicly announced just a few weeks ago that Alamut is no longer able to provide HGMD as an annotation....so that's a big reason why labs all over the place are clamoring for a solution to annotation with HGMD included. ....If Qiagen were willing to setup tokenization/authentication with BioThings...there could be an option to provide HGMD-PRO annotation via the public API. But that would take plenty of time and effort. HGMD-PRO as a plugin any institution can just enable is certainly the way to go right now. We could also look at having a 2nd souce HGMD-Public (1/2 the data, and out of date, but at least it's publicly available. Right?)
Another possible option is we can provide a parser, but standalone instance users (in-house, not the public one) who do have the HGMD license, can get the dumped HGMD-RPO file and run the parser to merge the data into the standalone instance.
@raymond301 We don't have access to the commercial version of HGMD, do you know if a dumped file will be available for HGMD-PRO subscriber?
Reaching out to Qiagen will be a something we'd like to do soon, just to see if they are open to any solution to include HGMD in MyVariant.info.
For HGMD-PRO (Qiagen's commercial product) there are a number of difference licenses, the key differences are between clinical & research purpose as well as their web-interface, or just a data-dump download.
I have "HGMD Download, Research Use" which consists of a number of files:
- HGMD_Data_Download_Page.pdf
- HGMD_download_installation_<version>.pdf
- HGMD_FAQ_<version>.pdf
- HGMD_Schema_<version>.pdf
- hgmd_phenbase-<version>.dump.gz
- hgmd_pro-<version>.dump.gz
- hgmd_snp-<version>.dump.gz
- hgmd_views-<version>.dump.gz
- hgmd_pro_<version>_hg19.vcf
- hgmd_pro_<version>_hg38.vcf
It's not overly complex to parse and load the VCF's provided...but there are slight differences from the MySQL database dump files, including all the functional annotation & curation notes. So we would be resigned to merge the additional details from the database files, along with the included vcf files. Please note that this file structure & formats have changed over the years...so all of this is subject to change based on release version, which is every quarter of the year.
It's a very doable task...I cannot speak for Qiagen's position on inclusivity into MyVariant.info. But it may be worth looking into what can be obtained through their public version.