mygene.info
mygene.info copied to clipboard
Create a new NCBI data source to get complete gene summary from ASN dump
The current gene summary data (summary
field) from MyGene.info API are extracted from the RefSeq records (see the current refseq data source).
It appears that Refseq does not contain all gene summary text available from NCBI. For example, reported in #129, gene POLA2 contains a summary text which is not available from its RefSeq record, therefore it's missing from the current MyGene.info API.
As suggested by the NCBI support team (Case #: CAS-941135-X3W9H8 for the record), the complete gene summary text are available from NCBI's ASN1 binary dump files. We can create a new ncbi_gene
data source based on ASN1 binary dump files to extract gene summary text.