datasets
datasets copied to clipboard
Make more publication metadata available via `datasets download virus genome taxon 186538`
Unfortunately some of the valuable metadata contained in genbank files is lost on the way to datasets. In particular crucial author/submitter metadata is lost. Things like: TITLE
, JOURNAL
, PUBMED
and DOI
are unfortunately lost entirely.
It would be great if these could be included in the submitter
part of the jsonl report.
Compare what's available for the .gb
file:
REFERENCE 1 (bases 1 to 2408)
AUTHORS Sanchez,A., Trappier,S.G., Mahy,B.W., Peters,C.J. and Nichol,S.T.
TITLE The virion glycoproteins of Ebola viruses are encoded in two
reading frames and are expressed through transcriptional editing
JOURNAL Proc. Natl. Acad. Sci. U.S.A. 93 (8), 3602-3607 (1996)
PUBMED [8622982](https://www.ncbi.nlm.nih.gov/pubmed/8622982)
REFERENCE 2 (bases 1 to 2408)
AUTHORS Sanchez,A., Trappier,S., Conaty,A.L., Brammer,L., Mahy,B.J.W.,
Peters,C.J. and Nichol,S.T.
TITLE Direct Submission
JOURNAL Submitted (22-MAR-1995) Anthony Sanchez, Special Pathogens Branch,
Division of Viral and Rickettsial Diseases, Centers for Disease
Control and Prevention, 1600 Clifton Road, Bldg. 15, Room SB611,
Mail Stop G14, Atlanta, GA 30333, USA
with what's available via datasets download virus genome
:
"submitter": {
"affiliation": "Anthony Sanchez, Special Pathogens Branch, Division of Viral and Rickettsial Diseases, Centers for Disease Control and Prevention, 1600 Clifton Road, Bldg. 15, Room SB611, Mail Stop G14, Atlanta, GA 30333",
"country": "USA",
"names": [
"Sanchez,A.",
"Trappier,S.G.",
"Mahy,B.W.",
"Peters,C.J.",
"Nichol,S.T.",
"Trappier,S.",
"Conaty,A.L.",
"Brammer,L.",
"Mahy,B.J.W."
]
}