uta
uta copied to clipboard
Universal Transcript Archive: comprehensive genome-transcript alignments; multiple transcript sources, versions, and alignment methods; available as a docker image
$ sudo docker run --name uta_20180821 -v uta_20180821:/var/lib/postgresql/data -p 50827:5432 biocommons/uta:uta_20180821 $ psql -h localhost -p 50827 -U anonymous -d uta -c "select * from uta_20180821.meta" ERROR: relation "uta_20180821.meta" does...
Hi- I'm doing some variant mapping. Starting with this: ``` v = hp.parse_hgvs_variant("NM_007194.4(CHEK2):c.1611T>A") print(v) print(v.posedit.pos.start) print(v.posedit.edit.ref) ``` I get: NM_007194.4(CHEK2):c.1611T>A 1611 T All is good. I can get the genomic...
**Originally reported by Reece Hart (Bitbucket: [reece](http://bitbucket.org/reece), GitHub: [reece](http://github.com/reece)) in [biocommons/uta](https://bitbucket.org/biocommons/uta/) [#198](https://bitbucket.org/biocommons/uta/issue/198)** Migrated by [bitbucket-issue-migration](https://github.com/reece/bitbucket-issue-migration) on 2016-09-09 15:15:07 --- In UTA's data model, an `exon_set` represents a set of exons...
When aligning sequences with indels in repeat regions, the alignment is ambiguous. Most aligners implicitly right or left shuffled gaps, but this choice is arbitrary: the real region of ambiguity...
At the end of `uta_20171026` dump three `REFRESH MATERIALIZED VIEW` statements are executed for views `exon_set_exons_fp_mv`, `tx_exon_set_summary_mv`, `tx_def_summary_mv`. However restoration of uta_20180821 ends with `GRANT` statements without any `REFRESH MATERIALIZED...
Our team recently noticed that for a small subset of transcripts within UTA the hgnc field is empty. See entry below comparing the record for the transcript of BRAF vs....
This transcript was updated to NM_005117.3 on 23-NOV-2018. `HGVSDataNotAvailableError: No alignments for NM_005117.3 in GRCh37 using splign` I don't know if that's the root cause.
Ensembl data in UTA is very old. During a future schema overhaul, the loading process will need to be updated. That will be a good time to update data as...
* exon structure: seq, exon coordinates * cds-clipped exon structure: seq, cds-clipped exon coordinates * exon lengths * cds exon lengths * cds