bioutils icon indicating copy to clipboard operation
bioutils copied to clipboard

Ensembl transcript ENST00000617537.5 sequence is genomic not cdna

Open davmlaw opened this issue 4 months ago • 9 comments

https://asia.ensembl.org/Homo_sapiens/Transcript/Summary?db=core;g=ENSG00000136250;r=7:36512941-36724494;t=ENST00000617537

Web page Reports that the sequence is 2385 bases long

Ensembl API is in agreement:

In [25]: import requests
In [26]: url = "https://rest.ensembl.org/sequence/id/ENST00000617537?type=cdna"
In [27]: r = requests.get(url, headers={"Content-Type": "application/json"}, timeout=60)
In [28]: len(r.json()["seq"])
Out[28]: 2385

SeqRepo returns much longer sequence:

In [20]: from hgvs.dataproviders.seqfetcher import SeqFetcher

In [21]: sf = SeqFetcher()

In [22]: seq = sf.fetch_seq("ENST00000617537.5")

In [23]: len(seq)
Out[23]: 211554

davmlaw avatar Oct 01 '24 07:10 davmlaw