ggd icon indicating copy to clipboard operation
ggd copied to clipboard

How about GENCODE retreival

Open stephenturner opened this issue 10 years ago • 6 comments

Data and links to FTP sites for human and mouse.

stephenturner avatar Dec 03 '14 10:12 stephenturner

Yep, good idea. I am currently looking into how best to pull data from Ensembl. Not that easy, unfortunately. Also, if you are interested in adding a recipe to ggd-recipes, that would be great!

arq5x avatar Dec 03 '14 10:12 arq5x

Could have a look at how https://github.com/hammerlab/pyensembl rolls.

martijnvermaat avatar Dec 03 '14 22:12 martijnvermaat

Ah, thanks. Will have a look.

arq5x avatar Dec 03 '14 22:12 arq5x

By my eye, it appears that pyensembl rolls by creating a local instance of a SQLite database by downloading the relevant gzipped GTF, etc. files from Ensembl. This strategy could work, but ideally, I would really like GGD to use the existing APIs to pull data. This is going to be a challenge, however...

arq5x avatar Dec 04 '14 02:12 arq5x

Aren't the Ensembl APIs in Perl?

mw55309 avatar Dec 10 '14 20:12 mw55309

They have a MYSQL database under the hood as well.

arq5x avatar Dec 10 '14 20:12 arq5x