rnacentral-webcode
rnacentral-webcode copied to clipboard
Downloading json files give a file that is gzipped twice
The file ends up having to be gzip -d twice if downloaded through safari. Steps:
- Go to 'http://rnacentral.org/export/download-result?job=4e4dc29d-c53d-41a3-be5f-54908ec622f3'
- Hit download
- Try the following:
✔ bsweeney@bsweeney-ml ~/Downloads
$ gzip -d HOTAIR_AND_TAXONOMY9606_AND_rna_typelncRNA_etc.json.gz
✔ bsweeney@bsweeney-ml ~/Downloads
$ jq '.[0]' < HOTAIR_AND_TAXONOMY9606_AND_rna_typelncRNA_etc.json
parse error: Invalid numeric literal at line 1, column 54
✘ bsweeney@bsweeney-ml ~/Downloads
$ mv HOTAIR_AND_TAXONOMY9606_AND_rna_typelncRNA_etc.json HOTAIR_AND_TAXONOMY9606_AND_rna_typelncRNA_etc.json.gz
✔ bsweeney@bsweeney-ml ~/Downloads
$ gzip -d HOTAIR_AND_TAXONOMY9606_AND_rna_typelncRNA_etc.json.gz
✔ bsweeney@bsweeney-ml ~/Downloads
$ jq '.[0]' < HOTAIR_AND_TAXONOMY9606_AND_rna_typelncRNA_etc.json
{
"url": "http://rnacentral.org/api/v1/rna/URS000011D1F0",
...
Note that if you copy the url that the download button leads to (http://rnacentral.org/export/download-result?job=4e4dc29d-c53d-41a3-be5f-54908ec622f3) and use wget to download and then gzip you will get the json file right away, no need for the second step.