go-site
go-site copied to clipboard
Change sources: Human, chicken, dog, pig, cow files
Hello,
@alexsign /GOA is now producing 'combined' files for Human, chichen, dog, pig, cow files, containing all Swiss-Prot isoforms, (not the TrEMBL isoforms), complexes, and RNAs.
The links are here:
- Chicken http://ftp.ebi.ac.uk/pub/databases/GO/goa/CHICKEN/goa_chicken_plus.gaf.gz
- Cow: http://ftp.ebi.ac.uk/pub/databases/GO/goa/COW/goa_cow_plus.gaf.gz
- Dog: http://ftp.ebi.ac.uk/pub/databases/GO/goa/DOG/goa_dog_plus.gaf.gz
- Human: http://ftp.ebi.ac.uk/pub/databases/GO/goa/HUMAN/goa_human_plus.gaf.gz
- Pig: http://ftp.ebi.ac.uk/pub/databases/GO/goa/PIG/goa_pig_plus.gaf.gz
We need to change where we get this data in our 'sources'
Thanks, Pascale
Talking to @pgaudet , we'll wait for this next release
to pass and then push the change. Possible locations of friction:
- [ ] neo
- [x] downloads
- [x] stats
@pgaudet I noticed the existence of goa_pdb
(https://ftp.ebi.ac.uk/pub/databases/GO/goa/PDB/goa_pdb.gaf.gz) in the metadata. Is this used for anything? I think we don't use that? I don't have any reference to it, except causing problems, back to 2019.
The files in the first comment are correct. GOA produces various files for various groups; we can ignore these.
Initial changes have been made and we're waiting on a snapshot run to test.
Talking @pgaudet, the stats seem to be good. Looking at the test downloads page (http://snapshot.geneontology.org/products/pages/downloads.html , ignoring the links), that seems to be good.
The final item to ensure is the NEO build. Building now.
NEO built:
1734706857 golr-index-contents.tgz
on machine:
1738730937 golr_new.tgz
Given how close these are, I think it's reasonable that nothing extreme happened. Allowing snapshot
to proceed.
Single file for human, dog, cow, chicken and pig: :)
compared to 2024-04-24 release:
I think this is complete? The only concern I see now is the entity is incorrect, currently is "protein" when it's a mix of protein, various RNAs, "gene_product", etc. But I think the requirements of this actual ticket are complete.
Right, next, we need to fix the downalods page and the documentation,