Try downloading compressed versions of remote ontologies
Whenever the user declares they want to import a foreign ontology (say, ro):
import_group:
products:
- id: ro
the generated mirroring code should first try to download http://purl.obolibrary.org/obo/ro.owl.gz, and then fallback to http://purl.obolibrary.org/obo/ro.owl if we get a 404.
This would in effect make the use_gzipped option redundant (it would be as if this option was always on by default – except for the fact that currently this option does not include the fallback behaviour).
Open to discussion about what we should do when the user explicitly provides a mirror_from URL:
import_group:
products:
- id: ro
mirror_from: https://example.org/my/custom/mirroring/site/ro.owl
For now I am inclined to say that we should not try to mess with any explicitly provided URL (so, we do not try to append .gz)
We should check with @jamesaoverton but I am not entirely sure if http://purl.obolibrary.org/obo/ro.owl.gz is allowed by the OBO purl system or if it has to be http://purl.obolibrary.org/obo/ro/ro.owl.gz. Would be good if appending .gz was the only think that is needed.
I like what you are proposing!
RO (and any other project) would have to opt-in by adding an ro.owl.gz entry to the products list: https://github.com/OBOFoundry/purl.obolibrary.org/blob/master/config/ro.yml#L8
The PURL code might warn about not liking the "owl.gz" extension, but we can tweak that.
And we can have a wider discussion about policy for providing compressed artifacts. A decade ago we figured that clients and servers would transparently compress data during transfer, and we weren't worried about storage, but our ontology files continue to get larger.