inception
inception copied to clipboard
Improved API for document search and export
Excuse a single ticket for multiple related problems. I'll create separate if you'd prefer that.
Is your feature request related to a problem? Please describe.
I enabled API access to programmatically export documents. I found that I can retrieve project documents and then export annotations in the desired format. However, there are some shortcomings:
- The
/api/aero/v1/projects/{projectId}/documentsendpoint returns all documents, requiring filtering on client side, for example to get onlystate=CURATION-COMPLETE - Documents can only be exported one-by-one with the
/api/aero/v1/projects/{projectId}/documents/{documentId}endpoint - When exporting, the response is always
Content-Type: application/octet-stream. Additionally,Acceptheader causes status406 Not Acceptable, even if a matching media type is requested. These I find a bug
Describe the solution you'd like
Ideally, it would be possible to directly export all matching documents, without doing a search first. Something like
GET /api/aero/v1/projects/{projectId}/documents{?format,state}
The response could be a ZIP with each document exported in the chosen format
Describe alternatives you've considered
If a new endpoint is not feasible, it would be nice to introduce some improvements
- Add
?statequery param to document search endpoint - Respond with matching content type, such as
rdfcas => text/turtle,conllu => text/plain,jsoncas => application/json, etc - (Optionally) Allow content negotiation of RDF formats. For example, requesting with
Accept: application/n-triplesshould be honoured and respond with RDF in n-triples format