aioelasticsearch
aioelasticsearch copied to clipboard
Prevent double-encoding of path components in queries.
What do these changes do?
This change addresses an issue with double-encoding URL components when performing a request against elastic search. For example, this causes issues when retrieving an elastic search document by its id, and the ID contains a character that needs to be encoded:
async with Elasticsearch("myserver.example") as es:
result = await es.get(index="myindex", doc_type="_doc", id="1234+5678)
Without this patch a 404 error (document not found) is returned even when a document with the specified id exists in the specified index. -- Note that the equivalent query above works correctly when performing a synchronous request via elasticsearch.Elasticsearch.
The reason for this is:
- Looking at
elasticsearch.client.ElasticSearch, you'll see that every time beforeperform_requestis invoked on thetransportobject,elasticsearch.client.utils._make_pathis already used to url-encode the path argument (which comes down to using urllib.parse.quote). - Using
aioelasticsearchthese requests end up ataioelasticsearch.connection.AIOHTTPConnection.perform_request, where the full URL is built by adding theurlargument (representing the aforementioned already encoded path component) toself.base_url(which is ayarl.URL) using the/operator. - This causes
yarl.URLto url-encode the path component a second time.
This PR avoids the issue by
- first constructing a relative
yarl.URLinstance for the path component to be appended, specifyingencoded=Trueto avoid double url-encoding. - Then using
URL.jointo build the final URL rather than the/operator.
Are there changes in behavior for the user?
No (bugfix).
Related issue number
None
Checklist
- [x] I think the code is well written
- [x] Unit tests for the changes exist
- [x] Documentation reflects the changes N/A
- [ ] Add a new news fragment into the
CHANGESfolder- name it
<issue_id>.<type>(e.g.588.bugfix) - if you don't have an
issue_idchange it to the pr id after creating the PR - ensure type is one of the following:
.feature: Signifying a new feature..bugfix: Signifying a bug fix..doc: Signifying a documentation improvement..removal: Signifying a deprecation or removal of public API..misc: A ticket has been closed, but it is not of interest to users.
- Make sure to use full sentences with correct case and punctuation, for example:
Fix issue with non-ascii contents in doctest text files.
- name it
Note
The checklist item above regarding the CHANGES folder seems out of date, since I can't find such a folder. -- If you want me to add a blurb about this PR to CHANGES.rst, please let me know.