meilisearch-python
meilisearch-python copied to clipboard
Using Request.Sessions for faster and less open connections
Pull Request
Related issue
Fixes #<issue_number>
What does this PR do?
- Changes all requests.get/etc methods to session.get/etc, in order to not create new connections for each request
PR checklist
Please check if your PR fulfills the following requirements:
- [x] Does this PR fix an existing issue, or have you listed the changes applied in the PR description (and why they are needed)?
- [x] Have you read the contributing guidelines?
- [x] Have you made sure that the title is accurate and descriptive of the changes?
Do need some help as to why this test doesn't pass:
==================================================================== short test summary info =====================================================================
FAILED tests/errors/test_timeout_error_meilisearch.py::test_client_timeout_error - Failed: DID NOT RAISE <class 'meilisearch.errors.MeilisearchTimeoutError'>
=========================================================== 1 failed, 243 passed, 1 xfailed in 33.49s ============================================================
Do need some help as to why this test doesn't pass:
It fails because the test patches requests.get, but you changed the code to use session.get so the patch isn't running.
I understand why in theory this could be faster in the right situation, but did you do any benchmarking? In my extreamly quick and crude benchmark there was no improvement.
Do need some help as to why this test doesn't pass:
It fails because the test patches
requests.get, but you changed the code to usesession.getso the patch isn't running.I understand why in theory this could be faster in the right situation, but did you do any benchmarking. In my extreamly quick and crude benchmark there was no improvement.
Haven't done any benchmarks, other than knowing that requests.get() needs to open a new TCP connection and session.get() can re-use the same connection.
The bigger "issue" which arose in my setup is Connection timeout's due to the quantity of connections were being made to my Meilisearch Server, probably nginx settings. But I also got There is no reason to not reuse a TCP connection when possible. Other SDK's are using the same logic, by sharing the client/Session between meilisearch.Client and meilisearch.Index: https://github.com/meilisearch/meilisearch-go/blob/main/index.go#L82-L88
Currently trying to apply your changes, but not passing around the HttpRequests also affects how tasks are created. As each task also creates a HttpReqeusts object. https://github.com/LGXerxes/meilisearch-python/blob/lgx/requests-session/meilisearch/task.py#L14-L29
Perhaps it is easier to:
global_session = None
def get_global_session():
global global_session
if global_session:
return global_session
global_session = requests.Session()
return global_session
class HttpRequests:
def __init__(self, config: Config) -> None:
self.config = config
self.headers = {
"Authorization": f"Bearer {self.config.api_key}",
"User-Agent": _build_user_agent(config.client_agents),
}
self.session = get_global_session()
This way when creating HttpRequests it just fetches the session. And makes it also easier to exchange the session for other libraries etc
Good point. I think what we could do is have the TaskHandler __init__ take an optional session like in the Index and create the handler the same way as index. This is an "off the top of my head" solution that I haven't tested yet so feel free to point out any issues if you see some.
Another option would be to create a http sigelton in _httprequests.py and have everything just use that.
@sanders41 I think the singleton pattern is probably the best choice here.
Don't see any issues with this implementation.
Closing this for lack of activity and the tests are still failing. Feel free to re-open of course if you want to keep working on it