cheroot
cheroot copied to clipboard
connections: expiration_interval as select timeout
β What kind of change does this PR introduce?
- [ ] π bug fix
- [x] π£ feature
- [ ] π docs update
- [ ] π tests/coverage improvement
- [x] π refactoring
- [ ] π₯ other
π What is the related issue number (starting with #)
https://github.com/cherrypy/cheroot/pull/311 made a change to process connections as they become active, rather than polling. One of the motivations was to ensure that the latency with which connections are processed is not tied to the select timeout, as it was originally - see https://github.com/cherrypy/cheroot/issues/305
I originally had this change in #311 and intended to submit with it, but must have somehow dropped it during editing/rebasing.
Also includes a minor change to avoid an unnecessary call to time.time() when handling expirations.
β What is the current behavior? (You can also link to an open issue here)
β What is the new behavior (if this is a feature change)?
π Other information:
π Checklist:
- [x] I think the code is well written
- [x] I wrote good commit messages
- [x] I have squashed related commits together after the changes have been approved
- [ ] Unit tests for the changes exist
- [ ] Integration tests for the changes exist (if applicable)
- [x] I used the same coding conventions as the rest of the project
- [x] The new code doesn't generate linter offenses
- [x] Documentation reflects the changes
- [x] The PR relates to only one subject with a clear title and description in grammatically correct, complete sentences
@mar10 any chance you can re-run the benchmarking from https://github.com/cherrypy/cheroot/issues/305 to validate this change? π
mar10 any chance you can re-run the benchmarking from #305 to validate this change? π
Sure: 2021-01-04, PC, Windows 10:
- Cheroot 8.5.1 Executed 16,660 activities
- Cheroot 8.4.8 Executed 16,364 activities
- Cheroot 8.3.1 Executed 2,928 activities
- Cheroot 8.2.1 Executed 2,828 activities
- Cheroot 8.1.0 Executed 2,960 activities
- Cheroot 8.0.0: Executed 17,092 activities
- gevent 20.12.1: Executed 15,420 activities
Click to expand details
> stressor -Vv
stressor/0.5.1-a1 Python/3.9.1(64 bit) Windows-10-10.0.19041-SP0
> wsgidav -Vv
WsgiDAV/3.0.5-a4 Python/3.9.1(64 bit) Windows-10-10.0.19041-SP0
Cheroot 8.5.1
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Stressor scenario 'test_rw' finished.
Tag: 'n.a.'
Base URL: http://127.0.0.1:8082
Start: 2021-01-04 21:24:12
End: 2021-01-04 21:24:42
Run time 30.281 sec, net: 4:58.43 min.
Executed 16,660 activities in 4,175 sequences, using 10 parallel sessions.
Sequence duration: 0.007253 sec average.
rate: 8,273 sequences per minute (per user: 827.3).
Activity rate: 550.2 activities per second (per user: 55.02).
3 monitored activities:
- /main/1/GetRequest(/private/test.html)
n: 4,155, min: 0.004999 sec, avg: 0.011 sec, max: 0.035 sec
- /main/2/PutRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 4,155, min: 0.007987 sec, avg: 0.019 sec, max: 0.046 sec
- /main/3/GetRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 4,155, min: 0.004061 sec, avg: 0.018 sec, max: 0.048 sec
Result: Ok.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Cheroot 8.4.8
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Stressor scenario 'test_rw' finished.
Tag: 'n.a.'
Base URL: http://127.0.0.1:8082
Start: 2021-01-04 21:22:12
End: 2021-01-04 21:22:42
Run time 30.265 sec, net: 4:58.40 min.
Executed 16,364 activities in 4,101 sequences, using 10 parallel sessions.
Sequence duration: 0.007380 sec average.
rate: 8,130 sequences per minute (per user: 813.0).
Activity rate: 540.7 activities per second (per user: 54.07).
3 monitored activities:
- /main/1/GetRequest(/private/test.html)
n: 4,081, min: 0.004999 sec, avg: 0.012 sec, max: 0.042 sec
- /main/2/PutRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 4,081, min: 0.003999 sec, avg: 0.019 sec, max: 0.059 sec
- /main/3/GetRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 4,081, min: 0.007000 sec, avg: 0.018 sec, max: 0.072 sec
Result: Ok.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Cheroot 8.3.1
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Stressor scenario 'test_rw' finished.
Tag: 'n.a.'
Base URL: http://127.0.0.1:8082
Start: 2021-01-04 21:20:48
End: 2021-01-04 21:21:19
Run time 30.453 sec, net: 5:01.63 min.
Executed 2,928 activities in 742 sequences, using 10 parallel sessions.
Sequence duration: 0.041 sec average.
rate: 1,462 sequences per minute (per user: 146.2).
Activity rate: 96.15 activities per second (per user: 9.615).
3 monitored activities:
- /main/1/GetRequest(/private/test.html)
n: 722, min: 0.006999 sec, avg: 0.085 sec, max: 0.155 sec
- /main/2/PutRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 722, min: 0.008998 sec, avg: 0.106 sec, max: 0.147 sec
- /main/3/GetRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 722, min: 0.009990 sec, avg: 0.100 sec, max: 0.139 sec
Result: Ok.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Cheroot 8.2.1
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Stressor scenario 'test_rw' finished.
Tag: 'n.a.'
Base URL: http://127.0.0.1:8082
Start: 2021-01-04 21:18:48
End: 2021-01-04 21:19:18
Run time 30.547 sec, net: 5:01.84 min.
Executed 2,828 activities in 717 sequences, using 10 parallel sessions.
Sequence duration: 0.043 sec average.
rate: 1,408 sequences per minute (per user: 140.8).
Activity rate: 92.58 activities per second (per user: 9.258).
3 monitored activities:
- /main/1/GetRequest(/private/test.html)
n: 697, min: 0.007997 sec, avg: 0.089 sec, max: 0.143 sec
- /main/2/PutRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 697, min: 0.014 sec, avg: 0.108 sec, max: 0.148 sec
- /main/3/GetRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 697, min: 0.013 sec, avg: 0.105 sec, max: 0.146 sec
Result: Ok.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Cheroot 8.1.0
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Stressor scenario 'test_rw' finished.
Tag: 'n.a.'
Base URL: http://127.0.0.1:8082
Start: 2021-01-04 21:17:07
End: 2021-01-04 21:17:37
Run time 30.485 sec, net: 5:01.40 min.
Executed 2,960 activities in 750 sequences, using 10 parallel sessions.
Sequence duration: 0.041 sec average.
rate: 1,476 sequences per minute (per user: 147.6).
Activity rate: 97.1 activities per second (per user: 9.71).
3 monitored activities:
- /main/1/GetRequest(/private/test.html)
n: 730, min: 0.005002 sec, avg: 0.083 sec, max: 0.141 sec
- /main/2/PutRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 730, min: 0.012 sec, avg: 0.105 sec, max: 0.156 sec
- /main/3/GetRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 730, min: 0.011 sec, avg: 0.099 sec, max: 0.138 sec
Result: Ok.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Cheroot 8.0.0
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Stressor scenario 'test_rw' finished.
Tag: 'n.a.'
Base URL: http://127.0.0.1:8082
Start: 2021-01-04 21:13:26
End: 2021-01-04 21:13:56
Run time 30.281 sec, net: 4:58.11 min.
Executed 17,092 activities in 4,283 sequences, using 10 parallel sessions.
Sequence duration: 0.007070 sec average.
rate: 8,487 sequences per minute (per user: 848.7).
Activity rate: 564.4 activities per second (per user: 56.44).
3 monitored activities:
- /main/1/GetRequest(/private/test.html)
n: 4,263, min: 0.002996 sec, avg: 0.010 sec, max: 0.031 sec
- /main/2/PutRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 4,263, min: 0.004000 sec, avg: 0.018 sec, max: 0.058 sec
- /main/3/GetRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 4,263, min: 0.002999 sec, avg: 0.017 sec, max: 0.045 sec
Result: Ok.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
gevent 20.12.1
Compared to gevent as Wsgi-Server
$ pip install gevent
$ wsgidav --root tests/stressor/htdocs/ --host 127.0.0.1 --port 8082 --auth anonymous --no-config --server gevent -q
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Stressor scenario 'test_rw' finished.
Tag: 'n.a.'
Base URL: http://127.0.0.1:8082
Start: 2021-01-04 21:43:29
End: 2021-01-04 21:43:59
Run time 30.266 sec, net: 4:58.30 min.
Executed 15,420 activities in 3,865 sequences, using 10 parallel sessions.
Sequence duration: 0.007831 sec average.
rate: 7,662 sequences per minute (per user: 766.2).
Activity rate: 509.5 activities per second (per user: 50.95).
3 monitored activities:
- /main/1/GetRequest(/private/test.html)
n: 3,845, min: 0.005235 sec, avg: 0.018 sec, max: 0.033 sec
- /main/2/PutRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 3,845, min: 0.006213 sec, avg: 0.019 sec, max: 0.039 sec
- /main/3/GetRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 3,845, min: 0.002064 sec, avg: 0.019 sec, max: 0.036 sec
Result: Ok.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
@mar10 that log looks like you were testing the released versions but not this PR branch. Is that right?
@liamstask rebasing the PR should pull-in the pytest-related fixes should get more jobs green.
rebased
@mar10 that log looks like you were testing the released versions but not this PR branch. Is that right?
that's true. I just tried with
pip install git+https://github.com/cherrypy/cheroot.git@refs/pull/352/head
and now see errors (PUT timeouts after 0.5 seconds, which is a threshold I set in my test). I will look into it a bit more, when I find some time.
I removed the timeout check and re-run the test again (30 seconds, 10 parallel users).
With this PR I see that most requests seem to take ~0.5 seconds.
(There are no errors and some requests are fast, as seen in the 'min' column):

I searched cheroot for '0.5' and found server.expiration_interval = 0.5
I then re-run with server.expiration_interval = 0.7 and get this result.
Note that Max column is now at ~0.7, so there is a clear evidence that this setting is related:
For reference, here are the results with cheroot 8.5.1:

Do you think this setting should be a part of the public API? As in, should users be expected to tweak it per their needs similar to the amount threads?
hm, no - this is not expected behavior, as the selector should return immediately once an entry becomes ready to read.
will see if i can reproduce and understand what might be responsible.
@liamstask do you want to mark this PR as a Draft or do you want to work on a separate one?
Also, is this PR considered a feature or a bugfix? Depending on this, I may consider still publishing it under 8.x (it's still in sync with the master but 9.x is supposed to be Python 3-only).
marking as draft until we understand the nature of this issue - depending on what we find, it may make sense to open another PR.
and hm - i think we could probably consider this PR a feature, as the current behavior isn't a problem necessarily.
but i do wonder if we should see even lower latency than the 0.01 minimum in the numbers above if we were running as expected, since it should really just be limited by the time it takes to wake up from the selector, which should be shorter than that.
it should really just be limited by the time it takes to wake up from the selector, which should be shorter than that.
Hm... I'm curious if that's a side effect of the GIL allowing only one thread at a time.
i am not immediately able to reproduce. i've locally added the following to test_wsgi.py and run via tox -- cheroot/test/test_wsgi.py::test_response_latency:
def test_response_latency(simple_wsgi_server):
import statistics
from timeit import default_timer as timer
session = Session(base_url=simple_wsgi_server['url'])
pooled = requests.adapters.HTTPAdapter(
pool_connections=1, pool_maxsize=1000,
)
session.mount('http://', pooled)
def do_request():
start = timer()
resp = session.get('info')
resp.raise_for_status()
return (timer() - start)
with ThreadPoolExecutor(max_workers=10) as pool:
tasks = [
pool.submit(do_request)
for _ in range(1000)
]
durations = [task.result() for task in tasks]
print("done.")
print("min: {}".format(min(durations)))
print("mean: {}".format(statistics.mean(durations)))
print("max: {}".format(max(durations)))
print("stdev: {}".format(statistics.stdev(durations)))
print("variance: {}".format(statistics.variance(durations)))
assert False # dump stdout to console
and varied expiration_interval to see if the resulting durations appear to be related to that timeout, and at least from an initial unscientific take, they do not appear to be:
expiration_interval = 0.01
min: 0.013620100999999885
mean: 0.034936168017999994
max: 0.24456356999999995
stdev: 0.014986918771672504
variance: 0.00022460773426870968
expiration_interval = 0.5
min: 0.013251182999999944
mean: 0.033912222677999995
max: 0.23156637300000016
stdev: 0.013886721685052835
variance: 0.00019284103915811663
expiration_interval = 1.0
min: 0.01254098800000003
mean: 0.033798407693999985
max: 0.2403869270000003
stdev: 0.014740100437246145
variance: 0.000217270560900104
expiration_interval = 5.0
min: 0.01327589600000012
mean: 0.034817831576
max: 0.25336556600000026
stdev: 0.01470904346964532
variance: 0.00021635595979191567
expiration_interval = 10.0
min: 0.012837041000000049
mean: 0.034476956264000005
max: 0.23227233400000014
stdev: 0.014599182796090602
variance: 0.0002131361383136678
These all look fairly consistent, within a margin of jitter, regardless of the timeout, as would be expected.
This is on my local machine - macOS, python3 - I wonder if there could be some unexpected platform-specific Windows behavior in the selector module?
@mar10 are you able to run your test on any other platforms to get another data point?
@mar10 if it's not straightforward to run your previous tests on another platform, it would still be useful to run the quick test I posted above on your platform, if possible, to help understand where this divergence might be coming from.
@webknjaz if you have a moment to run the test i posted above, or recommend another strategy, that would be helpful too. thanks!
Here's what I get on my Gentoo Linux laptop (P1 Gen2) under Python 3.9:
# 0.5
min: 0.0061725289560854435
mean: 0.03574465919518843
max: 0.1078346953727305
stdev: 0.01949915910102327
variance: 0.0003802172056470187
# 0.01
min: 0.002142445184290409
mean: 0.0351595618231222
max: 0.1128864698112011
stdev: 0.01742951475130368
variance: 0.00030378798446591257
# 1.0
min: 0.005018966272473335
mean: 0.032176667262334376
max: 1.0674880309961736
stdev: 0.03516322005452696
variance: 0.0012364520446030871
# 5.0
min: 0.006421465892344713
mean: 0.03070156146818772
max: 0.08956986200064421
stdev: 0.01611755839335192
variance: 0.00025977568856310894
# 10.0
min: 0.00646551325917244
mean: 0.027719526661094278
max: 1.0390562019310892
stdev: 0.0465421165957302
variance: 0.002166168617210545
On an older macbook pro i don't see a performance drop like on windows
cheroot 8.5.1
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Stressor scenario 'test_rw' finished.
Tag: 'n.a.'
Base URL: http://127.0.0.1:8082
Start: 2021-01-19 19:46:46
End: 2021-01-19 19:47:16
Run time 30.281 sec, net: 4:58.38 min.
Executed 8,744 activities in 2,196 sequences, using 10 parallel sessions.
Sequence duration: 0.014 sec average.
rate: 4,351 sequences per minute (per user: 435.1).
Activity rate: 288.8 activities per second (per user: 28.88).
3 monitored activities:
- /main/1/GetRequest(/private/test.html)
n: 2,176, min: 0.012 sec, avg: 0.025 sec, max: 0.052 sec
- /main/2/PutRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 2,176, min: 0.011 sec, avg: 0.034 sec, max: 0.065 sec
- /main/3/GetRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 2,176, min: 0.007975 sec, avg: 0.030 sec, max: 0.072 sec
Result: Ok. β¨ π° β¨
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
cheroot 8.5.2.dev8+g2af88bb8
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Stressor scenario 'test_rw' finished.
Tag: 'n.a.'
Base URL: http://127.0.0.1:8082
Start: 2021-01-19 19:50:09
End: 2021-01-19 19:50:39
Run time 30.289 sec, net: 4:58.44 min.
Executed 8,544 activities in 2,146 sequences, using 10 parallel sessions.
Sequence duration: 0.014 sec average.
rate: 4,251 sequences per minute (per user: 425.1).
Activity rate: 282.1 activities per second (per user: 28.21).
3 monitored activities:
- /main/1/GetRequest(/private/test.html)
n: 2,126, min: 0.011 sec, avg: 0.026 sec, max: 0.070 sec
- /main/2/PutRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 2,126, min: 0.013 sec, avg: 0.035 sec, max: 0.083 sec
- /main/3/GetRequest(/temp/wsgidav_test_file~$(session_id).txt)
n: 2,126, min: 0.006162 sec, avg: 0.030 sec, max: 0.074 sec
Result: Ok. β¨ π° β¨
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
@mar10 thank you. 2 other thoughts/requests to test on Windows, if/when you have time:
- can you try replacing the
self.sock.settimeout(1)here withself.sock.setblocking(False)? - are you able to also run the test i posted up in https://github.com/cherrypy/cheroot/pull/352#issuecomment-757554330? just to isolate whether there's something meaningfully different in your test to the one that I've run
Sorry for being so unresponsive, I will try to address the this week...
As of #378 and https://github.com/cherrypy/cherrypy/issues/1908, hopefully we can finish this. Indeed select() should immediately return as fast as a connection is ready and the timeout only have an effect on idle situations: https://docs.python.org/3/library/selectors.html?highlight=selector#selectors.BaseSelector.select
So it would be good to know whether the issue that select() seems to wait for the timeout regardless is still present on the Windows system(s) it showed up before. I'm willing to try replicating it here, but not sure how @mar10 exactly did the tests?
Another test could be to add some debug time output before and after the select() call, to verify that it's really that one and not something in the expire method or elsewhere (although I cannot see any way this is the case).
So it would be good to know whether the issue that select() seems to wait for the timeout regardless is still present on the Windows system(s) it showed up before. I'm willing to try replicating it here, but not sure how @mar10 exactly did the tests?
I ran a test tool I wrote against a WebDAV server that uses Cheroot (among others) as WSGI server.
This is the test suite. I can help running it, or repeat the test myself if it helps.
Not sure how to use tox, it always misses some config file... I ran https://github.com/cherrypy/cheroot/pull/352#issuecomment-757554330 now via py -m pytest -k test_response_latency test_wsgi.py on current Windows 10 21H1 and Python 3.9.6:
Unmodified cheroot 8.5.2
min: 0.011346500000000148
mean: 0.0310407035
max: 0.10277959999999997
stdev: 0.01078381715002545
variance: 0.00011629071232518302
min: 0.011402799999999935
mean: 0.0309045372
max: 0.6013285
stdev: 0.02022699558124092
variance: 0.0004091313502435398
cheroot 8.5.2 with timeout=10
- ... never finishes in reasonable time
cheroot 8.5.2 with timeout=0.1
min: 0.012011800000000017
mean: 0.1067621261
max: 0.5968496999999999
stdev: 0.02901034300752099
variance: 0.0008416000014140222
min: 0.01785569999999992
mean: 0.10330534
max: 0.14574089999999984
stdev: 0.023048536605140002
variance: 0.0005312350396384786
- The error is there, no question. As this looks wrong to me, probably something to report at Python Windows guys?
- Same behaviour with
timeout=Noneor not definit it at all. It seems that a request simply does not triggerselect(). But since connections are possible, when using a timeout, they seem to be successfully registered with the selector.
Replaced self.socket.settimeout(1) with self.socket.setblocking(False) https://github.com/cherrypy/cheroot/pull/352#issuecomment-765137683
min: 0.009096400000000004
mean: 0.10749972850000002
max: 0.15891839999999924
stdev: 0.019692086392565155
variance: 0.00038777826649224974
min: 0.01853629999999984
mean: 0.1093117002
max: 0.6129125
stdev: 0.0229338039484658
variance: 0.0005259593635466656
- No difference.
Maybe this is related?
Probably someone has an idea how to narrow it down to the naked selector's select() method, with a single socket, a single select() call without timeout, then registering the socket and see whether select() returns or not.
I noticed that the CPU utilisation on a raspberry pi 1st generation drops from ~20% to ~10% with cheroot 8.5.2 (or maybe a little earlier), but only when settings 'server.thread_pool' : 1 and/or 'engine.autoreload.on' : False are set before quickstart. This effectively removes the original problem / bug I lodged in #1908 and taken over in #378. So something else must have changed in the meantime.....
It makes sense that the load is reduced when the server threads are reduced, but also 10% idle load is much too high and limiting the threads has negative impact on concurrent connection performance likely. With this commit is should be possible to reduce idle load to 1/5, i.e. something around 5% in your case, without affecting concurrent connection performance.
Autoreload is a CherryPy feature, not a cheroot one, and it is disabled by default, isn't it? However, it is another (unrelated) layer of idle load of course, when CherryPy regularly checks for file changes.
So while your settings changes are workarounds, it doesn't mean that cheroot performs well or that this PR looses its reason.
"It makes sense that the load it reduced when the server threads are reduced" - well, previously that never made any difference, so something else has happened in the meantime which practically removes the problem. Nevertheless, agreed that this PR is still worthwhile having.
Would it be an option to add a workaround for Windows, to either set a lower expiration_interval or a e.g. hardcoded timeout=0.1 only on Windows?
I currently do not find the time to nail it down to selector.select simply not returning when a new socket is ready as it should, but at least the way cheroot has it implemented, this is obviously the case. Would be nice to have a minimal example to replicate the issue independently of cheroot to have a chance this getting fixed or worked around upstream.
I don't use Windows so I can't comment on the suggested workaround.
Basically I mean to leave things for Windows as they are currently, until we find a better solution, but apply this PR/change to all other systems, where it clearly enhances things. Observing idle CPU load with and without this PR should help to build an opinion whether it's worth it on non-Windows systems, and request performance on Windows and non-Windows systems are well covered by the tests above.
I'm just not sure how to reliably check whether it's a Windows platform or not in Python?
I try to push this topic since the high CPU idle load on small SBCs is currently really a deal breaker for one of our projects, where we're forced to use CherryPy < 9.0 because of this, which has other downsides, obviously.
Understood and makes sense - and I don't use Windows, so fine for me.
To go on, I opened a new PR: #401
It's the same as this one, but caps the select() timeout to 0.05 seconds on Windows. Whether this is a sufficiently small cap or not can be discussed, but I think it is small enough to not be recognised by humans.
Maybe this is related?
Probably someone has an idea how to narrow it down to the naked selector's
select()method, with a single socket, a single select() call without timeout, then registering the socket and see whether select() returns or not.
FWIW since that has mentions of select([], [], [], timeout=xxx) on windows becoming time.sleep(timeout=xxx), assuming that it's the case, we could try having a counter in cheroot.connections._ThreadsafeSelector for register()/unregister() and based on that, skip the actual calls to the underlying select().