Use highest thread priority in linux
Trying to fix #49317
I believe the fact that it is not clearly communicated that the gc in other envs besides windows doesn't run in high priority have made a lot of teams to try to debug issues produced by this fact.
Same facts as this https://github.com/dotnet/runtime/pull/89682 applies for the author.
Tagging subscribers to this area: @dotnet/gc See info in area-owners.md if you want to be subscribed.
Issue Details
Trying to fix #49317
I believe the fact that it is not clearly communicated that the gc in other envs besides windows doesn't run in high priority have created a lot of teams to try to debug issues produced by this fact.
Same facts as this https://github.com/dotnet/runtime/pull/89682 applies for the author.
| Author: | ntovas |
|---|---|
| Assignees: | - |
| Labels: |
|
| Milestone: | - |
Hello @ntovas, thanks for your contribution here. Have you validated whether this change actually has any material impact on memory utilization in your (or other) scenarios?
Hello @mangod9, It is based on the assumption than on a very thread hungry app, like our case, priority will help gc to complete tasks faster, that's why on windows we didn't face similar problems. We are trying to find a way to build a runtime with this changes that we will be able to deploy on a traffic heavy environment, because it is very hard for us to completely reproduce our case with mocked data/traffic. From my understanding it may be required to reduce heaps, to a lower number that the core count, but we are ready to test a lot of configurations, if they seem plausible improvements.
cool sounds good. Thanks for the due diligence.
@ntovas you can easily build a libcoreclr.so with your change and do a drop in replacement in your environment to verify its effect. Here is how to do that:
- Find the commit hash of the libcoreclr.so you are currently using in your production using
strings /path/to/libcoreclr.so | grep "@(#)". This will print out the commit hash - Check out dotnet/runtime repo at the hash you've found
- In the root of the repo, run
./build.sh clr+libs -c Release -rc Release - The libcoreclr.so to use is in artifacts/bin/coreclr/linux.x64.Release/libcoreclr.so. You can just copy it over the one that you have on our production systems, since it was built from the same commit, it should just work.
Note: You need to build it on a distro that is the same as the one you are using in production or on a distro that's based on the same libc (glibc or MUSL) with a version that's the same or less than the one on your production distro. On glibc based distros, you can run ldd --version to find out the glibc version.
Please let me know if you need any additional guidance with the build.
as a note, since we do not check the return value of calling GCToOSInterface::BoostThreadPriority it would be worthwhile making sure the priority is indeed changed :)
Hello, @janvorli thanks for the info, I will try it. :)
@Maoni0 yes we will check it first with a dummy app in our environments, and then with changes in runtime, thank you. :)
Hello @ntovas, just checking in whether you were able to validate the change to ensure it has the desired impact?
Hello @ntovas, checking in again on whether you were able to validate perf improvement with this change?
@mangod9 Hello, I have done some initial tests and it seems that in specific use cases it improves how gc works, but on containerized environments (docker, openshift) it seems that by default the app will not have privileges to change the thread scheduler/priority.
perhaps its not worth taking the change then?
This pull request has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.
This pull request will now be closed since it had been marked no-recent-activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the pull request, but please note that it will be locked if it remains inactive for another 30 days.