runtime icon indicating copy to clipboard operation
runtime copied to clipboard

Use highest thread priority in linux

Open ntovas opened this issue 2 years ago • 12 comments

Trying to fix #49317

I believe the fact that it is not clearly communicated that the gc in other envs besides windows doesn't run in high priority have made a lot of teams to try to debug issues produced by this fact.

Same facts as this https://github.com/dotnet/runtime/pull/89682 applies for the author.

ntovas avatar Aug 02 '23 12:08 ntovas

Tagging subscribers to this area: @dotnet/gc See info in area-owners.md if you want to be subscribed.

Issue Details

Trying to fix #49317

I believe the fact that it is not clearly communicated that the gc in other envs besides windows doesn't run in high priority have created a lot of teams to try to debug issues produced by this fact.

Same facts as this https://github.com/dotnet/runtime/pull/89682 applies for the author.

Author: ntovas
Assignees: -
Labels:

area-GC-coreclr

Milestone: -

ghost avatar Aug 02 '23 12:08 ghost

Hello @ntovas, thanks for your contribution here. Have you validated whether this change actually has any material impact on memory utilization in your (or other) scenarios?

mangod9 avatar Aug 02 '23 20:08 mangod9

Hello @mangod9, It is based on the assumption than on a very thread hungry app, like our case, priority will help gc to complete tasks faster, that's why on windows we didn't face similar problems. We are trying to find a way to build a runtime with this changes that we will be able to deploy on a traffic heavy environment, because it is very hard for us to completely reproduce our case with mocked data/traffic. From my understanding it may be required to reduce heaps, to a lower number that the core count, but we are ready to test a lot of configurations, if they seem plausible improvements.

ntovas avatar Aug 02 '23 20:08 ntovas

cool sounds good. Thanks for the due diligence.

mangod9 avatar Aug 02 '23 21:08 mangod9

@ntovas you can easily build a libcoreclr.so with your change and do a drop in replacement in your environment to verify its effect. Here is how to do that:

  • Find the commit hash of the libcoreclr.so you are currently using in your production using strings /path/to/libcoreclr.so | grep "@(#)". This will print out the commit hash
  • Check out dotnet/runtime repo at the hash you've found
  • In the root of the repo, run ./build.sh clr+libs -c Release -rc Release
  • The libcoreclr.so to use is in artifacts/bin/coreclr/linux.x64.Release/libcoreclr.so. You can just copy it over the one that you have on our production systems, since it was built from the same commit, it should just work.

Note: You need to build it on a distro that is the same as the one you are using in production or on a distro that's based on the same libc (glibc or MUSL) with a version that's the same or less than the one on your production distro. On glibc based distros, you can run ldd --version to find out the glibc version. Please let me know if you need any additional guidance with the build.

janvorli avatar Aug 02 '23 22:08 janvorli

as a note, since we do not check the return value of calling GCToOSInterface::BoostThreadPriority it would be worthwhile making sure the priority is indeed changed :)

Maoni0 avatar Aug 02 '23 23:08 Maoni0

Hello, @janvorli thanks for the info, I will try it. :)

@Maoni0 yes we will check it first with a dummy app in our environments, and then with changes in runtime, thank you. :)

ntovas avatar Aug 03 '23 07:08 ntovas

Hello @ntovas, just checking in whether you were able to validate the change to ensure it has the desired impact?

mangod9 avatar Sep 11 '23 15:09 mangod9

Hello @ntovas, checking in again on whether you were able to validate perf improvement with this change?

mangod9 avatar Oct 09 '23 15:10 mangod9

@mangod9 Hello, I have done some initial tests and it seems that in specific use cases it improves how gc works, but on containerized environments (docker, openshift) it seems that by default the app will not have privileges to change the thread scheduler/priority.

ntovas avatar Oct 10 '23 07:10 ntovas

perhaps its not worth taking the change then?

mangod9 avatar Nov 06 '23 18:11 mangod9

This pull request has been automatically marked no-recent-activity because it has not had any activity for 14 days. It will be closed if no further activity occurs within 14 more days. Any new comment (by anyone, not necessarily the author) will remove no-recent-activity.

ghost avatar Feb 12 '24 21:02 ghost

This pull request will now be closed since it had been marked no-recent-activity but received no further activity in the past 14 days. It is still possible to reopen or comment on the pull request, but please note that it will be locked if it remains inactive for another 30 days.

ghost avatar Feb 27 '24 00:02 ghost