alluxio icon indicating copy to clipboard operation
alluxio copied to clipboard

Add an option to avoid OutOfDirectMemoryError for AlluxioFuse

Open secfree opened this issue 2 years ago • 6 comments

What changes are proposed in this pull request?

Add an option to avoid OutOfDirectMemoryError for AlluxioFuse.

Why are the changes needed?

Fix https://github.com/Alluxio/alluxio/issues/16094

Does this PR introduce any user facing changes?

No

secfree avatar Aug 25 '22 04:08 secfree

Automated checks report:

  • Commits associated with Github account: PASS
  • PR title follows the conventions: FAIL
    • The title of the PR does not pass all the checks. Please fix the following issues:
      • First word of title ("Doc") is not an imperative verb. Please use one of the valid words

Some checks failed. Please fix the reported issues and reply 'alluxio-bot, check this please' to re-run checks.

alluxio-bot avatar Aug 25 '22 04:08 alluxio-bot

Automated checks report:

  • Commits associated with Github account: PASS
  • PR title follows the conventions: PASS

All checks passed!

alluxio-bot avatar Aug 25 '22 04:08 alluxio-bot

Do you have any hard evidence this option fixes the issue? How did you tested/verified that?

yyongycy avatar Aug 25 '22 07:08 yyongycy

Do you have any hard evidence this option fixes the issue? How did you tested/verified that?

I did the test the following way

  1. alluxio-fuse mount with/without the property -Dio.netty.noPreferDirect=true, keep all other parameters consistent
  2. Use the same number of processes to read an alluxio-fuse mount path

The result is

  • There are a lot of OutOfDirectMemoryError without -Dio.netty.noPreferDirect=true
  • There is no OutOfDirectMemoryError with -Dio.netty.noPreferDirect=true

I was able to reproduce it repeatedly.

As alluxio-fuse is using netty, and netty manages the direct memory pool by itself. With -Dio.netty.noPreferDirect=true, it does not use direct memory and uses heap memory, which is managed by GC, and GC releases memory faster.

secfree avatar Aug 25 '22 07:08 secfree

Do you have any hard evidence this option fixes the issue? How did you tested/verified that?

I did the test the following way

  1. alluxio-fuse mount with/without the property -Dio.netty.noPreferDirect=true, keep all other parameters consistent
  2. Use the same number of processes to read an alluxio-fuse mount path

The result is

  • There are a lot of OutOfDirectMemoryError without -Dio.netty.noPreferDirect=true
  • There is no OutOfDirectMemoryError with -Dio.netty.noPreferDirect=true

I was able to reproduce it repeatedly.

As alluxio-fuse is using netty, and netty manages the direct memory pool by itself. With -Dio.netty.noPreferDirect=true, it does not use direct memory and uses heap memory, which is managed by GC, and GC releases memory faster.

Any side effect? and other options? Wondering in what situation it is triggered?

yyongycy avatar Aug 25 '22 07:08 yyongycy

Any side effect? and other options?

The side effect is, without using direct memory, Zero Copy is disabled. The data needs to be copied once more between user space and kernel space.

Wondering in what situation it is triggered?

Background of my base: the platform needs to do alluxio-fuse mount for each user's job in its pod, as different jobs have different parallel, so it is difficult to set a fixed proper value for MaxDirectMemorySize.

secfree avatar Aug 25 '22 07:08 secfree

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Feb 02 '23 15:02 github-actions[bot]

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in two weeks if no further activity occurs. Thank you for your contributions.

github-actions[bot] avatar Jun 16 '23 15:06 github-actions[bot]