rocketmq icon indicating copy to clipboard operation
rocketmq copied to clipboard

[ISSUE 4902]fix-warmMappedFile:Place a safepoint before the loop, the processing effect takes precedence over sleep

Open isysc1 opened this issue 3 years ago • 18 comments

Place a safepoint before the loop, the processing effect takes precedence over sleep

isysc1 avatar Aug 26 '22 09:08 isysc1

而 fileSize 又是 int 类型,现在把 i 改为 long 类型完全没有意义

@Hccake filesize是不是int类型和jvm能不能在循环前放置安全点并没有什么关系,它看的就是循环的这个i是不是long类型的 参考这篇文章

sunheyi6 avatar Aug 27 '22 06:08 sunheyi6

而 fileSize 又是 int 类型,现在把 i 改为 long 类型完全没有意义

@Hccake filesize是不是int类型和jvm能不能在循环前放置安全点并没有什么关系,它看的就是循环的这个i是不是long类型的 参考这篇文章

受教,学习了

Hccake avatar Aug 27 '22 07:08 Hccake

The existing solution is to write one byte per page as warming. According to https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/7/html/reference_guide/using_mlock_to_avoid_page_io, it looks sufficient to mlock;

Another thing, actual empty write is actually preferred for some reason, it would be better to iterate from file tail to head, as the head of the file region is more likely to be written with data in the near future.

lizhanhui avatar Aug 27 '22 12:08 lizhanhui

Codecov Report

Merging #4903 (8325414) into develop (c411713) will decrease coverage by 0.00%. The diff coverage is 0.00%.

@@              Coverage Diff              @@
##             develop    #4903      +/-   ##
=============================================
- Coverage      43.29%   43.28%   -0.01%     
+ Complexity      7688     7684       -4     
=============================================
  Files            991      991              
  Lines          68784    68776       -8     
  Branches        9112     9111       -1     
=============================================
- Hits           29779    29771       -8     
+ Misses         35263    35257       -6     
- Partials        3742     3748       +6     
Impacted Files Coverage Δ
...ache/rocketmq/store/logfile/DefaultMappedFile.java 49.67% <0.00%> (+1.26%) :arrow_up:
...a/org/apache/rocketmq/filter/util/BloomFilter.java 60.43% <0.00%> (-2.20%) :arrow_down:
.../apache/rocketmq/controller/ControllerManager.java 72.82% <0.00%> (-2.18%) :arrow_down:
...mq/store/ha/autoswitch/AutoSwitchHAConnection.java 73.29% <0.00%> (-1.64%) :arrow_down:
...client/consumer/store/RemoteBrokerOffsetStore.java 68.14% <0.00%> (-0.89%) :arrow_down:
...apache/rocketmq/store/queue/BatchConsumeQueue.java 69.50% <0.00%> (-0.63%) :arrow_down:
...rocketmq/broker/processor/PopMessageProcessor.java 37.63% <0.00%> (-0.54%) :arrow_down:
...he/rocketmq/client/impl/consumer/ProcessQueue.java 61.46% <0.00%> (-0.46%) :arrow_down:
...e/rocketmq/remoting/netty/NettyRemotingClient.java 43.73% <0.00%> (-0.44%) :arrow_down:
...ava/org/apache/rocketmq/filter/util/BitsArray.java 59.82% <0.00%> (ø)
... and 8 more

:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more

codecov-commenter avatar Aug 27 '22 12:08 codecov-commenter

The existing solution is to write one byte per page as warming. According to https://access.redhat.com/documentation/en-us/red_hat_enterprise_linux_for_real_time/7/html/reference_guide/using_mlock_to_avoid_page_io, it looks sufficient to mlock;

@lizhanhui checked some information, and some places said that mlock alone cannot guarantee that memory can be locked, because there may be copy-on-write problems. I'm not sure if this is the problem. refer to this article

sunheyi6 avatar Aug 28 '22 09:08 sunheyi6

@shy-share The article you posted may talk about anonymous memory regions. For file-backed memory area, mlock should be the behavior discussed in the redhat doc.

lizhanhui avatar Aug 29 '22 09:08 lizhanhui

why the comment of sleep call is 'prevent gc'? In fact it's helping gc, right?

Knowden avatar Aug 30 '22 02:08 Knowden

why the comment of sleep call is 'prevent gc'? In fact it's helping gc, right?

@Knowden In fact, it can't prevent gc, it main purpose is to increase gc frequency and reduce single gc time. refer to why-thread-sleep0-can-prevent-gc-in-rocketmq

sunheyi6 avatar Aug 30 '22 08:08 sunheyi6

Would it be better to place safepoint if we had GC problems in the current situation ?

Oliverwqcwrw avatar Sep 01 '22 00:09 Oliverwqcwrw

@Oliverwqcwrw Why not perform an A-B test between the current solution and the proposed one, let data talk.

lizhanhui avatar Sep 01 '22 01:09 lizhanhui

@Oliverwqcwrw Why not perform an A-B test between the current solution and the proposed one, let data talk.

agree with, The data makes it easier for us to choose specific solutions

Oliverwqcwrw avatar Sep 01 '22 01:09 Oliverwqcwrw

@lizhanhui @Oliverwqcwrw This article has done a simple test, it seems that there is no big difference, and it is only limited to versions below jdk8, this problem has been optimized in jdk10

sunheyi6 avatar Sep 07 '22 01:09 sunheyi6

@shy-share Pretty good analysis. Well done.

lizhanhui avatar Sep 07 '22 02:09 lizhanhui

@shy-share the article seems deprecated https://stackoverflow.com/questions/72753599/counted-uncounted-loops-and-safepoints-is-while-i-someint-considered-u loop with a long variable becomes COUNTED after JDK16.

lizhanhui avatar Sep 07 '22 02:09 lizhanhui

@shy-share the article seems deprecated https://stackoverflow.com/questions/72753599/counted-uncounted-loops-and-safepoints-is-while-i-someint-considered-u loop with a long variable becomes COUNTED after JDK16.

@lizhanhui ok,so I think it is still necessary to modify this code. For jdk8, there does not seem to be a big performance gap after modification, but in order to be compatible with subsequent versions of jdk, it seems better to do so

sunheyi6 avatar Sep 07 '22 02:09 sunheyi6

@lizhanhui @Oliverwqcwrw

i write a benchmark code,refer to this

sunheyi6 avatar Sep 10 '22 00:09 sunheyi6

@Oliverwqcwrw Either Uncounted Loop or Sleep(0) is fine, IMO. Be sure to add proper documentation, explaining the purpose of sleep(0) or long-uncounted loop.

lizhanhui avatar Sep 20 '22 02:09 lizhanhui

@Oliverwqcwrw Either Uncounted Loop or Sleep(0) is fine, IMO. Be sure to add proper documentation, explaining the purpose of sleep(0) or long-uncounted loop.

agree with :)

Oliverwqcwrw avatar Sep 20 '22 03:09 Oliverwqcwrw

实际上,这里的改动无效:for循环里有方法调用,就不存在这个问题。方法调用栈退出的地方,是可以加check safepoint的。

kimmking avatar Mar 05 '23 11:03 kimmking

for循环退不出的场景,也可以通过-XX:+UseCountedLoopSafepoints解决,不需要改代码。

kimmking avatar Mar 05 '23 11:03 kimmking

实际上,这里的改动无效:for循环里有方法调用,就不存在这个问题。方法调用栈退出的地方,是可以加check safepoint的。

方法可能会被JIT编译器优化,从而导致无法正确识别safepoint

sunheyi6 avatar Mar 09 '23 02:03 sunheyi6

for循环退不出的场景,也可以通过-XX:+UseCountedLoopSafepoints解决,不需要改代码。

一般很少修改jdk参数,如果线上用

sunheyi6 avatar Mar 09 '23 02:03 sunheyi6

循环体内不是已经有方法调用了嘛: byteBuffer.put((int) i, (byte) 0); 难道这个方法调用不能当做“安全点”? 我猜测不能当安全点的原因是:JIT在编译byteBuffer.put((int) i, (byte) 0); 这段代码的时候,会进行内联操作?

yinyinnie avatar Oct 06 '23 02:10 yinyinnie

还有一个疑问,我理解无论是 1. Thread.sleep 2. 还是把int 换成long, 都是为了手动在for循环体内放置一个安全点。 为什么手动放置一个安全点就达到优化性能的效果了呢?

原因难道是这样: 1、有安全点的话,虚拟机通过“短暂的停顿”,就能枚举需要被回收的对象,所有“短暂的停顿”所累积的时间记为t1 2、如果不放置安全点,可能会导致full GC, 将GC的时间记为t2 t1 远小于 t2?

如果原因真是这样,想请问 t1 和 t2之间的量级差多少?

yinyinnie avatar Oct 06 '23 02:10 yinyinnie