datafaker icon indicating copy to clipboard operation
datafaker copied to clipboard

#1306 add test that reproduces the flaky problem with CopyOnWriteMap

Open asolntsev opened this issue 1 year ago • 1 comments

it's a simplified version of what happens with FakeValuesService.MAP_OF_METHOD_AND_COERCED_ARGS.

@snuyanzin Can you help with fixing CopyOnWriteMap?

asolntsev avatar Jul 21 '24 11:07 asolntsev

PR Summary

  • New Test Class Added A new test class named CopyOnWriteMapTest.java has been added. The primary function of this class is to thoroughly check and determine the performance and reliability of multi-threaded, simultaneous data access in a specific object class called CopyOnWriteMap.

  • Introduction of concurrentPutAndGet() Method This method is a part of the new test class. It is specifically designed to carry out 'put' and 'get' actions on the CopyOnWriteMap object in a concurrent environment. Put simply, it tests if multiple threads can simultaneously and safely access and modify the data within this object type.

The overall goal of these changes is to ensure that our CopyOnWriteMap operates robustly and correctly under situations where multiple processes try to access and alter data at once.

what-the-diff[bot] avatar Jul 21 '24 11:07 what-the-diff[bot]

Hi @asolntsev , how about getting rid of the COWMap? I just did a performance test with Spring Boot native, and I did about 300.000 requests, and in my "production" code I'm getting nullpointers. For some reason, I got more NPEs in GraalVM than while using a JVM. If you want, you can easily reproduce it here by running the k6 performance test: https://github.com/datafaker-net/datafaker-native-demo

bodiam avatar Oct 13 '24 01:10 bodiam

Wasn't the COWMap originally introduced to address performance issues?

kingthorin avatar Oct 13 '24 02:10 kingthorin

@kingthorin I believe the idea is that it has some benefits regarding performance, but at the cost that the library no longer works reliably.

bodiam avatar Oct 13 '24 04:10 bodiam

how about getting rid of the COWMap?

Yes, I like the idea. I even tried to do it in https://github.com/datafaker-net/datafaker/pull/1203

The argument against it was that COWMap is faster. But I never checked this statement by myself. And I don't think this speed is really important.

asolntsev avatar Oct 13 '24 13:10 asolntsev

Here is where it was introduced (I believe), it has further details.

https://github.com/datafaker-net/datafaker/pull/770

kingthorin avatar Oct 13 '24 13:10 kingthorin

Will using a synchronized map alleviate the issue? I know that might cause a slight performance hit but hopefully it still facilitates the original case.

Edit: or ConcurrentHashMap

kingthorin avatar Oct 13 '24 14:10 kingthorin

@kingthorin @bodiam I investigated the problem a bit more.

Conclusion: the problem is not in CopyOnWriteMap, but it's used.

When we have map of maps, then we often call get, then if != null and put. This sequence doesn't work well in concurrent execution by multiple threads. Instead, we should use computeIfAbsent.

I submitted PR https://github.com/datafaker-net/datafaker/pull/1381 that should fix our flaky problem.

asolntsev avatar Oct 16 '24 07:10 asolntsev

The problem was fixed in PR https://github.com/datafaker-net/datafaker/pull/1381. This PR is not needed anymore.

asolntsev avatar Oct 16 '24 20:10 asolntsev