tinkerpop
tinkerpop copied to clipboard
TINKERPOP-2837 Fix NPE caused by a thread safe issue
The current annotations object created by Collections.synchronizedMap() will be shared across multiple threads, and some methods which access this object do not apply lock (synchronized), as a result, NPE occurs. This patch fixed this issue.
Codecov Report
Merging #1908 (ea83329) into 3.5-dev (341793e) will decrease coverage by
5.17%
. The diff coverage isn/a
.
@@ Coverage Diff @@
## 3.5-dev #1908 +/- ##
=============================================
- Coverage 69.32% 64.16% -5.17%
=============================================
Files 865 25 -840
Lines 41037 3750 -37287
Branches 5407 0 -5407
=============================================
- Hits 28450 2406 -26044
+ Misses 10667 1166 -9501
+ Partials 1920 178 -1742
see 852 files with indirect coverage changes
:mega: We’re building smart automated test selection to slash your CI/CD build times. Learn more
@ministat Change LGTM, however is it possible to create a test case that breaks before your change and works after?
It is not easy to reproduce this thread safe issue in a UT, anyway let me try.
The UT cannot reproduce this issue. Only the production env can see it. If the UT is a must, I suggest abondoning this PR.
sorry - lost track of this one. i don't know that a unit test was a must. lyndon just asked for one. i'm curious about performance implications. anyway, i don' think you needed to close it if you were having a problem.
@spmallette What performance implications do you mean here? Do you mean the performance comparison of Collections.synchronizedMap and ConcurrentHashMap? If yes, https://www.baeldung.com/java-synchronizedmap-vs-concurrenthashmap already gives the result.
looking at this further reveals that annotations will lose their order. i assume providers inserting annotations would want to preserve the order they insert things as it may make a difference in presentation to the user. note that a test fails in 3.6-dev that isn't here on 3.5-dev as a result of this change. i think we'd want to find a fix that doesn't alter this behavior somehow.
Hi @ministat, do you have any plans to come back to this one? Thanks!
@xiazcy I cannot reproduce the issue through UT. This only occurs in my production environment. So, If a UT is a must, I do not want to continue invest on this.
at this point i think the issue is less with unit tests and more with my comment here:
https://github.com/apache/tinkerpop/pull/1908#issuecomment-1425880289
@spmallette What about applying concurrentlinkedhashmap? Maybe this one: https://github.com/ben-manes/concurrentlinkedhashmap which suggests using caffeine cache. Anyway, let us check whether the concurrentlinkedhashmap can pass the failed case in 3.6-dev.
thanks for looking into this further. unfortunately, it's creating more questions. Not sure how others feel but I'm hesitant to include another dependency here to solve this, especially a deprecated one that is not maintained anymore. it's also changing gryo which would make server backward incompatible with drivers, e.g. 3.5.6 server would stop working with a 3.5.5 driver. gryo is deprecated i suppose but i'd say it remains a concern in my mind.
Given those issues, I'd like this approach to be a last resort and even then I'd consider relegating it to a newer version that doesn't break compatibility. Do you feel like we have reached a position of last resort at this point? If so, could you share other approaches you've tried that didn't really work well?
thanks for your patience on this - sometimes the smallest bugs can be a nuisance to fix and generate a lot of discussion.
@spmallette Thanks for your attention on this. Well, for this issue, I did not find other approaches. I have temporarily applied ConcurrentHashMap in my local environment even though it sacrisfied the insertion order as you pointed. I agreed to archive this issue before we have a good solution.
It looks like we haven't made progress on this issue for a while. I would suggest that we close it for now until someone has a suggestion that satisfies @spmallette concerns.
If there aren't any objections in the next 72 hours then I'll close this. Again, feel free to open it again once another solution has been proposed.