jfx icon indicating copy to clipboard operation
jfx copied to clipboard

8252936: Optimize removal of listeners from ExpressionHelper.Generic

Open dannygonzalez opened this issue 5 years ago • 68 comments

https://bugs.openjdk.java.net/browse/JDK-8185886

Optimisation to ExpressionHelper.Generic class to use Sets rather than Arrays to improve listener removal speed.

This was due to the removal of listeners in TableView taking up to 50% of CPU time on the JavaFX Application thread when scrolling/adding rows to TableViews.

This may alleviate some of the issues seen here:

TableView has a horrific performance with many columns #409 https://github.com/javafxports/openjdk-jfx/issues/409#event-2206515033

JDK-8088394 : Huge memory consumption in TableView with too many columns JDK-8166956: JavaFX TreeTableView slow scroll performance JDK-8185887: TableRowSkinBase fails to correctly virtualise cells in horizontal direction

OpenJFX mailing list thread: TableView slow vertical scrolling with 300+ columns https://mail.openjdk.java.net/pipermail/openjfx-dev/2020-January/024780.html


Progress

  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue
  • [ ] Change must be properly reviewed

Issue

  • JDK-8252936: Optimize removal of listeners from ExpressionHelper.Generic

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.java.net/jfx pull/108/head:pull/108
$ git checkout pull/108

Update a local copy of the PR:
$ git checkout pull/108
$ git pull https://git.openjdk.java.net/jfx pull/108/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 108

View PR using the GUI difftool:
$ git pr show -t 108

Using diff file

Download this PR as a diff file:
https://git.openjdk.java.net/jfx/pull/108.diff

dannygonzalez avatar Feb 06 '20 16:02 dannygonzalez

Hi dannygonzalez, welcome to this OpenJDK project and thanks for contributing!

We do not recognize you as Contributor and need to ensure you have signed the Oracle Contributor Agreement (OCA). If you have not signed the OCA, please follow the instructions. Please fill in your GitHub username in the "Username" field of the application. Once you have signed the OCA, please let us know by writing /signed in a comment in this pull request.

If you already are an OpenJDK Author, Committer or Reviewer, please click here to open a new issue so that we can record that fact. Please use "Add GitHub user dannygonzalez" as summary for the issue.

If you are contributing this work on behalf of your employer and your employer has signed the OCA, please let us know by writing /covered in a comment in this pull request.

bridgekeeper[bot] avatar Feb 06 '20 16:02 bridgekeeper[bot]

/signed

I have signed the Oracle Contributor Agreement today. Have not received back any confirmation yet though.

dannygonzalez avatar Feb 06 '20 16:02 dannygonzalez

Thank you! Please allow for up to two weeks to process your OCA, although it is usually done within one to two business days. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

bridgekeeper[bot] avatar Feb 06 '20 16:02 bridgekeeper[bot]

hmm ... wouldn't the change violate spec of adding listeners:

If the same listener is added more than once, then it will be notified more than once.

kleopatra avatar Feb 12 '20 11:02 kleopatra

hmm ... wouldn't the change violate spec of adding listeners:

If the same listener is added more than once, then it will be notified more than once.

True, I hadn't read that spec in ObservableValueBase. Although that does seem odd behaviour to me. Obviously as the original implementation was using an array I can see how the implementation drove that specification.

Non of the JavaFx unit tests test for that specific case as the unit tests all passed. It would be nice if there was a specific test case for this behaviour.

I would need to store a registration count for each listener to satisfy this requirement.

dannygonzalez avatar Feb 12 '20 12:02 dannygonzalez

Although that does seem odd behaviour to me. Obviously as the original implementation was using an array I can see how the implementation drove that specification.

whatever drove it (had been so since the beginning ot java desktop, at least since the days of swing), there is no way to change it, is it?

Non of the JavaFx unit tests test for that specific case as the unit tests all passed. It would be nice if there was a specific test case for this behaviour.

yeah, the test coverage is ... not optimal :)

I would need to store a registration count for each listener to satisfy this requirement.

a count plus some marker as to where it was added:

addListener(firstL) addListener(secondL) addListener(firstL)

must result in firstL.invalidated, seconL.invalidated, firstL.invalidated .. which brings us back to .. an array?

kleopatra avatar Feb 12 '20 12:02 kleopatra

The listeners are called back in the order they were registered in my implementation although I didn’t see this requirement in the spec unless I missed something.

The performance unregistering thousands of listeners by TableView from the array is killing the performance of our JavaFX application. It takes up to 60% of JavaFX Application thread CPU and there are various bug reports around this same TableView performance issue. The set implementation has lowered this to below 1%.

This pull request may be too fundamental a change to go into mainline. We however have to build our local OpenJFX with this fix or our application is unusable.

I’m happy to receive any suggestions.

Danny

On 12 Feb 2020, at 12:22, Jeanette Winzenburg <[email protected]mailto:[email protected]> wrote:

Although that does seem odd behaviour to me. Obviously as the original implementation was using an array I can see how the implementation drove that specification.

whatever drove it (had been so since the beginning ot java desktop, at least since the days of swing), there is no way to change it, is it?

Non of the JavaFx unit tests test for that specific case as the unit tests all passed. It would be nice if there was a specific test case for this behaviour.

yeah, the test coverage is ... not optimal :)

I would need to store a registration count for each listener to satisfy this requirement.

a count plus some marker as to where it was added:

addListener(firstL) addListener(secondL) addListener(firstL)

must result in firstL.invalidated, seconL.invalidated, firstL.invalidated .. which brings us back to .. an array?

— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHubhttps://github.com/openjdk/jfx/pull/108?email_source=notifications&email_token=ABTEOIWBBLX3XA3JV23OP6LRCPSXRA5CNFSM4KQ7YBNKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOELQSLEY#issuecomment-585180563, or unsubscribehttps://github.com/notifications/unsubscribe-auth/ABTEOIVSPJEJ6FAJ5SFT7V3RCPSXRANCNFSM4KQ7YBNA.

dannygonzalez avatar Feb 12 '20 12:02 dannygonzalez

The listeners are called back in the order they were registered in my implementation although I didn’t see this requirement in the spec unless I missed something.

yeah, you are right can't find that spec on sequence right now - maybe I dreamed it :)

kleopatra avatar Feb 12 '20 13:02 kleopatra

/covered

dannygonzalez avatar Feb 12 '20 14:02 dannygonzalez

Thank you! Please allow for a few business days to verify that your employer has signed the OCA. Also, please note that pull requests that are pending an OCA check will not usually be evaluated, so your patience is appreciated!

bridgekeeper[bot] avatar Feb 12 '20 14:02 bridgekeeper[bot]

/covered

dannygonzalez avatar Feb 17 '20 11:02 dannygonzalez

You are already a known contributor!

bridgekeeper[bot] avatar Feb 17 '20 11:02 bridgekeeper[bot]

@dannygonzalez the reason for the jcheck failure is that you have commits with two different email addresses in your branch. At this point, it's probably best to squash the commits into a single commit with git rebase -i master (presuming that your local master is up to date), and then do a force-push.

kevinrushforth avatar Feb 17 '20 13:02 kevinrushforth

@kevinrushforth just a note to say there are other ExpressionHelper classes (i.e. MapExpressionHelper, SetExpressionHelper and ListExpressionHelper) that also use arrays and suffer from the linear search issue when removing listeners.

These however didn't appear in the critical path of the JavaFXThread and didn't come up in my profiling of TableView.

If this pull request is accepted, my opinion is that they should probably all move to using the same pattern as here, which is to use Maps instead of Arrays for their listener lists so that all these classes are uniform.

Thanks

dannygonzalez avatar Feb 18 '20 09:02 dannygonzalez

Webrevs

mlbridge[bot] avatar Feb 18 '20 09:02 mlbridge[bot]

Sorry for the interruption, send a PR that corrects the same problem.

yososs avatar Feb 22 '20 08:02 yososs

I haven't done a detailed review yet, but I worry about the memory consumption and performance of using a Map for the simple cases where there is only a single (or very small number) of listeners. These methods are used in many more places than just the TableView / TreeTableView implementation.

kevinrushforth avatar Feb 25 '20 00:02 kevinrushforth

Replying to @nlisker and @kevinrushforth general comments about the memory consumption of using a LinkedHashMap:

I agree that at the very least these should be lazy initialised when they are needed and that we should initially allocate a small capacity and load factor.

We could also have some sort of strategy where we could use arrays or lists if the number of listeners are less than some threshold (i.e. keeping the implementation with a similar overhead to the original) and then switch to the HashMap when it exceeds this threshold. This would make the code more complicated and I wonder if this is the worth the extra complexity.

I know that in our application, the thousands of listeners unregistering when using a TableView was stalling the JavaFX thread.

I am happy to submit code that lazy initialises and minimises initial capacity and load factor as suggested or happy to take direction and suggestions from anyone with a deeper understanding of the code than myself.

dannygonzalez avatar Feb 25 '20 12:02 dannygonzalez

I think that a starting point would be to decide on a spec for the listener notification order: is it specified by their registration order, or that is it unspecified, in which case we can take advantage of this for better performance. Leaving is unspecified and restricting ourselves to having it ordered is losing on both sides.

nlisker avatar Feb 25 '20 13:02 nlisker

Even though the order of notifications is unspecified, the following tests fail if we use a HashMap vs LinkedHashMap i.e. the notifications are not in order of registration:

test.javafx.scene.control.TableViewTest > ensureSelectionIsCorrectWhenItemsChange FAILED
    java.lang.AssertionError: expected:<0> but was:<-1>
        at org.junit.Assert.fail(Assert.java:91)
        at org.junit.Assert.failNotEquals(Assert.java:645)
        at org.junit.Assert.assertEquals(Assert.java:126)
        at org.junit.Assert.assertEquals(Assert.java:470)
        at org.junit.Assert.assertEquals(Assert.java:454)
        at test.javafx.scene.control.TableViewTest.ensureSelectionIsCorrectWhenItemsChange(TableViewTest.java:333)

test.javafx.scene.control.TreeTableViewTest > test_rt27180_expandBranch_laterSiblingSelected_singleSelection FAILED
    java.lang.AssertionError: 
        at org.junit.Assert.fail(Assert.java:91)
        at org.junit.Assert.assertTrue(Assert.java:43)
        at org.junit.Assert.assertTrue(Assert.java:54)
        at test.javafx.scene.control.TreeTableViewTest.test_rt27180_expandBranch_laterSiblingSelected_singleSelection(TreeTableViewTest.java:2042)

test.javafx.scene.control.TreeViewTest > test_rt27185 FAILED
    java.lang.AssertionError: expected:<TreeItem [ value: Mike Graham ]> but was:<null>
        at org.junit.Assert.fail(Assert.java:91)
        at org.junit.Assert.failNotEquals(Assert.java:645)
        at org.junit.Assert.assertEquals(Assert.java:126)
        at org.junit.Assert.assertEquals(Assert.java:145)
        at test.javafx.scene.control.TreeViewTest.test_rt27185(TreeViewTest.java:603)

test.javafx.scene.control.TreeViewTest > test_rt27180_collapseBranch_laterSiblingAndChildrenSelected FAILED
    java.lang.AssertionError: 
        at org.junit.Assert.fail(Assert.java:91)
        at org.junit.Assert.assertTrue(Assert.java:43)
        at org.junit.Assert.assertTrue(Assert.java:54)
        at test.javafx.scene.control.TreeViewTest.test_rt27180_collapseBranch_laterSiblingAndChildrenSelected(TreeViewTest.java:973)

test.javafx.scene.control.TreeViewTest > test_rt27180_expandBranch_laterSiblingSelected_singleSelection FAILED
    java.lang.AssertionError: 
        at org.junit.Assert.fail(Assert.java:91)
        at org.junit.Assert.assertTrue(Assert.java:43)
        at org.junit.Assert.assertTrue(Assert.java:54)
        at test.javafx.scene.control.TreeViewTest.test_rt27180_expandBranch_laterSiblingSelected_singleSelection(TreeViewTest.java:992)

These would need to be investigated to see why they assume this order.

dannygonzalez avatar Feb 25 '20 14:02 dannygonzalez

Hmm .. personally, I would consider changing such a basic (and probably highly optimized) implementation used all over the framework is a very high risk change that at the worst outcome would detoriate internal and external code. So before going that lane, I would try - as you probably have, just me not seeing it :) - to tackle the problem from the other end:

I know that in our application, the thousands of listeners unregistering when using a TableView was stalling the JavaFX thread.

.. sounds like a lot. Any idea, where exactly they come into play?

kleopatra avatar Feb 26 '20 09:02 kleopatra

I think that a starting point would be to decide on a spec for the listener notification order: is it specified by their registration order, or that is it unspecified, in which case we can take advantage of this for better performance. Leaving is unspecified and restricting ourselves to having it ordered is losing on both sides.

technically true - but the implementation was linear with a fixed sequence since-the-beginning-of-java-desktop-time (and sometimes, for good ol' beans properties, even exposed as api to access the array of listeners). So technically, we could go the path of explicitely spec'ing that the order is unspecified. Pretty sure that doing so and implementing it will break tons of application code that's subtly relying on fifo notification (f.i. register before or after skin has its own is a simple wide-spread trick) .. which all did it wrong and might deserve it ;-) But then if even internal code does it ..

kleopatra avatar Feb 26 '20 10:02 kleopatra

In use cases where a large number of listeners are being discarded, it may be better to first consider changing the design to receive event notifications on nodes whose listener registrations are not frequently discarded.

yososs avatar Feb 27 '20 03:02 yososs

Hmm .. personally, I would consider changing such a basic (and probably highly optimized) implementation used all over the framework is a very high risk change that at the worst outcome would detoriate internal and external code. So before going that lane, I would try - as you probably have, just me not seeing it :) - to tackle the problem from the other end:

I know that in our application, the thousands of listeners unregistering when using a TableView was stalling the JavaFX thread.

.. sounds like a lot. Any idea, where exactly they come into play?

I did start to look at why there were so many change listeners unregistering but felt that would take a deeper understanding of the architecture and design decisions of JavaFX scene graph and I didn't have that time to invest. I do know that there are thousands of them unregistering in a TableView and unregistering them is a bottleneck for the JavaFX thread.

There are multiple change listeners on a Node for example, so you can imagine with the number of nodes in a TableView this is going to be a problem if the unregistering is slow.

To get our application usable I profiled the code and saw this unregistering as a major bottleneck, hence why I looked at this more obvious solution.

I'm happy to look at the problem from the other angle and happy to listen to suggestions from people with more experience of the design and architecture but tackling the problem from the perspective of re-architecting the behaviour of listeners in the scene graph would be more work than I could feasibly take on.

dannygonzalez avatar Mar 02 '20 08:03 dannygonzalez

technically true - but the implementation was linear with a fixed sequence since-the-beginning-of-java-desktop-time (and sometimes, for good ol' beans properties, even exposed as api to access the array of listeners). So technically, we could go the path of explicitely spec'ing that the order is unspecified. Pretty sure that doing so and implementing it will break tons of application code that's subtly relying on fifo notification (f.i. register before or after skin has its own is a simple wide-spread trick) .. which all did it wrong and might deserve it ;-) But then if even internal code does it ..

So we can choose to specify it as ordered.

These are the 2 options I mentioned:

  1. Not specify the behavior and change the implementation in favor of performance. This could break applications as you've mentioned.
  2. Specify that the order is preserved and keep the ordered implementation behavior. This will allow applications to rely on the behavior safely.

Right now we're losing on both sides. We keep a promise and we don't tell anyone about it. The only reason for this is if we intend to change the behavior in the future, in which case we should add a doc comment saying that the order is unspecified and is planned to change in the future, so there will be at least some warning.

Once we choose what to do here it will make sense to continue to review the code with that decision in mind.

nlisker avatar Mar 02 '20 11:03 nlisker

In a minimal test I wrote (not a microbenchmark that removes listeners), I tried this PR code, but did not reproduce the performance improvement. I have attached a test program in my PR(#125)

yososs avatar Mar 10 '20 06:03 yososs

/reviewers 2

kevinrushforth avatar Apr 14 '20 11:04 kevinrushforth

@kevinrushforth The number of required reviews for this PR is now set to 2 (with at least 1 of role reviewers).

openjdk[bot] avatar Apr 14 '20 11:04 openjdk[bot]

@dannygonzalez You mentioned "There are multiple change listeners on a Node for example, so you can imagine with the number of nodes in a TableView this is going to be a problem if the unregistering is slow.".

Your changes in this PR seem to be focused on improving performance of unregistering listeners when there are thousands of listeners listening on changes or invalidations of the same property. Is this actually the case? Is there a single property in TableView or TreeTableView where such a high amount of listeners are present? If so, I think the problem should be solved there.

As it is, I donot think this PR will help unregistration performance of listeners when the listeners are spread out over dozens of properties, and although high in total number, the total listeners per property would be low and their listener lists would be short. Performance of short lists (<50 entries) is almost unbeatable, so it is relevant for this PR to know if there are any properties with long listener lists.

hjohn avatar Apr 14 '20 12:04 hjohn

@hjohn I haven't quantified exactly where all the listeners are coming from but the Node class for example has various listeners on it.

The following changeset (which added an extra listener on the Node class) impacted TableView performance significantly (although it was still bad before this change in my previous tests):

commit e21606d3a1b73cd4b44383babc750a4b4721edfd Author: Florian Kirmaier <fk at sandec.de<mailto:fk at sandec.de>> Date: Tue Jan 22 09:08:36 2019 -0800

8216377: Fix memoryleak for initial nodes of Window
8207837: Indeterminate ProgressBar does not animate if content is added after scene is set on window

Add or remove the windowShowingChangedListener when the scene changes

As you can imagine, a TableView with many columns and rows can be made up of many Node classes. The impact of having multiple listeners deregistering on the Node when new rows are being added to a TableView can be a significant performance hit on the JavaFX thread.

I don't have the time or knowledge to investigate why these listeners are all needed especially around the Node class. I went directly to the source of the problem which was the linear search to deregister each listener.

I have been running my proposed fix in our JavaFX application for the past few months and it has saved it from being unusable due to the JavaFx thread being swamped.

dannygonzalez avatar Apr 15 '20 07:04 dannygonzalez