jfx icon indicating copy to clipboard operation
jfx copied to clipboard

8359599: Calling refresh() for all virtualized controls recreates all cells instead of refreshing the cells

Open Maran23 opened this issue 5 months ago • 7 comments

When calling refresh() on virtualized Controls (ListView, TreeView, TableView, TreeTableView), all cells will be recreated completely, instead of just refreshing them.

This is because recreateCells() of the VirtualFlow is called when refresh() was called. This is not needed, since refreshing the cells can be done much cheaper with rebuildCells().

This will reset all cells (index = -1), add them to the pile and fill them back in the viewport with an index again. This ensures updateItem() is called.

The contract of refresh() is also a big vague, stating:

Calling {@code refresh()} forces the XXX control to recreate and repopulate the cells 
necessary to populate the visual bounds of the control.
In other words, this forces the XXX to update what it is showing to the user. 
This is useful in cases where the underlying data source has changed in a way that is not observed by the XXX itself.

As written above, recreating is not needed in order to fulfull the contract of updating what is shown to the user in case the underlying data source changed without JavaFX noticing (e.g. calling a normal Setter without any Property and therefore listener involved).


Progress

  • [x] Change must not contain extraneous whitespace
  • [x] Commit message must refer to an issue
  • [ ] Change must be properly reviewed (2 reviews required, with at least 2 Reviewers)

Issue

  • JDK-8359599: Calling refresh() for all virtualized controls recreates all cells instead of refreshing the cells (Enhancement - P4)

Reviewing

Using git

Checkout this PR locally:
$ git fetch https://git.openjdk.org/jfx.git pull/1830/head:pull/1830
$ git checkout pull/1830

Update a local copy of the PR:
$ git checkout pull/1830
$ git pull https://git.openjdk.org/jfx.git pull/1830/head

Using Skara CLI tools

Checkout this PR locally:
$ git pr checkout 1830

View PR using the GUI difftool:
$ git pr show -t 1830

Using diff file

Download this PR as a diff file:
https://git.openjdk.org/jfx/pull/1830.diff

Using Webrev

Link to Webrev Comment

Maran23 avatar Jun 15 '25 14:06 Maran23

:wave: Welcome back mhanl! A progress list of the required criteria for merging this PR into master will be added to the body of your pull request. There are additional pull request commands available for use with this pull request.

bridgekeeper[bot] avatar Jun 15 '25 14:06 bridgekeeper[bot]

@Maran23 This change now passes all automated pre-integration checks.

ℹ️ This project also has non-automated pre-integration requirements. Please see the file CONTRIBUTING.md for details.

After integration, the commit message for the final commit will be:

8359599: Calling refresh() for all virtualized controls recreates all cells instead of refreshing the cells

Reviewed-by: angorya, kcr, jhendrikx, jvos

You can use pull request commands such as /summary, /contributor and /issue to adjust it as needed.

At the time when this comment was updated there had been 4 new commits pushed to the master branch:

  • a4e5451373beed93942e0bf2b6c8b81d3e6ad66d: 8371087: Remove unused dependency on jdk.unsupported from javafx.graphics
  • c77c2335856a967907aaacc9546f44943c069add: 8364547: Window size may be incorrect when constrained to min or max
  • 5d069436ebaf0394f4c91b838f86b9391b618296: 8368478: RichTextArea: add IME support
  • ... and 1 more: https://git.openjdk.org/jfx/compare/bf76ed2a6dd1dd74baf865e405690ce996699513...master

As there are no conflicts, your changes will automatically be rebased on top of these commits when integrating. If you prefer to avoid this automatic rebasing, please check the documentation for the /integrate command for further details.

➡️ To integrate this PR with the above commit message to the master branch, type /integrate in a new comment.

openjdk[bot] avatar Jun 15 '25 14:06 openjdk[bot]

This is a risky change and will need to be carefully tested.

Reviewers: @andy-goryachev-oracle @johanvos

/reviewers 2 reviewers

kevinrushforth avatar Jun 16 '25 14:06 kevinrushforth

@kevinrushforth The total number of required reviews for this PR (including the jcheck configuration and the last /reviewers command) is now set to 2 (with at least 2 Reviewers).

openjdk[bot] avatar Jun 16 '25 14:06 openjdk[bot]

It is indeed true that refresh() is often a very expensive operation. Whenever VirtualFlow.recreateCells() is called, most of the internal state of the VirtualFlow is destroyed, and everything is recalculated from scratch. I believe that is not a problem, as long as recreateCells() is not called (by the controls) unless really needed. In that light, this PR seems interesting, as it will limit the number of rebuild-from-scratch cases. I ran the basic controls tests after applying the PR, and (a bit to my surprise) they all passed, which is great. However, it is very likely that some code out there implicitly rely on the rebuild-from-scratch logic, and that code will then work different after applying this PR. I believe it would be good to find such a case where the behavior (which, I agree, is often a bit vaguely defined) changes, so that we can discuss this with a concrete example.

Hence, while I like the idea here (avoiding unneeded heavy-cost operations in VirtualFlow), I would like to avoid a number of follow-up issues once this is merged -- driven by a change in "expected" behavior.

johanvos avatar Jun 17 '25 07:06 johanvos

Hence, while I like the idea here (avoiding unneeded heavy-cost operations in VirtualFlow), I would like to avoid a number of follow-up issues once this is merged -- driven by a change in "expected" behavior.

I completely agree. We need to be careful with such changes.

However, it is very likely that some code out there implicitly rely on the rebuild-from-scratch logic, and that code will then work different after applying this PR.

Since all rows (and cells) are reset and then updated, all changes that were not taken into account by the control are taken into account in any case then. On a side note here: In any JavaFX project, I have overwritten the refresh method (since it is not final) and always implemented the lightweight method as proposed here. I never found any problem.

But I think there is one concrete case which breaks now. Take the following example (slightly modified version of the code in the ticket):

Example
import java.util.concurrent.atomic.AtomicBoolean;

import javafx.application.Application;
import javafx.beans.property.SimpleStringProperty;
import javafx.collections.FXCollections;
import javafx.geometry.Insets;
import javafx.scene.Scene;
import javafx.scene.control.Button;
import javafx.scene.control.TableColumn;
import javafx.scene.control.TableRow;
import javafx.scene.control.TableView;
import javafx.scene.layout.Priority;
import javafx.scene.layout.VBox;
import javafx.stage.Stage;

public class TableRefresh extends Application {

    public static void main(String[] args) {
        launch(args);
    }

    @Override
    public void start(Stage stage) throws Exception {
        TableView<String> tv = new RefreshTableView<>();
        tv.setEditable(true);
        AtomicBoolean alternate = new AtomicBoolean(false);

        tv.setRowFactory(e -> {
            if (alternate.get()) {
                System.out.println("Creating alternative row");
                return new TableRow<>() {
                    {
                        setPadding(new Insets(5));
                    }
                };
            }

            System.out.println("Creating row");
            return new TableRow<>();
        });
        TableColumn<String, String> col = new TableColumn<>("Name");
        col.setCellValueFactory(i -> new SimpleStringProperty(i.getValue()));

        tv.getColumns().addAll(col);

        tv.setItems(FXCollections.observableArrayList("A", "B"));
        VBox.setVgrow(tv, Priority.ALWAYS);
        Button btn = new Button("Refresh");
        btn.setOnAction(e -> {
            alternate.set(true);
            tv.refresh();
        });
        Scene scene = new Scene(new VBox(tv, btn));

        stage.setScene(scene);
        stage.setTitle("Table Refresh");
        stage.show();
    }
}

Here, I'm aware that refresh() is recreating all rows, and update a boolean flag before calling refresh(), which leads to another path that is picked up and therefore other rows. In this case, it is much better to just set a new TableRow factory. This will do the same as what refresh() is doing right now (but not after this PR). In never saw this code in the wild though.

But that is still a valid risk that we need to consider. This is the only problem I see right now.

Maran23 avatar Jun 18 '25 10:06 Maran23

@Maran23 This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Jul 16 '25 16:07 bridgekeeper[bot]

My main concern with this PR is that we might get a minute performance improvement, while risking regression. Would it be possible to get some measurements using a modern system?

What do you think?

andy-goryachev-oracle avatar Jul 31 '25 21:07 andy-goryachev-oracle

I think the performance improvements due to this PR can be pretty significant. The problem is indeed the risk on regression. I believe we need improvements in 2 areas before we can safely merge this:

  1. More functional regression tests
  2. Performance tests.

johanvos avatar Aug 01 '25 13:08 johanvos

I think a CSR is needed if we're changing the documentation. It specifically says it recreates the cells (although the purpose of that eludes me, aside from badly written cells).

Since cells are supposed to be updated, and not "retain" anything from previous contents, I think only only buggy cells would benefit from recreating. Such buggy cells however would probably have subtle problems already when they're never recreated, like listener leaks.

So although I agree that recreating is not needed as cell functionality is defined by their update methods, I do think this then requires a CSR.

hjohn avatar Aug 01 '25 18:08 hjohn

So although I agree that recreating is not needed as cell functionality is defined by their update methods, I do think this then requires a CSR.

I agree.

/csr needed

kevinrushforth avatar Aug 01 '25 20:08 kevinrushforth

@kevinrushforth has indicated that a compatibility and specification (CSR) request is needed for this pull request.

@Maran23 please create a CSR request for issue JDK-8359599 with the correct fix version. This pull request cannot be integrated until the CSR request is approved.

openjdk[bot] avatar Aug 01 '25 20:08 openjdk[bot]

@Maran23 This pull request has been inactive for more than 4 weeks and will be automatically closed if another 4 weeks passes without any activity. To avoid this, simply issue a /touch or /keepalive command to the pull request. Feel free to ask for assistance if you need help with progressing this pull request towards integration!

bridgekeeper[bot] avatar Aug 29 '25 23:08 bridgekeeper[bot]

@Maran23 This pull request has been inactive for more than 8 weeks and will now be automatically closed. If you would like to continue working on this pull request in the future, feel free to reopen it! This can be done using the /open pull request command.

bridgekeeper[bot] avatar Sep 27 '25 08:09 bridgekeeper[bot]

/open

Will do the CSR + try to benchmark when I have more time!

Maran23 avatar Sep 30 '25 08:09 Maran23

@Maran23 This pull request is now open

openjdk[bot] avatar Sep 30 '25 08:09 openjdk[bot]

I did some benchmarks and attached them to the description. Results look very good and is expected from what I measured myself when I implemented that.

Since we do not need to throw away all TableRows and TableCells, and then recreate them, we get a massive boost. Especially when we have many items (-> rows) and many columns. If TableCells have special logic, like showing a graphic or styling, this will be even bigger.

Maran23 avatar Oct 02 '25 20:10 Maran23

@Maran23 Not a developer but I'd just like to say thanks for this. For the longest time I thought it was because of lazy coding on my part(new ReadOnlyObjectWrapper instances instead of caching) but it turns out that even if you optimize your code, TableView's refresh method still performs terribly.

I'd like also to point out that the terrible performance isn't just from having 100s of rows. Comparing the CPU usage difference between viewing many updating Label(s) at once vs a single TableView shows a CPU usage increase(0.01%-0.03%ish to 0.06%-0.08%ish) on relatively extremely powerful modern hardware with just about a dozen rows.

But what's even worse is the garbage allocation rate. TableView in my application allocates entire MEGABYTES of garbage in an application that has a relatively low garbage allocation rate. This not only results in more GCs but could result in expanding the heap, resulting in more app memory usage.

Can this please get more attention?

BlueGoliath avatar Oct 02 '25 20:10 BlueGoliath

I think the changes look good. I'm a bit confused in the performance table with what is meant with the 50 ms -> 0 ms in the "after" cases though?

Every refresh() will trigger 2 layouts for some reason, where the second one does nothing as nothing is dirty, so basically a noop. I can have a look into that (maybe as a follow up?) but I remember that this happens sometimes in general for the VirtualFlow and we should check that generally at one point.

@andy-goryachev-oracle solved that problem in the RichTextArea by isolating the method which should be called e.g. two times (due to e.g. ScrollBars) instead of a real relayout.

Maran23 avatar Oct 03 '25 14:10 Maran23

I think the changes look good. I'm a bit confused in the performance table with what is meant with the 50 ms -> 0 ms in the "after" cases though?

Every refresh() will trigger 2 layouts for some reason, where the second one does nothing as nothing is dirty, so basically a noop. I can have a look into that (maybe as a follow up?) but I remember that this happens sometimes in general for the VirtualFlow and we should check that generally at one point.

No need to address that in this PR, I was just confused what the numbers meant (shouldn't the before column than not also have X ms -> 0 ms?). So it seems like quite a good performance improvement.

As a side note, even 30-40 ms seems incredibly slow, that's bound to create noticeable input lag or frame skips :/ How many cells were visible? 1000 or 100x1000? If the latter, than 30-40 ms seems okayish.

hjohn avatar Oct 04 '25 04:10 hjohn

As a side note, even 30-40 ms seems incredibly slow, that's bound to create noticeable input lag or frame skips :/ How many cells were visible? 1000 or 100x1000? If the latter, than 30-40 ms seems okayish.

I agree. One problem here is, that all cells will be updated (via updateItem) of a TableRow, even if not visible.

I counted all updateItem calls, results below.

Code from Benchmark above:

  • TableRow updateItem: 78
  • TableCell updateItem: 7800

39 rows are displayed, and they are updated twice (first with -1 to reset, then with the actual index). And all rows have 100 cells.

A TableCell updateItem without any code (no setText, no setGraphic) is indeed faster, around 10-20ms. Looking into the code, there are some unnecessary requestLayout calls as well.

Maran23 avatar Oct 05 '25 17:10 Maran23

What needs to happen for this to get merged? Testing?

BlueGoliath avatar Oct 10 '25 21:10 BlueGoliath

@Maran23 Can you add the performance test program you ran to validate the performance numbers to this PR? A new directory under tests/performance seems a good place for it.

I'll review the CSR.

@johanvos had some comments earlier, so allow time for him to respond as well.

kevinrushforth avatar Oct 11 '25 12:10 kevinrushforth

@Maran23 Can you add the performance test program you ran to validate the performance numbers to this PR? A new directory under tests/performance seems a good place for it.

Sure. I don't know how easy it is, but it might be worth to consider using JMH in the future. No idea if this is easy to wire up with JavaFX though.

Maran23 avatar Oct 12 '25 14:10 Maran23

Every refresh() will trigger 2 layouts for some reason, where the second one does nothing as nothing is dirty, so basically a noop. I can have a look into that (maybe as a follow up?) but I remember that this happens sometimes in general for the VirtualFlow and we should check that generally at one point.

This is something that worries me. One of the main issues I see with e.g. VirtualFlow, is that some methods can be invoked both during a rendering pulse as well as (indirectly) via explicit invocations, most often by code invoked with Platform.runLater(). Depending on whether code is called during a pulse or not, the behavior can be very different. A major problematic consequence of a method that used to be called outside the pulse, and that is now for some reason (e.g. due to concurrency, as it depends whether the Runnable submitted to Platform.runLater() is executed before or after the pulse) called during a pulse, is that this can lead to flickering. If requestNextPulse is called during the layout phase, it is very well possible that a "wrong" layout is rendered briefly before the correct layout is shown.

I am not saying that this PR makes the situation worse or better, but I think there is a reasonable chance that some applications will behave differently (and show flickering). Having said that, I don't think that this PR can be blamed in case there is a different behavior.

johanvos avatar Oct 12 '25 17:10 johanvos

This is something that worries me. One of the main issues I see with e.g. VirtualFlow, is that some methods can be invoked both during a rendering pulse as well as (indirectly) via explicit invocations, most often by code invoked with Platform.runLater(). Depending on whether code is called during a pulse or not, the behavior can be very different. A major problematic consequence of a method that used to be called outside the pulse, and that is now for some reason (e.g. due to concurrency, as it depends whether the Runnable submitted to Platform.runLater() is executed before or after the pulse) called during a pulse, is that this can lead to flickering. If requestNextPulse is called during the layout phase, it is very well possible that a "wrong" layout is rendered briefly before the correct layout is shown.

I'm also not happy with the current situation. There are also some cases where cells request a layout while they are layouted. There is quite some room to optimize several scenarios here. I would like to check this out and optimize at one point, but really afraid that there might be regressions. So this will probably take a while, but I have a far better understanding of the VirtualFlow and surrounding classes than e.g. a year ago + we have far more tests as well!

Maran23 avatar Oct 13 '25 12:10 Maran23

There seems to be an intermediate test failure, maybe a race condition unrelated to this change:

TaskEventTest > cancelledCalledAfterHandler() FAILED
    org.opentest4j.AssertionFailedError: expected: <true> but was: <false>
        at app//org.junit.jupiter.api.AssertionFailureBuilder.build(AssertionFailureBuilder.java:151)
        at app//org.junit.jupiter.api.AssertionFailureBuilder.buildAndThrow(AssertionFailureBuilder.java:132)
        at app//org.junit.jupiter.api.AssertTrue.failNotTrue(AssertTrue.java:63)
        at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:36)
        at app//org.junit.jupiter.api.AssertTrue.assertTrue(AssertTrue.java:31)
        at app//org.junit.jupiter.api.Assertions.assertTrue(Assertions.java:183)
        at app//test.javafx.concurrent.TaskEventTest.cancelledCalledAfterHandler(TaskEventTest.java:410)

Maran23 avatar Oct 13 '25 12:10 Maran23

There seems to be an intermediate test failure, maybe a race condition unrelated to this change:

Yes, it is already filed and tracked by JDK-8357459.

kevinrushforth avatar Oct 13 '25 13:10 kevinrushforth

@Maran23 In case you missed it, this is pending changes from you.

  1. I noted some final doc changes in this comment above.
  2. I had added an earlier comment about adding your perf test to the PR somewhere under tests/performance.

kevinrushforth avatar Oct 29 '25 17:10 kevinrushforth