RichTextFX icon indicating copy to clipboard operation
RichTextFX copied to clipboard

What is the proper way to lazily add styles to paragraphs in RichTextFX?

Open PavelTurk opened this issue 8 months ago • 52 comments

Suppose I have a document with a huge number of paragraphs (e.g., 10,000,000+). It makes no sense to add styles to all paragraphs at once for performance reasons. What is the correct approach to:

  • Apply styles only when paragraphs become visible (e.g., during scrolling).
  • Avoid showing unstyled paragraphs (ensure styles are applied before rendering).

Is there a built-in mechanism in RichTextFX for this, or do I need a custom solution?

PavelTurk avatar Apr 24 '25 16:04 PavelTurk

For example in JavaFX CodeArea they use decorator:

        codeArea.setSyntaxDecorator(new SyntaxDecorator() {
            @Override
            public RichParagraph createRichParagraph(CodeTextModel ctm, int i) {
                System.out.println("LINE: " + i);
                StyleAttributeMap a = StyleAttributeMap.builder().setBold(true).build();
                RichParagraph.Builder b = RichParagraph.builder();
                b.addSegment(ctm.getPlainText(i), a);
                b.addHighlight(19, 4, Color.rgb(255, 128, 128, 0.5));
                b.addHighlight(20, 7, Color.rgb(128, 255, 128, 0.5));
                return b.build();
            }

            @Override
            public void handleChange(CodeTextModel ctm, TextPos tp, TextPos tp1, int i, int i1, int i2) {

            }
        });

createRichParagraph method is called every time when view is updated. For example, when user adds one character about 110 lines are updated and for every paragraph createRichParagraph is called. Is it possible to have a handler that will be called on paragraph update in RichTextFX?

PavelTurk avatar Apr 25 '25 09:04 PavelTurk

Yes, have a look at JavaKeywordsDemo

Jugen avatar Apr 25 '25 11:04 Jugen

@Jugen Thank you very much for your help. Do I understand correctly - you are talking about this one:

        // recompute syntax highlighting only for visible paragraph changes
        // Note that this shows how it can be done but is not recommended for production where multi-
        // line syntax requirements are needed, like comment blocks without a leading * on each line. 
        codeArea.getVisibleParagraphs().addModificationObserver
        (
            new VisibleParagraphStyler<>( codeArea, this::computeHighlighting )
        );

PavelTurk avatar Apr 25 '25 12:04 PavelTurk

Yes :-)

Jugen avatar Apr 25 '25 12:04 Jugen

@Jugen Thank you very much. This is what I was looking for. The only problem it doesn't work as expected :).

This is my code - the idea is very simple to make every paragraph red:

class ParagraphStyler implements
        Consumer<ListModification<? extends Paragraph<Collection<String>, String, Collection<String>>>> {

    private final org.fxmisc.richtext.CodeArea area;

    private final BiFunction<Integer, String, StyleSpans<Collection<String>>> styler;

    private int prevParagraph, prevTextLength;

    public ParagraphStyler(org.fxmisc.richtext.CodeArea area, BiFunction<Integer, String, StyleSpans<Collection<String>>> styler) {
        this.styler = styler;
        this.area = area;
    }

    @Override
    public void accept(ListModification<? extends Paragraph<Collection<String>, String, Collection<String>>> lm) {
        if (lm.getAddedSize() > 0) Platform.runLater(() -> {
            int paragraph = Math.min(area.firstVisibleParToAllParIndex() + lm.getFrom(),
                    area.getParagraphs().size() - 1);
            String text = area.getText(paragraph, 0, paragraph, area.getParagraphLength(paragraph));
            if (paragraph != prevParagraph || text.length() != prevTextLength) {
                if (paragraph < area.getParagraphs().size() - 1) {
                    int startPos = area.getAbsolutePosition(paragraph, 0);
                    area.setStyleSpans(startPos, styler.apply(paragraph, text));
                }
                prevTextLength = text.length();
                prevParagraph = paragraph;
            }
        });
    }
};

public class CodeAreaTest extends Application {

    @Override
    public void start(Stage primaryStage) throws Exception {
        CodeArea codeArea = new CodeArea();

        var red = Collections.singleton("red");

        BiFunction<Integer, String, StyleSpans<Collection<String>>> styler = (paragraphNumber, paragraphText) -> {
            System.out.println("LINE: " + paragraphNumber + ", text: " + paragraphText);
            StyleSpansBuilder<Collection<String>> spansBuilder = new StyleSpansBuilder<>();
            spansBuilder.add(red, paragraphText.length() + 1); // 1 - EOL
            return spansBuilder.create();
        };
        codeArea.getVisibleParagraphs().addModificationObserver(new ParagraphStyler(codeArea, styler));
        VirtualizedScrollPane scrollPane = new VirtualizedScrollPane(codeArea);

        Scene scene = new Scene(scrollPane, 1000, 600);
        String css = ".red { -fx-fill: red; } ";
        scene.getStylesheets().add("data:text/css," + css);

        primaryStage.setScene(scene);
        primaryStage.show();

        String content = getContent();
        codeArea.appendText(content);
    }

    private String getContent() throws Exception {
        String url = "https://raw.githubusercontent.com/openjdk/valhalla/e1280b3e11a98d98c0fdad73ce9c8bb9d2417a70/src/jdk.compiler/share/classes/com/sun/tools/javac/parser/JavacParser.java";
        HttpClient client = HttpClient.newHttpClient();
        HttpRequest request = HttpRequest.newBuilder()
                .uri(URI.create(url))
                .build();
        HttpResponse<String> response = client.send(request, HttpResponse.BodyHandlers.ofString());
        String content = response.body();
        return content;
    }

    public static void main(String[] args) {
        launch(args);
    }

}

And this is result:

Image

As you can see, the number of unpainted lines at the end of the document is increasing. The console data shows that not all lines are being passed to the styler. I use RichTextFX 0.11.5. Could you say how to fix it?

PavelTurk avatar Apr 25 '25 18:04 PavelTurk

@Jugen Could you say, what is the proper way to update styles when this approach is used? For example when it is necessary to update only styles for paragraphs 1 - 100? I tried this way:

var green = Collections.singleton("green");
        var button = new Button("Test");
        button.setOnAction(e -> {
            for (var i = 0; i < 100; i++) {
                var p = codeArea.getParagraphs().get(i);
                StyleSpansBuilder<Collection<String>> ssb = new StyleSpansBuilder<>();
                ssb.add(green, p.getText().length() + 1);
                p.restyle(0, ssb.create());
            }
            System.out.println("Updated");
        });

but it didn't work. I also tried to use absolute position and codeArea.setStyleSpans(startPos, styles); but it didn't work either.

PavelTurk avatar May 04 '25 19:05 PavelTurk

@Jugen Any ideas? Or it is a bug that should be fixed?

I’m currently working on a project that uses two code areas — one based on JavaFX and one based on RichTextFX. The core of the project is already complete and works with both JFX and RTFX. The only remaining tasks are implementing lazy style application and style updates in RTFX. Once those are done, the project will be ready to share with the community for feedback and review.

PavelTurk avatar May 06 '25 13:05 PavelTurk

I'll try and have a look next week .....

Jugen avatar May 09 '25 15:05 Jugen

@Jugen Thank you very much!

PavelTurk avatar May 09 '25 15:05 PavelTurk

@PavelTurk Sorry for the delayed response. Last week didn't go as planned and then this issue took me some time to figure out. Anyway, see PR #1277 to address the issue you show in the video clip you posted.

Jugen avatar May 22 '25 08:05 Jugen

@Jugen Could you say, what is the proper way to update styles when this approach is used? For example when it is necessary to update only styles for paragraphs 1 - 100? I tried this way:

var green = Collections.singleton("green");
        var button = new Button("Test");
        button.setOnAction(e -> {
            for (var i = 0; i < 100; i++) {
                var p = codeArea.getParagraphs().get(i);
                StyleSpansBuilder<Collection<String>> ssb = new StyleSpansBuilder<>();
                ssb.add(green, p.getText().length() + 1);
                p.restyle(0, ssb.create());
            }
            System.out.println("Updated");
        });

but it didn't work. I also tried to use absolute position and codeArea.setStyleSpans(startPos, styles); but it didn't work either.

WRT your green button test above. It doesn't work because Paragraph is an immutable object, so p.restyle( ... ) returns a new Paragraph object without changing the original. If you replace that line with codeArea.setStyleSpans( i, 0, ssb.create() ); instead, it should work.

Jugen avatar May 22 '25 09:05 Jugen

@Jugen Thank you very much for your help. I am testing #1277 . In your example you use

int startPos = area.getAbsolutePosition( addList.get(0).get1(), 0 );

to calculate offset. But can we avoid it? we do have

var paragraph = tuple.get2();
paragraph.restyle(0, styleSpans); //doesn't work

We have a reference to the paragraph and this paragraph has a "beautiful" restyle method. Why can't we use it?

PavelTurk avatar May 23 '25 10:05 PavelTurk

I share your sentiment, but the problem is that Paragraph is an immutable object, so p.restyle( ... ) returns a new Paragraph object without changing the original.

Jugen avatar May 23 '25 11:05 Jugen

@Jugen I am testing #1277, the situation has become much better. All paragraphs are styled. However, I found two problems.

The first one is that paragraph are styled AFTER they are displayed to the user on scrolling. It seems that the reason of this is the Platform.runLater():

Image

The styles are applied to all paragraphs, but with a slight delay, which results in the user seeing both unstyled and styled text.

I want to be extremely precise here. At the moment, the styling speed (which is very important!) is very good, and I strongly want to avoid decreasing it by making the code more complex. But maybe there's a way to handle this — for example, by adding a boolean property to control whether the unstyled text is shown or not.

Why a separate property? Because we can add an observer to either codeArea.getVisibleParagraphs() or codeArea.getVisibleParagraphIndexes().

I haven’t identified the cause of the second issue yet — it might be in my own code.

PavelTurk avatar May 23 '25 12:05 PavelTurk

But maybe the problem is that I style every paragraph but not all :

@Override
    public void accept(ListModification<? extends Tuple2<Integer, Paragraph<Collection<String>, String, Collection<String>>>> lm) {
        for (var tuple : lm.getAddedSubList()) {
            var paragraphIndex = tuple.get1();
            var paragraph = tuple.get2();

            //System.out.print(paragraphIndex + ";");
            var styles = styler.apply(paragraphIndex, paragraph.getText());
            if (styles != null) {
                Platform.runLater(() -> {
                    area.setStyleSpans(area.getAbsolutePosition(paragraphIndex, 0), styles);
                });
            }
        }
    }

PavelTurk avatar May 23 '25 12:05 PavelTurk

I think a different mechanism is needed here. Currently the styling is happening after the text is added to the viewport. I'll try and have a look next week and see if the styling can't be applied just before it's added to the viewport instead.

BTW you can also use area.setStyleSpans( paragraphIndex, 0, styles );

Jugen avatar May 23 '25 13:05 Jugen

Adding my fifty cents, as I did the exact same during last week (implementing syntax identification for group comments in Python). The Platform.runLater indeed delays the display (and I assume this is mandatory for the library as removing it creates some exception). My solution was styling all paragraph at opening the text area and then updating when changes occurs in the getVisibleParagraphs() change listener. Using that method, it doesn't fix the delay, but for most cases it will not be visible to the user. Also, you should update style on update of paragraph (for that observe plainTextChanges() and not getParagraphs(), because the later will react on style changes while the first one only react on content change).

(it obviously depend on what you are trying to do, I assume your red colouring is just a test for some other purpose, if you have some more details on your end goal I can maybe compare to the issues I faced).

Update: For a file with 120k lines, the colouring of Python syntax on my side takes about 400ms, which is acceptable for a file that large. But again, that depends on your end goal.

Symeon94 avatar May 23 '25 13:05 Symeon94

@Jugen

  1. Thank you for telling about this method area.setStyleSpans( paragraphIndex, 0, styles ); It is what I was looking for. I didn't notice that paragraph.restyle(...) returned a new paragraph - I was sure that it is void. But, as I understand when new style spans are applied offset is always required. So, are offsets recalculated when we call area.setStyleSpans( paragraphIndex, 0, styles );? By other words, does RTFX keep offsets in its model, or calculate it every time using loop and paragraph text lengths?

  2. What about a different mechanism - maybe we should add a functional interface that will be called when RTFX prepares a paragraph for display - on scrolling, update (AFTER CHANGE EVENTS), etc, BUT BEFORE the processed paragraph becomes visible. I mean something like SyntaxDecorator in JFX CA.

For example:

@FunctionalInterface
public interface StyleSpansFactory<T> {
     //returns styleSpans OR NULL!
     StyleSpans<T> create(int paragraphIndex);
}

@Symeon94 Yes, the red color is just for test. I think, when this issue is resolved we will have a solid and reliable mechanism for code styling.

PavelTurk avatar May 23 '25 14:05 PavelTurk

For your first point, GenericEditableStyledDocumentBase (used for setStyleSpans) seems to recalculate offset:

public void setStyleSpans(int paragraphIndex, int from, StyleSpans<? extends S> styleSpans) {
    setStyleSpans(doc.position(paragraphIndex, from).toOffset(), styleSpans);
}

If there is a way to apply style without calling Platform.runLater, that seems to me the only way to avoid the delay, I'll be interested to know if it can be fixed (and I'll be following the topic).

Symeon94 avatar May 23 '25 14:05 Symeon94

Okay I've updated PR #1277 by reverting the previous solution and now adding a setVisibleOnlyStyler method instead. This styler will only be applied to Paragraphs, just before being displayed.

Important Notes

  1. The result of this styling does NOT modify the document model.
  2. Paragraph is immutable, so don't return the same object expecting changes.
  3. The styler should return the result of one of Paragraph's restyle methods.

Please assess and provide feedback, thanks.

Jugen avatar May 26 '25 11:05 Jugen

I had a quick look. I have no idea how to test the code on my side (I assume you have) so considering the simplicity of the change it seems ok (I only had a small comment on the comment of the provided UnaryOperator).

Symeon94 avatar Jul 02 '25 05:07 Symeon94

I’ve switched to other tasks for now, so I haven’t tested it yet. I hope to sort everything out within the next 1–2 weeks and then test the PR in various scenarios.

PavelTurk avatar Jul 02 '25 09:07 PavelTurk

I have managed to build and compile with the changes. But I have a difficulty integrating within my application because the style in my case depends on the parapraph position (for styling group comments in Python) and the UnaryOperator applying the style only providing the paragraph content. Would there be a way to provide the paragraph index?

Symeon94 avatar Jul 02 '25 10:07 Symeon94

@Jugen Thank you very much for PR https://github.com/FXMisc/RichTextFX/pull/1277 .

Sorry for the late response — I only had a chance to test this PR today. However, I wasn't able to integrate it with my library. The issue arises because there's no access to the paragraph index. When this Styler is invoked, it is not supposed to handle text parsing logic—that’s not its responsibility. It can only generate styles based on the data model of the given paragraph. And to retrieve the model for a specific paragraph, we need its index.

I believe the index is essential, because accurate text parsing is usually done in a background thread and may affect not only the visible paragraphs. For example, if there are 100 lines, and the user types the beginning of a multiline comment (/*) in the first line, the styling of other lines may also be affected.

So, in order for everything to work seamlessly, we need 1) a data model for each paragraph (retrievable by index), and 2) the paragraph index when the styler is called.

I just verified: in JavaFX’s CodeArea, the index is also passed — https://github.com/openjdk/jfx/blob/1a2a50b593a1abcb767a3c6b0287996bdfb26973/modules/jfx.incubator.richtext/src/main/java/jfx/incubator/scene/control/richtext/SyntaxDecorator.java#L54

Perhaps we should consider adding an index field to the Paragraph. While this would require updating all subsequent paragraphs on insertion or removal, it would only involve a simple loop that updates a single int field (setIndex(i)).

PavelTurk avatar Jul 16 '25 10:07 PavelTurk

Some preliminary information on where the index is to help the investigation (I made a few assumptions along the way, it might require deeper investigation and testing, but I'm rather confident). I'm not an expert on this library, so if someone with more expertise know more, let me know where I might have made wrong assumptions. createCell(...) is passed to a method creating the virtuaflow in GenericStyledArea (line 782). It is then passed to the CellListManager which gives it to the CellPool:

public CellListManager(
        Node owner,
        ObservableList<T> items,
        Function<? super T, ? extends C> cellFactory) {
    this.owner = owner;
    this.cellPool = new CellPool<>(cellFactory);
    this.cells = LiveList.map(items, this::cellForItem).memoize();
	// ...
}

CellPool uses it via this.cellForItem(T item) which is passed to a memoized list (you can see it in the code above).

This method will be used to create cell for an index when calls to getIfMemoized() are made. These are done in the below function and I assume that the index here is the paragraph index (from what I have seen, I'm confident that is the paragraph index).

public Optional<C> getCellIfPresent(int itemIndex) {
    if (itemIndex>=cells.size()||itemIndex<0) {
       return Optional.empty();
    }
    return cells.getIfMemoized(itemIndex); // getIfMemoized() may throw
}

Concretely, this should (another assumption) end up calling in MappedList<E, F> the following (where mapper is the CellListManager::cellForItem(T item)):

@Override
public F get(int index) {
    return mapper.apply(source.get(index)); // You can assume that this is cellForItem(source.get(index));
}

Source in that context is the ObservableList<T> items in the first code block above.

This is the place where the paragraph index can be used. Optimally this method should do:

    return mapper.apply(index, source.get(index));

On method private C cellForItem(int index, T item) which would call cellPool.getCell(index, item) which would call the create cell in GenericStyledArea (this is modified version of the code):

virtualFlow = VirtualFlow.createVertical(
    getParagraphs(),
    (index, par) -> {
        Cell<Paragraph<PS, SEG, S>, ParagraphBox<PS, SEG, S>> cell = createCell(
        index,
        par,
        applyParagraphStyle,
        nodeFactory);
        nonEmptyCells.add(cell.getNode());
        return cell.beforeReset(() -> nonEmptyCells.remove(cell.getNode()))
        .afterUpdateItem(p -> nonEmptyCells.add(cell.getNode()));
    });

But that would be a potentially big impactful change. There might be other options.

Symeon94 avatar Jul 16 '25 11:07 Symeon94

"Small" update: another solution (probably much simpler) would be that the T item contains both the paragraph and the index.

This isthe createdVertical which pass Function<? super T, ? extends C> cellFactory, which is the pethod (par) -> createCell(...) receiving a Paragraph (T) and returning a Cell (C). This depends on the ObservableList, it seems that the method getting an element there will create the item from the index. That is where we should be able to add the index to the paragraph. I'm looking to find where exactly is this place.

Additional update:

To include the index in the paragraph, we must be sure that the paragraph index will be recreated each modification (else it might keep outdated information). From what I had seen long ago, I think that each modification recreates the paragraph, but I might be wrong on that one. And more generally, from what I have seen in the code, I don't think adding the index to the paragraph is actually something that will be easily achievable. To be confirmed.

Now, when it comes to where the paragraph is updated, the code is a bit of a maze, but the list containing them is ParagraphList containing the document. So to get the paragraph it calls ReadOnlyStyledDocument::getParagraph(index) and they are kept in some sort of tree it appears.

If I'm not wrong, this document is built from "ReadOnlyStyledDocumentBuilder". Paragraphs are added and then the document is created. So the paragraph already exist before being given a position inside the document.

Symeon94 avatar Jul 16 '25 11:07 Symeon94

Last comment from me and then I'll wait your input, it doesn't seem actually to difficult to add a flavour to the libraries to pass the index (my first proposal), you just need to create a variations in Flowless and ReactFX to use this function (where it will receive a tuple of the index and the parapraph instead of only the paragraph):

Function<Tuple2<Integer, ? super E>, ? extends F> mapper

Symeon94 avatar Jul 16 '25 12:07 Symeon94

@Jugen Any ideas?

PavelTurk avatar Aug 03 '25 07:08 PavelTurk

Yesterday, this project was presented to the community — https://github.com/mkpaz/tm4javafx — but without support for RTFX. We need to improve CodeArea.

PavelTurk avatar Aug 03 '25 07:08 PavelTurk

As soon as I get some time this week I'll try to implement the fix I proposed if there is no additional inputs. It will involve adding a small addition to ReactFX and an update of this library to use a Tuple with the index. Unless @Jugen sees a better solution to that problem.

Symeon94 avatar Aug 04 '25 06:08 Symeon94