API Redesign and Large-Scale Refactoring
In this issue, I’m proposing a rather drastic change: a large-scale refactoring of both the codebase and the public API. Yes, this will introduce breaking API changes.
The reason for this proposal is that I often notice clear inconsistencies in method names, properties, and other parts of the API — as if they were created by different people (which is most likely the case). As for the implementation, many parts seem to have been built more as workarounds than as clean solutions.
I understand that this would be a massive amount of work. I’m also aware that JFX RichTextArea is currently evolving rapidly. However, after working with both implementations, I see that each has its own strengths and weaknesses, and they are not interchangeable. From this, I conclude that this project will continue to be relevant and developed for quite some time, at least as long as JavaFX remains relevant.
If the project maintainers agree with this proposal, I suggest first gathering input from all users about the changes they consider necessary, and only then making a decision.
+1 from me, I have no objections to this in principle.
I'm not against the principle (I would even say I'm in favor of the idea), as the code certainly has a lot of parts that could require refactoring, but based on my work experience I would tend to favor incremental refactoring and not large ones.
Maybe we could list elements or sections of the codebase that could be improved this way and proceed piece by piece. The classical methodology would be to ensure test coverage of the section to be modified first and to then proceed with refactoring.
Now, obviously I'm a newcomer in this project, but I think a conservative approach tends to lead to better outcomes. The risk of going with a big overall refactoring is:
- Breaking interfaces as you mentioned
- Increased risk in adding new bugs
- Actually never completing the refactoring as a whole
I am only a follower of this project, not a user, yet. I have plans to use it or the web client with java script as a code editor. Based on the little I have browed this project it seems like it may require a lot of user setup and custom code, not just plug in play. I am not at all opposed to a major refactor, but if you are breaking the API anyway I would suggest starting fresh with a new project. Perhaps even use an AI agent such as Claude Code to write it. It might not take that long.
I don’t see breaking the API as inherently problematic—provided there’s a clear and justified purpose behind it. Typically, breaking changes are signaled by incrementing the major version, following the Semantic Versioning standard widely adopted in the JavaScript ecosystem (I know we are not in a JS project, but this system makes sense).
At the moment, we’re on 0.x.x. Introducing a breaking change would be a good opportunity to move to 1.x.x, giving users time to transition smoothly. That said, breaking changes should be introduced progressively, allowing developers to adapt from their current reality to the new.
The key question we should ask is: what are we aiming to achieve with this redesign?
- Are there particular features we want to introduce?
- Is the main goal to improve maintainability, and if so, in which areas of the codebase?
Clarifying these objectives will help us align on purpose and ensure we do not redesign for the sake of redesigning (which can be fun, no doubt 😉)
I was going through the code, and I wanted to document some of things we could consider for cleanup.
One thing I saw that could definitely need some cleanup are these huge methods in the codebase. I'd consider that above 20 lines is already too much, in some rare instance it can be accepted, but if that becomes a habit it is a good sign something is off.
These could be decomposed and tested as separate unit. In the long run that can help moving things around and cleaning the readability of the code.
For instance, I have been through GenericStyledArea and found:
- The constructor is 104 lines long!
-
handleInputMethodEventis 37 lines long -
allParToVisibleParIndexis 21 lines long -
visibleParToAllParIndexis 22 lines long -
getCharacterBoundsOnScreenis 69 lines long -
setLineHighlighterOnis 31 lines long -
unfoldParagraphsis 27 lines long -
layoutChildrenis 24 lines long -
createCellis 109 lines long! -
followCaretis 41 lines long
Another point that can be corrected before a big refactoring: huge classes are also usually a sign a class has too much responsibility (GenericStyledArea itself is arlready 2165 lines big). Take for instance followCaret mentioned above. Usually when you have a method Verb + Name you know there might be an opportunity to move things away. In that case followCaret could be a method follow on a class Caret (I know there is already a Caret interface, but I'm only trying to highlight points to investigate for overall improvement).
So if we could already:
- Subdivide and test methods
- Separate responsibilities
We would end up with clearer responsability, smaller units which are easier to modified and clearer layer separation (it will arise from the need to test units). And once you have these smaller units it becomes easier to redefine the API or to extend them.
I see that each has its own strengths and weaknesses, and they are not interchangeable.
Would be interested in hearing which cases / niches each fits better into and why.
Are there particular features we want to introduce?
Some things I've thought would be nice to have as a user:
- Ability to get
x,ylayout offsets of text positions- Current workaround
- Use cases: Drawing overlays such as "error squiggles" in an IDE-like editor, drawing lines between two lines of assembly code that reference one another
- Faster alternatives for various operations that force
layout()calls- Current workaround
- Or at least the option to extend the classes to make our own exposed wrappers for things so that I can get rid of my ugly reflection code 🙃
But generally my main desire of such a refactoring effort would be higher test coverage and more thorough tests (Though, I understand testing a UI focused library is typically challenging).
But generally my main desire of such a refactoring effort would be higher test coverage and more thorough tests (Though, I understand testing a UI focused library is typically challenging).
I agree that the objective of refactoring to improve test coverage. But why does better coverage matter? It has two major benefits:
- It reduces the risk of breaking existing features when making changes, which in turn speeds up feature development and bug fixing.
- It naturally encourages a clean separation of layers, since proper testing requires isolating the core logic from the UI.
Right now, we have 72 unit tests, and most of the remaining coverage comes from integration tests that actually start the UI. My IDE reports only 16% unit test coverage, which is quite low. A good first step would be to extract the model from the UI and aim for at least 80% unit test coverage on the model layer.
As far as your feature proposals, have you looked at how to add these? Was there some issue in the structure that made it more difficult to implement these changes that we could refactor?
I have done PR #1303 to showcase what could be a methodology:
- I extracted some part of the logic into a unit that can be tested (with minimal/no modification)
- I wrote some UT (actually I will write more, but for today I'll limit myself to that). Note that these tests are covering what the code is actually doing right now (I make no assumption on what it should do).
- Once there is enough test, we create a PR (at this point there is no real refactoring having happened yet)
- Once PR is merged, we can start refactoring the code that is now covered in test. Separating PR allows to guarantee to minimize the risk of breakage
The goal of this step is to slowly break the code into smaller units that are tested and then we can start grouping them in logical units to refactor the overall architecture.
To differentiate PR we could assign as a rule that "[Refactoring preparation]" could be the first type of PR and "[Refactor]" the second one.
As far as your feature proposals, have you looked at how to add these? Was there some issue in the structure that made it more difficult to implement these changes that we could refactor?
It wasn't that the existing structure made it so that these couldn't be implemented. All that would be necessary for my use case are some getters.
Generally when I see a library keeping implementation details private I don't expect exposure requests to be taken too seriously. Especially if there is a reason for some of those for being private, like the cases where the public paragraph index requests do a layout() to ensure correctness (at the cost of perf).