Fixed JavaKeywordsDemo not updating highlighting on the correct lines when multiple changes happen in succession
This pull request fixes issue #1301 by accounting for the fact that later changes affect the line numbers of earlier changes.
This solution may be of help to those wanting a way to run updates only on lines that have been changed (e.g. for efficiency), whether that's for syntax highlighting or other operations. Issue #1273 seems related to this.
Actually it appears this still isn't correct because offsetToPosition() is relying on the latest version of the text, not the text at the time of the edit. I think what we need to do is find the offsets affected across all the changes first (accounting for their influence on each other), then convert these to line numbers.
Everything should be correct now. I used IndexRanges to minimise calls to offsetToPosition() in case those may be slow for very large texts.
@willplatt Thanks for the PR, however this will need careful review and some changes.
@Jugen I assumed that while the subscribe function was running there couldn't be any more changes to the text. I thought all this code should be running on the FX thread, which is why there's a separate async demo. Are you saying some other code might change the text outside the FX thread? Or are you saying that at the moment the subscribe function is called there already could have been another change to the text? (If that's the case it seems like something to address in the implementation of EventStream because it's common to want the event you're processing to be the latest one, otherwise the event no longer corresponds to the model.)
The problem is that reduceSuccessions takes whatever time we've set it to wait before emitting an event and in that time further changes to the document may happen that could change the conversion result.
@Jugen I understand that, but the subscribe() lambda shouldn't be called unless no changes have occurred during that wait period. Is the issue that that the timer runs on another thread, so it's possible for it to run out but just before it notifies the FX thread another change occurs to the text on the FX thread?
Let's use an example: "The dog chased the cat"
- Change cat to mouse at position 19
- Then immediately change dog to mouse as well, within the succession period.
- Pause editing so succession ends and the events are emitted. If you now calculate where cat was, at 19, you'll get a different answer because the document isn't in the state it was when the edit occurred. Sometimes this may be the result that is needed, other times not - so we need to be careful that we are working with the value expected.
Note this isn't a real example that'll provide meaningful values showing the problem.
I don't see the problem in your example. In my code both events will be passed to the subscribed listener, which will see the order in which the changes were made and work out that the first change is no longer at position 19, it's at 21, while the second change is at 4. Then it computes the lines of these positions based on the final text, which is what we want.
@Symeon94 I agree that the code would be nicer if broken into small functions, but I'm aware others might have a different opinion. You're probably right that it would be better to map() the Set instead of using removeIf(). But I'm a little confused about your comment on "threshold". It's a word I've often seen used in programming or when talking about numbers in any field. Some dictionary definitions are "The quantitative point at which an action is triggered, especially a lower limit" and "The point that must be exceeded to begin producing a given effect or result or to elicit a response" (source).
But I think first @Jugen needs to be convinced that my proposed changes work correctly.
Ok, but you get the general idea of my comments:
- Split methods to keep them short
- Have significant naming for variables and methods
Breaking in small methods is a good way to convey meaning. In this case, you used comments describing what happens as replacement for breaking into smaller units. It is also easier to understand when methods do a single task.
As far as threshold is concerned, the word itself is not conveying meaning on the underlying reality. You kept the method naming generic, which is good. Re-reading your method, it seems to me that threshold is fine, but it is the method name that could be improved by referring to the range end as this is what is compared to the threshold. This small change would clarify a lot I think. I'm thinking of shiftWhenIndexesEndAboveThreshold or to keep it short shiftWhenEndAboveThreshold.
@willplatt Okay, I've taken a closer look at what you have submitted, and I understand it better now. So, first off apologies for my earlier response it was premature.
In summary the primary bug that is being addressed here is that edits that occur positionally earlier but temporally later cause the previous edit positions to be incorrect and need to be updated. In order to fix this, changes during the succession time are collected together and then processed to correct for any shifts.
So, I was wondering if sorting the change list on position before processing it wouldn't simplify the algorithm a bit ?
Then I'm not clear on why in the shift method with the adjustments to newEnd and newStart we want the greater (max) of the adjustment and threshold. Maybe put a comment there to explain the necessity.
Wrt to the shift method's parameter "threshold" I think "changePosition" would be a better name for it and then change the shift method's name accordingly to "shiftIndicesAtOrAbovePosition".
Since this is a demo, I'm not going to nitpick too much about any it - however do consider @Symeon94 input in terms of clarifying things.
@Jugen I think your summary of the bug and the fix is correct.
I don't think we want to sort the changes on position because then you could have an early change affecting the last change (which is already up-to-date and shouldn't be altered). We only want later changes to affect earlier change positions. But what you could do instead of how I did it is iterate through the list backwards (or reverse it and go forwards), having each item modify the ones before it.
The reason I use Math.max() of the adjusted position and the threshold is because you can't have a change starting at position x moving some text at position y > x to a position before x. However, the shift calculation could give you a position before x if the change involved replacement of the text from x to beyond y. In such cases we no longer care about the text that was at position y because it was deleted. But instead of figuring out what the adjusted range technically is in light of this deletion it's simpler just to pretend that position y was shifted to x. We know x is one of our change positions anyway so there's no extra syntax highlighting that results from this simplification. Writing it out makes it sound complicated but it's not really. I can spread this step out over a few lines with a comment or two to make it clearer.
It sounds like the rest of the code makes sense to you so I'll just tidy it up.
@Jugen @Symeon94 I tried to incorporate both your suggestions, so please review again and hopefully it is clearer now.