winmerge icon indicating copy to clipboard operation
winmerge copied to clipboard

Fixed Substitution Filters

Open EugeneLaptev opened this issue 4 years ago • 8 comments

I tried to use Substitution Filters for my code changes that contain many symbol renames and found that most of the time Substitution Filters don't work. After looking at the code you appear to have removed some important logic. I've introduced the logic back and for me the feature works properly again.

There is also a slight improvement gui-wise. If a selection is present on both panes then the command Add This Change To Substitution Filters uses the selection. Comes very handy for me.

Keep doing the great work!

EugeneLaptev avatar Jul 16 '21 16:07 EugeneLaptev

Thank you for PR.

Specifically, in what cases does substitution filters not work? This PR introduces a lot of complexity, so I'd like to do that if I can fix it with the current logic.

Perhaps what you're trying to do is replace only word-by-word differences. I'm afraid that this may not work in some cases.

If your problem cannot be solved without introducing word-by-word difference substitution, I would like to be able to switch to the current logic as an option.

sdottaka avatar Jul 16 '21 22:07 sdottaka

Hi! For me the algorithm updated by you worked half of the time. For example I have a rename that turns bool gbUseVectorRaycast = true; into bool gbUseIspcForRaycasts = true; The current WinMerge cannot ignore this change, even if I explicitly enter the change to ignore: bUseVectorRaycast->gbUseIspcForRaycasts

The problem might be the following. If you go through the text and try to apply the list of substitutions then it's not guaranteed that it'll work correctly for the individual pairs. You have to consider all the pairs individually. For example, consider this substitution list A->B A1->C According to this list the texts CA1T and CCT should appear to be the same, up-to the trivial changes defined by the substitution list, but they won't.

I've just discovered that if the substitution list contains ""->"" (empty string to empty string) then it hangs on opening a file.

I agree that an apparently simple feature can involve complicated logic. I am an interested end user and you hold the entire picture of the codebase. We should eventually come up with a solution that works for me and makes sense for everyone.

(Btw, my current commit (at work) contains so many renames mixed with changes of interest that I considered that it would be quicker to go fix WinMerge than ignore the renames with my eyes)

EugeneLaptev avatar Jul 17 '21 11:07 EugeneLaptev

I understand that the following substitution filters do not work for texts CA1T and CCT.

A-> B
A1-> C

However, since the above filter has too few characters, I feel that it is necessary to change the order or change using more appropriate character strings and regular expressions.

I couldn't understand in the text below that it doesn't work with the current substitution filter implementation.

[left]

bool gbUseVectorRaycast = true;

[right]

bool gbUseIspcForRaycasts = true;

I defined it as below, but it looks like it's working. Does it mean that the substitution filters on other lines have an effect and don't work as in the example above?

image

SubstitutionList::AreAllChangesTrivial() in this PR doesn't seem to work when you want to ignore date differences like this:

[left]

2021-07-17 01:23:45

[right]

2021-07-18 23:45:01

[Substition filters(regular expression)] "\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d" → "xxxx-xx-xx xx: xx: xx"

However, in this PR, the above substitution filter works because the existing processing executes after AreAllChangesTrivial(). Is this intended?

After all, it would make sense to target the substitution filters only on word-by-word differences, but since that may not work in some cases, I would like to add a checkbox to enable or disable it.

In addition, I would like to release the stable version 2.16.14 within this month, so I would like to accept this PR after the release.

sdottaka avatar Jul 17 '21 14:07 sdottaka

I'll come up with a clear repro a bit later. My text was a bit more involved than those two bools. Btw the reason I split the patern/replacement pairs into common and different parts was because the changed blocks are analysed for the differences first. These differences would match the middlePart. But this may well be clear to you anyway. Of course, dont compromise stability, if you think so. My fix can wait. Plus those options for the substitution pairs (regexp, match case, match whole word only) do not work and need to be either removed or implemented for my version. At least, I have a version that allows me to do my main work more effeciently. Good luck.

EugeneLaptev avatar Jul 17 '21 21:07 EugeneLaptev

I've just discovered that if the substitution list contains ""->"" (empty string to empty string) then it hangs on opening a file.

I've fixed that issue with commit 9840d43.

sdottaka avatar Jul 18 '21 01:07 sdottaka

Ok, I could not create an easy repro for those two bools. What must be have been happening was that my substitution list, being rather big (~30 pairs), contined strings that somehow mapped to the substrings for the names. Thus it was failing for the reason outlined above.

Anyway for any significantly big subsitution list this problem will likely be hitting users occasionally. So the current algorithm in WinMerge of applying the full list and then comparing the result is not going to be a reialble solution.

EugeneLaptev avatar Jul 19 '21 13:07 EugeneLaptev

Thinking of it, it's not clear what's best to do.

Because this looks interesting as well:

[Substition filters(regular expression)] "\d\d\d\d-\d\d-\d\d \d\d:\d\d:\d\d" → "xxxx-xx-xx xx: xx: xx"

Maybe an option like you suggest.

Btw is there a function that takes two chars and tells if there is a word boundary bettween them? Thanks

EugeneLaptev avatar Jul 22 '21 09:07 EugeneLaptev

Btw is there a function that takes two chars and tells if there is a word boundary bettween them?

The isWordBreak () function in stringdiffs.cpp may be close to that.

sdottaka avatar Jul 22 '21 10:07 sdottaka

If you have a fix, please create a new PR

sdottaka avatar Mar 13 '23 00:03 sdottaka