mattermost-plugin-autolink icon indicating copy to clipboard operation
mattermost-plugin-autolink copied to clipboard

Match patterns in parentheses

Open wodny opened this issue 5 years ago • 22 comments

Add matching opening parenthesis to the non-word pattern prefix. Closing parenthesis has already been there.

wodny avatar Sep 24 '20 18:09 wodny

Thanks @wodny for the pull request!

Per the CONTRIBUTING.md file displayed when you created this pull request, we need to add you to the list of approved contributors for the Mattermost project.

Please help complete the Mattermost contribution license agreement?

This is a standard procedure for many open source projects. Your form should be processed within 24 hours and reviewers for your pull request will be able to proceed.

Please let us know if you have any questions.

We are very happy to have you join our growing community! If you're not yet a member, please consider joining our Contributors community channel to meet other contributors and discuss new opportunities with the core team.

mattermod avatar Sep 24 '20 18:09 mattermod

Codecov Report

Merging #130 into master will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #130   +/-   ##
=======================================
  Coverage   45.04%   45.04%           
=======================================
  Files           6        6           
  Lines         535      535           
=======================================
  Hits          241      241           
  Misses        278      278           
  Partials       16       16           
Impacted Files Coverage Δ
server/autolink/autolink.go 51.16% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8acad23...e9dacd9. Read the comment docs.

codecov-commenter avatar Sep 24 '20 18:09 codecov-commenter

@wodny Appreciate the contribution and understand the intent. I'll need to look into the history carefully. The custom delimiter set (as opposed to WordMatch=true) has a long history of fine tuning, and I am afraid that this change would break an existing use case; unfortunately not covered by a unit test?

levb avatar Sep 25 '20 13:09 levb

@levb I've looked at the history and it seems that the set of characters for matching without using \b is the same since the beginning, i.e. d0760cfb3bd49895ce37e99d589181972088be30 (went through all the file renames).

I can see tuning regarding:

  • temporary replacement of custom prefix/suffix with \b only,
  • return of custom prefix/suffix and option to use \b as an alternative,
  • ReplaceAll called once, twice and replaced with some custom code (multiple versions),
  • double-quoted strings replaced with backtick strings.

So it looks like the parenthesis addition shouldn't break any assumptions. I've done some tests with patterns containing parentheses, those patterns inside another parentheses and so on...

wodny avatar Sep 25 '20 15:09 wodny

@wodny Appreciate the contribution and understand the intent. I'll need to look into the history carefully. The custom delimiter set (as opposed to WordMatch=true) has a long history of fine tuning, and I am afraid that this change would break an existing use case; unfortunately not covered by a unit test?

@levb @iomodo So it seems that I missed the fact that my change actually makes already existing tests fail. I ignored the failed part of CI on GitHub because of its name - "coverage". I assumed it's about percentage of coverage and not about tests. I've logged into Circle CI a couple of minutes ago to see the details.

I think that unit tests treating the ( character as non-boundary need a discussion.

But the second group of fails is much more interesting: more than one pattern/template seems to be applied to a single text fragment and MM-12345 gets expanded twice.

I'm going to experiment a bit with the code despite my lack of knowledge about Go ;)

BTW, with a more complicated pattern (?P<MattermostNonWordPrefix>(^|^\(|\s|\s\()) tests pass but I have to think if it would be a solution or just a dirty workaround.

wodny avatar Sep 28 '20 16:09 wodny

@levb @iomodo So I came to a conclusion that the more complex regex isn't pretty. Modifying the code to apply only the first matching pattern/template to a piece of text is beyond my capabilities taking no Go experience into account. I assume that it would be preferable not to search for a pattern in an already expanded text.

I've worked around my problem with ticket numbers in parentheses (eg. See foo bar (#1234).) by creating two pattern/template entries. One regular for #\d+ and one for (#\d+) with DisableNonWordPrefix set to true. It works good enough.

Is the DisableNonWordPrefix setting a part of the official API? It's covered by tests and is available via the /autolink command but is not documented in README. So maybe a better PR would be the one containing description of DisableNonWordPrefix and DisableNonWordSuffix and an example of matching things like ticket numbers in parentheses by creating two entries as mentioned above? What do you think?

wodny avatar Sep 28 '20 19:09 wodny

This PR has been automatically labelled "stale" because it hasn't had recent activity. A core team member will check in on the status of the PR to help with questions. Thank you for your contribution!

/cc @jasonblais @jfrerich

mattermod avatar Oct 09 '20 01:10 mattermod

/check-cla

jasonblais avatar Oct 09 '20 21:10 jasonblais

@levb @iomodo please see above note from @wodny and help provide guidance on next steps?

Thanks @wodny for the contribution!

jasonblais avatar Oct 09 '20 21:10 jasonblais

/cla-check

jasonblais avatar Oct 09 '20 21:10 jasonblais

I have also confirmed @wodny has signed the CLA, uncertain why the CLA check fails.

jasonblais avatar Oct 09 '20 21:10 jasonblais

Codecov Report

Merging #130 into master will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #130   +/-   ##
=======================================
  Coverage   45.04%   45.04%           
=======================================
  Files           6        6           
  Lines         535      535           
=======================================
  Hits          241      241           
  Misses        278      278           
  Partials       16       16           
Impacted Files Coverage Δ
server/autolink/autolink.go 51.16% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8acad23...e9dacd9. Read the comment docs.

codecov-io avatar Oct 09 '20 22:10 codecov-io

@wodny apologies for not responding sooner.

RE: "change actually makes already existing tests fail." - exactly, I think had I tried something like this before and then realized that it was there for a reason.

I wonder if the best solution would just be to make the entire prefix (and suffix) user-customizable, with a meaningful (\b?) default. It can be added as an extra parameter in the link, effectively subsuming "word match". @aaronrothschild @iomodo @wodny What do you think?

levb avatar Oct 12 '20 01:10 levb

Codecov Report

Merging #130 into master will not change coverage. The diff coverage is 100.00%.

Impacted file tree graph

@@           Coverage Diff           @@
##           master     #130   +/-   ##
=======================================
  Coverage   45.04%   45.04%           
=======================================
  Files           6        6           
  Lines         535      535           
=======================================
  Hits          241      241           
  Misses        278      278           
  Partials       16       16           
Impacted Files Coverage Δ
server/autolink/autolink.go 51.16% <100.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update 8acad23...e9dacd9. Read the comment docs.

codecov-commenter avatar Oct 12 '20 01:10 codecov-commenter

@wodny apologies for not responding sooner.

RE: "change actually makes already existing tests fail." - exactly, I think had I tried something like this before and then realized that it was there for a reason.

I wonder if the best solution would just be to make the entire prefix (and suffix) user-customizable, with a meaningful (\b?) default. It can be added as an extra parameter in the link, effectively subsuming "word match". @aaronrothschild @iomodo @wodny What do you think?

Sorry, I'm not following here.

aaronrothschild avatar Oct 12 '20 10:10 aaronrothschild

I wonder if the best solution would just be to make the entire prefix (and suffix) user-customizable, with a meaningful (\b?) default. It can be added as an extra parameter in the link, effectively subsuming "word match". @aaronrothschild @iomodo @wodny What do you think?

Sorry, I'm not following here.

Currently, the only option is to select between two predefined sets of boundaries using ~~DisableNonWordPrefix and DisableNonWordSuffix~~ WordMatch, ie. ^|\s + $|[\s\.\!\?\,\)] or \b or to disable boundaries with DisableNonWordPrefix and DisableNonWordSuffix. I suppose @levb suggests to generalize those ~~two~~ settings and allow custom user boundaries for those who eg. don't have JIRA and don't need current defaults and like me need to support Redmine issue hash notation in parentheses. Although, probably the default should be to use the predefined longer and more versatile boundary, not \b.

wodny avatar Oct 12 '20 11:10 wodny

I wonder if the best solution would just be to make the entire prefix (and suffix) user-customizable, with a meaningful (\b?) default. It can be added as an extra parameter in the link, effectively subsuming "word match". @aaronrothschild @iomodo @wodny What do you think?

I think it would work for me as this would provide functionality of the original PR but with the major advantage of not being hard-coded.

wodny avatar Oct 12 '20 11:10 wodny

This PR has been automatically labelled "stale" because it hasn't had recent activity. A core team member will check in on the status of the PR to help with questions. Thank you for your contribution!

/cc @jasonblais @jfrerich

mattermod avatar Oct 23 '20 01:10 mattermod

@aaronrothschild @levb can you help advice on next steps?

jasonblais avatar Oct 23 '20 16:10 jasonblais

This PR has been automatically labelled "stale" because it hasn't had recent activity. A core team member will check in on the status of the PR to help with questions. Thank you for your contribution!

/cc @jasonblais @jfrerich @emilyacook

mattermod avatar Oct 31 '20 01:10 mattermod

@aaronrothschild @levb reminder to help advise on next steps?

jfrerich avatar Oct 31 '20 14:10 jfrerich

Hi @levb @aaronrothschild - I'm helping review stale PRs and it seems this PR is waiting on your advice for next steps? Please help prioritize. TIA

/cc @wodny

lindy65 avatar May 26 '21 08:05 lindy65

Closing as inactive

hanzei avatar Feb 21 '23 11:02 hanzei