Email cloak plugin fails for emails with IDN
Pull Request for Issue #39811
Summary of Changes
change regular expression of Search email and patttern
Testing Instructions
use any language email for cloaking
Actual result BEFORE applying this Pull Request
All language email not cloaked
Expected result AFTER applying this Pull Request
All language email will cloaked
Link to documentations
Please select:
-
[ ] Documentation link for docs.joomla.org:
-
[ ] No documentation changes for docs.joomla.org needed
-
[ ] Pull Request link for manual.joomla.org:
-
[ ] No documentation changes for manual.joomla.org needed
You are welcome to try my suggestion, which does not have so many changes and is more reliable in these examples. My suggestion: https://regex101.com/r/1wQv4v/1

Your suggestion: https://regex101.com/r/mOVnlL/1

We have to be careful with this, because many other versions of email addresses are handled in the file. These must all continue to work afterwards. Otherwise we have again quite many new Issius to it.
a@b is a valid email address (intranets) but it fails both of the regex above
Do these email addresses need to be cloaked in intranets? Isn't it much more a protection against spam or abuse from the outside?
@LukasHH @richard67 with your regular expression email that contain only latin latters before @ will accepted so replacing this with p{L} take care for that is this fine if we change regular expression to :
"([\p{L}.'-+]+@(?:[a-z0-9.-\p{L}]+.)+(?:[a-zA-Z0-9-\p{L}]{2,24}))";
we can also short this regular expression to
"([\p{L}.'-+]+@(?:[.-\p{L}]+.)+(?:[-\p{L}]{2,24}))";
as p{L} accept all alphnumaric character no need to specify other like a-z0-9 and a-zA-Z0-9
OK - that's better. But the special characters must be escaped.
([\p{L}.\'\-\+]+\@(?:[\.\-\p{L}]+.)+(?:[\-\p{L}]{2,24}))
@LukasHH @richard67 please test it !
I have tested this item :white_check_mark: successfully on 02e6a09af50ec2c78dfa9c3423540e9e9105d6c9
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/39888.
Please test all variants that the plugin covers and have already occurred. This change affects all variants. I have here an example code to test, which you can copy in a new article about source code.
https://github.com/LukasHH/other-test/blob/main/email_cloaking_test_source.html
The change is still running on errors. Especially because of the wrong brackets. Here is an example, the test page with the sample code. https://j4-test.it-conserv.de/test/issiu
Please test all variants that the plugin covers and have already occurred. This change affects all variants. I have here an example code to test, which you can copy in a new article about source code.
even with the fixed regexp some tests failed. Tests 17, 18, 19 have a wrong email(partial) in both a and b. 20 - emails not cloaked in unordered list and between <p>. Failed test 21 related to https://github.com/joomla/joomla-cms/issues/39786
@Kaushik1216 see https://github.com/joomla/joomla-cms/pull/39888#issuecomment-1442566361
Please check my suggestion. In my tests this worked without any problems.
row 71 (add u at the end of the pattern)
$pattern = '~(?:<a ([^>]*)href\s*=\s*"mailto:' . $link . '"([^>]*))>' . $text . '</a>~iu';
row 106 (replace the pattern)
$searchEmail = "([\p{L}\p{N}\.\'\-\+]+\@(?:[\.\-\p{L}\p{N}]+\.)+(?:[\-\p{L}\p{N}]{2,24}))";
row 444 (add u at the end of the pattern)
$pattern = '~<[^<]*(?<!\/)>(*SKIP)(*F)|<[^>]+?(\w*=\"' . $searchEmail . '\")[^>]*\/>~iu';
row 458 (add ~ at the start and ~iu at the end of the pattern)
$pattern = '~(<[^>]+?(\w*=\"' . $searchEmail . '")[^>]*>[^<]+<[^<]+>)~iu';
row 474 (add u at the end of the pattern)
$pattern = '~<[^<]*(?<!\/)>(*SKIP)(*F)|' . $searchEmail . '~iu';
This should support all variants with unicode characters.
Please check my suggestion. In my tests this worked without any problems.
row 71 (add u at the end of the pattern)
$pattern = '~(?:<a ([^>]*)href\s*=\s*"mailto:' . $link . '"([^>]*))>' . $text . '</a>~iu';row 106 (replace the pattern)
$searchEmail = "([\p{L}\p{N}\.\'\-\+]+\@(?:[\.\-\p{L}\p{N}]+\.)+(?:[\-\p{L}\p{N}]{2,24}))";row 444 (add u at the end of the pattern)
$pattern = '~<[^<]*(?<!\/)>(*SKIP)(*F)|<[^>]+?(\w*=\"' . $searchEmail . '\")[^>]*\/>~iu';row 458 (add ~ at the start and ~iu at the end of the pattern)
$pattern = '~(<[^>]+?(\w*=\"' . $searchEmail . '")[^>]*>[^<]+<[^<]+>)~iu';row 474 (add u at the end of the pattern)
$pattern = '~<[^<]*(?<!\/)>(*SKIP)(*F)|' . $searchEmail . '~iu';This should support all variants with unicode characters.
ok I will do
@LukasHH thanks you for helping me .Please test it !
I have tested this item :white_check_mark: successfully on 4b7d61365a1b88b9a70ba4de969f590bdd1b694f
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/39888.
This pull request has been automatically rebased to 4.3-dev.
This pull request has been automatically rebased to 4.4-dev.
I have tested this item :white_check_mark: successfully on 4b7d61365a1b88b9a70ba4de969f590bdd1b694f
This comment was created with the J!Tracker Application at issues.joomla.org/tracker/joomla-cms/39888.
Unfortunately the PR has a conflict in file plugins/content/emailcloak/emailcloak.php which cannot be resolved with the GitHub UI so it needs to be done locally in a git clone.
The conflict comes from the emailcloak plugin having been converted to the new structure in 4.4-dev meanwhile. I will see if I can fix that.
Conflict solved.