com.florianingerl.util.regex icon indicating copy to clipboard operation
com.florianingerl.util.regex copied to clipboard

Captured Named Groups Always return nulls for indexes above 0

Open matthew-elisha opened this issue 4 years ago • 9 comments

I tested the following code Matcher matcher = Pattern.compile("(?x)" + "(?(DEFINE)" + "(?<sum> (?'summand')(?:\\+(?'summand'))+ )" + "(?<summand> (?'product') | (?'number') )" + "(?<product> (?'factor')(?:\\*(?'factor'))+ )" + "(?<factor>(?'number') )" + "(?<number>\\d++)" + ")" + "(?'sum')").matcher("5+6*8"); matcher.matches();

But the following line was simply returning NULL: System.out.Println(matcher.group("number"));

Please what am I getting wrong?

matthew-elisha avatar Jan 09 '21 12:01 matthew-elisha

Hello Matthew, There is no group called number on the lowest level. Number is only a group nested in another group. You would have to inspect the capture tree to see what the group number actually captured!

All the best, Florian

All

On Sat, Jan 9, 2021 at 1:01 PM Matthew Elisha [email protected] wrote:

I tested the following code Matcher matcher = Pattern.compile("(?x)" + "(?(DEFINE)" + "(? (?'summand')(?:\+(?'summand'))+ )" + "(? (?'product') | (?'number') )" + "(? (?'factor')(?:\(?'factor'))+ )" + "(?(?'number') )" + "(?\d++)" + ")" + "(?'sum')").matcher("5+68"); matcher.matches();

But the following line was simply returning NULL: System.out.Println(matcher.group("number"));

Please what am I getting wrong?

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/florianingerl/com.florianingerl.util.regex/issues/10, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE6YEWSHULHKJCAHBKTH3GLSZBAP3ANCNFSM4V3REJNQ .

florianingerl avatar Jan 09 '21 13:01 florianingerl

Thanks alot.

matthew-elisha avatar Jan 09 '21 13:01 matthew-elisha

Hi,

I still have an issue pls. The captured group value for all groups above index 1 return null. Is that normal? How can I access the captured groups by name using the following pattern:

Matcher matcher = Pattern.compile("(?x)" + "(?(DEFINE)" + "(?<sum> (?'summand')(?:\\+(?'summand'))+ )" + "(?<summand> (?'product') | (?'number') )" + "(?<product> (?'factor')(?:\\*(?'factor'))+ )" + "(?<factor>(?'number') )" + "(?<number>\\d++)" + ")" + "(?'sum')").matcher("5+6*8"); matcher.matches();

EDIT: My point is that I want to be able to access the value of a captured subsequence of the recursive pattern above by using something like this: matcher.group('name');

matthew-elisha avatar Jan 09 '21 14:01 matthew-elisha

Hello,

Any response to the above please?

matthew-elisha avatar Jan 11 '21 08:01 matthew-elisha

You have to inspect the capture tree to see what these groups captured!

On Mon, Jan 11, 2021 at 9:24 AM Matthew Elisha [email protected] wrote:

Hello,

Any response to the above please?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/florianingerl/com.florianingerl.util.regex/issues/10#issuecomment-757703839, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE6YEWRH4L42CN26IFPH5MLSZKYTXANCNFSM4V3REJNQ .

florianingerl avatar Jan 11 '21 09:01 florianingerl

Hello, I agree with matthew-elisha assuming that he invoked Matcher.find or that like. Whereas groups work for Matchers they don't for MatchResults, which is quite clear looking at the implementation of Matcher.toMatchResult().

I was happy to see that MatchResult.group(String) promises to get named group matches, but the problem is also with numbered groups

Example:

import java.util.regex.Pattern;
import java.util.regex.Matcher;
import java.util.regex.MatchResult;

// import com.florianingerl.util.regex.Pattern;
// import com.florianingerl.util.regex.Matcher;
// import com.florianingerl.util.regex.MatchResult;

public class test {
  public static void main(String[] args) {
    Pattern pattern = Pattern.compile("a(\\w+)b");
    Matcher matcher = pattern.matcher("ahellob");
    System.out.println("match: " + matcher.find());
    System.out.println("group1: "+matcher.toMatchResult().group(1));
  }
}

prints hello as expected. For the ingerl classes, the result is null, because the MatchResult is an unmatched matcher, no find was applied yet. It shall be hello.

The same is true for named groups.

A fix I would accept just that Matcher.toMatchResult() returns this. It is not really safe because one could access the internals by casting, but I consider this criminal. You could also copy the relevant pieces of information.

Reissner avatar Dec 07 '23 19:12 Reissner

The latest commit should fix the problem

https://github.com/florianingerl/com.florianingerl.util.regex/commit/289e8b106d7beae740ff049f238905135e7e5062

On Thu, Dec 7, 2023 at 8:30 PM Ernst Reissner @.***> wrote:

Hello, I agree with matthew-elisha https://github.com/matthew-elisha. Whereas groups work for Matchers they dont for MatchResults, which is quite clear looking at Matcher.toMatchResult().

I was happy to see that MatchResult.group(String) promises to get named group matches, but the problem is also with numbered groups

Example:

import java.util.regex.Pattern; import java.util.regex.Matcher; import java.util.regex.MatchResult;

// import com.florianingerl.util.regex.Pattern; // import com.florianingerl.util.regex.Matcher; // import com.florianingerl.util.regex.MatchResult;

public class test { public static void main(String[] args) { Pattern pattern = Pattern.compile("a(\w+)b"); Matcher matcher = pattern.matcher("ahellob"); System.out.println("match: " + matcher.find()); System.out.println("group1: "+matcher.toMatchResult().group(1)); } }

prints hello as expected. For the ingerl classes, the result is null, because the MatchResult is an unmatched matcher. It shall be hello.

The same is true for named groups.

A fix I would accept is just that Matcher.toMatchResult() returns this. It is not really safe because one could access the internals by casting, but I consider this criminal. You could also copy the relevant pieces of information.

— Reply to this email directly, view it on GitHub https://github.com/florianingerl/com.florianingerl.util.regex/issues/10#issuecomment-1845978063, or unsubscribe https://github.com/notifications/unsubscribe-auth/AE6YEWVLUT53WGUHAYCATOTYIIKORAVCNFSM4V3REJN2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBUGU4TOOBQGYZQ . You are receiving this because you commented.Message ID: @.*** com>

florianingerl avatar Dec 07 '23 20:12 florianingerl

You are really fast! Ah, not as minimal as I suggested. Better..

Hm.. did you publish? which version? 1.1.10?

Reissner avatar Dec 07 '23 20:12 Reissner

Just to be sure: does it work for named groups also? According testcase maybe a good idea.

Reissner avatar Dec 07 '23 21:12 Reissner

Hello Reissner, I just managed to deploy a new version where the bug is fixed. It is version 1.1.11 and can now be downloaded from Maven Central. This was a lot of work All the best, Florian

florianingerl avatar Jun 03 '24 17:06 florianingerl

New version 1.1.11 can be downloaded from Maven central where the bug is fixed.

florianingerl avatar Jun 03 '24 17:06 florianingerl