Regex#each_name(&) and Regex#names
fwiw Regex#names exists in Ruby.
Regex#each_name was added as a helper and a way to access group names without additional allocations.
Could this get a 2nd approval?
These were never discussed before, sorry. I don't understand their usefulness so I won't approve them.
2nd approval anyone? /cc @bcardiff @jhass
This needs a rebase
@straight-shoota done.
Can you provide a use case?
Also, UInt16 should be replace with Int32.
Then, I don't understand why names is useful for, and why there needs to be an unsorted and sorted variants. This makes the API bigger but I can't see how it can be used for anything useful.
Can you provide a use case?
@asterite These new methods are documented and I consider that enough for explaining the use cases.
Also, UInt16 should be replace with Int32.
That would be probably better, yes.
Then, I don't understand why names is useful for, and why there needs to be an unsorted and sorted variants. This makes the API bigger but I can't see how it can be used for anything useful.
Agreed, although unsorted version is there for efficiency reasons, yet if deemed unnecessary I can leave only sorted version.
@asterite ping 👋
Sorry, I don't have much more time to review or decide things.
We can't just change the return type of #name_table from Hash(UInt16, String) to Hash(Int32, String). That would break the API. For now, we need to keep that method.
There could be a different method returning Hash(Int32, String), though.
I've also some doubts if #names is a good addition. I see some risk that the behaviour cause confusion: The absence of unnamed capture groups might be missed. So I'm leaning towards not having this method.
Is there a good use case for that, besides "Ruby has it"? I don't think it's much relevant and can easily be implemented when needed.
We can't just change the return type of
#name_tablefromHash(UInt16, String)toHash(Int32, String). That would break the API. For now, we need to keep that method. There could be a different method returningHash(Int32, String), though.
https://github.com/search?q=name_table+language%3ACrystal&type=code
There's literally no single case there which would necessitate adding a new method.
I've also some doubts if
#namesis a good addition. I see some risk that the behaviour cause confusion: The absence of unnamed capture groups might be missed. So I'm leaning towards not having this method.
I don't get the point. If you call names you'd expect to see named captures, what else could you expect?
Is there a good use case for that, besides "Ruby has it"? I don't think it's much relevant and can easily be implemented when needed.
There's no other way to get the list of named captures, so that's enough for the use case for me, and this kind of - already accessible - feature, shouldn't force you to monkey-patch stdlib.
There's literally no single case there which would necessitate adding a new method.
That just means no single use published at GitHub. And it doesn't matter. It's a breaking change.
I don't get the point. If you call names you'd expect to see named captures, what else could you expect?
You could expect the named captures to be indexed by their capture group index.
There's no other way to get the list of named captures, so that's enough for the use case for me,
That's not a use case at all. There's no point in adding a method unless there is a real use case.
And sure, you can get an array of names from name_table.values (unsorted) or name_table.to_a.sort!.map(&.[1]) (sorted).
Hi @Sija , thanks for taking the time to write the PR. But we need to ask you a little bit extra:
Can you provide a use case?
@asterite These new methods are documented and I consider that enough for explaining the use cases.
The documentation explains the use of the method, but does not provide enough evidence that the change is necessary. What we're asking is what makes you want to do this change. What couldn't be done before, or how inefficient or ugly was to do it, and what is better with your proposal? Ideally, you should first open an issue stating the problem, and perhaps drafting a solution, and then move to coding when we all agree on your proposal. Since we're discussing it here now, can you provide an answer to this question in this PR? Thanks!
@beta-ziliani My use case was the replacement/escaping named group values without knowing the group names in advance. Without #names I had to resort to #name_table.to_a.sort!.map(&.[1]) call. And sorry for the delay 🙏
#each_name sounds reasonable, but I agree the method name should be changed. We now have Regex::PCRE2#each_named_capture_group which is pretty much the same idea. Note that you could only skip the Hash allocation, not the Strings themselves; the standard library avoids exposing strings as Bytes publicly in this way, even if it means fewer allocations.
#names is entirely redundant. With #each_name you are free to implement your own #name or #name_array (https://github.com/crystal-lang/crystal/pull/9760#discussion_r628544426) based on it.
#each_namesounds reasonable, but I agree the method name should be changed. We now haveRegex::PCRE2#each_named_capture_groupwhich is pretty much the same idea. Note that you could only skip theHashallocation, not theStrings themselves; the standard library avoids exposing strings asBytespublicly in this way, even if it means fewer allocations.
Yeah, name is up for the debate.
#namesis entirely redundant. With#each_nameyou are free to implement your own#nameor#name_array(#9760 (comment)) based on it.
This I'm not sure I follow. Why is it redundant? You could pose the same argument for String#chars vs String#each_char. Also, Ruby has it, so they didn't see it as redundant either.