bug in strsplit with /regex keyword and special string
Anybody can confirm this bug?
GDL> print,strsplit('{B}','{',/regex)
% STRTOK: Error processing regular expression: {
Invalid preceding regular expression.
% Error occurred at: STRSPLIT 92 strsplit.pro
IDL> print,strsplit('{B}','{',/regex)
1
confirmed :cry:
GDL uses regcomp() (see man regexp) and '{' is an EXTENDED feature of regexp, defining a so-called 'bound'. It is said in the doc that A '{' followed by a character other than a digit is an ordinary character, not the beginning of a bound. Obviously the linux regcomp() has a bug, since it should follow the documentation. On OSX there is no error. I would not change GDL code (and to do what?) if this is a linux library problem.
several points :
- I won't agree with a wontfix flag !
- if I change line 7319 in basic_fun.cpp int cflags = 0;//REG_EXTENDED; the code is working fine in my Linux U22.04
- the story in IDL STREGEX is amazing : STREGEX is based on the regex package written by Henry Spencer, modified by L3Harris Geospatial Solutions only to the extent required to integrate it into IDL. This package is freely available at: https://garyhouston.github.io/regex/. This should help to easily patch our code
yes cflags = 0 will work in this particular case, because '{' is not recognized as an extension trigger. But IDL uses the extended attributes:
IDL> print,strsplit('{B}','{1}',/regex)
% STRTOK: Error processing regular expression: {1}
repetition-operator operand invalid
% Execution halted at: $MAIN$
IDL> print,strsplit('{ABBBAAB}','B{1}',/regex)
0 5 8
IDL> print,strsplit('{ABBBAAB}','B{3}',/regex)
0 5
GDL> print,strsplit('{B}','{1}',/regex)
% STRTOK: Error processing regular expression: {1}
Invalid preceding regular expression.
% Error occurred at: STRSPLIT 92 /usr/local/share/gnudatalanguage/lib/strsplit.pro
% $MAIN$
% Execution halted at: $MAIN$
GDL> print,strsplit('{ABBBAAB}','B{1}',/regex)
0 5 8
GDL> print,strsplit('{ABBBAAB}','B{3}',/regex)
0 5
So cflags=0 will loose an important GDL functionality.
What IDL has changed is unknown, at least we know that GDL works under OSX so it is not a GDL problem at all, and we should report the issue to linux. Using a github 'regexp' library instead of the system's one is probably a bit safer and should be tempted.
Submitted a bug report to Mageia (my distro) with hope they can have it fixed by glibc guys. (or the documentation modified!)
@brandy125 you did (I quote distro maintainers) "discover this very obscure fault which requires a specially crafted program to show it?" This is going upstairs, to glibc...
pushed to glibc bug reports.
As I do not want to discuss with gnu people of the interpretation of the POSIX bible we'll follow @alaingdl 's suggestion an incorporate https://github.com/garyhouston/regex as a submodule.