patchutils
patchutils copied to clipboard
grepdiff: `--exended-regexp` does not work when compiled `--with-pcre2`
I discovered that if the package is compiled with --with-pcre2
(which is the default behaviour if libpcre2 is found on the system), then the following invocation does not return anything for the diff copied at the end of this message.
grepdiff --output-matching=hunk -E 'landblockelim|blockall' out.txt
Since either landblockelim
or blockall
are found in several of the hunks, I should get an output. However, if I use the BRE syntax instead:
grepdiff --output-matching=hunk -E 'landblockelim\|blockall' out.txt
then I get the expected output.
I ran the code through GDB and I see the regexp is correctly passed to regcomp
with REG_EXTENDED
set, but in <pcre2posix.h>
, which is included instead of <regex.h>
under --with-pcre2
, we have the following:
/* This is not used by PCRE2, but by defining it we make it easier
to slot PCRE2 into existing programs that make POSIX calls. */
#define REG_EXTENDED 0
so it seems something is missing for a working "out-of-the box" experience when compiling with libpcre2...
The diff in question:
commit 078aab4844983ccbb9c9fddc1a70e060e1d99a34
Author: Tony Craig <[email protected]>
Date: May 10 2022
Merge cgridDEV branch including C grid implementation and other fixes (#715)
Merge cgridDEV branch to main
diff --git a/cicecore/cicedynB/infrastructure/ice_domain.F90 b/cicecore/cicedynB/infrastructure/ice_domain.F90
index c44e896a..79f5bcb9 100644
--- a/cicecore/cicedynB/infrastructure/ice_domain.F90
+++ b/cicecore/cicedynB/infrastructure/ice_domain.F90
@@ -63,6 +63,8 @@ module ice_domain
maskhalo_dyn , & ! if true, use masked halo updates for dynamics
maskhalo_remap , & ! if true, use masked halo updates for transport
maskhalo_bound , & ! if true, use masked halo updates for bound_state
+ halo_dynbundle , & ! if true, bundle halo update in dynamics
+ landblockelim , & ! if true, land block elimination is on
orca_halogrid ! if true, input fields are haloed as defined by orca grid
!-----------------------------------------------------------------------
@@ -79,6 +81,7 @@ module ice_domain
! 'rake', 'spacecurve', etc
distribution_wght ! method for weighting work per block
! 'block' = POP default configuration
+ ! 'blockall' = no land block elimination
! 'latitude' = no. ocean points * |lat|
! 'file' = read distribution_wgth_file
character (char_len_long) :: &
@@ -153,13 +156,15 @@ subroutine init_domain_blocks
maskhalo_dyn = .false. ! if true, use masked halos for dynamics
maskhalo_remap = .false. ! if true, use masked halos for transport
maskhalo_bound = .false. ! if true, use masked halos for bound_state
+ halo_dynbundle = .true. ! if true, bundle halo updates in dynamics
add_mpi_barriers = .false. ! if true, throttle communication
debug_blocks = .false. ! if true, print verbose block information
- max_blocks = -1 ! max number of blocks per processor
+ max_blocks = -1 ! max number of blocks per processor
block_size_x = -1 ! size of block in first horiz dimension
block_size_y = -1 ! size of block in second horiz dimension
nx_global = -1 ! NXGLOB, i-axis size
ny_global = -1 ! NYGLOB, j-axis size
+ landblockelim = .true. ! on by default
if (my_task == master_task) then
write(nu_diag,*) subname,' Reading domain_nml'
@@ -284,7 +289,7 @@ end subroutine init_domain_blocks
!***********************************************************************
- subroutine init_domain_distribution(KMTG,ULATG)
+ subroutine init_domain_distribution(KMTG,ULATG,grid_ice)
! This routine calls appropriate setup routines to distribute blocks
! across processors and defines arrays with block ids for any local
@@ -299,6 +304,9 @@ subroutine init_domain_distribution(KMTG,ULATG)
KMTG ,&! global topography
ULATG ! global latitude field (radians)
+ character(len=*), intent(in) :: &
+ grid_ice ! grid_ice, B, C, CD, etc
+
!----------------------------------------------------------------------
!
! local variables
@@ -316,6 +324,7 @@ subroutine init_domain_distribution(KMTG,ULATG)
integer (int_kind) :: &
i,j,n ,&! dummy loop indices
ig,jg ,&! global indices
+ igm1,igp1,jgm1,jgp1,&! global indices
ninfo ,&! ice_distributionGet check
work_unit ,&! size of quantized work unit
#ifdef USE_NETCDF
@@ -449,6 +458,8 @@ subroutine init_domain_distribution(KMTG,ULATG)
flat = 1
endif
+ if (distribution_wght == 'blockall') landblockelim = .false.
+
allocate(nocn(nblocks_tot))
if (distribution_wght == 'file') then
@@ -504,10 +515,25 @@ subroutine init_domain_distribution(KMTG,ULATG)
if (this_block%i_glob(i) > 0) then
ig = this_block%i_glob(i)
jg = this_block%j_glob(j)
- if (KMTG(ig,jg) > puny .and. &
- (ULATG(ig,jg) < shlat/rad_to_deg .or. &
- ULATG(ig,jg) > nhlat/rad_to_deg) ) &
- nocn(n) = nocn(n) + flat(ig,jg)
+ if (grid_ice == 'C' .or. grid_ice == 'CD') then
+ ! Have to be careful about block elimination with C/CD
+ ! Use a bigger stencil
+ igm1 = mod(ig-2+nx_global,nx_global)+1
+ igp1 = mod(ig,nx_global)+1
+ jgm1 = max(jg-1,1)
+ jgp1 = min(jg+1,ny_global)
+ if ((KMTG(ig ,jg ) > puny .or. &
+ KMTG(igm1,jg ) > puny .or. KMTG(igp1,jg ) > puny .or. &
+ KMTG(ig ,jgp1) > puny .or. KMTG(ig ,jgm1) > puny) .and. &
+ (ULATG(ig,jg) < shlat/rad_to_deg .or. &
+ ULATG(ig,jg) > nhlat/rad_to_deg) ) &
+ nocn(n) = nocn(n) + flat(ig,jg)
+ else
+ if (KMTG(ig,jg) > puny .and. &
+ (ULATG(ig,jg) < shlat/rad_to_deg .or. &
+ ULATG(ig,jg) > nhlat/rad_to_deg) ) &
+ nocn(n) = nocn(n) + flat(ig,jg)
+ endif
endif
end do
endif
@@ -524,8 +550,8 @@ subroutine init_domain_distribution(KMTG,ULATG)
! Keep all blocks even the ones only containing land points
if (distribution_wght == 'block') nocn(n) = nx_block*ny_block
#else
- if (distribution_wght == 'block' .and. & ! POP style
- nocn(n) > 0) nocn(n) = nx_block*ny_block
+ if (distribution_wght == 'block' .and. nocn(n) > 0) nocn(n) = nx_block*ny_block
+ if (.not. landblockelim) nocn(n) = max(nocn(n),1)
#endif
end do
endif ! distribution_wght = file
So, after more investigation, this behaviour only appears on RHEL8, with a self-compiled patchutils 0.4.2 using the system-installed libpcre2 (pcre2-10.32-2.el8.src.rpm
).
It does not appear:
- On Ubuntu 22 with a self-compiled patchutils 0.4.2 using the system-installed
libpcre2
(10.39-3ubuntu0.1
) - On Ubuntu 22 with a self-compiled patchutils 0.4.2 using a self-compiled pcre2 at the same version as on RHEL 8 (10.32)
Turns out, the system where I was seeing this behaviour had LD_PRELOAD=/path/to/libc.so
set in the environment, which caused the libc regcomp
and regexec
functions to be linked at runtime, which explains why I had to use the BRE alternation syntax \|
. And since prec2posix.h
defines REG_EXTENDED=0
, using -E
had no effect.
Note that this would not have happened with pcre2 10.33 or later since the POSIX function names are only defined as preprocessor macros starting in that release: https://github.com/PCRE2Project/pcre2/blob/4a6a8b056f39079d5e958eac84c2ad173f4680bc/ChangeLog#L1050-L1054.
Two things to mention in retrospect:
- The commit that added pcre2 support, 8e61838a2c852c1789062a42a36761851e7fec4a, could have adjusted the help text when compiling with PCRE2 to mention that PCRE regexes are used, and that
-E
has no effect, since a user in general does not know how the program was compiled. - The changes to the configure script could be improved, since if we pass
--with-pcre2=/path/to/self/compiled/pcre2
, it results only in-lpcre2posix
being added toLIBS
, but not in-I/path/to/self/compiled/pcre2/include
being added toCFLAGS
, nor-L/path/to/self/compiled/pcre2/lib
being added toLDFLAGS
, so we end up using the system-installed version of the library.
@KangOl CC-ing you since you authored that commit :)
Can you resume the problem ? it is only on RHEL 8 ?
Hi @sergiomb2, in the end, the behaviour I was seeing was caused by LD_PRELOAD
being set by the sysadmins on the system I was using, which I had forgotten about. This is what I wrote in: https://github.com/twaugh/patchutils/issues/60#issuecomment-1982274526. There is nothing much to be done in patchutils on that front, except maybe not using the POSIX names regcomp
and regexec
when compiled with PCRE2, and instead using pcre2_regcomp
and pcre2_regexec
. This would have alleviated the problem.
So the issue can be closed, although I noted two possible improvements in https://github.com/twaugh/patchutils/issues/60#issuecomment-1982285550.