netcdf-c icon indicating copy to clipboard operation
netcdf-c copied to clipboard

tst_filter fails on s390x

Open opoplawski opened this issue 6 years ago • 31 comments

Environment Information

Fedora Rawhide s390x

Summary of Issue

FAIL: tst_filter
================
++ top_srcdir=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/..
++ top_builddir=/builddir/build/BUILD/netcdf-c-4.6.2.1/build
++ test x../../nc_test4 = x
+++ pwd
++ builddir=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
++ execdir=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
+++ basename ../../nc_test4
++ thisdir=nc_test4
+++ pwd
++ WD=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
++ cd ../../nc_test4
+++ pwd
++ srcdir=/builddir/build/BUILD/netcdf-c-4.6.2.1/nc_test4
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build/..
+++ pwd
++ top_srcdir=/builddir/build/BUILD/netcdf-c-4.6.2.1
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
+++ pwd
++ builddir=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build
+++ pwd
++ top_builddir=/builddir/build/BUILD/netcdf-c-4.6.2.1/build
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
+++ pwd
++ execdir=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
++ export srcdir top_srcdir builddir top_builddir execdir
++ test -e /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/ncdump.exe
++ ext=
++ export NCDUMP=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/ncdump
++ NCDUMP=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/ncdump
++ export NCCOPY=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/nccopy
++ NCCOPY=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/nccopy
++ export NCGEN=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncgen/ncgen
++ NCGEN=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncgen/ncgen
++ export NCGEN3=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncgen3/ncgen3
++ NCGEN3=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncgen3/ncgen3
++ ncgen3c0=/builddir/build/BUILD/netcdf-c-4.6.2.1/ncgen3/c0.cdl
++ ncgenc0=/builddir/build/BUILD/netcdf-c-4.6.2.1/ncgen/c0.cdl
++ ncgenc04=/builddir/build/BUILD/netcdf-c-4.6.2.1/ncgen/c0_4.cdl
++ cd /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4
+ API=1
+ NG=1
+ NCP=1
+ UNK=1
+ NGC=1
+ MISC=1
+ . /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4/findplugin.sh
++ test x '!=' x
+ echo 'findplugin.sh loaded'
findplugin.sh loaded
+ findplugin bzip2
+ FP_NAME=bzip2
+ FP_ISCMAKE=
+ FP_ISMSVC=
+ topbuilddir=/builddir/build/BUILD/netcdf-c-4.6.2.1/build
++ uname
++ cut -d _ -f 1
+ FP_OS=Linux
+ test xLinux = xDarwin
++ uname
++ cut -d _ -f 1
+ FP_OS=Linux
+ test xLinux = xCYGWIN
+ FP_PLUGINS=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins
+ FP_PLUGIN_LIB=
+ FP_PLUGIN_PATH=
+ test x '!=' x
+ test x '!=' x
+ test x '!=' x
+ FP_PLUGIN_LIB=libbzip2.so
+ test x '!=' x -a x '!=' x
+ test -f /builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs/libbzip2.so
+ FP_PLUGIN_PATH=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs
+ test x/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs = x
+ test -f /builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs/libbzip2.so
+ test x '!=' x -a x '!=' x
+ HDF5_PLUGIN_LIB=libbzip2.so
+ HDF5_PLUGIN_PATH=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs
+ return 0
+ BZIP2PATH=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs/libbzip2.so
+ findplugin misc
+ FP_NAME=misc
+ FP_ISCMAKE=
+ FP_ISMSVC=
+ topbuilddir=/builddir/build/BUILD/netcdf-c-4.6.2.1/build
++ uname
++ cut -d _ -f 1
+ FP_OS=Linux
+ test xLinux = xDarwin
++ uname
++ cut -d _ -f 1
+ FP_OS=Linux
+ test xLinux = xCYGWIN
+ FP_PLUGINS=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins
+ FP_PLUGIN_LIB=
+ FP_PLUGIN_PATH=
+ test x '!=' x
+ test x '!=' x
+ test x '!=' x
+ FP_PLUGIN_LIB=libmisc.so
+ test x '!=' x -a x '!=' x
+ test -f /builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs/libmisc.so
+ FP_PLUGIN_PATH=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs
+ test x/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs = x
+ test -f /builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs/libmisc.so
+ test x '!=' x -a x '!=' x
+ HDF5_PLUGIN_LIB=libmisc.so
+ HDF5_PLUGIN_PATH=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs
+ return 0
+ MISCPATH=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs/libmisc.so
+ echo 'final HDF5_PLUGIN_PATH=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs'
final HDF5_PLUGIN_PATH=/builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs
+ export HDF5_PLUGIN_PATH
+ test -f /builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs/libbzip2.so
+ test -f /builddir/build/BUILD/netcdf-c-4.6.2.1/build/plugins/.libs/libmisc.so
+ test x1 = x1
+ echo '*** Testing dynamic filters using API'
*** Testing dynamic filters using API
+ rm -f ./bzip2.nc ./bzip2.dump ./tst_filter.txt
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4/test_filter
*** Testing API: bzip2 compression.
show parameters for bzip2: level=9
show chunks: chunks=4,4,4,4
*** Testing API: bzip2 decompression.
data comparison: |array|=256
no data errors
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/ncdump -s bzip2.nc
+ sclean ./tst_filter.txt ./bzip2.dump
+ cat ./tst_filter.txt
+ sed -e /_NCProperties/d
+ sed -e /_SuperblockVersion/d
+ sed -e /_IsNetcdf4/d
+ cat
+ sed -e /var:_Endianness/d
+ diff -b -w /builddir/build/BUILD/netcdf-c-4.6.2.1/nc_test4/bzip2.cdl ./bzip2.dump
+ echo '*** Pass: API dynamic filter'
*** Pass: API dynamic filter
+ test x1 = x1
+ echo
+ echo '*** Testing dynamic filters parameter passing'
*** Testing dynamic filters parameter passing
+ rm -f ./testmisc.nc tst_filter.txt tst_filter2.txt
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/nc_test4/test_filter_misc
test1: compression.
test: nparams=14: params= 1 239 23 65511 27 77 93 1145389056 3287505826 1097305129 1 2147483648 4294967295 4294967295
 chunks=4,4,4,4
mismatch: [11] signed long long
fail (361): NetCDF: HDF error
test1: decompression.
data comparison: |array|=256
no data errors
mismatch: [11] signed long long
mismatch: [11] signed long long
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/ncdump -s testmisc.nc
+ getfilterattr ./tst_filter.txt ./tst_filter2.txt
+ case "$1" in
+ sed -e /var:_Filter/p -ed
+ rm -f ./tst_filter.txt
+ trimleft ./tst_filter2.txt ./tst_filter.txt
+ sed -e 's/[ 	]*\([^ 	].*\)/\1/'
+ rm -f ./tst_filter2.txt
+ cat
+ diff -b -w ./tst_filter.txt ./tst_filter2.txt
+ echo '*** Pass: parameter passing'
*** Pass: parameter passing
+ test x1 = x1
+ echo '*** Testing dynamic filters using ncgen'
*** Testing dynamic filters using ncgen
+ rm -f ./bzip2.nc ./bzip2.dump ./tst_filter.txt
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncgen/ncgen -lb -4 -o bzip2.nc /builddir/build/BUILD/netcdf-c-4.6.2.1/nc_test4/bzip2.cdl
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/ncdump -s bzip2.nc
+ sclean ./tst_filter.txt ./bzip2.dump
+ cat ./tst_filter.txt
+ sed -e /var:_Endianness/d
+ sed -e /_NCProperties/d
+ sed -e /_SuperblockVersion/d
+ sed -e /_IsNetcdf4/d
+ cat
+ diff -b -w /builddir/build/BUILD/netcdf-c-4.6.2.1/nc_test4/bzip2.cdl ./bzip2.dump
+ echo '*** Pass: ncgen dynamic filter'
*** Pass: ncgen dynamic filter
+ test x1 = x1
+ echo '*** Testing dynamic filters using nccopy'
*** Testing dynamic filters using nccopy
+ rm -f ./unfiltered.nc ./filtered.nc ./tmp.nc ./filtered.dump ./tst_filter.txt
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncgen/ncgen -4 -lb -o unfiltered.nc /builddir/build/BUILD/netcdf-c-4.6.2.1/nc_test4/ref_unfiltered.cdl
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncgen/ncgen -4 -lb -o unfilteredvv.nc /builddir/build/BUILD/netcdf-c-4.6.2.1/nc_test4/ref_unfilteredvv.cdl
+ echo '	*** Testing simple filter application'
	*** Testing simple filter application
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/nccopy -M0 -F /g/var,307,9,4 unfiltered.nc filtered.nc
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/ncdump -s filtered.nc
+ sclean ./tst_filter.txt ./filtered.dump
+ sed -e /var:_Endianness/d
+ cat
+ sed -e /_SuperblockVersion/d
+ sed -e /_NCProperties/d
+ sed -e /_IsNetcdf4/d
+ cat ./tst_filter.txt
+ diff -b -w /builddir/build/BUILD/netcdf-c-4.6.2.1/nc_test4/ref_filtered.cdl ./filtered.dump
+ echo '	*** Pass: nccopy simple filter'
	*** Pass: nccopy simple filter
+ echo '	*** Testing '\''*'\'' filter application'
	*** Testing '*' filter application
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/nccopy -M0 -F '*,307,9,4' unfilteredvv.nc filteredvv.nc
+ /builddir/build/BUILD/netcdf-c-4.6.2.1/build/ncdump/ncdump -s filteredvv.nc
+ sclean ./tst_filtervv.txt ./filteredvv.dump
+ cat ./tst_filtervv.txt
+ sed -e /_IsNetcdf4/d
+ cat
+ sed -e /var:_Endianness/d
+ sed -e /_SuperblockVersion/d
+ sed -e /_NCProperties/d
+ diff -b -w /builddir/build/BUILD/netcdf-c-4.6.2.1/nc_test4/ref_filteredvv.cdl ./filteredvv.dump
9c9
< 		var1:_Endianness = "little" ;
---
> 		var1:_Endianness = "big" ;
28c28
<   		var2:_Endianness = "little" ;
---
>   		var2:_Endianness = "big" ;
FAIL tst_filter.sh (exit status: 1)

opoplawski avatar Feb 25 '19 00:02 opoplawski

Am I correct in thinking that the s390x is a big-endian machine?

DennisHeimbigner avatar Feb 25 '19 03:02 DennisHeimbigner

Yes

opoplawski avatar Feb 25 '19 03:02 opoplawski

OK, I have a fix. As soon as I see it does not break anything, I will post the modified file for you to test on the s390x.

DennisHeimbigner avatar Feb 25 '19 03:02 DennisHeimbigner

Attached are the two files in nc_test4 that need to be replaced.

Try these two: files.zip

DennisHeimbigner avatar Feb 25 '19 04:02 DennisHeimbigner

Thanks, works for me.

opoplawski avatar Feb 26 '19 15:02 opoplawski

This appears to have returned with 4.7.1

opoplawski avatar Sep 14 '19 03:09 opoplawski

Ok. Let me take a look, reopening.

WardF avatar Sep 16 '19 21:09 WardF

Still there with 4.7.2

opoplawski avatar Oct 26 '19 02:10 opoplawski

Is this just that ref_filteredvv.cdl contains the string "little"? If so, then we need to fix the test to sed the ref_filteredvv.cdl and the filteredvv.cdl to remove the exact endianness string.

DennisHeimbigner avatar Oct 26 '19 02:10 DennisHeimbigner

@DennisHeimbigner at a glance it looks like tst_filter.sh should already be removing the Endianness string comparison. Unless you have a fix in the pipeline, I'll follow up on this and figure out what's going on.

WardF avatar Oct 28 '19 20:10 WardF

I wonder if that test should put the leading colon in square brackets. e.g. /[:]_Endianness ?

DennisHeimbigner avatar Oct 28 '19 21:10 DennisHeimbigner

@DennisHeimbigner there is an Endianness flag present in one of the reference files in the repository; I’m not sure why it’s not being stripped out by the sed invocation in tst_filters.sh. Your suggestion might work, although I wonder if the Endianness string should just be removed altogether?

WardF avatar Oct 28 '19 21:10 WardF

The reason the ref file still has it is because it is not being passed thru the sclean function apparently. We can just remove the _Endianness from that file it that solves the problem.

DennisHeimbigner avatar Oct 28 '19 21:10 DennisHeimbigner

This is unobvious. Where is that Endianness flag coming from? It is not in any of the ref files as near as I can tell.

DennisHeimbigner avatar Oct 28 '19 21:10 DennisHeimbigner

I take it back, but it is in an input to ncgen and is not referenced after that. I wonder if this is related to the fact that there is an NC_ENDIAN_NATIVE that ncdump tests for.

DennisHeimbigner avatar Oct 28 '19 21:10 DennisHeimbigner

Without access to a big endian system I'm not sure how to test a fix for sure but I can remove the endianness flag from the reference file.

WardF avatar Oct 28 '19 22:10 WardF

Ok, I'm setting up with qemu for testing, using the process found here.

WardF avatar Oct 28 '19 23:10 WardF

Just a follow-up, hdf5 is still compiling.

WardF avatar Oct 29 '19 22:10 WardF

I can say that tst_filter fails on a big endian machine with commit 6555488dfc379af4ad6d6da5a4717b2dc788be90.

test-suite-logs.tar.gz

@WardF Feel free to ask questions about the referenced QEMU mips repo.

t-b avatar Oct 31 '19 14:10 t-b

Ok, the failures in that test log are coming in test_filter_misc.c and are errors from the meta-data creation actions. Do not have enough info to resolve.

DennisHeimbigner avatar Oct 31 '19 16:10 DennisHeimbigner

Hi Dennis,

I'm seeing the following:

http://cdash.unidata.ucar.edu/testDetails.php?test=387708&build=14371

This is a different failure than has been reported above, any insight? It took a day to build hdf5 so I'd like to avoid redoing unless need be.

WardF avatar Nov 01 '19 17:11 WardF

Oh, I see they're the same as @t-b submitted, nevermind @DennisHeimbigner you've answered it already. Will keep looking at this.

WardF avatar Nov 01 '19 17:11 WardF

I can say that tst_filter fails on a big endian machine with commit 6555488.

test-suite-logs.tar.gz

@WardF Feel free to ask questions about the referenced QEMU mips repo.

My only question would be if there is a way to speed up the emulated machine; my cursory investigation says no, but I'm hardly a qemu expert :).

WardF avatar Nov 01 '19 18:11 WardF

@WardF Not really at the moment. That is a a limitation of the emulated architecture. I've tried to setup a s390x debian as well in QEMU but the installer on that platform is really different to the default installer.

t-b avatar Nov 01 '19 20:11 t-b

I've determined the failure was introduced between 4.7.0 and 4.7.1. Setting up a git bisect to track this down, but I expect it will take some time; compiling each commit takes about an hour.

WardF avatar Nov 04 '19 18:11 WardF

Ran git bisect overnight, currently reporting commit 8d0bced6 as the first failure. Investigating this morning.

WardF avatar Nov 05 '19 16:11 WardF

Any progress here? I have to admit that I think the error is different now than before:

FAIL: tst_filter
================
findplugin.sh loaded
final HDF5_PLUGIN_PATH=/builddir/build/BUILD/netcdf-c-4.7.3/build/plugins/.libs
*** Testing dynamic filters using API
*** Testing API: bzip2 compression.
show parameters for bzip2: level=9
show chunks: chunks=4,4,4,4
*** Testing API: bzip2 decompression.
data comparison: |array|=256
no data errors
*** Pass: API dynamic filter
*** Testing dynamic filters parameter passing
test1: compression.
test: nparams=14: params= 1 239 23 65511 27 77 93 1145389056 3287505826 1097305129 1 2147483648 4294967295 4294967295
dimsizes=4,4,4,4
chunksizes=4,4,4,4
mismatch: [11] signed long long
fail (378): NetCDF: HDF error
test1: decompression.
data comparison: |array|=256
no data errors
test2: dimsize % chunksize != 0: compress.
fail (134): Permission denied
fail (135): NetCDF: Not a valid ID
fail (139): NetCDF: Not a valid ID
fail (139): NetCDF: Not a valid ID
fail (139): NetCDF: Not a valid ID
fail (139): NetCDF: Not a valid ID
fail (141): NetCDF: Not a valid ID
fail (214): NetCDF: Not a valid ID
fail (114): NetCDF: Not a valid ID
bad chunk store
fail (148): NetCDF: Not a valid ID
fail (156): NetCDF: Not a valid ID
fail: line=161 param mismatch
mismatch: [11] signed long long
mismatch: [11] signed long long
FAIL tst_filter.sh (exit status: 1)

Still seeing it with 4.7.3.

opoplawski avatar Nov 24 '19 23:11 opoplawski

I'm able to duplicate the issue but have not had a chance to work on it these last couple of weeks. I will switch gears and take a look, we are hindered by the slowness of the qemu machine, but we should be able to get this sorted out. The 'fail 134: permission denied' is an interesting one, I wonder how many of the subsequent failures are the result of not being able to access a particular file/chunk/whatever 'permission denied' is referring to. I'll take a look.

WardF avatar Nov 25 '19 23:11 WardF

Any progress here? I think this is the last thing holding us back from updating netcdf in Fedora to 4.7.3 (assuming we don't find anything in any dependent packages).

opoplawski avatar Feb 04 '20 03:02 opoplawski

The length of time it takes to debug this has made it difficult to work on/easy to get lost in the shuffle of other issues. Our IT dept. has now secured some big-endian hardware that we can use to debug this in a much more efficient manner; I'm hoping to be able to hop back onto this next week, and have a fix in for the 4.7.4 release we are beginning to prepare. Thanks for your patience!

WardF avatar Feb 14 '20 22:02 WardF