root icon indicating copy to clipboard operation
root copied to clipboard

Test failures on s390x

Open ellert opened this issue 3 years ago • 4 comments

  • [x] Checked for duplicates

Describe the bug

When running the unit tests on s390x there are several failures:

97% tests passed, 34 tests failed out of 1232
Label Time Summary:
longtest               = 572.89 sec*proc (27 tests)
multithreaded          = 889.21 sec*proc (2 tests)
python_runtime_deps    =  58.00 sec*proc (11 tests)
tutorial               = 4391.45 sec*proc (786 tests)
Total Test time (real) = 2976.41 sec
The following tests FAILED:
	 16 - pyunittests-pyroot-pyz-stl-vector (Failed)
	 57 - pyunittests-pyroot-pyz-rtensor (Failed)
	213 - gtest-roofit-histfactory-test-testHistFactory (Failed)
	219 - gtest-roofit-roofit-test-testRooCrystalBall (Failed)
	220 - gtest-roofit-roofit-test-testRooJohnson (Failed)
	223 - gtest-roofit-roofit-test-testSumW2Error (Failed)
	228 - gtest-roofit-roofitcore-test-testRooBinSamplingPdf (Failed)
	229 - gtest-roofit-roofitcore-test-testRooSimPdfBuilder (Failed)
	230 - gtest-roofit-roofitcore-test-testRooWrapperPdf (Failed)
	231 - gtest-roofit-roofitcore-test-testRooFitDriver (Failed)
	233 - gtest-roofit-roofitcore-test-testRooAbsPdf (Failed)
	237 - gtest-roofit-roofitcore-test-testRooProdPdf (Failed)
	241 - gtest-roofit-roofitcore-test-testTestStatistics (Failed)
	242 - gtest-roofit-roofitcore-test-testRooProductPdf (Failed)
	243 - gtest-roofit-roofitcore-test-testNaNPacker (Failed)
	244 - gtest-roofit-roofitcore-test-testRooSimultaneous (Failed)
	245 - gtest-roofit-roofitcore-test-testRooGradMinimizerFcn (Failed)
	247 - gtest-roofit-roofitcore-test-testLikelihoodSerial (Failed)
	248 - gtest-roofit-roofitcore-test-testRooRealL (Failed)
	249 - gtest-roofit-roofitcore-test-testGlobalObservables (Failed)
	252 - gtest-roofit-roostats-test-testSPlot (Failed)
	274 - test-stresshistogram (Failed)
	275 - test-stresshistogram-interpreted (Failed)
	296 - test-stresshistofit (Failed)
	297 - test-stresshistofit-interpreted (Failed)
	396 - gtest-tree-dataframe-test-datasource-ntuple (Failed)
	402 - gtest-tree-ntuple-v7-test-ntuple-basics (Failed)
	406 - gtest-tree-ntuple-v7-test-ntuple-merger (Failed)
	412 - gtest-tree-ntuple-v7-test-ntuple-serialize (Failed)
	420 - gtest-tree-ntuple-v7-test-ntuple-minifile (Failed)
	423 - gtest-tree-ntuple-v7-test-ntuple-extended (Failed)
	870 - tutorial-roofit-rf612_recoverFromInvalidParameters (Failed)
	1077 - tutorial-dataframe-df006_ranges-py (Failed)
	1106 - tutorial-math-exampleFunction-py (Failed)
Errors while running CTest

With the proposed change in #10303 to not fail on the warning about RooNaNPacker not being implemented for big endian, the list of failures is shorter:

99% tests passed, 17 tests failed out of 1232
Label Time Summary:
longtest               = 540.69 sec*proc (27 tests)
multithreaded          = 678.83 sec*proc (2 tests)
python_runtime_deps    =  60.39 sec*proc (11 tests)
tutorial               = 3802.05 sec*proc (786 tests)
Total Test time (real) = 2732.59 sec
The following tests FAILED:
	 16 - pyunittests-pyroot-pyz-stl-vector (Failed)
	 57 - pyunittests-pyroot-pyz-rtensor (Failed)
	237 - gtest-roofit-roofitcore-test-testRooProdPdf (Failed)
	243 - gtest-roofit-roofitcore-test-testNaNPacker (Failed)
	274 - test-stresshistogram (Failed)
	275 - test-stresshistogram-interpreted (Failed)
	296 - test-stresshistofit (Failed)
	297 - test-stresshistofit-interpreted (Failed)
	396 - gtest-tree-dataframe-test-datasource-ntuple (Failed)
	402 - gtest-tree-ntuple-v7-test-ntuple-basics (Failed)
	406 - gtest-tree-ntuple-v7-test-ntuple-merger (Failed)
	412 - gtest-tree-ntuple-v7-test-ntuple-serialize (Failed)
	420 - gtest-tree-ntuple-v7-test-ntuple-minifile (Failed)
	423 - gtest-tree-ntuple-v7-test-ntuple-extended (Failed)
	870 - tutorial-roofit-rf612_recoverFromInvalidParameters (Failed)
	1077 - tutorial-dataframe-df006_ranges-py (Failed)
	1106 - tutorial-math-exampleFunction-py (Failed)
Errors while running CTest

For both lists the proposed change in #10308 was applied.

Expected behavior

Ideally there should be no test failures.

To Reproduce

Steps to reproduce the behaviour:

  1. build root for s390x.
  2. run the unit tests.

Setup

  1. ROOT version 6.26.02.
  2. Fedora rawhide for s390x. The list of failures is similar for other Fedora and EPEL releases.
  3. Build from source (during package build for Fedora/EPEL).

Additional context

The log

ellert avatar Apr 17 '22 13:04 ellert

Assigning to Axel because I am not sure whether ROOT tries to support s390x at all.

eguiraud avatar Apr 19 '22 13:04 eguiraud

Regarding the RNTuple failures, one issue is the endianness (big-endian), which is addressed in #10402

jblomer avatar Apr 21 '22 07:04 jblomer

@ellert What is the status with current master ? The PR with the ntuple issues has been merged. Thanks!

ferdymercury avatar May 08 '25 07:05 ferdymercury

Pinging @ellert

ferdymercury avatar Jun 08 '25 07:06 ferdymercury

Here is a s390x build using master from yesterday (Jun 18): https://koji.fedoraproject.org/koji/taskinfo?taskID=134115604

Many RNTuple tests still fail:

The following tests FAILED:

  1 - cppinterop-CppInterOpTests (Failed)               DEPENDS
 13 - pyunittests-bindings-distrdf-backend-distrdf-unit-backend-graph-caching (Failed) python
 43 - pyunittests-bindings-pyroot-pythonizations-pyroot-pyz-stl-vector (Failed) python
 83 - pyunittests-bindings-pyroot-pythonizations-pyroot-pyz-rvec-asrvec (Failed) python python_runtime_deps
 84 - pyunittests-bindings-pyroot-pythonizations-pyroot-pyz-rdataframe-makenumpy (Failed) python python_runtime_deps
 89 - pyunittests-bindings-pyroot-pythonizations-pyroot-pyz-rtensor (Failed) python python_runtime_deps
126 - gtest-core-dictgen-dictgen-base (Failed)
144 - gtest-core-metacling-TClingTest (Failed)
424 - test-stresshistogram (Failed)                     longtest
425 - test-stresshistogram-interpreted (Failed)
436 - test-stresshistofit (Failed)                      longtest
437 - test-stresshistofit-interpreted (Failed)
450 - gtest-tmva-sofie-TestCustomModelsFromONNX (Failed)
543 - gtest-tree-dataframe-dataframe-snapshot-ntuple (Failed)
544 - gtest-tree-dataframe-dataframe-unified-constructor (Failed)
552 - gtest-tree-ntuple-ntuple-basics (Failed)
553 - gtest-tree-ntuple-ntuple-bulk (Failed)
554 - gtest-tree-ntuple-ntuple-cast (Failed)
557 - gtest-tree-ntuple-ntuple-compat (Failed)
562 - gtest-tree-ntuple-ntuple-join-table (Failed)
563 - gtest-tree-ntuple-ntuple-merger (Failed)
565 - gtest-tree-ntuple-ntuple-model (Failed)
566 - gtest-tree-ntuple-ntuple-multi-column (Failed)
567 - gtest-tree-ntuple-ntuple-packing (Timeout)
570 - gtest-tree-ntuple-ntuple-processor (Failed)
571 - gtest-tree-ntuple-ntuple-processor-chain (Failed)
572 - gtest-tree-ntuple-ntuple-processor-join (Failed)
573 - gtest-tree-ntuple-ntuple-project (Failed)
574 - gtest-tree-ntuple-ntuple-modelext (Failed)
577 - gtest-tree-ntuple-ntuple-types (Failed)
578 - gtest-tree-ntuple-ntuple-view (Failed)
581 - gtest-tree-ntuple-rfield-class (Failed)
583 - gtest-tree-ntuple-rfield-variant (Failed)
584 - gtest-tree-ntuple-rfield-vector (Failed)
586 - gtest-tree-ntuple-ntuple-show (Failed)
587 - gtest-tree-ntuple-ntuple-storage (Failed)
588 - gtest-tree-ntuple-ntuple-extended (Failed)
590 - gtest-tree-ntuple-ntuple-largefile2 (Failed)
592 - gtest-tree-ntuple-ntuple-parallel-writer (Failed)
594 - gtest-tree-ntuple-rfield-streamer (Failed)
595 - gtest-tree-ntuple-ntuple-storage-daos (Failed)
598 - gtest-tree-ntupleutil-v7-ntuple-importer (Failed)
737 - tutorial-hist-hist102_TH2_contour_list (Timeout)  tutorial
1266 - tutorial-analysis-dataframe-df006_ranges-py (Failed) tutorial
1290 - tutorial-hist-hist007_TH1_liveupdate-py (Failed)  tutorial
1317 - tutorial-math-exampleFunction-py (Failed)         tutorial
1318 - tutorial-math-fit-NumericalMinimization-py (Failed) python_runtime_deps tutorial
1319 - tutorial-math-fit-combinedFit-py (Failed)         python_runtime_deps tutorial
1475 - tutorial-visualisation-rcanvas-rbox-py (Failed)   tutorial

See the logs in the link above for details.

ellert avatar Jun 19 '25 09:06 ellert

I'm looking into them with a VM. It's a slow process so it will take a few days before I have results.

jblomer avatar Jun 27 '25 06:06 jblomer

I created a debug build and the only failing tests I see are the ones connected to quantized floats (@silverweed something we need to look into).

I'll create a release with debug info build to see if the situation changes.

jblomer avatar Jun 30 '25 14:06 jblomer

I get the same result with a RelWithDebugInfo build. Trying again with a Release build.

jblomer avatar Jul 14 '25 09:07 jblomer

Maybe related? https://github.com/root-project/root/pull/19411 and https://github.com/root-project/root/pull/19420

Also: https://github.com/root-project/root/issues/14512 https://github.com/root-project/root/issues/12431 https://github.com/root-project/root/issues/12429

ferdymercury avatar Jul 22 '25 09:07 ferdymercury

@ellert I do see issues with quantized floats -- but these are the only issues I can reproduce. Neither debug nor release builds show the avalanche of issues from the logs. This is with a standard QEMU s390x VM. I'm afraid I can't do much about it unless I can get access to an environment where the error is present.

jblomer avatar Aug 05 '25 20:08 jblomer

Waiting for input. Do not hesitate to re-open.

dpiparo avatar Oct 26 '25 09:10 dpiparo