DAOS-9218 test: data-mover erasure code support
Quick-Functional: true Test-tag: iosysadmin Test-repeat: 20 Signed-off-by: Cedric Koch-Hofer [email protected]
Bug-tracker data: Ticket title is 'serialization/deserialization not working with data protection' Status is 'In Review' Labels: 'tds,triaged' Job should run at elevated priority (3) https://daosio.atlassian.net/browse/DAOS-9218
Test stage Functional Hardware Large completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-10008/2/testReport/(root)/
Test stage Functional Hardware Large completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-10008/3/testReport/(root)/
Test stage Functional Hardware Large completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-10008/8/testReport/(root)/
Test stage Functional Hardware Large completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-10008/9/testReport/(root)/
Test stage Functional Hardware Large completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-10008/11/testReport/(root)/
Test stage Functional Hardware Large completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-10008/12/testReport/(root)/
Test stage Functional Hardware Large completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-10008/13/testReport/(root)/
Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/17/execution/node/1139/log
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/17/execution/node/1084/log
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/18/execution/node/1083/log
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/19/execution/node/1083/log
Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/20/execution/node/1130/log
The error of the CI should not be related with this PR. Indeed, the same error as the one reported in the ticket https://daosio.atlassian.net/browse/DAOS-11943 is happening in the latest CI build.
@liuxuezhao, @wangdi1 or @jolivier23, please could you check if the C code is OK for you? It is just missing this part of the code to be reviewed to request the landing of this prehistorical ticket ;)
This PR did not run https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-10008/20/artifact/Functional%20Hardware%20Large/deployment/io_sys_admin.py/results.html due to DAOS-9218.
The most recent commit message needs to start with DAOS-9218 <some text>. It looks like this was missing in the last commit message. Please merge with master and include DAOS-9218 in the commit message along with the required tags.
As suggested by @phender, I have merged with master and push to CI with a valid commit message. I also take the opportunity to add a small patch allowing to use the daos-launch.sh script with git bisect.
Test stage checkpatch completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/22/execution/node/145/log
Test stage NLT on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/22/execution/node/638/log
Test stage NLT on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/23/execution/node/246/log
Test stage Build RPM on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/24/execution/node/333/log
Test stage Build RPM on Leap 15 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/24/execution/node/307/log
Test stage Build DEB on Ubuntu 20.04 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/24/execution/node/330/log
Test stage Build on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/24/execution/node/438/log
Test stage Build on Leap 15 with Intel-C and TARGET_PREFIX completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/24/execution/node/477/log
Test stage NLT on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10008/25/execution/node/638/log
I assume you know how to read the failures from this. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-10008/25/valgrindResult/pid=34159,0x63/
There's a segfault of 0 on this line if (shard_arg->la_recxs[i].rx_idx & PARITY_INDICATOR)
I assume you know how to read the failures from this. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-10008/25/valgrindResult/pid=34159,0x63/
There's a segfault of 0 on this line
if (shard_arg->la_recxs[i].rx_idx & PARITY_INDICATOR)
I have to admit it, that it is my first time working with NLT failure. I was trying to run them locally without big success until now :'( However, with your pointers, I should be able to easily fix the issue :) Thanks a lot :)
At the end, I was able to locally run NLT tests and reproduce and to fix the NULL pointer read issue. Thanks again @ashleypittman for your help 😄
I assume you know how to read the failures from this. https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-10008/25/valgrindResult/pid=34159,0x63/ There's a segfault of 0 on this line
if (shard_arg->la_recxs[i].rx_idx & PARITY_INDICATOR)I have to admit it, that it is my first time working with NLT failure. I was trying to run them locally without big success until now :'( However, with your pointers, I should be able to easily fix the issue :) Thanks a lot :)
This is largely about knowing how/where Jenkins reports valgrind failures but NLT is the number 1 reporter of these. When there's a crash you typically see a large number of failures reported and it can be hard to know which ones to focus on.