daos icon indicating copy to clipboard operation
daos copied to clipboard

DAOS-11624 test: Update server storage yaml configuration

Open phender opened this issue 2 years ago • 12 comments

In preparation for metadata on SSDs update the existing functional tests from using the legacy storage configuration specification to using the storage tier specification.

Skip-unit-tests: true Test-tag: aggregatebasic aggregationchecksum test_dfusespacecheck aggregateiosmall aggregate_single_pool

Required-githooks: true

Signed-off-by: Phil Henderson [email protected]

Before requesting gatekeeper:

  • [ ] Two review approvals and any prior change requests have been resolved.
  • [ ] Testing is complete and all tests passed or there is a reason documented in the PR why it should be force landed and forced-landing tag is set.
  • [ ] Commit messages follows the guidelines outlined here.
  • [ ] Any tests skipped by the ticket being addressed have been run and passed in the PR.

Gatekeeper:

  • [ ] You are the appropriate gatekeeper to be landing the patch.
  • [ ] The PR has 2 reviews by people familiar with the code, including appropriate watchers.
  • [ ] Githooks were used. If not, request that user install them and check copyright dates.
  • [ ] Checkpatch issues are resolved. Pay particular attention to ones that will show up on future PRs.
  • [ ] All builds have passed. Check non-required builds for any new compiler warnings.
  • [ ] Sufficent testing is done. Check feature pragmas and test tags and that tests skipped for the ticket are run and now pass with the changes.
  • [ ] If applicable, the PR has addressed any potential version compatibility issues.
  • [ ] Check the target branch. If it is master branch, should the PR go to a feature branch? If it is a release branch, does it have merge approval in the JIRA ticket.
  • [ ] Extra checks if forced landing is requested
    • [ ] Review comments are sufficiently resolved, particularly by prior reviewers that requested changes.
    • [ ] No new NLT or valgrind warnings. Check the classic view.
    • [ ] Quick-build or Quick-functional is not used.
  • [ ] Fix the commit message upon landing. Check the standard here. Edit it to create a single commit. If necessary, ask submitter for a new summary.

phender avatar Oct 12 '22 23:10 phender

Bug-tracker data: Ticket title is 'Adjust test infrastructure to support md_on_ssd yaml file changes' Status is 'In Progress' Labels: 'md_on_ssd' Job should run at elevated priority (3) https://daosio.atlassian.net/browse/DAOS-11624

github-actions[bot] avatar Oct 12 '22 23:10 github-actions[bot]

Test stage Functional Hardware Small completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/1/execution/node/725/log

daosbuild1 avatar Oct 13 '22 01:10 daosbuild1

Test stage Functional Hardware Small completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/2/execution/node/725/log

daosbuild1 avatar Oct 13 '22 05:10 daosbuild1

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/2/execution/node/815/log

daosbuild1 avatar Oct 13 '22 06:10 daosbuild1

Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/2/execution/node/904/log

daosbuild1 avatar Oct 13 '22 09:10 daosbuild1

Test stage Functional Hardware Small completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/4/execution/node/725/log

daosbuild1 avatar Oct 13 '22 13:10 daosbuild1

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/4/execution/node/815/log

daosbuild1 avatar Oct 13 '22 16:10 daosbuild1

Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/4/execution/node/904/log

daosbuild1 avatar Oct 13 '22 18:10 daosbuild1

Test stage Functional Hardware Small completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/5/execution/node/724/log

daosbuild1 avatar Oct 13 '22 22:10 daosbuild1

Test stage Functional on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/8/execution/node/860/log

daosbuild1 avatar Oct 14 '22 12:10 daosbuild1

Test stage Functional Hardware Small completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/8/execution/node/1002/log

daosbuild1 avatar Oct 14 '22 17:10 daosbuild1

Issues seen in https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-10555/8:

  • container/auto_oc_selection.py: https://daosio.atlassian.net/browse/DAOS-11907
  • datamover/dst_create.py: https://daosio.atlassian.net/browse/DAOS-11909
  • dfuse/daos_build.py: https://daosio.atlassian.net/browse/DAOS-11441 / https://daosio.atlassian.net/browse/DAOS-11376
  • harness/advanced.py: missing required server storage specification in test yaml
  • pool/svc.py: https://daosio.atlassian.net/browse/DAOS-11724
  • rebuild/cascading_failures.py: No objects written to rank 3 - possibly a new error; will verify with next weekly results
  • rebuild/container_rf.py: No objects written to rank 3
  • server/daos_server_dump.py: manual tests that are expected to fail
  • control/config_generate_output.py: missing required server storage specification in test yaml
  • control/config_generate_run.py: missing required server storage specification in test yaml
  • control/daos_server_helper.py: missing required scm_mount specification in test yaml
  • daos_vol/bigio.py: missing required engines specification in test yaml
  • dfuse/root_container.py: missing required engines_per_host specification in test yaml
  • harness/timeout.py: manual test that is expected to be interrupted

phender avatar Oct 14 '22 21:10 phender

Test stage Functional on EL 8 completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/9/execution/node/860/log

daosbuild1 avatar Oct 15 '22 11:10 daosbuild1

Test stage Functional Hardware Small completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/9/execution/node/1002/log

daosbuild1 avatar Oct 16 '22 08:10 daosbuild1

Test stage Functional Hardware Medium completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/9/execution/node/1138/log

daosbuild1 avatar Oct 17 '22 03:10 daosbuild1

Test stage Functional Hardware Large completed with status FAILURE. https://build.hpdd.intel.com//job/daos-stack/job/daos/view/change-requests/job/PR-10555/9/execution/node/1091/log

daosbuild1 avatar Oct 17 '22 11:10 daosbuild1

Analysis of https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-10555/9 test failures:

  • Errors from tests that intentionally fail or are not intended to run in CI (behaving as expected):

    • Functional on EL 8 / FTEST_launch.harness-advanced.103-./harness/advanced.py
    • Functional on EL 8 / no_cmocka_xml_file_test.4-./harness/basic.py:HarnessBasicTest.test_no_cmocka_xml
    • Functional Hardware Small / FTEST_launch.harness-advanced.28-./harness/advanced.py
    • Functional Hardware Medium / FTEST_launch.harness-advanced.30-./harness/advanced.py
    • Functional Hardware Large / FTEST_launch.harness-advanced.16-./harness/advanced.py
    • Functional Hardware Large / FTEST_performance.IorHard.2-./performance/ior_hard.py:IorHard.test_performance_ior_hard_dfs_ec_16p2gx
    • Functional Hardware Large / FTEST_performance.IorHard.4-./performance/ior_hard.py:IorHard.test_performance_ior_hard_dfuse_ec_16p2gx
    • Functional Hardware Large / FTEST_performance.MdtestEasy.2-./performance/mdtest_easy.py:MdtestEasy.test_performance_mdtest_easy_dfs_ec_16p2g1
    • Functional Hardware Large / FTEST_performance.MdtestEasy.5-./performance/mdtest_easy.py:MdtestEasy.test_performance_mdtest_easy_dfs_ec_16p2g1_stop
    • Functional Hardware Large / FTEST_performance.MdtestHard.2-./performance/mdtest_hard.py:MdtestHard.test_performance_mdtest_hard_dfs_ec_16p2g1
  • https://daosio.atlassian.net/browse/DAOS-11925

    • Functional on EL 8 / FTEST_launch.datamover-dst_create.12-./datamover/dst_create.py
    • Functional Hardware Large / FTEST_launch.datamover-obj_large_posix.04-./datamover/obj_large_posix.py
  • https://daosio.atlassian.net/browse/DAOS-11907

    • Functional on EL 8 / FTEST_container.AutoOCSelectionTest.1-./container/auto_oc_selection.py:AutoOCSelectionTest.test_oc_selection
  • https://daosio.atlassian.net/browse/DAOS-11909

    • Functional on EL 8 / FTEST_datamover.DmvrDstCreate.1-./datamover/dst_create.py:DmvrDstCreate.test_dm_dst_create_dcp_posix_dfs
    • Functional on EL 8 / FTEST_datamover.DmvrDstCreate.2-./datamover/dst_create.py:DmvrDstCreate.test_dm_dst_create_dcp_posix_daos
    • Functional Hardware Large / FTEST_datamover.DmvrObjLargePosix.1-./datamover/obj_large_posix.py:DmvrObjLargePosix.test_dm_obj_large_posix_dcp
  • https://daosio.atlassian.net/browse/DAOS-11928

    • Functional on EL 8 / FTEST_rebuild.RbldContRfTest.3-./rebuild/container_rf.py:RbldContRfTest.test_rebuild_with_container_rf
    • Functional on EL 8 / FTEST_rebuild.RbldContRfTest.4-./rebuild/container_rf.py:RbldContRfTest.test_rebuild_with_container_rf
  • https://daosio.atlassian.net/browse/DAOS-11446

    • Functional Hardware Large / FTEST_erasurecode.EcodFioRebuild.07-./erasurecode/rebuild_fio.py:EcodFioRebuild.test_ec_online_rebuild_fio
  • https://daosio.atlassian.net/browse/DAOS-11651

    • Functional Hardware Large / FTEST_erasurecode.EcodFioRebuild.11-./erasurecode/rebuild_fio.py:EcodFioRebuild.test_ec_online_rebuild_fio
  • https://daosio.atlassian.net/browse/DAOS-11780

    • Functional Hardware Large / FTEST_ior.EcodIorHardRebuild.2-./ior/hard_rebuild.py:EcodIorHardRebuild.test_ec_ior_hard_online_rebuild
  • https://daosio.atlassian.net/browse/DAOS-11927

    • Functional Hardware Large / FTEST_nvme.NvmePoolExclude.1-./nvme/pool_exclude.py:NvmePoolExclude.test_nvme_pool_excluded
    • Functional Hardware Large / FTEST_pool.PoolRedunFacProperty.1-./pool/rf.py:PoolRedunFacProperty.test_rf_pool_property
  • https://daosio.atlassian.net/browse/DAOS-10769

    • Functional Hardware Large / FTEST_rebuild.RbldWidelyStriped.1-./rebuild/widely_striped.py:RbldWidelyStriped.test_rebuild_widely_striped

The following errors need to be fixed in ths PR:

  • Functional Hardware Small / FTEST_control.ConfigGenerateRun.1-./control/config_generate_run.py:ConfigGenerateRun.test_config_generate_run
  • Functional Hardware Small / FTEST_control.ConfigGenerateRun.2-./control/config_generate_run.py:ConfigGenerateRun.test_config_generate_run
  • Functional Hardware Small / FTEST_control.ConfigGenerateRun.3-./control/config_generate_run.py:ConfigGenerateRun.test_config_generate_run
  • Functional Hardware Small / FTEST_control.ConfigGenerateRun.4-./control/config_generate_run.py:ConfigGenerateRun.test_config_generate_run
  • Functional Hardware Small / FTEST_control.ConfigGenerateRun.5-./control/config_generate_run.py:ConfigGenerateRun.test_config_generate_run
  • Functional Hardware Small / FTEST_control.ConfigGenerateRun.6-./control/config_generate_run.py:ConfigGenerateRun.test_config_generate_run
  • Functional Hardware Small / FTEST_control.ConfigGenerateRun.7-./control/config_generate_run.py:ConfigGenerateRun.test_config_generate_run
  • Functional Hardware Large / FTEST_deployment.AgentFailure.1-./deployment/agent_failure.py:AgentFailure.test_agent_failure
  • Functional Hardware Large / FTEST_deployment.AgentFailure.2-./deployment/agent_failure.py:AgentFailure.test_agent_failure_isolation
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.01-./pool/create_all_hw.py:PoolCreateAllHwTests.test_one_pool
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.02-./pool/create_all_hw.py:PoolCreateAllHwTests.test_one_pool
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.03-./pool/create_all_hw.py:PoolCreateAllHwTests.test_one_pool
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.04-./pool/create_all_hw.py:PoolCreateAllHwTests.test_one_pool
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.05-./pool/create_all_hw.py:PoolCreateAllHwTests.test_recycle_pools
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.06-./pool/create_all_hw.py:PoolCreateAllHwTests.test_recycle_pools
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.07-./pool/create_all_hw.py:PoolCreateAllHwTests.test_recycle_pools
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.08-./pool/create_all_hw.py:PoolCreateAllHwTests.test_recycle_pools
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.09-./pool/create_all_hw.py:PoolCreateAllHwTests.test_two_pools
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.10-./pool/create_all_hw.py:PoolCreateAllHwTests.test_two_pools
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.11-./pool/create_all_hw.py:PoolCreateAllHwTests.test_two_pools
  • Functional Hardware Large / FTEST_pool.PoolCreateAllHwTests.12-./pool/create_all_hw.py:PoolCreateAllHwTests.test_two_pools
  • Functional Hardware Large / FTEST_performance.IorEasy.01-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfs_sx
  • Functional Hardware Large / FTEST_performance.IorEasy.02-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfs_ec_16p2gx
  • Functional Hardware Large / FTEST_performance.IorEasy.03-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfuse_sx
  • Functional Hardware Large / FTEST_performance.IorEasy.04-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfuse_ec_16p2gx
  • Functional Hardware Large / FTEST_performance.IorEasy.05-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfs_ec_4p2gx_stop_write
  • Functional Hardware Large / FTEST_performance.IorEasy.06-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfs_ec_4p2gx_stop_read
  • Functional Hardware Large / FTEST_performance.IorEasy.07-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfs_ec_16p2gx_stop_write
  • Functional Hardware Large / FTEST_performance.IorEasy.08-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfs_ec_16p2gx_stop_read
  • Functional Hardware Large / FTEST_performance.IorEasy.09-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_hdf5_sx
  • Functional Hardware Large / FTEST_performance.IorEasy.10-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_mpiio_sx

phender avatar Oct 18 '22 04:10 phender

Test stage Functional Hardware Large completed with status UNSTABLE. https://build.hpdd.intel.com/job/daos-stack/job/daos//view/change-requests/job/PR-10555/10/testReport/(root)/

daosbuild1 avatar Oct 18 '22 09:10 daosbuild1

Testing of updated files from build 9 passed in https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-10555/10/. Build 10 did have the following expected errors:

  • Errors from tests that fail due to requiring more resources than supported by CI (18 engines):
    • Functional Hardware Large / FTEST_performance.IorEasy.02-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfs_ec_16p2gx
    • Functional Hardware Large / FTEST_performance.IorEasy.04-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfuse_ec_16p2gx
    • Functional Hardware Large / FTEST_performance.IorEasy.07-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfs_ec_16p2gx_stop_write
    • Functional Hardware Large / FTEST_performance.IorEasy.08-./performance/ior_easy.py:IorEasy.test_performance_ior_easy_dfs_ec_16p2gx_stop_read

phender avatar Oct 18 '22 13:10 phender

The CodeSpell error is for a file not modified in this PR:

Error: ./src/vea/tests/vea_stress.c:925: reserv ==> reserve

phender avatar Oct 18 '22 13:10 phender

Summary of testing:

  • https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-10555/9
    • ran all tests
    • of the 58 failures, all but 31 where known issues (see previous comments for details)
  • https://build.hpdd.intel.com/job/daos-stack/job/daos/job/PR-10555/10/
    • ran the 31 failed tests from build 9
    • all but 4 passed
    • the 4 failures are expected as they require more resources than are provided by CI (see previous comments for details)
  • the only other failure is a CodeSpell error for a file not modified by this PR

phender avatar Oct 18 '22 13:10 phender

One question, should this be on the branch?

jolivier23 avatar Oct 19 '22 00:10 jolivier23

fine to land on master since it just moves to the new-style yaml file that was added in 2.0.

johannlombardi avatar Oct 19 '22 08:10 johannlombardi