aiida-core icon indicating copy to clipboard operation
aiida-core copied to clipboard

Improvements to .molecule tests (Jenkins)

Open chrisjsewell opened this issue 3 years ago • 6 comments

A few things noted to improve at a later time, after merging #4565:

  1. remove create_docker.yml (https://github.com/aiidateam/aiida-core/pull/4565#discussion_r572159376_)

Well the key thing is that we need to be able to set the Docker context (see context: "../.." above in .molecule/default/config_local.yml), i.e. setting docker.image.path. I've re-opened in https://github.com/ansible-community/molecule-docker/pull/37, without the bug fixes (which maybe were not correct) to hopefully make it more easy to be accepted. I imagine it will take a while to make its way into a release though, so might have to open an issue to remove at a later date?

  1. dynamic python version selection (once aiida-prerequisites has environmental variable with the python version, https://github.com/aiidateam/aiida-core/pull/4565#discussion_r572052668)

  2. Fix pip cache (folder permissions, https://github.com/aiidateam/aiida-core/pull/4565#issuecomment-775146939)

chrisjsewell avatar Feb 08 '21 16:02 chrisjsewell

~~Also for Jenkins it might be good to run reentry scan once in the preparation step (see https://github.com/aiidateam/aiida-core/pull/4719#issuecomment-775861310)~~ (fixed with https://github.com/aiidateam/aiida-core/commit/3ad071244332cc78e4be597a05ad5703d17787bd)

chrisjsewell avatar Feb 09 '21 11:02 chrisjsewell

Thanks Chris. Also, I get a new error here:

    TASK [run polish workchains] ***************************************************
    Tuesday 09 February 2021  11:53:41 +0000 (0:00:00.054)       0:00:25.080 ******
fatal: [molecule-aiida-django-d44d5143-8757-499e-bee6-a4952d2d9314]: FAILED! => changed=true
  cmd: |-
    set -e
    declare -a EXPRESSIONS=('1 -2 -1 4 -5 -5 * * * * +' '2 1 3 3 -1 + ^ ^ +' '3 -5 -1 -4 + * ^' '2 4 2 -4 * * +' '3 1 1 5 ^ ^ ^')
    for expression in "${EXPRESSIONS[@]}"; do
      /opt/conda/bin/verdi -p django run --auto-group -l polish -- "${HOME}/django/polish/cli.py" -X add! -C -F -d -t 600 "$expression"
    done
  delta: '0:12:15.919423'
  end: '2021-02-09 12:05:58.306043'
  msg: non-zero return code
  rc: 1
  start: '2021-02-09 11:53:42.386620'
  stderr: ''
  stderr_lines: <omitted>
  stdout: |-
    Expression: 1 -2 -1 4 -5 -5 * * * * +
    Evaluated : 201
    Workchain : uuid: 0df81fa3-34c3-4fc8-95c1-d8b2d66a9f10 (pk: 131) value: 201 <4>
    Success: the workchain accurately reproduced the evaluated value in 270.50s
    Expression: 2 1 3 3 -1 + ^ ^ +
    Evaluated : 10
    Workchain : uuid: 8bb1ec7f-24c5-4932-ae83-337d837fa471 (pk: 182) value: 10 <134>
    
    PLAY RECAP *********************************************************************
    Success: the workchain accurately reproduced the evaluated value in 115.19s
    Expression: 3 -5 -1 -4 + * ^
    Evaluated : 15625
    Workchain : uuid: e0393f2f-db68-4fe9-96bf-948ab0fc1820 (pk: 285) value: 15625 <185>
    Success: the workchain accurately reproduced the evaluated value in 138.40s
    Expression: 2 4 2 -4 * * +
    Evaluated : 999970
    Workchain : uuid: 8b95f070-b36f-405b-a83b-d2d29b45001c (pk: 319) value: 999970 <288>
    Success: the workchain accurately reproduced the evaluated value in 85.14s
    Expression: 3 1 1 5 ^ ^ ^
    Failed: the workchain<322> did not return a result output node
  stdout_lines: <omitted>
    molecule-aiida-django-d44d5143-8757-499e-bee6-a4952d2d9314 : ok=6    changed=4    unreachable=0    failed=1    skipped=1    rescued=0    ignored=1
    
    Playbook run took 0 days, 0 hours, 12 minutes, 41 seconds
    Tuesday 09 February 2021  12:05:58 +0000 (0:12:16.398)       0:12:41.479 ******
    ===============================================================================
    run polish workchains ------------------------------------------------- 736.40s
    Reset pythonpath of daemon (2 workers) ---------------------------------- 9.43s
    Copy workchain files ---------------------------------------------------- 6.19s
    verdi add code setup ---------------------------------------------------- 4.70s
    Check if add code is already present ------------------------------------ 4.12s
    get python path including workchains ------------------------------------ 0.45s
    set_fact ---------------------------------------------------------------- 0.10s
    include_tasks ----------------------------------------------------------- 0.05s
ERROR: 
An error occurred during the test sequence action: 'verify'. Cleaning up.

Do you know why?

giovannipizzi avatar Feb 09 '21 12:02 giovannipizzi

No thats why I've merged #4729; seems to happen sometimes when Jenkins is under heavy load

chrisjsewell avatar Feb 09 '21 12:02 chrisjsewell

There actually seems to be multiple reasons:

https://theossrv6.epfl.ch/jenkins/blue/organizations/jenkins/aiida_core_aiidateam/detail/PR-4729/1/pipeline

{'version': {'core': '1.5.2'}, 'exception': 'Traceback (most recent call last):\n  File "/opt/conda/lib/python3.7/site-packages/aiida/orm/utils/managers.py", line 83, in __getattr__\n    return self._get_node_by_link_label(label=name)\n  File "/opt/conda/lib/python3.7/site-packages/aiida/orm/utils/managers.py", line 64, in _get_node_by_link_label\n    return self._node.get_outgoing(link_type=self._link_type).get_node_by_label(label)\n  File "/opt/conda/lib/python3.7/site-packages/aiida/orm/utils/links.py", line 300, in get_node_by_label\n    raise exceptions.NotExistent(f\'no neighbor with the label {label} found\')\naiida.common.exceptions.NotExistent: no neighbor with the label result found\n\nDuring handling of the above exception, another exception occurred:\n\naiida.common.exceptions.NotExistentAttributeError: Node<143> does not have an output with link label \'result\'\n', 'checkpoints': '!plumpy:bundle\n\'!!meta\':\n  class_name: polish_workchains.polish_f7863ed90d43505883874975e9377e76:Polish00WorkChain\n  types:\n    _future: S\n  user:\n    object_loader: aiida.engine.persistence:ObjectLoader\nCONTEXT: !aiida_attributedict\n  calculations:\n  - !aiida_node \'b563bcc2-bbdf-4ab0-b423-925296b105ca\'\n  iterators: []\n  iterators_sign: []\n  iterators_stack: []\n  operands:\n  - 3\n  - 3\n  - -1\n  result: !aiida_node \'3b945e28-368d-4564-ab19-c70ce2f5fc57\'\n  workchains:\n  - !aiida_node \'155a41b0-7501-45f5-bcea-2f76d21297a9\'\nINPUTS_PARSED: "!plumpy:attributes_frozendict\\ncode: !aiida_node \'7cc094b0-55ef-4b66-b0e4-c233ba2026e1\'\\n\\\n  metadata: !plumpy:attributes_frozendict\\n  call_link_label: CALL\\n  store_provenance:\\\n  \\ true\\nmodulo: !aiida_node \'960b50ab-3288-4897-b67a-6df9bf683fc8\'\\noperands: !aiida_node\\\n  \\ \'4da6ca30-4030-492d-aaf5-c5d724b8f2a6\'\\n"\nINPUTS_RAW: \'!plumpy:attributes_frozendict\n\n  code: !aiida_node \'\'7cc094b0-55ef-4b66-b0e4-c233ba2026e1\'\'\n\n  modulo: !aiida_node \'\'960b50ab-3288-4897-b67a-6df9bf683fc8\'\'\n\n  operands: !aiida_node \'\'4da6ca30-4030-492d-aaf5-c5d724b8f2a6\'\'\n\n  \'\n_awaitables: []\n_creation_time: 1612875513.0140836\n_enable_persistence: true\n_future:\n  \'!!meta\':\n    class_name: plumpy.persistence:SavableFuture\n  _result: null\n  _state: PENDING\n_parent_pid: null\n_paused: null\n_pid: 134\n_pre_paused_status: null\n_state:\n  \'!!meta\':\n    class_name: plumpy.process_states:Running\n  args: !!python/tuple []\n  in_state: true\n  kwargs: {}\n  run_fn: _do_step\n_status: null\ncalc_id: 134\nstepper_state:\n  \'!!meta\':\n    class_name: plumpy.workchains:_BlockStepper\n  _pos: 3\n  stepper_state:\n    \'!!meta\':\n      class_name: plumpy.workchains:_FunctionStepper\n    _fn: post_raise_power\n', 'process_label': 'Polish00WorkChain', 'process_state': 'excepted', 'stepper_state_info': '3:post_raise_power'}

https://theossrv6.epfl.ch/jenkins/blue/organizations/jenkins/aiida_core_aiidateam/detail/develop/996/pipeline

{'sealed': True, 'version': {'core': '1.5.2'}, 'exception': 'concurrent.futures._base.TimeoutError\n', 'process_label': 'Polish00WorkChain', 'process_state': 'excepted', 'process_status': 'Waiting for child processes: 325, 326', 'stepper_state_info': '1:raise_power'}

In #4733 I am going to add a retry for the workchain executions, to see if that will mitigate the failures, but obviously in time I/we should look into these more closely.

chrisjsewell avatar Feb 09 '21 14:02 chrisjsewell

@chrisjsewell since Jenkins has been decommissioned, can we close this? Or was this rather molecule specific. That folder is still present in the source tree on develop. Should that be removed, or is that still being used?

sphuber avatar Mar 13 '22 21:03 sphuber

Pinging @chrisjsewell . Are we still using the tests in .molecule? Is it even up to date?

sphuber avatar Apr 28 '22 22:04 sphuber