clay icon indicating copy to clipboard operation
clay copied to clipboard

Requesting two multicore tasks at the same time ends in failure for both tasks

Open ederenn opened this issue 5 years ago • 0 comments

Description

Golem Version: 0.21.0+dev470.g7817ff3

Golem-Messages version (leave empty if unsure): 3.14.1

Electron version (if used): 0.21.0

OS [e.g. Windows 10 Pro]: Linux 18.04, Windows 10 Pro

Branch (if launched from source): b0.22

Mainnet/Testnet: mainnet

Priority label is set to the lowest by default. To setup higher priority please change the label P0 label is set for Severity-Critical/Effort-easy P1 label is set for Severity-Critical/Effort-hard P2 label is set for Severity-Low/ Effort-easy P3 label is set for Severity-Low/Effort-hard

Description of the issue:

Gwasm task was requested simultaneously on two requestors. One provider started computations on all available cores for two sets of subtasks at the same time. For example - provider has 8 cores in total, and 8 subtasks were accepted from requestor 1 and requestor 2. Provider continued computing 7 subtasks from R1 and one subtask from R2. Offers for all other subtasks were not cancelled, just remained in starting state and finished in failure. Provider was banned on both requestors.

Actual result:

Task has ended in timeout.

Steps To Reproduce

Short description of steps to reproduce the behavior: e.g.

  1. Launch small network, of 4 nodes.
  2. On two nodes request gflite task at the same time. task settings: text file 553 words, 10 subtasks, subtask timeout 2 min, task timeout 10 min.
  3. Check on two requestors list of subtasks from cli.
  4. Compare list of providers for both tasks.
  5. Max number of cores will be set for both requestors at the same time.
  6. After subtask timeouts requestor will ban providers.

Logs and any additional context

from cli R!

┌───────────┬────────────────────────────────────────┬────────────┬────────────┐
│  node     │  subtask id                            │  status    │  progress  │
├───────────┼────────────────────────────────────────┼────────────┼────────────┤
│  qbam     │  490751e4-1515-11ea-9789-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  qbam     │  4907eea4-1515-11ea-9b12-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  qbam     │  4908df40-1515-11ea-ac1c-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  qbam     │  490a5238-1515-11ea-96bd-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  qbam     │  490ac3ac-1515-11ea-a779-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  qbam     │  490bce3e-1515-11ea-a83b-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  qbam     │  490d26ca-1515-11ea-9312-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  qbam     │  490e73be-1515-11ea-8877-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  qbam     │  490fd726-1515-11ea-a3f8-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  qbam     │  49111a90-1515-11ea-a282-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  tempest  │  49129b3e-1515-11ea-bf29-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  tempest  │  4913c54c-1515-11ea-9aaa-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  tempest  │  49152e06-1515-11ea-a051-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  tempest  │  49167026-1515-11ea-be93-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  tempest  │  4917defe-1515-11ea-9830-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  tempest  │  49192a94-1515-11ea-9615-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  tempest  │  491a778a-1515-11ea-bedc-2f67ff27ad1e  │  Starting  │  0.0 %     │
│  tempest  │  491bc724-1515-11ea-8f94-2f67ff27ad1e  │  Starting  │  0.0 %     │
└───────────┴────────────────────────────────────────┴────────────┴────────────┘

>> tasks subtasks list 3e2353a8-1515-11ea-a554-2f67ff27ad1e
┌───────────┬────────────────────────────────────────┬─────────────┬────────────┐
│  node     │  subtask id                            │  status     │  progress  │
├───────────┼────────────────────────────────────────┼─────────────┼────────────┤
│  qbam     │  490751e4-1515-11ea-9789-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  qbam     │  4907eea4-1515-11ea-9b12-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  qbam     │  4908df40-1515-11ea-ac1c-2f67ff27ad1e  │  Verifying  │  0.0 %     │
│  qbam     │  490a5238-1515-11ea-96bd-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  qbam     │  490ac3ac-1515-11ea-a779-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  qbam     │  490bce3e-1515-11ea-a83b-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  qbam     │  490d26ca-1515-11ea-9312-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  qbam     │  490e73be-1515-11ea-8877-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  qbam     │  490fd726-1515-11ea-a3f8-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  qbam     │  49111a90-1515-11ea-a282-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  tempest  │  49129b3e-1515-11ea-bf29-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  tempest  │  4913c54c-1515-11ea-9aaa-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  tempest  │  49152e06-1515-11ea-a051-2f67ff27ad1e  │  Failure    │  0.0 %     │
│  tempest  │  49167026-1515-11ea-be93-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  tempest  │  4917defe-1515-11ea-9830-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  tempest  │  49192a94-1515-11ea-9615-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  tempest  │  491a778a-1515-11ea-bedc-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  tempest  │  491bc724-1515-11ea-8f94-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  tempest  │  59423ce8-1515-11ea-b07d-2f67ff27ad1e  │  Finished   │  100.0 %   │
│  tempest  │  59430892-1515-11ea-a9cd-2f67ff27ad1e  │  Finished   │  100.0 %   │
└───────────┴────────────────────────────────────────┴─────────────┴────────────┘
>> 

R2

>> tasks subtasks list  4100084a-1514-11ea-8af0-86728d43f73d
┌───────────┬────────────────────────────────────────┬────────────┬────────────┐
│  node     │  subtask id                            │  status    │  progress  │
├───────────┼────────────────────────────────────────┼────────────┼────────────┤
│  qbam     │  4e368c42-1514-11ea-8d75-86728d43f73d  │  Finished  │  100.0 %   │
│  qbam     │  4e38ecda-1514-11ea-89ba-86728d43f73d  │  Finished  │  100.0 %   │
│  qbam     │  4e38ecdb-1514-11ea-96bc-86728d43f73d  │  Finished  │  100.0 %   │
│  qbam     │  4e3b4e58-1514-11ea-a231-86728d43f73d  │  Finished  │  100.0 %   │
│  qbam     │  4e3b4e59-1514-11ea-bf56-86728d43f73d  │  Finished  │  100.0 %   │
│  qbam     │  4e3b4e5a-1514-11ea-89ab-86728d43f73d  │  Finished  │  100.0 %   │
│  qbam     │  4e3db040-1514-11ea-a972-86728d43f73d  │  Finished  │  100.0 %   │
│  qbam     │  4e3db041-1514-11ea-9bcf-86728d43f73d  │  Finished  │  100.0 %   │
│  qbam     │  4e40149c-1514-11ea-9867-86728d43f73d  │  Finished  │  100.0 %   │
│  qbam     │  4e4276d8-1514-11ea-b3fd-86728d43f73d  │  Finished  │  100.0 %   │
│  vvv lin  │  4e4276d9-1514-11ea-bcd5-86728d43f73d  │  Finished  │  100.0 %   │
│  vvv lin  │  4e44d94c-1514-11ea-a4c8-86728d43f73d  │  Finished  │  100.0 %   │
│  vvv lin  │  4e44d94d-1514-11ea-9632-86728d43f73d  │  Finished  │  100.0 %   │
│  tempest  │  4e473b92-1514-11ea-a669-86728d43f73d  │  Finished  │  100.0 %   │
│  tempest  │  4e473b93-1514-11ea-af54-86728d43f73d  │  Finished  │  100.0 %   │
│  tempest  │  4e49a8de-1514-11ea-86c2-86728d43f73d  │  Finished  │  100.0 %   │
│  tempest  │  4e49a8df-1514-11ea-b499-86728d43f73d  │  Finished  │  100.0 %   │
│  tempest  │  4e4bff1a-1514-11ea-bd30-86728d43f73d  │  Finished  │  100.0 %   │
│  tempest  │  4e4bff1b-1514-11ea-afe8-86728d43f73d  │  Finished  │  100.0 %   │
│  tempest  │  4e4e68e2-1514-11ea-b037-86728d43f73d  │  Finished  │  100.0 %   │
└───────────┴────────────────────────────────────────┴────────────┴────────────┘

>> tasks subtasks list 3c757476-1515-11ea-8a4a-86728d43f73d
┌───────────┬────────────────────────────────────────┬─────────────┬────────────┐
│  node     │  subtask id                            │  status     │  progress  │
├───────────┼────────────────────────────────────────┼─────────────┼────────────┤
│  qbam     │  4afd8f8c-1515-11ea-b789-86728d43f73d  │  Finished   │  100.0 %   │
│  qbam     │  4afd8f8d-1515-11ea-8af0-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4afff068-1515-11ea-aa51-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4afff069-1515-11ea-97b6-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b025158-1515-11ea-a155-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b04b374-1515-11ea-8cc4-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b04b375-1515-11ea-b374-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b07162c-1515-11ea-9dbc-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b07162d-1515-11ea-9f83-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b0979e4-1515-11ea-88b7-86728d43f73d  │  Failure    │  0.0 %     │
│  vvv lin  │  4b0bdc10-1515-11ea-b955-86728d43f73d  │  Finished   │  100.0 %   │
│  vvv lin  │  4b0bdc11-1515-11ea-9787-86728d43f73d  │  Verifying  │  0.0 %     │
│  vvv lin  │  4b0e3e78-1515-11ea-b5c8-86728d43f73d  │  Verifying  │  0.0 %     │
│  tempest  │  4b0e3e79-1515-11ea-987b-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b10aa8a-1515-11ea-a69d-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b10aa8b-1515-11ea-a931-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b13020a-1515-11ea-9c67-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b13020b-1515-11ea-87ee-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b156fda-1515-11ea-ba3e-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b156fdb-1515-11ea-8fc8-86728d43f73d  │  Failure    │  0.0 %     │
│  vvv lin  │  c2c7c7c8-1515-11ea-abcd-86728d43f73d  │  Starting   │  0.0 %     │
│  vvv lin  │  c2c7c7c9-1515-11ea-8f19-86728d43f73d  │  Starting   │  0.0 %     │
│  vvv lin  │  c2ca2ac6-1515-11ea-b0fb-86728d43f73d  │  Starting   │  0.0 %     │
└───────────┴────────────────────────────────────────┴─────────────┴────────────┘
>> tasks subtasks list 3c757476-1515-11ea-8a4a-86728d43f73d
┌───────────┬────────────────────────────────────────┬─────────────┬────────────┐
│  node     │  subtask id                            │  status     │  progress  │
├───────────┼────────────────────────────────────────┼─────────────┼────────────┤
│  qbam     │  4afd8f8c-1515-11ea-b789-86728d43f73d  │  Finished   │  100.0 %   │
│  qbam     │  4afd8f8d-1515-11ea-8af0-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4afff068-1515-11ea-aa51-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4afff069-1515-11ea-97b6-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b025158-1515-11ea-a155-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b04b374-1515-11ea-8cc4-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b04b375-1515-11ea-b374-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b07162c-1515-11ea-9dbc-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b07162d-1515-11ea-9f83-86728d43f73d  │  Failure    │  0.0 %     │
│  qbam     │  4b0979e4-1515-11ea-88b7-86728d43f73d  │  Failure    │  0.0 %     │
│  vvv lin  │  4b0bdc10-1515-11ea-b955-86728d43f73d  │  Finished   │  100.0 %   │
│  vvv lin  │  4b0bdc11-1515-11ea-9787-86728d43f73d  │  Verifying  │  0.0 %     │
│  vvv lin  │  4b0e3e78-1515-11ea-b5c8-86728d43f73d  │  Verifying  │  0.0 %     │
│  tempest  │  4b0e3e79-1515-11ea-987b-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b10aa8a-1515-11ea-a69d-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b10aa8b-1515-11ea-a931-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b13020a-1515-11ea-9c67-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b13020b-1515-11ea-87ee-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b156fda-1515-11ea-ba3e-86728d43f73d  │  Failure    │  0.0 %     │
│  tempest  │  4b156fdb-1515-11ea-8fc8-86728d43f73d  │  Failure    │  0.0 %     │
│  vvv lin  │  c2c7c7c8-1515-11ea-abcd-86728d43f73d  │  Failure    │  0.0 %     │
│  vvv lin  │  c2c7c7c9-1515-11ea-8f19-86728d43f73d  │  Failure    │  0.0 %     │
│  vvv lin  │  c2ca2ac6-1515-11ea-b0fb-86728d43f73d  │  Failure    │  0.0 %     │
└───────────┴────────────────────────────────────────┴─────────────┴────────────┘
>>

## Proposed Solution?
_(Optional: What could be a solution for that issue)_

ederenn avatar Dec 02 '19 17:12 ederenn