ouroboros-network icon indicating copy to clipboard operation
ouroboros-network copied to clipboard

Control block forging through NodeKernel

Open coot opened this issue 3 years ago • 3 comments
trafficstars

Address #3159 on the ouroboros-consensus side.

  • Fixed a typo
  • Added setBlockForging to NodeKernel
  • Updated ouroboros-consensus-test
  • Updated ouroboros-consensus-mock
  • Updated ouroboros-consensus-mock-test
  • Updated ouroboros-consensus-byron
  • Updated ouroboros-consensus-byron-test
  • Updated ouroboros-consensus-shelley
  • Updated ouroboros-consensus-shelley-test
  • Updated ouroboros-consensus-cardano library
  • Updated ouroboros-consensus-cardano:db-analyzer
  • Updated ouroboros-consensus-cardano-test

Checklist

  • Branch
    • [x] Commit sequence broadly makes sense
    • [x] Commits have useful messages
    • [x] New tests are added if needed and existing tests are updated
    • [x] If this branch changes Consensus and has any consequences for downstream repositories or end users, said changes must be documented in interface-CHANGELOG.md
    • [x] If this branch changes Network and has any consequences for downstream repositories or end users, said changes must be documented in interface-CHANGELOG.md
    • [x] If serialization changes, user-facing consequences (e.g. replay from genesis) are confirmed to be intentional.
  • Pull Request
    • [x] Self-reviewed the diff
    • [x] Useful pull request description at least containing the following information:
      • What does this PR change?
      • Why these changes were needed?
      • How does this affect downstream repositories and/or end-users?
      • Which ticket does this PR close (if any)? If it does, is it linked?
    • [x] Reviewer requested

coot avatar Jun 08 '22 15:06 coot

I am getting some test failures:

ouroboros-consensus-byron-test

byron
  Byron
    simple convergence: FAIL (295.37s)
      *** Failed! Falsified (after 4 tests):
      TestSetup {setupEBBs = NoEBBs, setupK = SecurityParam 1, setupTestConfig = TestConfig {initSeed = Seed 4918591009071197805, nodeTopology = NodeTopology (fromList [(CoreNodeId 0,fromList []),(CoreNodeId 1,fromList [CoreNodeId 0]),(CoreNodeId 2,fromList [CoreNodeId 0,CoreNodeId 1]),(CoreNodeId 3,fromList [CoreNodeId 1])]), numCoreNodes = NumCoreNodes 4, numSlots = NumSlots 67}, setupNodeJoinPlan = NodeJoinPlan (fromList [(CoreNodeId 0,SlotNo 1),(CoreNodeId 1,SlotNo 1),(CoreNodeId 2,SlotNo 1),(CoreNodeId 3,SlotNo 35)]), setupNodeRestarts = NodeRestarts (fromList [(SlotNo 21,fromList [(CoreNodeId 1,NodeRestart)]),(SlotNo 25,fromList [(CoreNodeId 1,NodeRestart)]),(SlotNo 28,fromList [(CoreNodeId 0,NodeRestart)]),(SlotNo 29,fromList [(CoreNodeId 1,NodeRestart)]),(SlotNo 33,fromList [(CoreNodeId 1,NodeRestart)]),(SlotNo 34,fromList [(CoreNodeId 2,NodeRestart)]),(SlotNo 35,fromList [(CoreNodeId 3,NodeRekey)]),(SlotNo 37,fromList [(CoreNodeId 1,NodeRestart)]),(SlotNo 44,fromList [(CoreNodeId 0,NodeRestart)]),(SlotNo 45,fromList [(CoreNodeId 1,NodeRestart)]),(SlotNo 55,fromList [(CoreNodeId 3,NodeRestart)]),(SlotNo 56,fromList [(CoreNodeId 0,NodeRestart)]),(SlotNo 62,fromList [(CoreNodeId 2,NodeRestart)]),(SlotNo 63,fromList [(Cor

... 
     
      consensus expected: True
      maxForkLength: 0
      There were unexpected CannotForges: fromList [(SlotNo 55,[PBftCannotForgeInvalidDelegation (KeyHash {unKeyHash = 911965b94fe206522fe8fb1683abf0bd39d9f092e9c9553ddbe35ae0})]),(SlotNo 59,[PBftCannotForgeInvalidDelegation (KeyHash {unKeyHash = 911965b94fe206522fe8fb1683abf0bd39d9f092e9c9553ddbe35ae0})]),(SlotNo 63,[PBftCannotForgeInvalidDelegation (KeyHash {unKeyHash = 911965b94fe206522fe8fb1683abf0bd39d9f092e9c9553ddbe35ae0}),PBftCannotForgeInvalidDelegation (KeyHash {unKeyHash = 911965b94fe206522fe8fb1683abf0bd39d9f092e9c9553ddbe35ae0})])]
      Use --quickcheck-replay=254433 to reproduce.

It's interesting that the in the slot 55 the node was not able to produce a block, and that's exactly the slot at which the node 3 was scheduled to restart (if I interpret the TestSetup correctly).

The full ouroboros-consensus-byron-test log.

ouroboros-consensus-cardano-test

And also another one in ouroboros-consensus-cardano-test:

          Exception thrown while showing test case:
            Assertion failed
            CallStack (from HasCallStack):
              assert, called at src/Cardano/Crypto/KES/Mock.hs:98:9 in cardano-crypto-class-2.0.0-49b4213a40d8c0265da39810113c14e195218a76babd0ce512365b752abc1e6e:Cardano.Crypto.
KES.Mock
              signKES, called at src/Cardano/Crypto/KES/Class.hs:355:40 in cardano-crypto-class-2.0.0-49b4213a40d8c0265da39810113c14e195218a76babd0ce512365b752abc1e6e:Cardano.Cry
pto.KES.Class
          
          Use --quickcheck-replay=900463 to reproduce.
          Use -p '/SerialiseDisk.roundtrip Header/' to rerun this test only.

The last one is because this assertion failure.

The full ouroboros-consensus-cardano-test log.

coot avatar Jun 14 '22 13:06 coot

Ah ha! Hydra built green with my typo fixup! 🙌

nfrisby avatar Jun 29 '22 00:06 nfrisby

I'm mentally gassed at this point in my day. I'll do a last pass tomorrow during/after our call.

nfrisby avatar Jun 29 '22 00:06 nfrisby

Hmm. Do we also need the shutdown logic of the addBlockRunner thread to flush the cdbBlocksToAdd queue, sending Nothing to every request it hadn't started processing before it begin terminating?

@nfrisby do we ever restart the ChainDB? The queue is created when we start the ChainDB, if it's has the same life time as ChainDB then it doesn't matter, new ChainDB will anyway use new queue. But I suspect we never re-open ChainDB, is this right?

coot avatar Nov 08 '22 19:11 coot

But I suspect we never re-open ChainDB, is this right?

That is right. Some tests do it, but not the implementation.

nfrisby avatar Nov 15 '22 20:11 nfrisby

Closed in favor of https://github.com/input-output-hk/ouroboros-consensus/pull/140

bolt12 avatar Jun 07 '23 14:06 bolt12