cardano-ledger icon indicating copy to clipboard operation
cardano-ledger copied to clipboard

Avoid forcing the Ledger pulsers when serializing the Ledger State

Open dnadales opened this issue 1 year ago • 2 comments

Storing the ledger state on disk requires forcing the pulser, which consumes a significant amount of CPU time and memory, and raises garbage collection activity significantly. As a consequence, slot leadership checks are missed.

Possible ways to mitigate this are:

  • Store a representation of the thunks involved in the pulser.
  • Drop the information used by the pulser, and restore it when we load the ledger state from disk.

dnadales avatar Mar 11 '24 17:03 dnadales

Some Notes. The pulser is a complicated structure. It is a polymorpjic data type, that is used at only one type. Here is the type it us used at: RewardPulser c ShelleyBase (RewardAns c)) Here are some data type definitions of things stored in the Pulser:

data FreeVars c = FreeVars
  { fvDelegs :: !(VMap VB VB (Credential 'Staking c) (KeyHash 'StakePool c))
  , fvAddrsRew :: !(Set (Credential 'Staking c))
  , fvTotalStake :: !Coin
  , fvProtVer :: !ProtVer
  , fvPoolRewardInfo :: !(Map (KeyHash 'StakePool c) (PoolRewardInfo c))
  }
data RewardAns c = RewardAns
  { accumRewardAns :: !(Map (Credential 'Staking c) (Reward c))
  , recentRewardAns :: !(RewardEvent c)
  }

Instantiating the RewardPulser at the only type it is ever used at, the type of its constructor is thus

 RSLP ::
    !Int ->
    !(FreeVars c) ->
    !(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin)) ->
    !ans ->
    RewardPulser c m ans

Here is a pattern that matches RSLP at runtime. This helps us visualize what is stored in side, and which components need to be serialized, so that only the initial state is serialized.

RSLP itemsPerPulse 
     (FreeVars delegates rewaddress totalstake protocolversion rewinfo)
     itemsLeftToPluse
     (RewardAns accumAns deltaAnsLastPulse)

The itemsPerPulse never changes and is needed to reset to the initial state The whole FreeVars structure never changes, and is needed to reset to the initial state The itemsLeftToPulse, is an intermediate value, The initial itemsPerPulse is not available. The whole RewardAns is an intermediate value. But should be reset to (RewardAns Map.empty Map.empty) for the initial state.

So to reset we need to remember the initial state of the itemsLeftToPulse, which is a huge data structure with type

(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin))

Luckily this structure consists of a wrapper and a huge array. Even luckier, is that the function that updates this at each pulse VMap.splitAt creates a new wrapper, but the huge array remains the same. So we can add two copies with almost no increase in storage size.

So the strategy is to change the Pulser construcot to have the type

 RSLP ::
    !Int ->
    !(FreeVars c) ->
    !(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin)) ->
    -- ^ the initial value
    !(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin)) ->
    -- ^ the current value
    !ans ->
    RewardPulser c m ans

To serialize we get the protVer from the FreeVars to know how to serialize

  1. serialize the Int 2 serialize the Freevars
  2. serialize the initial value

To deserialize

  1. n <- deserialize
  2. free <- deserialize
  3. initial <- desrialize return (RSLP n free initial initial (RewardAs Map.empty Map.empty))

Hopefully completing this pulser will give the same answer as completing the one we serialized. It will have to do some extra work.

TimSheard avatar Mar 12 '24 00:03 TimSheard

Wow, thank you for the excellent analysis. The solution makes sense; I wonder if it'd be possible to benchmark the memory consumption of patch #4196 with the baseline (current master).

dnadales avatar Mar 13 '24 10:03 dnadales