Avoid forcing the Ledger pulsers when serializing the Ledger State
Storing the ledger state on disk requires forcing the pulser, which consumes a significant amount of CPU time and memory, and raises garbage collection activity significantly. As a consequence, slot leadership checks are missed.
Possible ways to mitigate this are:
- Store a representation of the thunks involved in the pulser.
- Drop the information used by the pulser, and restore it when we load the ledger state from disk.
Some Notes. The pulser is a complicated structure. It is a polymorpjic data type, that is used at only one type. Here is the type it us used at: RewardPulser c ShelleyBase (RewardAns c)) Here are some data type definitions of things stored in the Pulser:
data FreeVars c = FreeVars
{ fvDelegs :: !(VMap VB VB (Credential 'Staking c) (KeyHash 'StakePool c))
, fvAddrsRew :: !(Set (Credential 'Staking c))
, fvTotalStake :: !Coin
, fvProtVer :: !ProtVer
, fvPoolRewardInfo :: !(Map (KeyHash 'StakePool c) (PoolRewardInfo c))
}
data RewardAns c = RewardAns
{ accumRewardAns :: !(Map (Credential 'Staking c) (Reward c))
, recentRewardAns :: !(RewardEvent c)
}
Instantiating the RewardPulser at the only type it is ever used at, the type of its constructor is thus
RSLP ::
!Int ->
!(FreeVars c) ->
!(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin)) ->
!ans ->
RewardPulser c m ans
Here is a pattern that matches RSLP at runtime. This helps us visualize what is stored in side, and which components need to be serialized, so that only the initial state is serialized.
RSLP itemsPerPulse
(FreeVars delegates rewaddress totalstake protocolversion rewinfo)
itemsLeftToPluse
(RewardAns accumAns deltaAnsLastPulse)
The itemsPerPulse never changes and is needed to reset to the initial state The whole FreeVars structure never changes, and is needed to reset to the initial state The itemsLeftToPulse, is an intermediate value, The initial itemsPerPulse is not available. The whole RewardAns is an intermediate value. But should be reset to (RewardAns Map.empty Map.empty) for the initial state.
So to reset we need to remember the initial state of the itemsLeftToPulse, which is a huge data structure with type
(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin))
Luckily this structure consists of a wrapper and a huge array. Even luckier, is that the function that updates this at
each pulse VMap.splitAt creates a new wrapper, but the huge array remains the same. So we can add two copies
with almost no increase in storage size.
So the strategy is to change the Pulser construcot to have the type
RSLP ::
!Int ->
!(FreeVars c) ->
!(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin)) ->
-- ^ the initial value
!(VMap.VMap VMap.VB VMap.VP (Credential 'Staking c) (CompactForm Coin)) ->
-- ^ the current value
!ans ->
RewardPulser c m ans
To serialize we get the protVer from the FreeVars to know how to serialize
- serialize the Int 2 serialize the Freevars
- serialize the initial value
To deserialize
- n <- deserialize
- free <- deserialize
- initial <- desrialize return (RSLP n free initial initial (RewardAs Map.empty Map.empty))
Hopefully completing this pulser will give the same answer as completing the one we serialized. It will have to do some extra work.
Wow, thank you for the excellent analysis. The solution makes sense; I wonder if it'd be possible to benchmark the memory consumption of patch #4196 with the baseline (current master).