cardano-ledger icon indicating copy to clipboard operation
cardano-ledger copied to clipboard

Avoid converting to and from compact representation for multi-assets.

Open dnadales opened this issue 1 year ago • 2 comments

Storing the ledger state on disk requires converting from the compact representation of multi-assets, which consumes a significant amount of memory, and raises garbage collection activity significantly. Consequently, slot leadership checks are missed.

dnadales avatar Mar 11 '24 17:03 dnadales

Digging around trying to understand why this might be expensive. So the most expensive thing to Serialize is the UTxO. The UTxO has TxOut, and TxOut have Value, and Value might have MultiAsset. Here is how the Alonzo TxOut is serialized.


instance
  (Era era, Val (Value era)) =>
  EncCBOR (AlonzoTxOut era)
  where
  encCBOR (TxOutCompact addr cv) =
    encodeListLen 2
      <> encCBOR addr
      <> encCBOR cv
  encCBOR (TxOutCompactDH addr cv dh) =
    encodeListLen 3
      <> encCBOR addr
      <> encCBOR cv
      <> encCBOR dh

We will concentrate on the pattern TxOutCompact (the pattern TxOutCompactDH is similar)

pattern TxOutCompact ::
  (Era era, Val (Value era), HasCallStack) =>
  CompactAddr (EraCrypto era) ->
  CompactForm (Value era) ->
  AlonzoTxOut era
pattern TxOutCompact addr vl <-
  (viewCompactTxOut -> (addr, vl, SNothing))
  where
    TxOutCompact cAddr cVal = mkTxOutCompact (decompactAddr cAddr) cAddr cVal SNothing

Note how the cVal has type (CompactForm Value). How is that serialized?

instance Crypto c => Compactible (MaryValue c) where
  newtype CompactForm (MaryValue c) = CompactValue (CompactValue c)
    deriving (Eq, Typeable, Show, NoThunks, **EncCBOR**, DecCBOR, NFData)
  toCompact x = CompactValue <$> to x
  fromCompact (CompactValue x) = from x

So the EncCBOR is derived, so that means it uses the instance (EncCBOR CompactValue) which is this

instance Crypto c => EncCBOR (CompactValue c) where
  encCBOR = encCBOR . from

So the from translates into a Map of Map and that is serialized. We could probably just use the bytes inside CompactValue instead, But that would break backward compatibility, and we couldn't synch the chain.

What we need is 2 specially designed function just for nodes to snap shot the ledger state, which are not used for synching the chain, but a lot in common with the current serialisers, except on the TxOut of the UTxO.

TimSheard avatar Mar 25 '24 17:03 TimSheard

Fortunately, if we use the Coders library that is not very much code at all. For making a snapshot

-- | An Encode function for NewEpochState that does something different with the TxOut in the UTxO
--   (encode (serialNES encodeTxOut nes)) can be used just like (encCBOR nes)
serialNES ::
  ( Era era
  , EncCBOR (TxOut era)
  , EncCBOR (GovState era)
  , EncCBOR (StashedAVVMAddresses era)
  ) => (TxOut era -> Encoding) -> NewEpochState era -> Encode ('Closed 'Dense) (NewEpochState era)
serialNES encTxOut (NewEpochState e bp bc es ru pd av) =
  (Rec NewEpochState
   !> To e
   !> To bp
   !> To bc
   !> (let EpochState acct ls snap myop = es
       in (Rec EpochState
           !> To acct
           !> (let LedgerState utxo cert = ls
               in (Rec (\ cert u -> LedgerState u cert)
                   !> To cert -- certstate first to improve sharing
                   !> (let UTxOState (UTxO u) dp fs us sd don = utxo
                       in (Rec (\ u -> UTxOState (UTxO u))
                           !> E (encodeMap encCBOR encTxOut) u
                           !> To dp
                           !> To fs
                           !> To us
                           !> To sd
                           !> To don))))
           !> To snap
           !> To myop))
   !> To ru
   !> To pd
   !> To av)

And if we ever have to recover from a Snapshot we can use

-- | An Decode function for NewEpochState that does something different with the TxOut in the UTxO
--   It can Decode what 'serialNES' encodes, and can use more efficient algorithms on TxOut
--   in particular how it handles the compact form of MultiAsset.
deSerialNES ::
  ( Era era
  , DecCBOR (TxOut era)
  , DecCBOR (GovState era) -- DecShareCBOR ??
  , DecCBOR (CertState era)
  , DecCBOR (StashedAVVMAddresses era)
  , DecCBOR (IncrementalStake (EraCrypto era)) -- FIXME we have DecShareCBOR instance
  , DecCBOR (NonMyopic (EraCrypto era))  
  ) => (forall s . Decoder s (TxOut era)) -> Decode ('Closed 'Dense) (NewEpochState era)
deSerialNES decTxOut =
  (RecD NewEpochState
   <! From 
   <! From 
   <! From 
   <! (RecD EpochState
       <! From 
       <! (RecD (\ cert utxo -> LedgerState utxo cert)
           <! From  -- Cert state first to improve sharing
           <! (RecD (\ u -> UTxOState (UTxO u))
               <! D (decodeMap decCBOR decTxOut)
               <! From 
               <! From 
               <! From 
               <! From 
               <! From))
           <! From 
           <! From)
   <! From
   <! From
   <! From)

There are a few details about sharing that still need to be worked out.

TimSheard avatar Mar 25 '24 20:03 TimSheard

Closing this ticket as duplicate of #4078

lehins avatar Nov 14 '24 00:11 lehins