plutus icon indicating copy to clipboard operation
plutus copied to clipboard

PIR case-of-case is exponential and causes OOMs

Open koslambrou opened this issue 1 year ago • 6 comments

Summary

After updating to the latest plutus-tx, we noticed some significant regression in the instances generated by PlutusTx.makeIsDataIndexed ''C [...] .

Compilation times skyrocketed and memory usage is between 60-100GB of RAM (I have a lot of memory).

The issue seems to be the simplification process of the script compilation. For now, we reduced as much as possible the simplifier iterations with PlutusTx.Plugin:max-simplifier-iterations-pir=1. After that change, memory usage is back to normal.

I don't know exactly the issue, but I started by writing my own UnsafeFromData instance for C which resulted in the same issue. Then, I noticed that the amount of used memory seemed to be proportional to the number of constructors of C.

Steps to reproduce the behavior

For a minimal reproductible example:

import PlutusTx qualified
import PlutusTx.Prelude hiding (Semigroup (..), unless)
import Prelude qualified as H

data C = A | B | C | D | E | F | G | H | I | J | K | L | M | N | O | P | Q | R

PlutusTx.makeIsDataIndexed ''C [ ('A, 0) , ('B, 1) , ('C, 2) , ('D, 3) , ('E, 4) , ('F, 5) , ('G, 6) , ('H, 7) , ('I, 8) , ('J, 9) , ('K, 10) , ('L, 11) , ('M, 12) , ('N, 13) , ('O, 14) , ('P, 15) , ('Q, 16) , ('R, 17) ]

{-# INLINABLE validator #-}
validator ::
  BuiltinData
  -> BuiltinData
  -> BuiltinData
  -> Bool
validator d _ _ =
  let !_ = PlutusTx.unsafeFromBuiltinData @C d
   in True

unappliedValidator ::
  PlutusTx.CompiledCode (PlutusTx.BuiltinData -> PlutusTx.BuiltinData -> PlutusTx.BuiltinData -> BuiltinUnit)
unappliedValidator =
   $$(PlutusTx.compile [|| \d r c -> check $ validator d r c ||])

Actual Result

No response

Expected Result

No response

Describe the approach you would take to fix this

No response

System info

Plutus: 88ad493ad94ecdb34196f7a9aef558613bdee8b2

koslambrou avatar Jun 06 '24 13:06 koslambrou

There's no comment or any other indication on how this issue was triaged and why and I haven't seen any internal discussion on when we're going to address the issue and who's responsible for making sure we will, so I'm "untriaging" the issue. Sorry!

effectfully avatar Jun 13 '24 02:06 effectfully

Thank you for creating the issue and the PR with a reproducer! This is a perfect bug report.

I just took a look and the bug is indeed in the case-of-case transformation as I originally guessed. Defining caseOfCase like this:

caseOfCase binfo conservative newAnn t = pure t

makes exponential memory consumption vanish.

The culprit is most likely a known bug (see this thread) in the implementation of case-of-case.

effectfully avatar Jun 25 '24 04:06 effectfully

Oh and sorry it took so long for us to look into the issue! And thank you again for the amazing report.

effectfully avatar Jun 25 '24 04:06 effectfully

I thought I might be able to sketch a quick solution, but it didn't work out

The culprit is most likely a known bug (see this thread) in the implementation of case-of-case.

I've tried making caseOfCase do caseReduce to remove the exponential blow up in this case, like this:

-    liftQuote $ transformMOf termSubterms (processTerm binfo vinfo conservative newAnn) t
+        procTerm = fmap CaseReduce.caseReduce . processTerm binfo vinfo conservative newAnn
+    liftQuote $ transformMOf termSubterms procTerm t

It didn't help, so the problem must be elsewhere.

I've also tried making

-        , _posCaseOfCaseConservative = False
+        , _posCaseOfCaseConservative = True

as non-conservative case-of-case can also be exponential. Didn't help either.

Note [Case-of-case and duplicating code] described another source of exponentiality:

If the inner branch bodies are conapps (see Note [Case-of-case and conapps]) then
that will reduce but not necessarily eliminate the duplication. Consider

case (case s of { C1 a -> True; C2 b -> True; C3 c -> False }) of
  True -> v
  False -> w

--> (case-of-case)

case s of
  C1 a -> case True of { True -> v; False -> w; }
  C2 b -> case True of { True -> v; False -> w; }
  C3 c -> case False of { True -> v; False -> w; }

--> (case-of-known-constructor)

case s of
  C1 a -> v
  C2 b -> v
  C3 c -> w

i.e. we still duplicate v!

Maybe it's that, maybe something else. Fixing this bug requires taking a deep dive and looking at what actually happens during optimization.

In the meantime I think we should simply turn of case-of-case with all its many a source of exponentiality.

effectfully avatar Jun 25 '24 17:06 effectfully

I've now removed case-of-case entirely from the optimization pipeline in #6248.

Please check if this addresses your problem.

I'll keep the issue open until we figure out what went wrong with case-of-case and reintroduce it into the optimization pipeline.

effectfully avatar Jun 27 '24 08:06 effectfully

It seems turning case-of-case off solved performance issues for a lot of folks. We should fix the issues with the case-of-case pass and enable it back but in meantime I'm going to claim that it's a low priority task and move it back to the backlog.

effectfully avatar Jul 15 '24 14:07 effectfully