agda Exponential module chain leads to infeasible scope checking

-- {-# OPTIONS -v scope:10 #-}

data Bool : Set where
  true false : Bool

postulate
  f : Bool → Bool
  g : Bool → Bool
  P : Bool → Set

module M0 (x : Bool) where
  postulate
    a : P (f x)
    b : P (g x)

module       M1 (x : Bool) where
  module M = M0
  module N = M (f x)
  module O = M (g x)

module       M2 (x : Bool) where
  module M = M1
  module N = M (f x)
  module O = M (g x)

module       M3 (x : Bool) where
  module M = M2
  module N = M (f x)
  module O = M (g x)

module       M4 (x : Bool) where
  module M = M3
  module N = M (f x)
  module O = M (g x)

module       M5 (x : Bool) where
  module M = M4
  module N = M (f x)
  module O = M (g x)

module       M6 (x : Bool) where
  module M = M5
  module N = M (f x)
  module O = M (g x)

module       M7 (x : Bool) where
  module M = M6
  module N = M (f x)
  module O = M (g x)

module       M8 (x : Bool) where
  module M = M7
  module N = M (f x)
  module O = M (g x)

{- DONT DO THIS AT HOME:
module       M9 (x : Bool) where
  module M = M8
  module N = M (f x)
  module O = M (g x)
-}

On a Dell E7440 (16GB, SSD), up to and including M8, it takes 40sec (16GB allocated), including M8 it takes 5min 40 sec (54GB allocated).

This performance bottleneck can probably be attributed to the eager copying behavior of module aliases.

Sep 13 '15 10:09 andreasabel

Basically all the time is spent on serialisation, and removing the insideScope from the interface had no measurable effect on the time, but changing the copied definitions to dummy definitions buys a factor 4 on serialisation.

One thing to try could be to remove copied definitions from the signature in the interface file and just store the fact that a module application needs to happen when you load it. It would mean more (potentially a lot?) of extra work when loading interface files though, so it's not clear that it would be worth it.

I've fixed things so that you only copy definitions that you actually bring into scope (with the help of an old reverted commit by Nisse: e6cb239e), so the simple solution to these problems is to keep the number of definitions you export from applied modules at a sensible level (that means no exponential module chains!).

No changes:

Total                      12,135ms
Miscellaneous                   2ms
Parsing                         1ms
Parsing.Operators               4ms
Import                         10ms
Deserialization                 0ms
Scoping                       882ms
Typing                        917ms
Typing.OccursCheck              0ms
Termination                     0ms
Positivity                      0ms
Injectivity                     0ms
ProjectionLikeness              0ms
Highlighting                    2ms
Serialization               8,278ms
Serialization.Sort             84ms
Serialization.BinaryEncode  1,859ms
Serialization.Compress         82ms
ModuleName                      9ms

No scope in interface

Total                      12,128ms
Miscellaneous                   2ms
Parsing                         1ms
Parsing.Operators               4ms
Import                         10ms
Deserialization                 0ms
Scoping                       966ms
Typing                      1,161ms
Typing.OccursCheck              0ms
Termination                     0ms
Positivity                      0ms
Injectivity                     0ms
ProjectionLikeness              0ms
Highlighting                    2ms
Serialization               8,350ms
Serialization.Sort             76ms
Serialization.BinaryEncode  1,463ms
Serialization.Compress         66ms
ModuleName                     22ms

Minimal copied definitions in signature (all copies replaced by postulate f : Setω).

Total                      3,879ms
Miscellaneous                  3ms
Parsing                        1ms
Parsing.Operators              4ms
Import                         5ms
Deserialization                0ms
Scoping                      848ms
Typing                       288ms
Typing.OccursCheck             0ms
Termination                    0ms
Positivity                     0ms
Injectivity                    0ms
ProjectionLikeness             0ms
Highlighting                   4ms
Serialization              2,038ms
Serialization.Sort            32ms
Serialization.BinaryEncode   589ms
Serialization.Compress        27ms
ModuleName                    35ms

Sep 18 '15 22:09 UlfNorell

The module alias isn't causing the performance problem. Replacing it by a module application has the same performance characteristics:

data Bool : Set where
  true false : Bool

postulate
  f : Bool → Bool
  g : Bool → Bool
  h : Bool → Bool
  P : Bool → Set

module M0 (x : Bool) where
  postulate
    a : P (f x)
    b : P (g x)

module       M1 (x : Bool) where
  module M = M0 (h x)
  module N = M0 (f x)
  module O = M0 (g x)

module       M2 (x : Bool) where
  module M = M1 (h x)
  module N = M1 (f x)
  module O = M1 (g x)

module       M3 (x : Bool) where
  module M = M2 (h x)
  module N = M2 (f x)
  module O = M2 (g x)

module       M4 (x : Bool) where
  module M = M3 (h x)
  module N = M3 (f x)
  module O = M3 (g x)
...

May 28 '18 14:05 UlfNorell

Here is an example without application that exhibits exponential scoping performance:

module A0 where

module A1 where
  module U = A0
  module V = A0

module A2 where
  module U = A1
  module V = A1

module A3 where
  module U = A2
  module V = A2

module A4 where
  module U = A3
  module V = A3

module A5 where
  module U = A4
  module V = A4

module A6 where
  module U = A5
  module V = A5

module A7 where
  module U = A6
  module V = A6

module A8 where
  module U = A7
  module V = A7

module A9 where
  module U = A8
  module V = A8

module A10 where
  module U = A9
  module V = A9

module A11 where
  module U = A10
  module V = A10

module A12 where
  module U = A11
  module V = A11

Sep 26 '22 20:09 sattlerc

That might be what's affecting some of the deeper modules in agda-categories. I've had to remove a number of 'helper modules' to help with type checking time. Those helpers really were... helpful.

Sep 26 '22 20:09 JacquesCarette

I suspect that also may have been a factor in the cost explosion in my early attempts at formalizing category theory (I never got beyond interchange of vertical and horizontal composition of natural transformations). I used to think module definition statements were evaluated lazily (like other definitions) and liberally used nested helper modules and also shorthands like:

record Functor (C D : Category) [...] where
  module C = Category C
  module D = Category D
  [...]

record NatTrans [...] where
  module F = Functor F
  module G = Functor G
  [...]

Relatedly, I suspect the performance of formalizations of abstract areas like category theory with lots of iterated type dependency would be a lot better if we had a proper parametric module system that does away with the lambda lifting hack (so that within a module scope, the module parameters are really fixed). Lambda lifting ought to happen only at an explicit open module statement. Instead, there should be a syntax for applying a module in an expression, so that given

module M (x y : A) where
  f : (z : B) → C

we can write something like (M x y).f z or M[x, y].f z instead of M.f x y z. This would also help with the surface-level problem of argument shifting for mixfix operators that will be familiar to anyone who has seen Agda print something like (C Category.∘ g) f.

Sep 26 '22 21:09 sattlerc