linter icon indicating copy to clipboard operation
linter copied to clipboard

Compilation time explodes in combination with shapeless

Open vdichev opened this issue 8 years ago • 2 comments

When a project has converted some data structures to e.g. shapeless records, compilation time has naturally increased. However, using linter, it spends a disproportionate amount of time in the linter-typed and linter-typed-inter. I've done some measurements using "-verbose" and the scalac-aspects project (https://github.com/gkossakowski/scalac-aspects) and for a particular file:

  • Without sqltyped, without linter: total: 917197, typer: 360658
  • without sqltyped, with linter: total: 1766577, typer: 513275, linter-typed: 392068, linter-typed-interpreter: 221284
  • with sqltyped, without linter: total: 5001996, typer: 3124241
  • with sqltyped, with linter: total: 32682659, typer: 3281671, linter-typed: 15140647, linter-typed-interpreter: 12101858

So while linter phases took about 35% of the time without sqltyped, they took a whopping 83% with sqltyped. So while the time in typer increased 8-9 times, linter phases increased 43 times!

I guess this is because shapeless generates a very deep type tree, but I didn't expect the linter time to grow so drastically. Do you have an idea how this could be improved? Perhaps a way to indicate to the traverser to prune a type branch and not process it at all? For instance, if there is a shapeless HList, this might indicate a very deep type structure and could be skipped, as there isn't much value in statically checking it. Maybe if shapeless indicates in some way that some of the members are synthetically generated?

Any tips are welcome.

vdichev avatar May 20 '16 13:05 vdichev

I have a few ideas what might improve the situation with very complex types: One is that almost all types are put into a map in the first phase to later determine which types are inferred and some other use I can't remember, but those checks can maybe be refactored to work without this map or as you've said - discarded if the types get too deep. Next thing is that more synthetic trees could probably be skipped, but determining where to add these tree.isSynthetic checks to the pattern matches could be hard (unless a profiler pinpoints some obvious candidates). The third one is refactoring hacky checks, which convert the ast or type tree into string for checking, or do another deep tree operation, to maybe not do it for every level again and again.

I'd be a good idea to add some big complex test cases, and run the tests with a profiler. I don't know how to write a test that would be a good approximation of shapeless code, without including shapeless in test (which is also an option if necessary).

Related: https://github.com/HairyFotr/linter/issues/20 (I see I'm really bad with time estimates :))

You wanna try any of these, or have other thoughts? There might be (ironically) very messy code around the parts I mentioned, so please don't bother too much if it looks bad. Also, I'll be travelling / camping for a few weeks now, so my responses might be slow as I don't know when I'll have internet, or time and means to write or test code.

HairyFotr avatar May 21 '16 21:05 HairyFotr

Thanks for the ideas! I know performance is often not the first priority and it certainly takes a lot of time to measure and improve. Let me check which area would yield biggest returns.

vdichev avatar May 25 '16 07:05 vdichev