scala-dev icon indicating copy to clipboard operation
scala-dev copied to clipboard

Eliminate unnecessary module loads

Open retronym opened this issue 10 years ago • 16 comments
trafficstars

After inlining the module instance field loads remain - they have to, the static initializer of the module class may have side effects:

object T {
  println("hui")
  @inline def f = 1
}

class C {
  def foo = T.f + T.f
}

generates

  public foo()I
    GETSTATIC T$.MODULE$ : LT$;
    DUP
    IFNONNULL L0
    ACONST_NULL
    ATHROW
   L0
   FRAME SAME1 T$
    ASTORE 1
    ICONST_1
    ISTORE 2
    ILOAD 2
    GETSTATIC T$.MODULE$ : LT$;
    DUP
    IFNONNULL L1
    ACONST_NULL
    ATHROW
   L1
   FRAME FULL [C T$ I] [I T$]
    ASTORE 3
    ICONST_1
    ISTORE 4
    ILOAD 4
    IADD
    IRETURN

The first GETSTATIC is required for the side effect. The second is not. The null checks are not required, module instances are always non-null, but that will be handled by scala-opt/scala#8.

Idea: for module classes we could add a bit to the ScalaInlineInfo attribute which tells whether the module constructor is known to have no side-effects (the module has only defs, for example).

retronym avatar Aug 05 '15 04:08 retronym

When inlining from a module, we end up with a null check for the module:

object O {
  @inline def f = 100
}
class C {
  def t = O.f
}
qsc -Yopt:l:classpath,-copy-propagation Test.scala

gives

  public t()I
   L0
    LINENUMBER 5 L0
    GETSTATIC O$.MODULE$ : LO$;
    DUP
    IFNONNULL L1
    ACONST_NULL
    ATHROW
   L1
   FRAME SAME1 O$
    ASTORE 1
    BIPUSH 100
    ISTORE 2
    ILOAD 2
    IRETURN

enabling copy propagation eliminates the local holding the module:

  public t()I
   L0
    LINENUMBER 5 L0
    GETSTATIC O$.MODULE$ : LO$;
    IFNONNULL L1
    ACONST_NULL
    ATHROW
   L1
   FRAME SAME
    BIPUSH 100
    IRETURN

We can eliminate the null check if the module constructor has a basic shape. A module load can only be null if the module is still being constructed, i.e. if the MODULE$ field is accessed

  • either in a superclass constructor of the module
  • or in a statement preceding the module constructor's super call (early init)

Example with superclass constructor

class C { val b = O }
object O extends C

object Test extends App {
  println(O.b) // null
}

Example with early init:

object A1 extends {
  val e = B1
} with Object {
  val b1 = B1
}

object B1 {
  val a1 = A1
}

object Test extends App {
  println(A1.b1) // B1
  println(B1.a1) // null
  println(A1.e)  // B1
}

We can look at the constructor bytecode for the module and its super classes. If they all don't have side-effects, the module load cannot be null. However, this is a cross-class optimization, so we have to scope it according to the current optimizer level (l:project / l:classpath).

In the same way we can identify modules that don't have any side-effects and eliminate unused loads.

NOTE: the idea of storing the module purity in the InlineInfo attribute (while the module is being compiled) doesn't work: given object O extends S, class S may come from a different library. It would be a bad idea to embed into O information about the specific version of S that was on the classpath while compiling O.

lrytz avatar Nov 04 '15 12:11 lrytz

Some modules have fields, so their initializers are not pure. Maybe: special case some of ours (Predef for example, which has val Map = immutable.Map).

  private val sideEffectFreeModules = Set.empty[String] ++ {
    Iterable("Predef", "Array", "Console", "Function", "None", "Option",
      "Boolean", "Byte", "Char", "Double", "Float", "Int", "Long", "Short", "Unit").map("scala/" + _ + "$")
  }
  private def isSideEffectFreeModuleDesc(s: String) = {
    s.startsWith("L") && s.endsWith(";") && sideEffectFreeModules(s.substring(1, s.length - 1))
  }
  def isSideEffectFreeModuleLoad(insn: AbstractInsnNode) = {
    insn.getOpcode == GETSTATIC && {
      val fi = insn.asInstanceOf[FieldInsnNode]
      sideEffectFreeModules(fi.owner) && fi.name == "MODULE$" && isSideEffectFreeModuleDesc(fi.desc)
    }
  }

lrytz avatar Nov 04 '15 13:11 lrytz

Some WIP code for identifying pure modules. Still need to check super classes (the version here just supports Object), and make it depend on the optimizer level (corss-classpath or not)


  private def findMethods(classNode: ClassNode, name: String): List[MethodNode] = classNode.methods.asScala.find(_.name == name).toList

  private def isModuleClassExtendingObject(classNode: ClassNode) = {
    classNode.name.endsWith("$") &&
      classNode.superName == "java/lang/Object" &&
      classNode.attrs.asScala.exists(a => a.`type` == "Scala" || a.`type` == "ScalaSignature")
  }

  private def basicClassInit(classNode: ClassNode) = findMethods(classNode, CLASS_CONSTRUCTOR_NAME) match {
    case List(m) => m.instructions.iterator.asScala.filter(_.getOpcode >= 0).toList match {
      case List(n: TypeInsnNode, i: MethodInsnNode, r) =>
        n.getOpcode == NEW && n.desc == classNode.name &&
          i.getOpcode == INVOKESPECIAL && i.name == INSTANCE_CONSTRUCTOR_NAME &&
          r.getOpcode == ARETURN

      case _ => false
    }
    case _ => false
  }

  def basicSuperCall(classNode: ClassNode, onlyCheckSuperAndStore: Boolean) = findMethods(classNode, CLASS_CONSTRUCTOR_NAME) match {
    case List(m) => 
      val insns = m.instructions.iterator.asScala.filter(_.getOpcode >= 0)
      val superAndStore = insns.take(4).toList match {
        case List(l1: VarInsnNode, i: MethodInsnNode, l2: VarInsnNode, s: FieldInsnNode) =>
          l1.getOpcode == ALOAD && l1.`var` == 0 &&
            i.getOpcode == INVOKESPECIAL && i.name == INSTANCE_CONSTRUCTOR_NAME && i.owner == "java/lang/Object" &&
            l2.getOpcode == ALOAD && l2.`var` == 0 &&
            s.getOpcode == PUTSTATIC && s.name == "MODULE$" && s.owner == classNode.name

        case _ => false
      }
      if (onlyCheckSuperAndStore) superAndStore else superAndStore && (insns.drop(4).toList match {
        case List(r) => r.getOpcode == RETURN
        case _ => false
      })

    case _ => false
  }

  private def isModuleWithBasicSuperCall(moduleClass: ClassNode): Boolean = {
    isModuleClassExtendingObject(moduleClass) && basicClassInit(moduleClass) && basicSuperCall(moduleClass, onlyCheckSuperAndStore = true)
  }

  def hasOnlyModuleField(classNode: ClassNode) = !classNode.fields.asScala.exists(_.name != "MODULE$")

  private def isSideEffectFreeModuleClass(moduleClass: ClassNode) = {
    isModuleClassExtendingObject(moduleClass) && hasOnlyModuleField(moduleClass) && basicClassInit(moduleClass) && basicSuperCall(moduleClass, onlyCheckSuperAndStore = false)
  }

lrytz avatar Nov 04 '15 13:11 lrytz

once Predef$.MODULE$ loads are eliminated, an inlined implicitly call is zero-cost, like https://github.com/non/imp

lrytz avatar Dec 14 '15 10:12 lrytz

Test case: ArrowAssoc

  @Test
  def arrowAssocInline(): Unit = {
    val code =
      """class C {
        |  def t1 = "a" -> "b"
        |  def t2 = 1 -> true // ArrowAssoc is not specialized, this creates a Tuple2
        |}
      """.stripMargin
    val List(c) = compile(code)
    assertDoesNotInvoke(getSingleMethod(c, "t1"), "$minus$greater$extension")

    assertDoesNotInvoke(getSingleMethod(c, "t2"), "$minus$greater$extension")
    assertInvoke(getSingleMethod(c, "t2"), "scala/runtime/BoxesRunTime", "boxToInteger")

    // TODO: should get rid of a lot more code.
    //   - Predef.ArrowAssoc identity function is called
    //   - null check on Predef.ArrowAssoc$.MODULE$
  }

lrytz avatar Jan 06 '16 11:01 lrytz

FTR, here are the parts of the Scala.js optimizer where we do that:

sjrd avatar Jan 06 '16 12:01 sjrd

/link #112

retronym avatar Mar 20 '17 23:03 retronym

A related benchmark that suggests that module load / null checks might be eliminating the benefit of implicit value classes: https://github.com/isaacl/lila-jmh-benchmarks/commit/c8e767bc9305b9d53b51fd223bf9c06a4e49535b

retronym avatar Mar 20 '17 23:03 retronym

I started playing with @lrytz's code above.

WIP in retronym:ticket/sd16, with a first step of using a hard-coded list of pure modules, and also adding a new inliner heuristic to inline methods of the form xLOAD; xRETURN (which catches, among other things, implicit value class factories)

% type jars
jars is a function
jars ()
{
    command ls -G /Users/jz/.ivy2/local/org.scala-lang/scala-*/2.12.3-bin-$1-SNAPSHOT/jars/scala-*.jar | paste -s -d : -
}

% jardiff -q --git /tmp/sd16-diff $(jars 5dcabaf) $(jars e80f00e)

% git --git-dir /tmp/sd16-diff log --shortstat
commit 7b46aa7c54a6dacc364887f11a18b0cd33800c3c (HEAD -> master, origin/master)
Author: Jason Zaugg <[email protected]>
Date:   Fri Jun 9 09:18:27 2017 +1000

    jardiff textified output of: /Users/jz/.ivy2/local/org.scala-lang/scala-compiler/2.12.3-bin-e80f00e-SNAPSHOT/jars/scala-compiler.jar:/Users/jz/.ivy2/local/org.scala-lang/scala-dist/2.12.3-bin-e80f00e-SNAPSHOT/jars/scala-dist.jar:/Users/jz/.ivy2/local/org.scala-lang/scala-library/2.12.3-bin-e80f00e-SNAPSHOT/jars/scala-library.jar:/Users/jz/.ivy2/local/org.scala-lang/scala-reflect/2.12.3-bin-e80f00e-SNAPSHOT/jars/scala-reflect.jar

 1012 files changed, 23985 insertions(+), 102153 deletions(-)

commit a21407f61ea6d15679e9ea9b5b8fc813a395c9b8
Author: Jason Zaugg <[email protected]>
Date:   Fri Jun 9 09:18:20 2017 +1000

    jardiff textified output of: /Users/jz/.ivy2/local/org.scala-lang/scala-compiler/2.12.3-bin-5dcabaf-SNAPSHOT/jars/scala-compiler.jar:/Users/jz/.ivy2/local/org.scala-lang/scala-dist/2.12.3-bin-5dcabaf-SNAPSHOT/jars/scala-dist.jar:/Users/jz/.ivy2/local/org.scala-lang/scala-library/2.12.3-bin-5dcabaf-SNAPSHOT/jars/scala-library.jar:/Users/jz/.ivy2/local/org.scala-lang/scala-reflect/2.12.3-bin-5dcabaf-SNAPSHOT/jars/scala-reflect.jar

 8369 files changed, 2839334 insertions(+)

https://github.com/retronym/sd16-diff

It makes a modest 2% reduction in the sum size of classfiles in scala-*.jar:

⚡ (for sha in 5dcabaf e80f00e; do git --git-dir /code/scala/.git log -1 --oneline $sha; mkdir -p /tmp/$sha; cd /tmp/$sha; for f in /Users/jz/.ivy2/local/org.scala-lang/scala-*/2.12.3-bin-$sha-SNAPSHOT/jars/scala-*.jar; do jar xf $f; done; gdu -sb /tmp/$sha; done)
5dcabaf753 WIP optimizer: eliminate pure module loads, implicit value class factory calls
48019854	/tmp/5dcabaf
e80f00e911 (HEAD -> ticket/sd16) restarr
47054566	/tmp/e80f00e

I can't measure a difference in compiler performance. Perhaps there are microbenchmarks that would benefit more, like the one I linked above. It might also improve performance in the interpreter or C1-compiled code, but that's just a guess.

retronym avatar Jun 08 '17 23:06 retronym

@lrytz Do you think there is a useful / low-risk patch in here for 2.12.3? I'm thinking we should just retarget this to 2.13.x, and backport the most useful improvements under new opt-in optimizer flags if they are profitable.

retronym avatar Jun 09 '17 00:06 retronym

Not sure about this diff after my change:

image

retronym avatar Jun 09 '17 00:06 retronym

Agree about 2.13.x, especially since they are cleanups that don't seem to affect steady state performance. I don't see how your patch causes the diff in your screenshot, we'd have to check that in detail for sure. I agree special-casing a few modules is a pragmatic approach, as we probably cannot ever analyze Predef to be pure (with all its field initializers). We can do that independently of an analysis for other module classes.

lrytz avatar Jun 09 '17 14:06 lrytz

... or should we just close this as out of scope for Scala 2?

dwijnand avatar Oct 16 '20 15:10 dwijnand

Not sure, Scala 2 has a lot of time to develop :-)

lrytz avatar Oct 19 '20 13:10 lrytz

I think I suggested that because I had approximated that this had behavioural and/or binary impacts that couldn't be made within a 2.13 minor release. But if that's not the case then fixing this might prove to have an amazing benefit.

dwijnand avatar Oct 19 '20 13:10 dwijnand

Given that we're talking about the inliner, binary compatibility is maybe less of a concern

lrytz avatar Oct 19 '20 13:10 lrytz