zinc
zinc copied to clipboard
Writing the analysis file is slow
In a project with many source files, writing the analysis file is slow. On my machine it takes O(500ms) to write the analysis file for a project with 5000 files. Almost all of the time is spent in the google protobuf library's write function. I'm not sure if writing protobufs to disk is inherently slow or whether there is something we can do to make it faster.
It would be good to compare this against the text Analysis file.
@eatkins I think it would be useful if you can try to use the java protobuf encoders/decoders instead of the scala ones, I would expect it to be significantly faster.
I believe that I was incorrect that converting the data to protobuf was expensive. The performance is more or less the same when I use the text analysis file vs. the binary analysis file. I turned on the sbt.analysis.debug.timing system property and got the following output when writing an analysis file for 5000 (which are autogenerated and basically look like { "name": "Foo$N.scala", "content": "package foo; object Foo$N" } for N in [1, 5000]) : [write setup] 1ms [write relations] 171ms [write stamps] 46ms [bytes -> base64] 42ms [byte copy] 4ms [sbinary write] 329ms [write apis] 134ms [write sourceinfos] 22ms [write compilations] 0ms
I'm now wondering if there is an io bottleneck computing the stamps. I haven't looked closely at how the output stamps are generated, but my guess is the bulk of this time is spent hashing the source files and classpath. The source hashes could be read from a cache fairly easily. The classpath hashes could also be provided from a cache as long as we invalidated all of the entries that were modified during incremental compilation. I'm not sure how excited I am to update this since it really only matters for projects with somewhat unreasonably large numbers of source files. Most projects with that many source files are probably split out into smaller subprojects making this issue far less noticeable.