scalafix icon indicating copy to clipboard operation
scalafix copied to clipboard

--diff appears to be slow / not actually using git diff

Open james-s-w-clark opened this issue 1 year ago • 2 comments

I am trying to speed up our CI, which runs stuff like scalafix and scalafmt. scalafmt (with our Mill task, at least) is able to use caching to prevent unnecessary work (inital checks may take a few minutes; further checks with no/small diffs take ~seconds).

Most PRs in CI only change a small percentage of files, so there shouldn't be much to format or fix.

I noted in the documentation there are the --diff and --diff-base branch/tag/commit options for scalafix. I expected that:

  1. files not in git diff would not be checked
  2. possibly even better, only changed content in added/modified/renames would need to be checked
  3. the above would presumably make scalafix much faster with the --diff options, when the percentage of files changed is small. This assumes git is fast, and scalafix has some intense work to do.

I have read your CliGitDiffSuite tests, and it looks like functionality for 1 & 2 is there.

However, at least through the mill-scalafix plugin, --diff actually seems to slow things down.

Here is a link to an issue I opened in mill-scalafix. From a few runs, it looks like --diff is about 50% slower.

In this scalafix repo I tried sbt "scalafixAll --check" and sbt "scalafixAll --check --diff-base main, and didn't see much difference in time taken (even with no diff, on a branch off of main).

james-s-w-clark avatar Dec 15 '22 21:12 james-s-w-clark

scalafmt (with our Mill task, at least) is able to use caching to prevent unnecessary work (inital checks may take a few minutes; further checks with no/small diffs take ~seconds)

I assume that's a feature of the mill-scalafmt plugin? If you were using sbt-scalafix, you would also benefit from incremental runs out of the box. I don't think mill-scalafix has anything like that at the moment.

However, at least through the mill-scalafix plugin, --diff actually seems to slow things down.

Could you at least confirm that there were less files targeted (via --verbose)? I am not familiar with mill, so I don't know what kind of side effects we could have (a significant time might be spent outside scalafix, recompiling sources with semanticdb for example). Could you run the same experiment with the CLI?

In this scalafix repo I tried sbt "scalafixAll --check" and sbt "scalafixAll --check --diff-base main, and didn't see much difference in time taken (even with no diff, on a branch off of main).

As sbt-scalafix is incremental, this test is probably biased. Same suggestion as above: you should try to use the CLI directly if you want to assess the impact of --diff.

bjaglin avatar Dec 15 '22 22:12 bjaglin

Good call on trying with cs's scalafix to remove bias.

I already gave it a shot yesterday:

  • in our private repo, even with no DisableSyntaxs and only 1 rule (didn't matter which) in .scalafix.conf it was unhappy with errors like } expected but identifier found... java.util.List as JList for example. I didn't expect all the rules (1 at a time) to pick up on that.

  • in this repo I got error: Unknown rule 'OrganizeImports'. If I comment that rule out, there's an error for each file about SemanticDB not found.

I'll try a bit more to get cs's scalafix it working on our source code. A full scalafix there takes a lot longer than for this repo.


The mill-scalafix repo runs with coursier's scalafix off of main (or, a branch off that with no changes).

  • scalafix --verbose --check logs info: Processing (1/34) etc., some [DisableSyntax....], and some example fixes (e.g. def procedure() {} -> def procedure(): Unit = {})
  • scalafix --verbose --check --diff-base main also logs the same
    • so, it looks like raw scalafix isn't using the diffing
    • This is on a branch off main, with no code changes.
  • scalafix --verbose --diff-base main also logs the same (and edits 3 files)

The codebase is fairly small here so I can't compare time/performance yet, but it looks like functionally the diff is having no effect.

james-s-w-clark avatar Dec 16 '22 09:12 james-s-w-clark