codeql New atm features rebased

This PR

This takes the new-atm-features branch (but excludes a few experimental commits near its top), rebases it on top of current main, and removes the obsolete features that have been replaced.

Aug 11 '22 08:08 kaeluka

Before we can merge this, we need to (at least) train/evaluate/release the models, and add a commit here that makes the branch depend on the new model.

Aug 11 '22 10:08 kaeluka

I just requested a review from @henrymercer - this is a pretty big change set. This is the branch we're aiming to release, as discussed yesterday evening.

We've been using most of it for many months now, which IMO means it'd be most important to take a look at the changes I've done today.

I've spotted one minor bug in one of the features when doing this, but I'd rather keep everything as stable as possible to minimize changes as much as possible (the bug also affects current main).

Aug 11 '22 11:08 kaeluka

Experiment came back, 34% slowdown. @tiferet, is this a blocker?

Aug 16 '22 06:08 kaeluka

Experiment came back, 34% slowdown. @tiferet, is this a blocker?

I think that's a question for @henrymercer: is this still within the acceptable time cutoff relative to a run without boosted queries?

Aug 16 '22 11:08 tiferet

The KR is 25%. I think we should spend some time optimizing the new features to keep us within this performance budget.

Aug 16 '22 11:08 henrymercer

Experiment came back, 34% slowdown. @tiferet, is this a blocker?

Is this a 34% slowdown relative to the unboosted version or relative to the current version of ATM?

Also, @kaeluka do you have ideas for how to optimize this? We should do that before proceeding with the new data generation, because we want to generate data from the final version of the PR exactly as we're going to merge it.

Aug 16 '22 12:08 tiferet

Is this a 34% slowdown relative to the unboosted version or relative to the current version of ATM?

Relative to the unboosted version :)

Also, @kaeluka do you have ideas for how to optimize this? We should do that before proceeding with the new data generation, because we want to generate data from the final version of the PR exactly as we're going to merge it.

Optimizing CodeQL performance is one area where I know next to nothing :( the best I can do is to look for "obvious" mistakes, but in my experience the obvious ones I'm able to recognize are also caught by QL-for-QL (and QL-for-QL is coming up with nothing for this PR). I'd be very eager to help out on the optimization, but I think I can't drive this.

Aug 16 '22 12:08 kaeluka

codeql codeql copied to clipboard

New atm features rebased

This PR

codeql
codeql copied to clipboard