automlbenchmark icon indicating copy to clipboard operation
automlbenchmark copied to clipboard

Benchmark multiple AutoML systems at once

Open PGijsbers opened this issue 4 years ago • 4 comments

When you want to run multiple AutoML systems, you currently have to call the script twice, e.g.:

python runbenchmark.py TPOT test test 
python runbenchmark.py auto-sklearn test test

It would be convenient to allow all a single command to run

  • all frameworks in a definition: python runbenchmark.py frameworks.yaml test test, and/or
  • multiple frameworks comma-separated: python runbenchmark.py TPOT,auto-sklearn test test, and/or
  • multiple frameworks from a text file: python runbenchmark.py frameworksubset.txt test test With the text file containing a single line comma separated as above, or one framework per line.

The first option is convenient if you have your own framework definition from which you want to run all frameworks. The latter two are more convenient (and less error prone) if you are just interested in running a subset of the default frameworks.yaml definition.

PGijsbers avatar Oct 06 '20 16:10 PGijsbers

The more I think about this issue, the less i'm convinced about it.

First, to make this really useful, we need to implement in such a way that all jobs (for all frameworks) are managed by the same queue, otherwise a simple loop like the one in runstable.sh is good enough, and I don't see the need to add complexity inside the app to support this.

for c in ${constraints[*]}; do
    for b in ${benchmarks[*]}; do
        for f in ${frameworks[*]}; do
#            echo "python runbenchmark.py $f $b $c -m $mode -p $parallel $extra_params"
            python runbenchmark.py $f $b $c -m $mode -p $parallel $extra_params
        done
    done
done

The problem with the loop is that each python process must be completed before starting the next one, so we may be waiting for a few trailing jobs before being able to start many ($parallel) new ones... It's also possible to put each of them in the background, but then more difficulties will arise as the parallelism is not under control anymore...

Now, if we decide to support only multiple frameworks, then we get the same issue with multiple benchmarks and multiple constraints as shown in the loop above. Although this would be very practical, this also means adding a lot of complexity in the app to support this...

Will try to figure out how this could be implemented without compromising the existing logic, but I'm afraid this can't be done for v2.

sebhrusen avatar Dec 10 '20 18:12 sebhrusen

I am not familiar enough with the job queuing to weigh in right now, but I am fine providing (approximate) support through the shell script and moving this out of scope for v2.

PGijsbers avatar Dec 11 '20 14:12 PGijsbers

I may have found a way to implement it just by adding a layer on top of the various Benchmark instances: idea is to still create one instance of those per framework-benchmark-constraint combination, create all the jobs for all of them without starting any (or obtain them through a generator, let's see...), and then the layer on top will be in charge of merging them and executing them. The advantage of this approach is that it shouldn't require serious changes in current Benchmark implementations where most of the logic is done.

I still can't promise it for v2 (I have a lot of work to do on H2O side), but I think it's the right way to approach this.

sebhrusen avatar Dec 11 '20 14:12 sebhrusen

Don't fret about pushing this into v2 👍 it's just a nice-to-have

PGijsbers avatar Dec 14 '20 09:12 PGijsbers