automlbenchmark icon indicating copy to clipboard operation
automlbenchmark copied to clipboard

Define abstractions for framework integration

Open sebhrusen opened this issue 3 years ago • 2 comments

The goal is to provide an abstraction and default implementation(s) for most common scenarios. This would also allow frameworks to support several versions easily. Finally, and more structured framework runner will simplify the integration effort and standardize support for extra features like the _save_artifacts param.

1st suggestion (incomplete, and will change):

class FrameworkRunner:
    def __init__(self, config, dataset): pass
    def prepare_data(self): pass
    def fit(self, …): pass
    def predict(self, …): pass
    def get_result(self): pass
    def save_artifacts(self): pass

sebhrusen avatar Apr 06 '21 15:04 sebhrusen

Based on our discussion, we should include some type of recovery mode.

PGijsbers avatar Feb 21 '22 19:02 PGijsbers

With this refactor, it will also be easier to use the various "checkpoints" to store partial results and/or have more dynamic time cut-offs. For example, the one hour time limit could be more strictly enforced for just the fit call, while being (much) more lenient in phases after fit, as compared to having a single large budget for all phases combined. This will avoid both scenarios where EC2 instances live needlessly long because they get hung in a fit call and those were results are incompletely merely because the predict part took longer.

PGijsbers avatar Jun 21 '23 11:06 PGijsbers