etna
etna copied to clipboard
[Draft] class `Auto` for automatic optimal model search
N.B. Blocked by #854 , #853
🚀 Feature Request
Create etna.auto.Auto
class which supposed to search optimal solution from defined config pool.
- Config pool could be extended
- We use optuna for search orchestration
- optuna could be parallalized via runners
-
objective
is a decarator for passing additional arguments for optuna objective.
Workflow:
- Init Auto with defined parameters
- start fit with chosen TSDataset, you should specify number of trials or timeout. You could pass
initializer
to init loggers for example orcallback
to customize work with backtest results optionaly. - you could call
stack_best
to get stacking ensemble of the best pipelines or just get the best pipeline - you could get all results with aggregated statistics via
runs_result
method
Proposal
class Auto:
def __init__(
self,
metric: Metric,
metric_aggregation: Literal['mean', 'median'],
backtest_params: Dict,
experiment_folder: str,
horizon: int,
pool: Optional[Pool, List[Pipeline]] = Pool.default,
runner: Runner = LocalRunner,
storage: optuna.BaseStorage = None,
):
pass
def fit(
self,
ts: TSDataset,
timeout: float,
initializer: Callable,
callback: Callable,
**optuna_kwargs,
) -> Pipeline:
def stack_best(
self,
n_best: 5
) -> StackingEnsemble
pass
def runs_result(self) -> DataFrame:
# returns: | Pipeline | metrics | path |
pass
@staticmethod
def objective(
ts: TSDataset,
metric: Metric,
metric_aggregation: Literal['mean', 'median'],
backtest_params: dict,
callback: Optional[Callable] = None,
initializer: Optional[Callable] = None,
) -> Callable[optuna.trial.Trail, float]
""" Return oputna like objective with bactkest running and calling `initializer`, `callback` functions. """
Test cases
No response
Additional context
stateDiagram-v2
fit: Auto.fit
State2: Optuna start search from defined pool
State3: pipeline_1
State4: piepline_n
State5: Optuna storage
fit --> State2
State2 --> State3
State2 --> State4
State3 --> State5
State4 --> State5
note right of State5
Metrics and configs for extended analysis
end note
stack_best: Auto.stack_best
runs_result: Auto.runs_result
note left of runs_result: returns table with fields | pipeline | metric1 | metric2 ... | path |
runs_result --> State5
stack_best --> State5
note left of stack_best
returns StackingEnsemble
end note
note left of fit
start greedysearch with optuna with chosen runner
end note
sequenceDiagram
[*]->>+Auto: fit()
Note over Auto,Auto: create Optuna with storage
Note over Auto,Auto: we use sqlite storage by default
Auto->>Optuna: tune()
Note over Auto,Optuna: run self.Optuna.tune(..., runner=self.runner)
Optuna->>+Runner: __call__()
Note over Optuna,Runner: Optuna.study.optimize is executed in defined Runner enviroment with objective based on backtest
Runner-->>-Optuna: return None
Optuna-->>Auto: return optuna.study
Auto->>Auto: runs_result()
Note over Auto,Auto: filter out best result Pipeline
Auto-->>-[*]: return best Pipeline