etna
etna copied to clipboard
`Dataset` in `etna.datasets`
🚀 Feature Request
class Dataset:
train: TSDataset
test: TSDataset
dataset_path: str # url or repositary url
freq: str
known_future: Optional[List[Feature]]
unknown_future: Optional[List[Feature]]
cache_path: Optional[pathlib.Path]
metadata: dict
tags: List[str]
@property
def train(self) -> TSDataset:
pass
@property
def test(self) -> TSDataset:
pass
class Feature:
# N.B. its just possible fields -- we shoud use dict instead of classes
name: str
type: Union[Literal["categorical"], Literal["numeric"], Literal["str"]]
def m5_generation() -> Dataset:
# code for preprocessing of local saved dataset
def load_dataset(name: str) -> Dataset:
# calling function for generation Dataset
- [ ] Index folder with structure:
etna/datasets/index/m4_monthly.json
...
etna/datasets/index/m5.json
With json configuration for TSDataset creation it could contains specials params for data generation, urls, fields for Dataset
init
-
[ ] We have helper function which produce
Dataset
using information injson
. -
[ ] caching datasets in jsonl #467 format (Optional, in the first itreation we can download dataset everytime)
-
[ ] we should add current code generated datasets and m5
Motivation
Proposal
Test cases
No response
Alternatives
No response
Additional context
#467
Checklist
- [ ] I discussed this issue with ETNA Team