Philip May
Philip May
Hi @stefan-it I just wantd to bring your attention to the release of "our" _German colossal, cleaned Common Crawl corpus_: https://german-nlp-group.github.io/projects/gc4-corpus.html It is a massive (450 GB zipped) dataset based...
https://github.com/matusnovak/prometheus-smartctl/blob/3fbd7bbb06b466bfeb7c1eb99aec45a5951f1458/Dockerfile#L12
alternative is missing with some older versions @ scipy.stats.ttest_ind
We do not need to check for pruning here - so is len of current trial == len of best trial -> nothing needs to be done?
https://github.com/PhilipMay/mltb/blob/237d16d6aff35802b414151a353e3cb368b71f3d/mltb/omlflow.py#L94 see here https://docs.python.org/3/howto/logging.html#logging-variable-data
https://github.com/PhilipMay/mltb/blob/45608a01533dd3b2b5e91ea03b0614eed5f4f6f3/mltb/omlflow.py#L63 ```python self._iter_metrics = {} self._next_iter_num = 0 ``` ## ToDo - [ ] eyeball the fix in production
https://github.com/PhilipMay/mltb/blob/cb7f9c97775d9cc8efe4cc3d2f7c8eab0479dea0/mltb/omlflow.py#L93