Dan Saattrup Nielsen issues

Results 74 issues of


Dan Saattrup Nielsen

A general sklearn dependency update

This is an attempt to collect all the sklearn dependency issues currently present. **Presort parameter** `The parameter 'presort' is deprecated and has no effect. It will be removed in v0.24....

Add MuMiN dataset to readme

Adds the [MuMiN dataset](https://mumin-dataset.github.io/) to the readme. (My editor also removed some trailing whitespace, I hope that's okay)

Conformal Quantile Regression was introduced in [Romano, Patterson & Candès](https://arxiv.org/abs/1905.03222) and is a variant of quantile regression which calibrates the prediction intervals, yielding narrower intervals, while preserving theoretical coverage guarantees....

enhancement

IVAPs and CVAPs

The Inductive Venn-Abers predictors (IVAPs) and Cross Venn-Abers predictors (CVAPs) was introduced in [Vovk, Petej & Fedorova (2015)](https://arxiv.org/abs/1511.00213), and can provide lower and upper bounds for probabilities in classification models....

enhancement

Unit tests

So far all tests are doctests, which double as explanatory examples. However, we also need unit tests that test the edge cases of the functions.

enhancement

Add MuMiN dataset

This adds the [MuMiN dataset](https://mumin-dataset.github.io/) to the readme. (My editor also removed trailing whitespace, I hope that's okay)

Add MuMiN dataset to readme

Adds the [MuMiN dataset](https://mumin-dataset.github.io/) to the readme. (My editor also removed trailing whitespace, I hope that's alright)

Add bounds on dependency versions

### 🚀 The feature, motivation and pitch The current `pyproject.toml` file has [dependencies with no lower or upper bounds on them](https://github.com/allenai/OLMo/blob/3be4c1ec367213ff96ccc168ba2f7c27be6d5bc7/pyproject.toml#L15-L54), which means that package management systems that check for...

type/feature

[MODEL EVALUATION REQUEST] Large LLaMA-2 models

**Describe the model** Meta has pretrained and RLHF'd models on primarily English data, but also includes many others. Should be benchmarked on all languages. All models can be found [on...

model evaluation request

large model (>7B)

[FEATURE REQUEST] Support seq-to-seq architectures

### 🚀 The feature, motivation and pitch Supporting these architectures would enable benchmarking popular models such as T5 and BART. Most of the benchmarking suite should be identical to the...

enhancement