Felix Schneider issues

Results 5 issues of


                                            Felix Schneider

Inconsistencies between documentation and API

There are several issues with the documentation and the `.pyi` stub files: - The documentation does not mention decoders at all. - In the stub file for `Tokenizer`, all of...

No processors.Sequence

It would be good if there was a `processors.Sequence`, similar to `pre_tokenizers.Sequence`. Right now, if I want to make a Byte-level BPE tokenizer similar to Roberta, but with a different...

Expose `if final copy` switch to the user

In some cases, some information in the text should be redacted in the review copy but present in the final copy. It would be nice to have to have a...

Batched mapping does not raise an error if values for an existing column are empty

### Describe the bug Using `Dataset.map(fn, batched=True)` allows resizing the dataset by returning a dict of lists, all of which must be the same size. If they are not the...

Add a GROUP BY operator

**Is your feature request related to a problem? Please describe.** Using batch mapping, we can easily split examples. However, we lack an appropriate option for merging them back together by...

enhancement