Tobias Pitters issues

Results 35 issues of


                                            Tobias Pitters

FIX: REGR: setting numeric value in Categorical Series with enlargement raise internal error

- [x] closes #47677 - [x] [Tests added and passed](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#writing-tests) if fixing a bug or adding a new feature - [x] All [code checks passed](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#pre-commit). - [x] Added [type annotations](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#type-hints)...

Indexing

Categorical

REGR: Ignore eval inplace

- [x] closes #47449 - [x] [Tests added and passed](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#writing-tests) if fixing a bug or adding a new feature - [x] All [code checks passed](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#pre-commit). - [x] Added [type annotations](https://pandas.pydata.org/pandas-docs/dev/development/contributing_codebase.html#type-hints)...

expressions

Implementing eq method to compare Schemas/Columns

I am creating schemas dynamically (from ddl schemas). It would be great if I could build some test and just do ```python expected_schema == result_schema ``` Right now I am...

enhancement

help wanted

Warn fix optimizers warning

Replace the deprecated `inspect.getargspec` with `inspect.getfullargspec` as the first is deprecated in favor of the latter. See here: https://docs.python.org/3/library/inspect.html#inspect.getargspec

add alpaca gpt4 dataset

The inputs can be quite a lot of different versions of `no input`, therefore don't use the `input` column for that. In some cases the text in `input` is already...

Add instruction to reverse augmentation

We currently support reverse augmentation for the alpaca datasets. This proved to be not really helpful till now. As mentioned in section 5.1.1 of the [paper](https://arxiv.org/abs/2303.18223) we should probably generate...

good first issue

Feature/dataset entry

closes #2708 Add pydantic basemodel class (equivalent to dataclass but with stronger guarantees) to return from dolly dataset. Add the formatting functionality in the dataset entry class. This PR does...

Add dialogue data collator tests

Add dialogue data collator unit test. Things to note on this PR: - is it correct that we mask the last occurance of `` of the assistant? See the example...

Total score could synchronize faster

It seems like the score for the month and the total score are synchronized in different time intervals. Maybe we could do both aggregations in the same time interval. ![image](https://user-images.githubusercontent.com/31857876/231005321-1e5f61f5-93f4-4d93-9e64-d485dc907b98.png)

backend

nice-to-have

Special tokens in datasets

As it can be seen in [some of the answers](https://open-assistant.github.io/oasst-model-eval/?f=https%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Foasst-sft%2F2023-04-07_OpenAssistant_llama-30b-sft-oa-alpaca-epoch-2_sampling_noprefix2.json%0Ahttps%3A%2F%2Fraw.githubusercontent.com%2FOpen-Assistant%2Foasst-model-eval%2Fmain%2Fsampling_reports%2Foasst-sft%2F2023-04-07_OpenAssistant_llama-30b-sft-oa-alpaca-epoch-4_sampling_noprefix2.json) our model outputs quite a number of tokens that are reserved for special purposes and should not appear in text....

data