Dirk Groeneveld issues

Results 84 issues of


                                            Dirk Groeneveld

Comprehensive tests for step graphs

Test cases: - [ ] Direct step dependency - [ ] Step dependency through a list/dict/set/tuple - [ ] Step dependency through a complex (`FromParams`) object - [ ] Anonymous...

feature request

Custom truncation logic is really hard

I have a bunch of text pairs that I want to tokenize. Some of those texts are too long for the transformer, so I ask the tokenizer to truncate with...

Docs say you can pass token ids to `.encode()`, but it throws an exception when you do

I'm looking at these docs: https://huggingface.co/docs/transformers/main/en/main_classes/tokenizer#transformers.PreTrainedTokenizer.encode They say you can pass in token ids instead of a string. But when you try you get a `TypeError`: ``` In [2]: import...

Add Python 3.10 to the eval-hackathon branch

Beaker Executor should execute multiple Tango steps in one Beaker experiment

### 🚀 The feature, motivation and pitch Beaker experiments have significant overhead. When we're running many small Tango steps, we can save some time by running multiple steps at once....

feature request

integration: beaker

Running a Beaker Executor job leaves loads of uncommitted datasets in the workspace

### 🐛 Describe the bug Run the catwalk training job specified here: https://github.com/allenai/catwalk/commit/5ba019204b0ff36c1c4da7feab4515342e9d9ad2 Command line is `tango --settings experiments/train_all_the_things/tango.yml run experiments/train_all_the_things/train_all_the_things.jsonnet`. It will run for quite a while. Two jobs...

bug

integration: beaker

project/data

Better Checkpoint Management

What happens now === Our runs produce "checkpoint directories". You might have seen them. Checkpoint directories contain a bunch of debris from a run, including between 0 and n actual...

Dirk Groeneveld