yoyodyne
yoyodyne copied to clipboard
Small-vocabulary sequence-to-sequence generation with optional feature conditioning
Currently we create a vocabulary of all items in all datapaths specified to the training script. However, we may want to study how models perform when provided unknown symbols. In...
What are people's thoughts on adding preprocessing scripts to allow BPE-like tokenization of characters? Technically we already support this (just tokenize your input and use delineation function). But wonder if...
MIght as well set up an autoregressive decoder since T5 is on the docket. This shouldn't be too much of a hassle since the Transformer model works, but leaving as...
TorchMetrics support is pretty reliable nowadays and makes distributed training less annoying (no more World sizes, yay!). It also syncs well with Wandb logging and allows monitoring of training batch...
Traceback (most recent call last): File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/call.py", line 38, in _call_and_handle_interrupt return trainer_fn(*args, **kwargs) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 650, in _fit_impl self._run(model, ckpt_path=self.ckpt_path) File "/usr/local/lib/python3.10/dist-packages/pytorch_lightning/trainer/trainer.py", line 1112, in _run results =...
For models where the features are concatenated to the source string, we now handle this in the collator. We simply add the source_token vocabulary length to each feature index in...
Transformer inference (i.e. with no teacher forcing) is slow. In practice I think people typically implement some kind of caching so that at each timestep, we do not need to...
I am working on a simple wrapper class which loads the model and yields predictions, but the interface in `predict.py` is somewhat unfriendly for this...it has nice abstractions but they're...
With the decoupling of encoders and decoders, we have added a `Linear` encoder, which seems to just embed the inputs and pass them along. We should also add a `SelfAttention`...
This is a notice of my plans to move the encoding methods (which take strings and make tensors) and decoding methods (which convert tensors back into strings) into the Index...