pfeatherstone comments

Results 403 comments of


                                            pfeatherstone

Feature request: support mask for padded batches

How do i "ignore the masked values" ? The output of the loss function is a batch of numbers. I.e if `ref` has shape [B,T], `deg` has shape [B,T], then...

Question: first read memories

Pretty much. Basically nothing can be optional with tracing and ONNX. So both mems and mask must be specified

Feature request: make JIT and ONNX export work

I haven't been able to train my models yet just with normal transformers, using larger context lengths (my weird TTS + STT system). CTC loss isn't converging at all. So...

Question: first read memories

You still need to pass mems in the forward pass when tracing. So i think the fix would be to pass mems full of zeros and make the code handle...

Question: first read memories

Then probably need some control flow in a torch.jit.script function which handles the different cases

Feature request: make JIT and ONNX export work

Gave this a go, it turns out that `torch.jit.trace()` doesn't accept `None` in `example_inputs`. So we cannot trace with `mems` not None and expect to work when None, or vice...

Question: first read memories

Yeah I have. I could submit a PR but I didn't know if passing zeros and allowing to attend to that was ok.

Question: first read memories

Ah yes I need to try. Can it wait till Tuesday?

Question: how to adapt this for CTC loss

In the memory replay backpropagation algorithm, the labels are partitioned in the same way as the logits. The loss is evaluated per block. For CTC that doesn't make sense since...

Question: how to adapt this for CTC loss

@lucidrains Or if we forget CTC, can you think of a way to make this work with unaligned targets ?