Colin Raffel
Colin Raffel
Re: data storage and S3, maybe a better solution would be to use Git LFS (or one of the equivalents)? That way it won't ever go anywhere and we won't...
> I'm not sure - it looks like Github LFS is still under construction (I applied for the early access though, we'll see what happens...). I have access. There are...
> Do you know if it's possible to get a download URL, like the zipball links for repos? That's a good question. I'm pretty sure it doesn't, which limits its...
In order to use the module import functionality of seqio, importing the module needs to add the task you want to use to the task registry without calling any additional...
Hi, monotonic attention (and MoChA) produces the probability of attending to each of the encoder states *or* skipping all of the encoder states. As a result the sum of the...
I like this idea. I can't think of a way to allow this and still allow pre-computing the input, unless we put the input-to-hidden connections outside of the recurrence, but...
> FWIW, this is what I'm doing at the moment in my experiments, as it seemed more natural. Do you have any example code? > And I think your usage...
> What about extending the CustomRecurrentLayer to allow specifying a in_and_hid_to_hid network rather than separate in_to_hid and hid_to_hid networks which are added up? Sure, we could do that, but it'd...
I see. But why not just solve this using the RNN containers we've been talking about? I don't really see the need for this intermediate layer of granularity.
> Would anybody like to try posting a complete container-based example of a vanilla recurrent layer (LSTM and GRU distracts too much for the start)? I'd _like_ to, not sure...