Kyle Gorman
Kyle Gorman
Okay, I made a change to `indexes.py` so it no longer crashes if there are no features. This does seem to work on my Polish tests, but validation accuracy performance...
Enabling this may also be resulting in slightly poorer performance on Polish with a pointer-generator LSTM, just from eyeballing...
> Ok I will do more testing. I also just remembered a potential issue: this removes any mechanism to get a pointer-generator without shared embeddings, which was previously the default....
> I trained on the polish abstractness data with features (pointer_generator_lstm, lstm encoder, linear features encoder, hidden size 512, 1 layer, emb 128). > > Here are the first 3...
> To be clear, we did not previously default to sharing embeddings, we just ensured characters had the same index, but those indices pointed to separate matrices. I think this...
Sorry to keep jumping around with this, but I let this PR's version keep training on Polish and it went nan around 20 epochs. See if you can replicate? ```readonly...
Broader question: given that Wu & Cotterell didn't find an improvement for going from zeroth to first order, would we want to consider getting rid of the contextual form? I...
> This all basically looks fine to me, but I'll defer to Adam for final approval. > > I would recommend defining `1e7` as a constant and explaining what it...
> @kylebgorman Replaced the `math` with `log`. Found a few redundant uses of `inf` so just created a `defaults.INF` and `defaults.NEG_INF` set of arguments. While at it I also offloaded...
> @kylebgorman revert fly by comment to width. Size should only refer.to tensor.dim. done