John Giorgi

Results 54 comments of John Giorgi

@ethch18 I implemented MLM in AllenNLP for my own project. Unfortunately its highly coupled to some other code, but these might be useful: - Helper functions (mostly based on HF...

> I think one of @dirkgr's points was that we don't want to add a reinitialized transformer to the cache. So maybe only run [`_model_cache[spec] = transformer`](https://github.com/allenai/allennlp/pull/5543/files#diff-52c88d0594aac08ef176c393bdd7cb32ba8f57c6cc7f9a39583e741100fa4309L184) when `reinit_modules is...

@dirkgr I had originally added this functionality to `PretrainedTransformerEmbedder`, but then I figured it made sense to move it to `cached_transformers`, so any part of the library that loads a...

We (not affiliated with AllenNLP) are actually working on exactly that! Our implementation subclasses `FBetaMeasure` and has (almost) all the same unit tests. Still stuck on a couple. You can...

@annajung Sorry I am getting back so late! Are the decoder layers all RNNs? If so I think it makes the most sense to use the last hidden vector output...

@annajung Looks good :) I left a comment! I think I originally thought the goal was to add a composable decoder to the copynet model, similar to what already exists...

@annajung What do you think? I would be game to try and implement it. I am excited to try out a transformer decoder layer within copynet for my own project....

Okay I have made a decent amount of progress on this. I have replaced `self._decoder_cell` with `self.decoder`, which is a [SeqDecoder](https://github.com/allenai/allennlp-models/blob/f1de60fc5eab482dfd58d47ca160b497c69b4519/allennlp_models/generation/modules/seq_decoders/seq_decoder.py#L10) subclass. Now I am trying to figure out how...

> @JohnGiorgi one thing to keep in mind is that we want to keep backwards compatibility. So a CopyNet model trained before with a `_decoder_cell` should still work. I see....