Matt Watson
Matt Watson
This is a fairly large training script, with quite a bit going on in the `trian_step`. If this is a bug report, it would help to get a much more...
I'm not sure a truly dynamic pairing of vocabulary indices and learned embedding weights (especially in a smoothly differentiable way) would be a good fit for either StringLookup or the...
Thanks! This is awesome to validate we do indeed have a working solution here, without needing to return a complex object from our modeling APIs with every intermediate output. This...
@jbischof I think you might be misunderstanding this issue? At least going off of your question. > Wouldn't it be pretty heavyweight to always add this to the output? I...
One potential way this could work: Rework the [split sentence script](https://github.com/keras-team/keras-nlp/blob/master/examples/tools/split_sentences.py) to go from a raw wikipedia dump and books text files -> to a set of sharded files with...
Opened two more specific issue for this w.r.t. our BERT/RoBERTa models. - xla testing for bert - https://github.com/keras-team/keras-nlp/issues/325 - mixed precision for bert - https://github.com/keras-team/keras-nlp/issues/326
Thanks! This looks good to me at a high level. I see a lot of places where this and https://github.com/keras-team/keras-nlp/pull/361 are treading the same ground, so maybe we should merge...
Also just a note, after landing https://github.com/keras-team/keras-nlp/pull/361, I think we should move the GCP buckets directories so that they live in `keras-nlp/models/gpt2_base`, and not `keras-nlp/models/gpt2_base_webtext`. Edit, just noted @abheesht17 already...
Thanks! I can bring this up with the team. In terms of desired behavior. I would be inclined to not allow floats, because we probably don't want `keras_nlp.tokenizers.ByteTokenizer(sequence_length=5.5)` to actually...
Looking at tf.data's summary of performance optimizations, I would say largely yes (to preserving optimzations), with a few caveats: - We are not loading the data, so sharding and interleaving...