Matt Watson

Results 335 comments of Matt Watson

I actually think we have this! At least for MLM preprocessing. What is currently released: - This layer -> https://keras.io/api/keras_nlp/preprocessing_layers/masked_lm_mask_generator/ - A guide showing it in action -> https://keras.io/guides/keras_nlp/transformer_pretraining/ What...

@abheesht17 is this still valid? If so can we add more details, or close if not?

Sorry for the delay! I've been on leave, just getting back. Still thinking on this, not sure what to do... One potential option would be to expose a more robust...

@sampathweb as you are going to look at nightly publishing do you want to take a look at this too? No strong feelings on what we do here.

Talked with @sampathweb sounds like we will wait till we are ready to do this across all keras projects. So we use a similar publishing setup across the team. Will...

Probably an issue with generating the API symbols. Looks like you need to sync with the latest changes on master, then you could try running `./shell/api_gen.sh`

@Mohamed-Ashraf273 is there a way that we can land this without the subgraph approach? We have a similar need in Jax at train time. Compilation times are much improved if...

We do have a prefill step, it's just not explicitly called out as such. See for example https://github.com/keras-team/keras-hub/blob/25c9062c5f9a25fade16094cd21d2545125168ec/keras_hub/src/models/gemma/gemma_causal_lm.py#L250 We will prefill for the entire batch and sequence in one go....

@susnato this looks like an issue @chenmoneygithub filed for himself to work on to me! @chenmoneygithub one thing to keep in mind is our notion of test sizes. Currently we...

> My feeling is the point of accelerator testing is to test GPU compatibility, but now we are mixing GPU testing with large testing (e.g., saving). Talked offline, but I...