Matt Watson comments

Results 342 comments of


                                            Matt Watson

Move generation functionality to base classes

This is mostly implemented, but still needs a little work. I'll push code shortly.

Loosen class requirements in from_preset

This will probably have some failing tests, just seeing how it does. Need to add some unit tests for a few bug I spotted in our converters still.

Add auto variable sharding for all backbones/tasks

We should also keep the docstring for the method on the `Backbone` base class. And factor out all the error checking somehow. That way the per model code here could...

Allow computer vision task models to run without tf.data

Contributions are welcome here, but this is a fairly abstract problem that would need some scouting out first. We could try to leverage Keras' DataAdapter here, I'm not sure how...

🚀 Contributing to KerasHub 🚀

@SamanehSaadat @divyashreepathihalli leaving you both assigned here so we can monitor this issue for comments and new contributors. I've pinned it to the top of our issue list (following Keras).

Add support for JetStream generative inference for all KerasHub LLMs

Looked at this a bit. I think #1861 will be a important precursor work to make implementing this reasonable. I also think we might want to consider starting on some...

Add support for JetStream generative inference for all KerasHub LLMs

It's unclear to me whether Jetstream supports tokenization beyond sentencepiece and gpt-style-bpe, see [this](https://github.com/google/maxtext/blob/5af84912f4d11f356ea9929950faa7c50b12ae85/MaxText/maxengine.py#L358-L363) for maxtext. This is something to look into.

Matt Watson

Move generation functionality to base classes

Loosen class requirements in from_preset

Add auto variable sharding for all backbones/tasks

Allow computer vision task models to run without tf.data

🚀 Contributing to KerasHub 🚀

Add support for JetStream generative inference for all KerasHub LLMs

Add support for JetStream generative inference for all KerasHub LLMs

Tensorboard callback is blocking process

Tensorboard callback is blocking process

layers.GRU returns wrong shaped output with GPU