keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Add StableLM-3B 4E1T to Keras Hub

Open Bond099 opened this issue 8 months ago • 5 comments

This PR adds the StableLM-3B 4E1T model to Keras Hub. However, numerical matching with the Hugging Face implementation is still in progress.

Bond099 avatar Mar 18 '25 18:03 Bond099

@divyashreepathihalli Here is a comparison of numerics with Hugging Face in Colab. The results match with an absolute tolerance of 1e-3, but they do not match when using 1e-5. Could you please take a look and suggest some improvements or explanations for this discrepancy?

Bond099 avatar Mar 22 '25 18:03 Bond099

The numerics is good enough!

divyashreepathihalli avatar Apr 16 '25 05:04 divyashreepathihalli

@Bond099 let's sync this with the latest changes and make sure to run our format script. I'm not exactly sure why non of our CI is running, but I don't think it ran.

mattdangerw avatar May 29 '25 15:05 mattdangerw

Let's clean up the PR. Can we fix the following minor things?

  • Pull in master.
  • I still don't see tie_weights = False in the PR.
  • Run formatting, etc.
  • Let's wait for all the tests to run.

abheesht17 avatar Jun 16 '25 17:06 abheesht17

Looks like there are conflicts. Please pull in master and resolve conflicts

abheesht17 avatar Jun 17 '25 04:06 abheesht17

/gemini review

divyashreepathihalli avatar Jul 11 '25 00:07 divyashreepathihalli

/gemini review

divyashreepathihalli avatar Aug 25 '25 22:08 divyashreepathihalli

@Bond099 , Could you please resolve the review comments and update the PR. Thanks!

sachinprasadhs avatar Oct 15 '25 18:10 sachinprasadhs

This PR is stale because it has been open for 28 days with no activity. It will be closed in 28 days if no further activity occurs. Thank you.

github-actions[bot] avatar Nov 13 '25 02:11 github-actions[bot]