keras-nlp
keras-nlp copied to clipboard
Add StableLM-3B 4E1T to Keras Hub
This PR adds the StableLM-3B 4E1T model to Keras Hub. However, numerical matching with the Hugging Face implementation is still in progress.
@divyashreepathihalli Here is a comparison of numerics with Hugging Face in Colab. The results match with an absolute tolerance of 1e-3, but they do not match when using 1e-5. Could you please take a look and suggest some improvements or explanations for this discrepancy?
The numerics is good enough!
@Bond099 let's sync this with the latest changes and make sure to run our format script. I'm not exactly sure why non of our CI is running, but I don't think it ran.
Let's clean up the PR. Can we fix the following minor things?
- Pull in master.
- I still don't see
tie_weights = Falsein the PR. - Run formatting, etc.
- Let's wait for all the tests to run.
Looks like there are conflicts. Please pull in master and resolve conflicts
/gemini review
/gemini review
@Bond099 , Could you please resolve the review comments and update the PR. Thanks!
This PR is stale because it has been open for 28 days with no activity. It will be closed in 28 days if no further activity occurs. Thank you.