keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

[WIP] add ModernBERT

Open SauravMaheshkar opened this issue 6 months ago • 1 comments

Ref: #2027

SauravMaheshkar avatar May 17 '25 03:05 SauravMaheshkar

Hey folks 👋🏼 , could I get some help on this failing test, it's failing when using the mixed_float16 dtype policy.

🐕 ❯ pytest keras_hub/src/models/modernbert/modernbert_backbone_test.py
=================================================================================================== test session starts ===================================================================================================
platform darwin -- Python 3.11.11, pytest-8.3.5, pluggy-1.6.0 -- /Users/sauravmaheshkar/dev/keras-hub/.venv/bin/python3
cachedir: .pytest_cache
rootdir: /Users/sauravmaheshkar/dev/keras-hub
configfile: pyproject.toml
plugins: cov-6.1.1
collected 4 items                                                                                                                                                                                                         

keras_hub/src/models/modernbert/modernbert_backbone_test.py::TestCase::test_session <- .venv/lib/python3.11/site-packages/tensorflow/python/framework/test_util.py SKIPPED (Not a test.)                            [ 25%]
keras_hub/src/models/modernbert/modernbert_backbone_test.py::ModernBertBackboneTest::test_backbone_basics FAILED                                                                                                    [ 50%]
keras_hub/src/models/modernbert/modernbert_backbone_test.py::ModernBertBackboneTest::test_saved_model SKIPPED (need --run_large option to run)                                                                      [ 75%]
keras_hub/src/models/modernbert/modernbert_backbone_test.py::ModernBertBackboneTest::test_session <- .venv/lib/python3.11/site-packages/tensorflow/python/framework/test_util.py PASSED                             [100%]

======================================================================================================== FAILURES =========================================================================================================
_______________________________________________________________________________________ ModernBertBackboneTest.test_backbone_basics _______________________________________________________________________________________

self = <keras_hub.src.models.modernbert.modernbert_backbone_test.ModernBertBackboneTest testMethod=test_backbone_basics>

    def test_backbone_basics(self):
>       self.run_backbone_test(
            cls=ModernBertBackbone,
            init_kwargs=self.init_kwargs,
            input_data=self.input_data,
            expected_output_shape=(2, 5, 8),
        )

keras_hub/src/models/modernbert/modernbert_backbone_test.py:25: 
_ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _ _
keras_hub/src/tests/test_case.py:490: in run_backbone_test
    self.run_precision_test(cls, init_kwargs, input_data)
keras_hub/src/tests/test_case.py:355: in run_precision_test
    self.assertEqual(policy.compute_dtype, sublayer.compute_dtype)
keras_hub/src/tests/test_case.py:57: in assertEqual
    super().assertEqual(x1, x2, msg=msg)
E   AssertionError: 
E   - float16
E   + float32
-------------------------------------------------------------------------------------------------- Captured stdout call ---------------------------------------------------------------------------------------------------
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 298ms/step
1/1 ━━━━━━━━━━━━━━━━━━━━ 0s 8ms/step
================================================================================================= short test summary info =================================================================================================
FAILED keras_hub/src/models/modernbert/modernbert_backbone_test.py::ModernBertBackboneTest::test_backbone_basics - AssertionError: 
- float16
+ float32
========================================================================================= 1 failed, 1 passed, 2 skipped in 3.88s ==========================================================================================

SauravMaheshkar avatar May 17 '25 15:05 SauravMaheshkar

@SauravMaheshkar - are you still working on this?

abheesht17 avatar Jul 10 '25 19:07 abheesht17

Hey apologies, got busy with work. Will address comments now

SauravMaheshkar avatar Jul 13 '25 12:07 SauravMaheshkar

This PR is stale because it has been open for 14 days with no activity. It will be closed if no further activity occurs. Thank you.

github-actions[bot] avatar Oct 04 '25 02:10 github-actions[bot]

This PR was closed because it has been inactive for 56 days. Please reopen if you'd like to work on this further.

github-actions[bot] avatar Nov 01 '25 02:11 github-actions[bot]