keras-nlp icon indicating copy to clipboard operation
keras-nlp copied to clipboard

Port sentence embedding example to keras-core

Open abuelnasr0 opened this issue 2 years ago • 5 comments

Port sentence_embeddings_with_sbert example to keras_core and change the title to be semantic similarity and text clustering using S-RoBERTa with keras_nlp

abuelnasr0 avatar Jul 30 '23 22:07 abuelnasr0

Hi @abuelnasr0 are you willing to port it to keras-core ?

kanpuriyanawab avatar Jul 31 '23 11:07 kanpuriyanawab

Hi @shivance Yes I would love to. I will open a pull request in keras-io as soon as possible.

abuelnasr0 avatar Jul 31 '23 12:07 abuelnasr0

Thanks @abuelnasr0 !

mattdangerw avatar Aug 01 '23 19:08 mattdangerw

Hi @mattdangerw, I need a help in something. I have ported it to keras-core. but there is an error with the tensorflow backend with the triplete objective function example and it worked fine with torch and jax backends. the error occurs when I try to fit the model. here is a colab link : https://colab.research.google.com/gist/abuelnasr0/8aef29478ad1b3204f1c7e2b52af5451/copy-of-sentence_embeddings_with_sbert.ipynb you can jump to the triplete objective function section after running the setup.

the error :

[/usr/local/lib/python3.10/dist-packages/keras_core/src/layers/input_spec.py](https://localhost:8080/#) in assert_input_compatibility(input_spec, inputs, layer_name)
    179             continue
    180 
--> 181         shape = backend.standardize_shape(x.shape)
    182         ndim = len(shape)
    183         # Check ndim.

[/usr/local/lib/python3.10/dist-packages/keras_core/src/backend/common/variables.py](https://localhost:8080/#) in standardize_shape(shape, allow_dynamic_batch_size, allow_all_dynamic)
    412         if not hasattr(shape, "__iter__"):
    413             raise ValueError(f"Cannot convert '{shape}' to a shape.")
--> 414         shape = tuple(shape)
    415 
    416     for i, e in enumerate(shape):

[/usr/local/lib/python3.10/dist-packages/tensorflow/python/framework/tensor_shape.py](https://localhost:8080/#) in __iter__(self)
    929     """Returns `self.dims` if the rank is known, otherwise raises ValueError."""
    930     if self._dims is None:
--> 931       raise ValueError("Cannot iterate over a shape with unknown rank.")
    932     else:
    933       if self._v2_behavior:

ValueError: Cannot iterate over a shape with unknown rank.

I will give some insights of that error. This error occurs when trying to assert input compatibility between input_spec and the input of the model. And when trying to get the shape of the first input which is padding mask of the anchor sentence, the above error occured.

I have played with the code of keras core to get some info about the error and here what I got: input[0]: Tensor("data_2:0", dtype=bool) input[0].shape: input_spec[0]: InputSpec(shape=(None, 512), ndim=2)

you can find that code here: https://colab.research.google.com/gist/abuelnasr0/cfd681e3d87a99f357b76965ec0bcb98/sentence_embeddings_with_sbert.ipynb

abuelnasr0 avatar Aug 03 '23 17:08 abuelnasr0

@mattdangerw It works now. I have changed the way I am loading and preprocessing the data and it surprisingly worked :D. I didn't even mean to fix it. I was trying something. I will edit the text and open a pull request soon.

abuelnasr0 avatar Aug 03 '23 18:08 abuelnasr0