keras-nlp
                                
                                 keras-nlp copied to clipboard
                                
                                    keras-nlp copied to clipboard
                            
                            
                            
                        Modular Natural Language Processing workflows with Keras
**Is your feature request related to a problem? Please describe.** Currently, Keras-hub lacks a pure Keras 3 implementation of a CRF layer. This forces users to rely on external libraries...
I was writing a converter for the DeBERTA-v3 models and while testing I noticed that these models are only provided in the `pytorch_model.bin` format, and not in `model.safetensors` format. DeBERTA...
https://github.com/ZhuiyiTechnology/roformer Roformer is a BERT-like model. It adds the now very commonly used Rope position encoding on top of BERT. In fact, this is the first practical application of Rope...
**Is your feature request related to a problem? Please describe.** [Swin-UNETR](https://arxiv.org/abs/2201.01266) was originally designed for 3D medical image segmentation, using Swin Transformers for effective feature extraction. However, there is currently...
**Describe the bug** The variable `attention_scores` introduced at line 111 is always `None`. **To Reproduce** Since it is an internal variable, I copied the subclass CMHA in this script: https://colab.research.google.com/drive/1ZUS4mjDQktovKiJ8TQ7zYtm4PGjesXvG?usp=sharing...
```python import os os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"#chinese huggingface mirror source os.environ["KERAS_BACKEND"] = "torch" os.environ["CUDA_VISIBLE_DEVICES"] = "1" model_name = 'NousResearch/Meta-Llama-3.1-8B' import torch from transformers import AutoTokenizer,AutoModelForCausalLM,AutoConfig import keras hf_model = AutoModelForCausalLM.from_pretrained(model_name, device_map="cuda:0",...
**Describe the bug** Models downloaded through KerasHub fail to deserialize. **To Reproduce** It is not possible to reproduce this bug in Colab. I suspect there is some library version incompatibility;...
Hi, As the Llama3 is popular model, it would be great if we can have a script that export Llama Keras checkpoint to HF. The code is a already exist...
Flash attention support has been added to Keras 3. https://github.com/keras-team/keras/blob/25d6d80a6ecd31f0da52c325cd16dbe4a29b7329/keras/src/layers/attention/multi_head_attention.py#L55 However, some of the models implemented in KerasHub is overriding `def _compute_attention()` function which has the flash attention enabling mechanism....
vocabulary size 6400 ``` text = "Are you OK? " start = time.time() for i in range(10): tokenizer.tokenize(text + str(i)) end = time.time() print(end - start) ``` 3.8366940021514893 seconds