keras-nlp issues

Add Pure Keras 3 CRF Layer with Native Loss and Viterbi Decoding for Sequence Labeling

**Is your feature request related to a problem? Please describe.** Currently, Keras-hub lacks a pure Keras 3 implementation of a CRF layer. This forces users to rely on external libraries...

gcuder

type:feature

`model.safetensors` doesn't exist in preset directory

I was writing a converter for the DeBERTA-v3 models and while testing I noticed that these models are only provided in the `pytorch_model.bin` format, and not in `model.safetensors` format. DeBERTA...

omkar-334

type:support

I want to provide the Keras_hub implementation of RoFormer.

2

https://github.com/ZhuiyiTechnology/roformer Roformer is a BERT-like model. It adds the now very commonly used Rope position encoding on top of BERT. In fact, this is the first practical application of Rope...

pass-lin

type:feature

Add Swin-UNETR to Keras-Hub

2

**Is your feature request related to a problem? Please describe.** [Swin-UNETR](https://arxiv.org/abs/2201.01266) was originally designed for 3D medical image segmentation, using Swin Transformers for effective feature extraction. However, there is currently...

innat

type:feature

The attention scores are always `None` in `CachedMultiHeadAttention`

**Describe the bug** The variable `attention_scores` introduced at line 111 is always `None`. **To Reproduce** Since it is an internal variable, I copied the subclass CMHA in this script: https://colab.research.google.com/drive/1ZUS4mjDQktovKiJ8TQ7zYtm4PGjesXvG?usp=sharing...

apehex

type:Bug

The LLaMA implementation by Keras Hub exhibits significant deviations in accuracy compared to the standard implementation (Hugging Face).

5

```python import os os.environ["HF_ENDPOINT"] = "https://hf-mirror.com"#chinese huggingface mirror source os.environ["KERAS_BACKEND"] = "torch" os.environ["CUDA_VISIBLE_DEVICES"] = "1" model_name = 'NousResearch/Meta-Llama-3.1-8B' import torch from transformers import AutoTokenizer,AutoModelForCausalLM,AutoConfig import keras hf_model = AutoModelForCausalLM.from_pretrained(model_name, device_map="cuda:0",...

pass-lin

type:Bug

Downloaded model deserialization failure

3

**Describe the bug** Models downloaded through KerasHub fail to deserialize. **To Reproduce** It is not possible to reproduce this bug in Colab. I suspect there is some library version incompatibility;...

r-zip

type:Bug

stat:awaiting response from contributor

Exporting Keras Llama Checkpoint to HF

7

Hi, As the Llama3 is popular model, it would be great if we can have a script that export Llama Keras checkpoint to HF. The code is a already exist...

salrowili

Gemma

stat:awaiting response from contributor

Update model implementations to use flash attention

1

Flash attention support has been added to Keras 3. https://github.com/keras-team/keras/blob/25d6d80a6ecd31f0da52c325cd16dbe4a29b7329/keras/src/layers/attention/multi_head_attention.py#L55 However, some of the models implemented in KerasHub is overriding `def _compute_attention()` function which has the flash attention enabling mechanism....

divyashreepathihalli

team-created

The BytePairTokenizer class is extremely, extremely slow at tokenizing

4

vocabulary size 6400 ``` text = "Are you OK? " start = time.time() for i in range(10): tokenizer.tokenize(text + str(i)) end = time.time() print(end - start) ``` 3.8366940021514893 seconds

chenying99

type:Bug

stat:awaiting response from contributor

keras-nlp
keras-nlp copied to clipboard

Metadata

Add Pure Keras 3 CRF Layer with Native Loss and Viterbi Decoding for Sequence Labeling

`model.safetensors` doesn't exist in preset directory

I want to provide the Keras_hub implementation of RoFormer.

Add Swin-UNETR to Keras-Hub

The attention scores are always `None` in `CachedMultiHeadAttention`

The LLaMA implementation by Keras Hub exhibits significant deviations in accuracy compared to the standard implementation (Hugging Face).

Downloaded model deserialization failure

Exporting Keras Llama Checkpoint to HF

Update model implementations to use flash attention

The BytePairTokenizer class is extremely, extremely slow at tokenizing

← Metadata

Owner

Metadata

keras-nlp keras-nlp copied to clipboard

Metadata

← Metadata

Owner

Metadata

keras-nlp
keras-nlp copied to clipboard