vllm icon indicating copy to clipboard operation
vllm copied to clipboard

[New Model]: OpenELM-3B

Open Isotr0py opened this issue 10 months ago • 1 comments

The model to consider.

apple/OpenELM-3B

The closest model vllm already supports.

No response

What's your difficulty of supporting the model you want?

OpenELM models have a dynamic head num for each layer, which needs a dynamic kv_cache for page attention:

  "num_kv_heads": [
    3,
    3,
    3,
    3,
    3,
    4,
    4,
    4,
    4,
    4,
    4,
    4,
    5,
    5,
    5,
    5
  ],
  "num_query_heads": [
    12,
    12,
    12,
    12,
    12,
    16,
    16,
    16,
    16,
    16,
    16,
    16,
    20,
    20,
    20,
    20
  ],
  "num_transformer_layers": 16,

Isotr0py avatar May 02 '24 04:05 Isotr0py

I had made this feature request here . https://github.com/vllm-project/vllm/discussions/4350

bks5881 avatar May 03 '24 08:05 bks5881

This issue was closed and marked as completed but I don't see it referencing a PR nor do I see the model listed as a supported model?

elatt avatar May 10 '24 20:05 elatt