Isotr0py issues

Results 20 issues of


                                            Isotr0py

[New Model]: OpenELM-3B

### The model to consider. [apple/OpenELM-3B](https://huggingface.co/apple/OpenELM-3B) ### The closest model vllm already supports. _No response_ ### What's your difficulty of supporting the model you want? OpenELM models have a dynamic...

new model

Add netcdf4 to goci2 optional dependency in `pyproject.toml`

- This PR adds `goci2` optional dependency with `netCDF4 >= 1.1.8` since we used `NetCDF4FileHandler` for GOCI2 reader.

[Hardware][Intel] Add LoRA adapter support for CPU backend

This PR adds the implementation of `bgmv` and `dispatch_bgmv_low_level` in pytorch. This works for the device which doesn't satisfy `compute capacity >= 8.0` to launch `punica` kernel. ### Features -...

x86 CPU

[Model] Initialize Fuyu-8B support

FILL IN THE PR DESCRIPTION HERE Fix #2262 - #2262 This PR adds support for [persimmon-8b](https://huggingface.co/adept/persimmon-8b-base) and [fuyu-8B](https://huggingface.co/adept/fuyu-8b) models. **Updated TODO:** - [x] Refactor to support new vision API -...

[Core] Support loading GGUF model

FILL IN THE PR DESCRIPTION HERE Related issue: #1002 **Features:** - This PR adds support for loading GGUF format model - This PR will also add `gguf` to requirements. -...

Fix missing methods for Fuyu

# What does this PR do? This PR add missing methods in `modeling_fuyu.py` like `get_output_embeddings` etc. ## Before submitting - [ ] This PR fixes a typo or improves the...

Add chat_template for tokenizer extracted from GGUF model

# What does this PR do? - Currently, tokenizer extracted from GGUF model misses a chat_template. - This PR adds missing chat_template through minor fix when extract tokenizer from GGUF...

🚨 Support dequantization for most GGML types

# What does this PR do? ### This PR needs to wait `gguf` package version update and still work in progress. - This PR aims to add dequantization support for...

Quantization

[Bugfix][Kernel] Add `IQ1_M` quantization implementation to GGUF kernel

FILL IN THE PR DESCRIPTION HERE FIX #8321 (*link existing issues this PR will resolve*) - This PR adds `IQ1_M` quantization implementation to GGUF kernel, which is not included in...

[Bugfix] Fix InternVL2 inference with various num_patches

FILL IN THE PR DESCRIPTION HERE FIX #8361 (*link existing issues this PR will resolve*) FIX #8369 **TODO** - [x] Add test to cover these cases. **BEFORE SUBMITTING, PLEASE READ...

ready