intel-extension-for-transformers icon indicating copy to clipboard operation
intel-extension-for-transformers copied to clipboard

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Results 95 intel-extension-for-transformers issues
Sort by recently updated
recently updated
newest added
trafficstars

ValueError: Unsupported huggingface version: 4.34.1. You may need to upgrade your SDK version (pip install -U sagemaker) for newer huggingface versions. Supported huggingface version(s): 4.6.1, 4.10.2, 4.11.0, 4.12.3, 4.17.0, 4.26.0,...

```python from transformers import AutoTokenizer, TextStreamer from intel_extension_for_transformers.transformers import AutoModelForCausalLM, WeightOnlyQuantConfig model_name = "Intel/neural-chat-7b-v3-3" # for int8, should set weight_dtype="int8" config = WeightOnlyQuantConfig(compute_dtype="bf16", weight_dtype="int4") prompt = "Once upon a time,...

Does the Llava part work ? https://github.com/intel/intel-extension-for-transformers/tree/main/intel_extension_for_transformers/transformers/modeling/llava_models If so are they optimized for Intel Device and are there any examples ? Thanks for building this library. I have seen the...

Hi I want to be able to stream the model not only to stdoutput. the current streamer: TextStreamer only works with the stdouput as I understands it. I tried using...

Is there a specific version of openai that is aligned with the OpenAI interfaces offered by neuralchat? I am currently testing using the current **1.12.0** but encountering a **422 Unprocessable...

aitce

### Problem Summary and status of similar tests I am having trouble getting neuralchat to work with my **Intel Data Center Flex 170 GPU**. Below is my procedure with the...

aitce

## Type of Change feature API not changed ## Description Support user management in backend server ## Expected Behavior & Potential Risk Support user management in backend server ## How...

NeuralChat

Running the short example found in the main doc: https://github.com/intel/intel-extension-for-transformers/blob/main/docs/qloracpu.md But it crashes here: ``` model = prepare_model_for_kbit_training(model, use_gradient_checkpointing = True) File "/home/xtof/nvme/envs/cpuqlora/lib/python3.9/site-packages/peft/utils/other.py", line 95, in prepare_model_for_kbit_training for name, param...

I am using Arc770 GPU on Windows 11 1. I have installed WSL2 2. I have installed miniconda 3. I follow instruction - "pip install intel-extension-for-transformers" 4. Run the example...

## Type of Change Add NeuralChat example API not changed ## Description Add Multi-Socket LLM inference example for NeuralChat. Related DeepSpeed PR: https://github.com/microsoft/DeepSpeed/pull/4750 (not merged yet) ## Expected Behavior &...

draft