intel-extension-for-transformers icon indicating copy to clipboard operation
intel-extension-for-transformers copied to clipboard

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Results 95 intel-extension-for-transformers issues
Sort by recently updated
recently updated
newest added
trafficstars

## Type of Change enable lm-eval for llama.cpp models API not changed ## Description refer to the lm-eval [official code](https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/models/gguf.py) and [llama-cpp-python](https://github.com/abetlen/llama-cpp-python/tree/main) ### improvements: 1. load llama.cpp model directly when...

requirements.txt for intel_extension_for_transformers/neural_chat/examples/helloworld Dependency required/check using requirements.txt for intel_extension_for_transformers/neural_chat/examples/helloworld ## Type of Change others: Dependency required/check using requirements.txt for intel_extension_for_transformers/neural_chat/examples/helloworld no API changed ## Description Dependency required/check using requirements.txt for...

**backend w/ rag** I already installed requirement ``` pip install -r ~/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval/requirements.txt ``` But rag plugin still initialize failed. ``` 2024-05-10 04:32:14,308 - intel_extension_for_transformers.neural_chat.utils.error_utils - ERROR - neuralchat error: Generic...

Failed to call this api on my ubuntu system. Not sure what's the issue. ![image](https://github.com/intel/intel-extension-for-transformers/assets/81341556/dc0fd912-a9bf-4a4f-9afd-7e008ebe2a17) I suspect the dependency version is not correct on my ubuntu system. But there is...

NeuralChat
aitce

I tried starting with the conda install from the installation.md... ``` conda create -n hf conda activate hf conda install -c intel intel_extension_for_transformers . . . ``` ...but it's incomplete....

Hi This may not be a neural chat issue, but it is quite difficult to pinpoint hence I'm posting it here. I'm running the RAG mode on my data after...

from transformers import TextIteratorStreamer, AutoTokenizer from intel_extension_for_transformers.transformers import AutoModelForCausalLM, RtnConfig model_name = "./models/chatglm3-6b" model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=RtnConfig(bits=4, compute_dtype="int8", weight_dtype="int4_fullrange", use_neural_speed=True ), trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) history = [{"role": "system",...

I try to run the TTS (English and Multi Language Text-to-Speech) in my PC. https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/README.md It occured the `cannot import name 'WeightOnlyQuantizedLinear'` error info as below. ```shell ~/WorkSpace/TTS$ python eng-tts.py...

Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. Release notes Sourced from torch's releases. PyTorch 2.2: FlashAttention-v2, AOTInductor PyTorch 2.2 Release Notes Highlights Backwards Incompatible Changes Deprecations New Features Improvements Bug fixes...

dependencies
python

Bumps [langchain-community](https://github.com/langchain-ai/langchain) from 0.0.27 to 0.2.9. Release notes Sourced from langchain-community's releases. langchain-community==0.2.9 Changes since langchain-community==0.2.7 infra: add min version testing to pr test flow (#24358) community[patch]: Release 0.2.9 (#24453)...

dependencies
python