intel-extension-for-transformers icon indicating copy to clipboard operation
intel-extension-for-transformers copied to clipboard

⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡

Results 95 intel-extension-for-transformers issues
Sort by recently updated
recently updated
newest added
trafficstars

Loading saved model runs into following error It also takes a very long time to run and save quantized models. ``` 2024-03-21 08:48:58 [INFO] loading weights file models/4_bit_llama2-rtn/model.safetensors 2024-03-21 08:48:58...

1. Data augmentation: retrieval dataset construction, include (1) Context to Question and Mine Hard Negatives, (2) Context, Question to Ground Truth. 2. Retrieval evaluation: MRR (Mean reciprocal rank) and Hit...

NeuralChat

Ubuntu22.04 Python 3.11.9 Trying to install dependencies for NeuralChat: pip install -r requirements_cpu.txt Error: Using cached svgwrite-1.4.3-py3-none-any.whl (67 kB) Building wheels for collected packages: cchardet, lm_eval Building wheel for cchardet...

## Type of Change feature API changed ## Description Enable RAG's table extraction functionality for pdf Enable RAG's table summary functionality, with three modes to choose: [none, title, llm] ##...

NeuralChat

## Type of Change feature API not changed ## Description Add new customized chabot UI ## Expected Behavior & Potential Risk ![image](https://github.com/intel/intel-extension-for-transformers/assets/104267837/409e6bab-959c-4a64-9f47-25148e83aa6f) ![image](https://github.com/intel/intel-extension-for-transformers/assets/104267837/1c817e4b-db9d-4689-8f4e-f40e8ce61ac3) ## How has this PR been tested?...

NeuralChat

## Type of Change feature API not changed ## Description Support pptx format for RAG ## Expected Behavior & Potential Risk User can use pptx format file for RAG ##...

NeuralChat

## Type of Change feature ## Description do FP8 quantization using habana ## Expected Behavior & Potential Risk the expected behavior that triggered by this PR ## How has this...

habana

Hi itrex team, thanks for the great work! I've been experimenting with the Weight Only Quantization (WOQ) from ITREX, following the provided examples in [weightonlyquant.md#example-for-cpu-device](https://github.com/intel/intel-extension-for-transformers/blob/main/docs/weightonlyquant.md#example-for-cpu-device). The results are promising. Now...

## Type of Change feature API added: - /v1/assist/chat - /v1/assist/decode - /v1/assist/data_transfer ## Description Support Assisted Generation on Multi-nodes. The code framework is implemented. Details will be completed by...

draft

neuralchat already synced RESTful API with latest OpenAI protocol via 2e1c79d9b99db8bc004d67235fc6df51ca1d238e But neuralchat frontend don't have field to assign system prompt. **backend log** ``` INFO: 127.0.0.1:58004 - "POST /v1/chat/completions HTTP/1.1"...