intel-extension-for-transformers issues

lm-eval for llama.cpp enhancement.

2

## Type of Change enable lm-eval for llama.cpp models API not changed ## Description refer to the lm-eval [official code](https://github.com/EleutherAI/lm-evaluation-harness/blob/main/lm_eval/models/gguf.py) and [llama-cpp-python](https://github.com/abetlen/llama-cpp-python/tree/main) ### improvements: 1. load llama.cpp model directly when...

lkk12014402

requirements.txt for intel_extension_for_transformers/neural_chat/examples/helloworld

requirements.txt for intel_extension_for_transformers/neural_chat/examples/helloworld Dependency required/check using requirements.txt for intel_extension_for_transformers/neural_chat/examples/helloworld ## Type of Change others: Dependency required/check using requirements.txt for intel_extension_for_transformers/neural_chat/examples/helloworld no API changed ## Description Dependency required/check using requirements.txt for...

fbaldassarri

rag plugin initialize failed

1

**backend w/ rag** I already installed requirement ``` pip install -r ~/intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval/requirements.txt ``` But rag plugin still initialize failed. ``` 2024-05-10 04:32:14,308 - intel_extension_for_transformers.neural_chat.utils.error_utils - ERROR - neuralchat error: Generic...

redhairerINTEL

neuralchat /v1/askdoc/create 404 not found. Failed to call this api on ubuntu system.

3

Failed to call this api on my ubuntu system. Not sure what's the issue. ![image](https://github.com/intel/intel-extension-for-transformers/assets/81341556/dc0fd912-a9bf-4a4f-9afd-7e008ebe2a17) I suspect the dependency version is not correct on my ubuntu system. But there is...

RongLei-intel

NeuralChat

aitce

(detailed) conda install instructions?

I tried starting with the conda install from the installation.md... ``` conda create -n hf conda activate hf conda install -c intel intel_extension_for_transformers . . . ``` ...but it's incomplete....

hpcpony

Segmentation fault while running rag mode

Hi This may not be a neural chat issue, but it is quite difficult to pinpoint hence I'm posting it here. I'm running the RAG mode on my data after...

anayjain

AutoModelForCausalLM model.generate Wrong response by docker run the same chatglm3-int4 model bin file

from transformers import TextIteratorStreamer, AutoTokenizer from intel_extension_for_transformers.transformers import AutoModelForCausalLM, RtnConfig model_name = "./models/chatglm3-6b" model = AutoModelForCausalLM.from_pretrained(model_name, quantization_config=RtnConfig(bits=4, compute_dtype="int8", weight_dtype="int4_fullrange", use_neural_speed=True ), trust_remote_code=True) tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True) history = [{"role": "system",...

ahlwjnj

ImportError: cannot import name 'WeightOnlyQuantizedLinear' from 'intel_extension_for_pytorch.nn.utils._quantize_convert'

2

I try to run the TTS (English and Multi Language Text-to-Speech) in my PC. https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/pipeline/plugins/audio/README.md It occured the `cannot import name 'WeightOnlyQuantizedLinear'` error info as below. ```shell ~/WorkSpace/TTS$ python eng-tts.py...

junruizh2021

Bump torch from 1.13.1 to 2.2.0 in /workflows/compression_aware_training

1

Bumps [torch](https://github.com/pytorch/pytorch) from 1.13.1 to 2.2.0. Release notes Sourced from torch's releases. PyTorch 2.2: FlashAttention-v2, AOTInductor PyTorch 2.2 Release Notes Highlights Backwards Incompatible Changes Deprecations New Features Improvements Bug fixes...

dependabot[bot]

dependencies

python

Bump langchain-community from 0.0.27 to 0.2.9 in /intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval

1

Bumps [langchain-community](https://github.com/langchain-ai/langchain) from 0.0.27 to 0.2.9. Release notes Sourced from langchain-community's releases. langchain-community==0.2.9 Changes since langchain-community==0.2.7 infra: add min version testing to pr test flow (#24358) community[patch]: Release 0.2.9 (#24453)...

dependabot[bot]

dependencies

python

intel-extension-for-transformers
intel-extension-for-transformers copied to clipboard

Metadata

lm-eval for llama.cpp enhancement.

requirements.txt for intel_extension_for_transformers/neural_chat/examples/helloworld

rag plugin initialize failed

neuralchat /v1/askdoc/create 404 not found. Failed to call this api on ubuntu system.

(detailed) conda install instructions?

Segmentation fault while running rag mode

AutoModelForCausalLM model.generate Wrong response by docker run the same chatglm3-int4 model bin file

ImportError: cannot import name 'WeightOnlyQuantizedLinear' from 'intel_extension_for_pytorch.nn.utils._quantize_convert'

Bump torch from 1.13.1 to 2.2.0 in /workflows/compression_aware_training

Bump langchain-community from 0.0.27 to 0.2.9 in /intel_extension_for_transformers/neural_chat/pipeline/plugins/retrieval

← Metadata

Owner

Metadata

intel-extension-for-transformers intel-extension-for-transformers copied to clipboard

Metadata

← Metadata

Owner

Metadata

intel-extension-for-transformers
intel-extension-for-transformers copied to clipboard