intel-extension-for-transformers
intel-extension-for-transformers copied to clipboard
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
## Type of Change Waiting INC support export compressor model. ## Description detail description JIRA ticket: xxx ## Expected Behavior & Potential Risk the expected behavior that triggered by this...
I noticed when using neuralchat_server for a chat completion is very slow, compared to loading the model through AutoModelForCausalLM then do a generate (after applying chat_template). - Both cases, I'm...
I am encountering an issue while attempting to load the LLaMA-3.2-3B model. The error message I receive is as follows: RuntimeError: "normal_kernel_cpu" not implemented for 'Char'. Could you please provide...
I have installed **intel-extension-for-transformer** using `pip install intel-extension-for-transformers` but trying a little script to see if it worked I got this error : Traceback (most recent call last): File "/home/nico/env/prov.py",...
I have tried every method from the install instruction, but all way got those same responses as follow: Looking in indexes: https://repo.huaweicloud.com/repository/pypi/simple/, https://download.pytorch.org/whl/cpu WARNING: Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None))...
Bumps [langchain](https://github.com/langchain-ai/langchain) from 0.1.11 to 0.2.10. Release notes Sourced from langchain's releases. langchain-text-splitters==0.2.4 Changes since langchain-text-splitters==0.2.2 text-splitters[patch]: Release 0.2.4 (#25979) text-splitters[patch]: Modified SpacyTextSplitter to fully keep whitespace when strip_whitespace is...
model.cpp: loading model from runtime_outs/ne_qwen2_q_autoround.bin The number of ne_parameters is wrong. init: n_vocab = 151936 init: n_embd = 1536 init: n_mult = 8960 init: n_head = 12 init: n_head_kv =...
!git clone https://github.com/intel/intel-extension-for-transformers.git !pip install intel-extension-for-transformers !pip install --upgrade neural_compressor==2.6 !python /content/intel-extension-for-transformers/examples/huggingface/pytorch/translation/quantization/run_translation.py \ --model_name_or_path Helsinki-NLP/opus-mt-en-ro \ --do_train \ --do_eval \ --source_lang en \ --target_lang ro \ --dataset_name wmt/wmt16 \ --dataset_config_name...