intel-extension-for-transformers
intel-extension-for-transformers copied to clipboard
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
Trying to build a project using python3.10-alpine docker image as a base, the project has intel-extension-for-transformers as a dep and I hit this error: ``` 9.175 Collecting intel-extension-for-transformers==1.3.2 9.193 Downloading...
from intel_extension_for_transformers.neural_chat import PipelineConfig from intel_extension_for_transformers.neural_chat import build_chatbot from intel_extension_for_transformers.neural_chat import plugins plugins.retrieval.enable=True plugins.retrieval.args["input_path"]="./docs/" config = PipelineConfig(plugins=plugins) chatbot = build_chatbot(config) When I run this code every time I add some...
I am trying to explore the backend server. After resolving dependencies issues, I tried to start the server but system doesn’t shows any running backend server neither logs helps out...
stream default is True but string output is not like streaming 
Hey there! I'm trying to run llama3-8b-instruct with intel extension for transformers. Here's my code: ``` from transformers import AutoTokenizer from intel_extension_for_transformers.transformers import AutoModelForCausalLM import torch model_id = "meta-llama/Meta-Llama-3-8B-Instruct" tokenizer...
## Type of Change Added feature to use EAGLE (speculative sampling) with ITREX as discussed with the ITREX team and Haim Barad from my team. Added example script on how...
## Type of Change feature or bug fix or documentation or others API changed or not not ## Description Add streaming llm doc
## Type of Change Bug Fix to use token latency instead of total inference time to measure performance ## Description The workshop notebook measure total inference time for performance instead...
New prompt format for llama3 https://llama.meta.com/docs/model-cards-and-prompt-formats/meta-llama-3/
Adding an end-end finetuning and evaluation workflow for text-generation using Glue MNLI