intel-extension-for-transformers
intel-extension-for-transformers copied to clipboard
⚡ Build your chatbot within minutes on your favorite device; offer SOTA compression techniques for LLMs; run LLMs efficiently on Intel Platforms⚡
## Type of Change update notebook ## Description update talkingbot pc notebook ## Expected Behavior & Potential Risk update talkingbot pc notebook ## How has this PR been tested? local...
## Type of Change feature ## Description H2O: Heavy-Hitter Oracle for Efficient Generative Inference of Large Language Models [paper](https://arxiv.org/pdf/2306.14048.pdf) NTD - [x] example - [ ] refactor code to same...
followed the guidelines mentioned here: https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/examples/deployment/talkingbot/server/backend/README.md **first error**: positional argument 'model_type' is missing, which is not given in example ``` TypeError Traceback (most recent call last) Cell In[17], line 7...
**followed the guidelines mentioned here:** https://github.com/intel/intel-extension-for-transformers/tree/main/intel_extension_for_transformers/neural_chat/ui/customized/talkingbot process failed during dependencies ``` npm WARN deprecated @types/[email protected]: This is a stub types definition. sass provides its own type definitions, so you do...
## Type of Change gaudi modeling used in itrex for int4 kv-cache support
Ipex pvc
## Type of Change feature ## Description detail description JIRA ticket: [https://jira.devtools.intel.com/browse/NLPTOOLKIU-1193]
Hi, I have followed this instruction [intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/docs/notebooks/setup_text_chatbot_service_on_spr.ipynb at main · intel/intel-extension-for-transformers (github.com)](https://github.com/intel/intel-extension-for-transformers/blob/main/intel_extension_for_transformers/neural_chat/docs/notebooks/setup_text_chatbot_service_on_spr.ipynb) and written down a couple of issues with potential solutions, which you may want to consider implementing. I...
## Type of Change feature No API changed ## Description Removed fallback of lm_head op for WOQ ## Expected Behavior & Potential Risk Don't fallback lm_head when weight-only quantization. ##...
Flow in RAG example does not work.. I'll be following the instructions as in here: https://github.com/intel/intel-extension-for-transformers/tree/main/intel_extension_for_transformers/neural_chat/examples/quick_start/rag intel-extension-for-transformers/intel_extension_for_transformers/neural_chat/examples/quick_start/rag at main · intel/intel-extension-for-transformers And when got this message: [09:28] Tamir, Guy (itrex-rag)...