AmberX
AmberX
I encountered the same Native API failed problem when input token size > 2000.
> Hi @AmberXu98 > > Do you encounter this error `Native API returns: -5 (PI_ERROR_OUT_OF_RESOURCES)` meaning out of memory? > > Also could you provide more information of your settings...
> Hi @AmberXu98 > > We are reproducing with your prompt again to double confirm it in our environment. > > By the way, want to confirm is it the...
> > > Hi @AmberXu98 > > > We are reproducing with your prompt again to double confirm it in our environment. > > > By the way, want to...
@hkvision @chtanch I encounter `Native API returns: -999 (Unknown PI error)`, how can I solve it? The bigdl-llm version on my machine is 2.5.0b20240219. Shorter inputs works, only longer inputs...
> @AmberXu98 to clarify, are you using [Qwen-7b-chat-int4](https://huggingface.co/Qwen/Qwen-7B-Chat-Int4)? Or the non-quantized models? Eg, [Qwen-7b](https://huggingface.co/Qwen/Qwen-7B) or [Qwen-7b-chat](https://huggingface.co/Qwen/Qwen-7B-Chat)? @chtanch non-quantized model **Qwen-7B-Chat** and load_in_4bit like this: `model = AutoModelForCausalLM.from_pretrained('Qwen/Qwen-7B-Chat',load_in_4bit=True,optimize_model=True,trust_remote_code=True,use_cache=True,cpu_embedding=True).eval()`