Ruonan Wang issues

Results 26 issues of


                                            Ruonan Wang

Chronos : Support val data in autoformer fit

## Description We need to align function in BaseForecaster and AutoformerForecaster synchronously. ### 1. Why the change? https://github.com/intel-analytics/BigDL/issues/5834 ### 2. User API changes ``` from bigdl.chronos.forecaster import AutoformerForecaster forecaster =...

Chronos

Nano：Follow up improvements for InferenceOptimizer

## Improvements There are two parts need to be improved for InferenceOptimizer : - Reduce the time cost under the default parameters #5740 - Improve the output of optimize process...

Nano

Nano: New function for InferenceOptimizer to find model with local minimum latency

## Background Nano currently provide `optimize` and `get_best_model` in InferenceOptimizer for our users to get an accelerated model with global minimum latency. However, in the actual use scenario, when :...

Nano

Inference speed is very slow on CPU

In my test, mobilevit_xs process one image will cost 113ms, which is much larger than the value in the paper.

ipex quantization model can't be loaded after save

Hi, when I use ipex quantization with inc, I meet a problem that quantized model can't be loaded after save. When I save, I just call `quantized.save(path)` and I get...

FP16 output on PVC has randomness

### Describe the bug I found that on PVC GPU, output of fp16 has randomness. The same llm model, if I run two times, the output will be different, below...

Correctness

dGPU-Max

LLM

LLM: split chatglm3's mlp and use mlp fusion

## Description This split chatglm3's mlp and use mlp fusion, which can has ~1ms on MTL. But quantize kv cache + mlp fusion will cause change of output on Arc...

`to('xpu')` causes that Jupyter kernel dies on ARC

### Describe the bug I found in jupyter notebook, `to('xpu')` makes the Jupyter kernel die. ### Notebook to reproduce ![image](https://github.com/intel/intel-extension-for-pytorch/assets/105281011/0398eab9-6a09-448f-9e46-32f0a9c7cfd8) ```bash [1] import intel_extension_for_pytorch as ipex [2] from transformers import...

XPU/GPU

Crash

update troubleshooting of llama.cpp

## Description update troubleshooting of llama.cpp ### 1. Why the change? https://github.com/intel-analytics/ipex-llm/issues/10989 ### 4. How to test? - [ ] Document test ![image](https://github.com/intel-analytics/ipex-llm/assets/105281011/a5507224-79d6-4cf9-94af-8f630aaef262)

support gguf_q4k_m / gguf_q4k_s

## Description ### 1. Why the change? https://github.com/analytics-zoo/nano/issues/1316#issuecomment-2076658639 ### 2. User API changes ```python model = AutoModelForCausalLM.from_pretrained(model_path, load_in_low_bit='gguf_q4k_m', optimize_model=True, torch_dtype=torch.float16, trust_remote_code=True, use_cache=True) ``` ```python model = AutoModelForCausalLM.from_pretrained(model_path, load_in_low_bit='gguf_q4k_s', optimize_model=True, torch_dtype=torch.float16,...