Luchang Li issues

Results 19 issues of


                                            Luchang Li

How to get whole pass list for disable optimizations

The 0.4.19 add a optimizations to merge consecutive slice into one slice, I want to disable this pass but don't know what pass to disable. Generally, how can we now...

[BUG] infer shape error when graph.input contains info about graph.initializer

**Describe the bug** A clear and concise description of what the bug is. Some times onnx model graph.input contains info about graph.initializer for unknown reasons. At this condition, the onnxsim...

clCreateCommandQueueWithProperties does not support properties like CL_QUEUE_PRIORITY_KHR CL_QUEUE_THROTTLE_KHR

clCreateCommandQueueWithProperties does not support properties like CL_QUEUE_PRIORITY_KHR CL_QUEUE_THROTTLE_KHR, currently only support CL_QUEUE_PROPERTIES

一种改进next_token计算的方式

采用下面的方式替代已有计算可以明显降低next_token计算量，用于替换原有的 ``` next_token_scores = self.apply_warp(next_token_scores) probs = npsoftmax(next_token_scores.astype(np.float64), axis=1) # Caution: # *** ValueError: sum(pvals[:-1].astype(np.float64)) > 1.0. The pvals array is cast to 64-bit floating point prior to checking the...

can we replace https://the-eye.eu/public/AI/pile/val.jsonl.zst

I can't download the file "https://the-eye.eu/public/AI/pile/val.jsonl.zst" in get_calib_dataset, can we use other data to replace it? Thanks a lot.

TypeError: FalconRotaryEmbedding.forward() missing 1 required positional argument: position_ids

when using transformers verison 4.35.2, I got this error, and similar error for quanting llama: it seems you are using version

AttributeError: 'FalconAttention' object has no attribute 'maybe_rotary'

OmniQuant-main/models/int_falcon_layer.py", line 52, in __init__ self.maybe_rotary = copy.deepcopy(org_module.maybe_rotary) File "local/lib/python3.10/site-packages/torch/nn/modules/module.py", line 1614, in __getattr__ raise AttributeError("'{}' object has no attribute '{}'".format( AttributeError: 'FalconAttention' object has no attribute 'maybe_rotary' transformers version:...

Qwen1.5-7B-Chat AWQ量化的MMLU评测效果相比Qwen1.5-7B-Chat-GPTQ-Int4和Qwen1.5-7B-Chat相差特别大

我评测了Qwen1.5-7B-Chat和两个量化模型的MMLU效果，发现AWQ的分数特别低，比直接naive 4bit还差。这是什么情况呢？浮点模型分数0.60，而GPTQ版本0.59而AWQ版本只有0.45，naive的版本都有0.589 GPTQ和AWQ量化模型: https://huggingface.co/Qwen/Qwen1.5-7B-Chat-AWQ https://huggingface.co/Qwen/Qwen1.5-7B-Chat-GPTQ-Int4

[BUG] RuntimeError: /project/third_party/onnx-optimizer/third_party/onnx/onnx/common/ir.h:1372: eraseOutput: Assertion `outputs_[i]->uses().empty()` failed

**Describe the bug** A clear and concise description of what the bug is. I use onnxsim-0.4.36 to simpily a model and get this error: Traceback (most recent call last): File...