AllentDan issues

Results 21 issues of


AllentDan

Remove first empty chunck for api_server

improvement

Log stats

Open http://xxxx:23333/metrics/ to view the metrics.

enhancement

[RFC] Refactor chat template and remove model name from engine config

## Motivation - Decoupling dialogue templates from the inference engine. - Reduce the barrier to adding new dialogue templates. - Remove `model_name` from EngineConfig to avoid redundant specification. - Support...

RFC

LMDeploy support accelerating DeepSeek VL models now!!! :rocket:

[LMDeploy](https://github.com/InternLM/lmdeploy), as an AI deployment platform supporting multiple backend services, has always been committed to providing fast and stable AI model deployment services. Now, it supports accelerating the inference and...

BUG report on multiple threads execution

[This lambda expression](https://github.com/CoffeeBeforeArch/mmul/blob/c624ef730ef0b14ad040d0444e4c4af5f1e60fab/src/baseline/benchmark.cpp#L109) pushed to the vector. Only part of the threads is really executed.

Why use num_threads - 1

Hi, Nick. I was confused that you used [num_threads - 1](https://github.com/CoffeeBeforeArch/mmul/blob/c624ef730ef0b14ad040d0444e4c4af5f1e60fab/src/baseline/benchmark.cpp#L106) instead of num_threads.

[Feature] Support vl models quantization

- [x] deepseek vl - [x] llava - [x] internvl - [x] xcomposer (did not quant plora) - [x] minigemini - [x] yi - [x] qwen - [x] internvl-llava

enhancement

Check base64 image validation

#1602

Bug:P2

AllentDan

add internlm2 support

Add input validation