lightllm issues

2

在 `HttpServerManager` 类中，当服务配置为多节点模式 (`args.nnodes > 1`) 且节点 rank 大于 0 时，代码使用 `zmq.PULL` socket 绑定到 `tcp://*:{args.multinode_httpmanager_port}`。这意味着该端口监听来自**所有网络接口**的连接，潜在地将其暴露给不受信任的网络。 [lightllm/server/httpserver/manager.py](https://github.com/ModelTC/lightllm/blob/6234bd3bdf2c8876f953d833db71e4b0c7192a52/lightllm/server/httpserver/manager.py#L626) ```python # 在 HttpServerManager.__init__ 中: if args.nnodes > 1: if args.node_rank == 0:...

kexinoh

bug

🔒 SecurityReportBot Alert: Potential Vulnerability Detected

Hello! We are SecurityReportBot, an automated security assistant. During our routine scan, we detected a potential vulnerability in your repository. However, we noticed that GitHub Security Reports are not enabled...

secuityreportbot

[Feature]框架支持prefix cache吗

3

inkhare

Will lightllm support low bit quantization?

1

Something Like GGUF 1bit and 2bit quantization?

MeJerry215

Support disaggregated prefill ?

8

I saw your code referring to PD disaggragate. Please tell me how to use it

artetaout

Question about Optimizations of Inference for batch_size = 1

1

There are some papers showing perspectives as below. " **_However, the following two issues can lead to low GPU utilization. First, the Decode stage of GPT requires frequent sequential computing...

OlinLai

[Feature]支持多机的tp或者dp吗

2

zpllz

bug

[BUG] cannot launch server after builting from source

3

i strictly follow the installation docs (https://lightllm-cn.readthedocs.io/en/latest/getting_started/installation.html#installation). and my gpu is a800. error: python -m lightllm.server.api_server --model_dir ~/autodl-pub/models/llama-7b/ INFO 12-24 20:14:05 [cache_tensor_manager.py:17] USE_GPU_TENSOR_CACHE is On ERROR 12-24 20:14:05 [_custom_ops.py:51] vllm...

MisterBrookT

bug

[Feature]框架支持多机多卡的推理吗

1

noc-turne

lightllm
lightllm copied to clipboard

Metadata

edp with cc+acc is done

[BUG]通过 zmq.Socket.recv_pyobj 进行的隐式 Pickle 反序列化导致远程代码执行 (RCE)

🔒 SecurityReportBot Alert: Potential Vulnerability Detected

[Feature]框架支持prefix cache吗

Will lightllm support low bit quantization?

Support disaggregated prefill ?

Question about Optimizations of Inference for batch_size = 1

[Feature]支持多机的tp或者dp吗

[BUG] cannot launch server after builting from source

[Feature]框架支持多机多卡的推理吗

← Metadata

Owner

Metadata

lightllm lightllm copied to clipboard

Metadata

← Metadata

Owner

Metadata

lightllm
lightllm copied to clipboard