sglang topics

MOSS-TTSD is a spoken dialogue generation model that enables expressive dialogue speech synthesis in both Chinese and English, supporting zero-shot multi-speaker voice cloning, and long-form speech ge...

OpenMOSS

finetune

large-language-models

sglang

speech-dialogue-generation

GPTQModel

902

Stars

130

Forks

902

Watchers

LLM model quantization (compression) toolkit with hw acceleration support for Nvidia CUDA, AMD ROCm, Intel XPU and Intel/AMD/Apple CPU via HF, vLLM, and SGLang.

ModelCloud

gptq

optimum

peft

quantization

kvcached

682

Stars

67

Forks

682

Watchers

Virtualized Elastic KV Cache for Dynamic GPU Sharing and Beyond

ovg-project

elastic-kvcache

gpu-mutiplexing

gpu-sharing

inference-engine

FlashTTS

557

Stars

72

Forks

557

Watchers

基于SparkTTS、OrpheusTTS等模型，提供高质量中文语音合成与声音克隆服务。

HuiResearch

flashtts

llamacpp-python

megatts3

orpheus-tts

SpecForge

500

Stars

112

Forks

500

Watchers

Train speculative decoding models effortlessly and port them smoothly to SGLang serving.

sgl-project

eagle

eagle3

fsdp

llm

InferenceMAX

383

Stars

55

Forks

383

Watchers

Open Source Continuous Inference Benchmarking - GB200 NVL72 vs MI355X vs B200 vs H200 vs MI325X & soon™ TPUv6e/v7/Trainium2/3/GB300 NVL72 - DeepSeek 670B MoE, GPTOSS

InferenceMAX

ai

amd

benchmark

cuda