Peter Pan
Peter Pan
Links below: The Slides for the intro in tag meeting: https://docs.google.com/presentation/d/1Gh-Y8t4QGrq4s2nYI5YR0rycVHjQcmEOtxB1Xqotqr8/edit?usp=sharing Recording for the meeting: https://www.youtube.com/watch?v=CrPr7TqEIrc&list=PL6wYrb-bYwC_iStEfVmBpLKDBqOy23nCA&index=44 More discussion for sandbox: https://github.com/cncf/sandbox/issues/49
is this project still under maintenance ?
attempt to make life easier https://github.com/vllm-project/vllm/pull/24177
@ShadowCrafter011 already updated his PR https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/16688 So I will refactor this one when https://github.com/AUTOMATIC1111/stable-diffusion-webui/pull/16688 merged
> There is already a docker image that have been maintained for over 1½ years. If there is anything in the main repo that causes these not to work then...
> Thanks for the fix. but I am seeing another error: > > ``` > Traceback (most recent call last): > File "/usr/lib/python3.10/runpy.py", line 196, in _run_module_as_main > return _run_code(code,...
Woo, Thank you @zhyncs. just try new image `lmsysorg/sglang:v0.4.3.post2-cu125` the performance seems similar than 0.4.2 (on 16 x H20) when running-req = 1, the `gen throughput (token/s)` is no more...
@merrymercy can you please kindly take a look ?
> I'm confused about the CI UT failures, seems all are irrelevant ... > > * `test_video_chat_completion ` failure , > * performance threshold `test_mmlu` ` assert metrics["score"] >= 0.5`...
I've two GPU with 24G HBM each. at the begining, code modification as below ``` inference/cli_demo.py elif generate_type == "t2v": - pipe = CogVideoXPipeline.from_pretrained(model_path, torch_dtype=dtype) + pipe = CogVideoXPipeline.from_pretrained(model_path, torch_dtype=dtype,device_map="balanced")...