wentao
wentao
Hi,I find that you have successfully trained to complete the thundernet model based on voc2007. About the dataset, I want to know how do I prepare it ? train_path =...
There is a comparison of speed analysis before and after int8 quantization ?
Is there any demo provided for the cuda accelerated conversion example of yuv420sp to bgr? thanks!
Excuse me, if I need to deploy llama2 on Qualcomm Snapdragon chip through ExecuTorch and want to use NPU computing power as an inference computing unit, what do I need...
I would like to ask if nvcv::Tensor currently supports dimension splicing. For example, when I perform batch preprocessing here, different frame images will produce multiple nvcv::Tensor: nvcv::Tensor inputTensor1(2, {m_model_width, m_model_height},...
### Describe the feature Are there any plans to support accelerated training of the StableDiffusion3 algorithm model?
At present, I use 8x4090x24GiB resources for DiffSynth-Studio/examples/stepvideo/stepvideo_text_to_video_low_vram.py code reasoning. According to the readme, the video memory resource only needs 24GB, but the actual test found that more is needed....
### Your current environment My test environment is as follow: ``` Environment: 2x H100 with NVLink ``` My testing method: ``` # P instance UCX_TLS=cuda_ipc,cuda_copy,tcp \ LMCACHE_CONFIG_FILE=/work/vllm-0.8.5/examples/lmcache/disagg_prefill_lmcache_v1/configs/lmcache-prefiller-config.yaml \ LMCACHE_USE_EXPERIMENTAL=True \...
### Your current environment  ### How would you like to use vllm I want to run inference of a [specific model](put link here). I don't know how to integrate...
### Model Series Qwen3 ### What are the models used? qwen3-4b ### What is the scenario where the problem happened? performance qwen3-4b vs qwen2.5-7b-instruct ### Is this a known issue?...