felixslu

Results 10 comments of felixslu

I wonder whether we could get tunning script from your team to tunning sd models by myself. such as links below: https://github.com/mlc-ai/mlc-llm/commit/8aeb3dfe9ff07b04331cc0ed6fdc7c3ee384e382#diff-643d01e2455cf9344c3c81c40c42c8d6aad9cd7ad207aa72712c0b1556c2d014 mlc_llm/tuning.py

> Did you install rust and cargo? Yes,I have install rust and cargo , by commands below,and resolve this compile error! Thanks a lot! `1、apt install cargo 2、mkdir -p build...

I have resolve it by advise of this issue below! thanks! 1、cat dist/dolly-v2-3b-q3f16_0/params/mlc-chat-config.json 2、modify “dolly” to “vicuna_v1.1” https://github.com/mlc-ai/mlc-llm/issues/257

Thanks for your advise! I will try this repo.([web-stable-diffusion](https://github.com/mlc-ai/web-stable-diffusion)) As I known,3-4 bit quantization technology (such as gptq) have not been used in the web-stable-diffusion project , maybe it could...

Hi,MLC team! I have try to run this repo ,but failed!([web-stable-diffusion](https://github.com/mlc-ai/web-stable-diffusion)). Firstly, I try to use pre-auto-tuned schedule params in directory of log_db ,but get errors. ["https://github.com/mlc-ai/web-stable-diffusion/issues/38".](https://github.com/mlc-ai/web-stable-diffusion/issues/38) Then, I try...

hi,MLC Team! Could you have time to update the stable diffusion log_db ? ([web-stable-diffusion](https://github.com/mlc-ai/web-stable-diffusion)) Because, we can not get auto-tunning params for newly introduced purity flag. We get error below...

> a friendlier error @Shixiaowei02 **I have got this error in recently released tensorrt-llm v0.9.0.** ,please give me some advice to fix it ,tks! `python3 -c "import tensorrt_llm; print(tensorrt_llm.version)" Traceback...

> I think model builder should contribute their vision model works in here. In an ideal situation,it's model builder's work! but sadly, maybe their work not focus on device,or they...

> `max_tokens_in_paged_kv_cache` is define [here](https://github.com/triton-inference-server/tensorrtllm_backend/blob/main/all_models/inflight_batcher_llm/tensorrt_llm/config.pbtxt#L311-L316). > > `max_tokens_in_paged_kv_cache` and `kv_cache_free_gpu_mem_fraction` controll the kv cache memory usage together. More details are described [here](https://github.com/triton-inference-server/tensorrtllm_backend/tree/main?tab=readme-ov-file#modify-the-model-configuration). > > If you don't setup the proper...

TVMError: Data types float32 and float16 must be equal for binary operators [10:32:31] ~/tvm/tvm/src/relax/ir/block_builder.cc:64: Warning: BlockBuilder destroyed with remaining blocks! [2023-07-13 10:32:31,987] torch._dynamo.convert_frame: [ERROR] WON'T CONVERT forward /root/miniconda3/envs/tvm-build/lib/python3.8/site-packages/transformers/models/clip/modeling_clip.py line 363...