Shushi Hong
Shushi Hong
Support Reverse sequence quantization operation as part of #15148
Support ARG_MIN quantization operation as part of #15148
This pr support Minicpm-2B, MOE version of Minicpm will be updated in following prs. Demonstration here: ``` tlopex@tlopex-OMEN-by-HP-Laptop-17-ck1xxx:~/mlc-llm$ mlc_llm chat dist/MiniCPM-2B-128k-q4f16_1-MLC --device "cuda:0" --overrides context_window_size=2048 --model-lib ./dist/libs/MiniCPM-2B-128k-q4f16_1-cuda.so [2024-08-06 00:01:19] INFO...
This pr updates `use_qk_norm` option for Cohere series models like Command-R-Plus.
## Summary This PR reorganizes the conda-related files and removes unused conda build infrastructure. ## Changes ### Moved Files - `conda/build-environment.yaml` → `tests/conda/build-environment.yaml` - `conda/condarc` → `tests/conda/condarc` ### Removed Files...