Tianqi Chen comments

Results 637 comments of


                                            Tianqi Chen

[Question] Support for Flutter

After reading a bit more, I think easiest way would to to interface the [TVM C API](https://github.com/apache/tvm/blob/main/include/tvm/runtime/c_runtime_api.h) and work through https://dart.dev/guides/libraries/c-interop. The rust module might serve as a reference https://github.com/mlc-ai/mlc-llm/pull/1213....

[Model Request] Mixtral-8x22B-Instruct-v0.1 🙏

feel free to convert the model ourselves

[Bug] gemma 2b start chatting error

This is now fixed in latest apk

org.apache.tvm.Base$TVMError: InternalError: Check failed: (e == CL_SUCCESS) is false: OpenCL Error, code=-54: CL_INVALID_WORK_GROUP_SIZE [Bug]

likely we need to further restrict the group sizes and these device memory are too small for the llama style models

[Bug] Yi model error：TVM runtime cannot find vm_load_executable

we do depend on cmake to support android, we recommend conda environment to enable such cases

KVCachw expects [Bug]

latest android sdk might help address related issues https://llm.mlc.ai/docs/deploy/android.html

[Question] any reason why Vulkan (Windows) prebuilt is not provided?

Thanks for the suggestions. Indeed we recently are moving towards encouraging JIT compile to simplify our flow. Please checkout some of the latest tutorials https://llm.mlc.ai/docs/get_started/introduction.html

[Question] Save/load kv cache for faster load times?

#2295 Should address this

[Bug] WSL2 Ubuntu RTX 3060 CUDAError: cuModuleLoadData(&(module_[device_id]), data_.c_str()) failed with error: CUDA_ERROR_NO_BINARY_FOR_GPU

please try ouot the latets command in https://llm.mlc.ai/docs/get_started/quick_start.html

[Bug] Stack trace not available when DMLC_LOG_STACK_TRACE is disabled at compile time.

This was due to the prefill_chunk_size setting, reduce it would help the issue