Tianqi Chen comments

Results 637 comments of


                                            Tianqi Chen

Is guidance for `stream=-1` intentional in dlpack

cc @leofang for thoughts, i do think it have usecases when we only want to exchange the data structure while leaving synchronization need explicitly to the user

Usage Stats in Intermediate Steps

unfortunately dong so would mean the output won't align with the openai proctol, so likely we cannot support such a case, note that async streaming(between worker and the client) is...

[MetaSchedule]Fix the bug when loading database_tuning_record.json if there is pad_einsum primitive

@tvm-bot rerun

Use subgroup operations when possible

This is great, subgroup shuffle can be useful for reduction operations. We did have warp shuffle support for metal backend, so maybe we can try add codegen backend for webgpu

[TVMScript][Relax] Use tir.SizeVar for shape variables

This is a nice PR that would be good to land it. cc @yongwww

[Relax] Expose BlockBuilder's Analyzer instance in Python

thanks @oskar-inceptron If we want to go towards this directly, a better approach is to make Analyzer an Object, so we can use ObjectPtr for this

model compile process[Question]

we recommend start from default options, which we normally use and `q4f16_1` to reduce memory.

model compile process[Question]

https://llm.mlc.ai/docs/compilation/convert_weights.html contain a walk through guide

[Question] Running mlc_llm into a multi-phase container build

hi @oglok seems you were using an older version that is now being deprecated

[Question] Running mlc_llm into a multi-phase container build

as of now unfortunately we don't have a container file unfortunately so maybe build from source for jetson is needed