Jian Chen comments

Repositories
Issues
Comments

Results 4 comments of


                                            Jian Chen

Can we integrate Flashinfer into gpt-fast?

Wow cool! Is there any example of using torch.compile with flashinfer BatchPrefillWithPagedKVCacheWrapper or Decode wrapper? Thanks!

Can we integrate Flashinfer into gpt-fast?

Thanks for your information! I have checked how to define custom operators, and I can successfully define single_prefill_with_kv_cache like below: ```python import torch import flashinfer torch.library.define( "mylib::custom_func", "(Tensor q, Tensor...

Can we integrate Flashinfer into gpt-fast?

Thanks! I think I solved this problem by creating the wrapper before defining custom operator, and keeping using this wrapper. But make the wrapper python project will be fine and...

apply_rope_inplace will cause graphbreak due to mutated inputs

Yeah I have annotated that but it still not works. Exposing non in place rope will be much helpful, thanks!