Textmony
Results
2
comments of
Textmony
Same on llama q0f16, Platform: Mental, iOS app: ``` Function vm.builtin.paged_attention_kv_cache_create_reduced(0: runtime.ShapeTuple, 1: int64_t, 2: int64_t, 3: int64_t, 4: int64_t, 5: int, 6: double, 7: double, 8: runtime.NDArray, 9: runtime.PackedFunc,...
Hit the same issue with the latest code at 9998076153d5309ec87dc32c373e1759813ee84e for iOS app with a customized llama model ``` Function vm.builtin.paged_attention_kv_cache_create_reduced(0: runtime.ShapeTuple, 1: int64_t, 2: int64_t, 3: int64_t, 4: int64_t,...