lxnlxnlxnlxnlxn

Results 2 comments of lxnlxnlxnlxnlxn

那请问目前是只能运行Readme中的Static inference performance部分嘛(kvoff分支)?

我在Huggingface官网上下载了Chinese-LLaMA-2-1.3B模型,而后运行 ``` test/model/test_llama2.py,得到以下报错: root@gpu0:/lightllm/test/model# python test_llama2.py python: /project/lib/Analysis/Allocation.cpp:40: std::pair mlir::triton::getCvtOrder(mlir::Attribute, mlir::Attribute): Assertion `!(srcMmaLayout && dstMmaLayout) && "Unexpected mma -> mma layout conversion"' failed. F ====================================================================== FAIL: test_llama2_infer (__main__.TestLlama2Infer) ---------------------------------------------------------------------- Traceback...