zhoufengen

Results 4 comments of zhoufengen

Thank you very much for your reply! All outputs must be streamed back, because if this agent is to provide services to others, long waiting times will cause people to...

me too!!! To debug try disable codegen fallback path via setting the env variable `export PYTORCH_NVFUSER_DISABLE=fallback` (Triggered internally at ../third_party/nvfuser/csrc/manager.cpp:335.) y_pred, att_cache, cnn_cache = self.llm.forward_chunk(lm_input, offset=0, required_cache_size=-1, att_cache=att_cache, cnn_cache=cnn_cache, Exception...