Siddharth Singh

Results 1 comments of Siddharth Singh

If one wants to use per-layer cuda-graphs (--cuda-graph-scope full as of today in main), do we set --cuda-graph-scope as `attn mlp`? In that case, are we doubling the number of...