wushaoqiang
Results
1
issues of
wushaoqiang
code : import numpy as np import torch from tutel.jit_kernels import sparse as jit_kernel print(torch.__version__) def moe_dispatch_bwd_gate(): samples=2 capacity=2 hidden=2 num_experts=1 indices = [0,0] locations = [0,0] input = [0.4946,...
invalid