Sunghyun Park
Sunghyun Park
@ganler Thank you for the clarification. I misunderstood its definition and I agree that yours can differentiate these two. Then, how about this case? `Relu` has two consumers now and...
@ganler I see. It seems like we are based on different assumptions about whether a fused kernel allows multiple outputs (potentially intermediate output) or not. If I remember correctly such...
Great catch! Does it only happen in the Relax world, especially cases in the first thread? This might be the issue in the Relay VM side as well and might...
Thank you for great proposal, @vinx13! I also have a question. Since TIR-level layout rewrite occurs purely in TIR-level, IIUC, these two paths might produce different primfuncs. * P1: `topi.cuda.conv2d_NCHW`...
Hi, @ZihengJiang. I'm also glad to see that we are looking into the same direction. My collaborators and I developed an automated solution to handle such problems and observed some...
@comaniac, thank you for your input and I totally agree with your thoughts. I expect there would be more exciting opportunities in training since there are more interesting operations and...
Hi, all! Hope you are all doing well. Since our last discussion, I've worked on identifying more concrete challenges in the current pass infra and drafted an initial design to...
Thanks for valuable inputs today! And definitely I would appreciate more feedback if you have any. Feel free to leave more :) Some of feedback during the meeting 1. Use...
Hi, @ZihengJiang > * For the tuning pass with eval passes usage `T1(eval_passes=[H3, H4])`, will the H3 and H4 happen before or after T1? `eval_passes` will apply for candidate evaluation....
@hypercubestart, thank you for your input! > * In Relay, quantization workflow is split into 3 separate passes: QuantizeAnnotate, QuantizeCalibrate, QuantizeRealize. For QuantizeCalibrate, we need to represent required pre-passes (annotate)...