LibShalom
LibShalom copied to clipboard
What's the main idea for irregular-shaped GEMM?
Hi! Your job is pretty good! I have a question about irregular-shaped GEMM: How to implement the irregular-shaped GEMM more efficiently? What's the main idea?
Hello! we overlap packing and computing. Efficient handling of edge cases And use more efficient parallelization methods.
I have read your SC paper. It's a great job! I have a question. How do you implement the parallel packB? It seems that it needs some barriers in parallel packB when the parallel mode is 2d, but I don't see any synchronizations in your code.
Thank you for your attention! The packing of B is private to each thread, and this process overlaps with the calculation. Therefore, we do not need to synchronize this operation.
------------------ 原始邮件 ------------------ 发件人: "AnonymousYWL/LibShalom" @.>; 发送时间: 2021年11月22日(星期一) 中午1:18 @.>; @.@.>; 主题: Re: [AnonymousYWL/LibShalom] What's the main idea for irregular-shaped GEMM? (Issue #1)
I have read your SC paper. It's a great job! I have a question. How do you implement the parallel packB? It seems that it needs some barriers in parallel packB when the parallel mode is 2d, but I don't see any synchronizations in your code.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe. Triage notifications on the go with GitHub Mobile for iOS or Android.
So you mean that since the overhead of packing B can be amortized by the calculation, even though private packing B will cause redundant packing B, it dosen't decrease the overall performance?
Yes, this may be improved in the future.