pykan
pykan copied to clipboard
Is KAN 10X slower per step of training, or does it need 10X many steps to converge?
Hi Ziming,
In Section 6 of your paper, you mentioned that KANs are practically 10X slower than MLPs. I am curious what you meant by it. Did you mean a KAN takes 10X as many steps to converge in comparison to an MLP with the same number of parameters, or that a KAN takes 10X as much wall-clock time to run a single step of training (forward, backward, and gradient update) in comparison with a same-parameter MLP?
One more question I have is, if I understand correct, a KAN has more parameters per neuron versus an MLP. In your speed (or slowness) claim, you did control for parameter count, not neuron count, correct?
Hi, I mean the latter "KAN takes 10X as much wall-clock time to run a single step of training (forward, backward, and gradient update) in comparison with a same-parameter MLP?" Also 10x should be taken more as a typical value (for the scale of the problem reported in the paper). obviously, it depends a lot on other hyperparameters as well.
Yes, I did control for parameter count.
Thank you for the answers! If it's slower per-step, then it's quite hopeful to accelerate KAN by writing some efficient kernels. The KAN operations look pretty hardware-friendly.
On Thu, May 2, 2024, 22:45 Ziming Liu @.***> wrote:
Hi, I mean the latter "KAN takes 10X as much wall-clock time to run a single step of training (forward, backward, and gradient update) in comparison with a same-parameter MLP?" Also 10x should be taken more as a typical value (for the scale of the problem reported in the paper). obviously, it depends a lot on other hyperparameters as well.
Yes, I did control for parameter count.
— Reply to this email directly, view it on GitHub https://github.com/KindXiaoming/pykan/issues/38#issuecomment-2092345510, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSFTNM75LZ6SBXLHAAOG53ZAMP6NAVCNFSM6AAAAABHE43CCCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJSGM2DKNJRGA . You are receiving this because you authored the thread.Message ID: @.***>