pykan Is KAN 10X slower per step of training, or does it need 10X many steps to converge?

Is KAN 10X slower per step of training, or does it need 10X many steps to converge?

Open cmsflash opened this issue 9 months ago • 2 comments

Hi Ziming,

In Section 6 of your paper, you mentioned that KANs are practically 10X slower than MLPs. I am curious what you meant by it. Did you mean a KAN takes 10X as many steps to converge in comparison to an MLP with the same number of parameters, or that a KAN takes 10X as much wall-clock time to run a single step of training (forward, backward, and gradient update) in comparison with a same-parameter MLP?

One more question I have is, if I understand correct, a KAN has more parameters per neuron versus an MLP. In your speed (or slowness) claim, you did control for parameter count, not neuron count, correct?

May 03 '24 05:05 cmsflash

Hi, I mean the latter "KAN takes 10X as much wall-clock time to run a single step of training (forward, backward, and gradient update) in comparison with a same-parameter MLP?" Also 10x should be taken more as a typical value (for the scale of the problem reported in the paper). obviously, it depends a lot on other hyperparameters as well.

Yes, I did control for parameter count.

May 03 '24 05:05 KindXiaoming

Thank you for the answers! If it's slower per-step, then it's quite hopeful to accelerate KAN by writing some efficient kernels. The KAN operations look pretty hardware-friendly.

On Thu, May 2, 2024, 22:45 Ziming Liu @.***> wrote:

Hi, I mean the latter "KAN takes 10X as much wall-clock time to run a single step of training (forward, backward, and gradient update) in comparison with a same-parameter MLP?" Also 10x should be taken more as a typical value (for the scale of the problem reported in the paper). obviously, it depends a lot on other hyperparameters as well.

Yes, I did control for parameter count.

— Reply to this email directly, view it on GitHub https://github.com/KindXiaoming/pykan/issues/38#issuecomment-2092345510, or unsubscribe https://github.com/notifications/unsubscribe-auth/ADSFTNM75LZ6SBXLHAAOG53ZAMP6NAVCNFSM6AAAAABHE43CCCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAOJSGM2DKNJRGA . You are receiving this because you authored the thread.Message ID: @.***>

May 04 '24 05:05 cmsflash

pykan pykan copied to clipboard

Is KAN 10X slower per step of training, or does it need 10X many steps to converge?

pykan
pykan copied to clipboard