pykan icon indicating copy to clipboard operation
pykan copied to clipboard

Optimize the grid dimensionality during KANLayer initialization to reduce memory/GPU usage significantly and greatly reduce the initialization time of KANLayer.

Open congyue1977 opened this issue 6 months ago • 0 comments

In the initialization process of KANLayer, since the knots vector of B-Splines is constructed based on the grid_range parameter, it is identical across all input dimensions (in_dim). This means the data in the grid is redundant, so simply setting the size of the first dimension to 1 suffices. Subsequent calculations will automatically utilize tensor broadcasting and will not affect the grid update process.

This optimization reduces memory or GPU usage significantly. After optimization, each layer of KANLayer can save (in_dim-1) * (G+2k+1) memory. If the depth is N and input dimensions are the same, this can save N*(in_dim-1) * (G+2k+1).

Furthermore, this optimization drastically reduces the initialization time of KANLayer, improving network efficiency. Through testing, with a large G, for example 100, and a width of [4,100,100,100,1] with k=3 for KAN, before optimization, it took nearly 30s to start training on an Intel i9-12900K. After optimization, training starts in less than 1s.

congyue1977 avatar Jul 28 '24 09:07 congyue1977