pykan icon indicating copy to clipboard operation
pykan copied to clipboard

help! there is a problem here! cell 12 of hellokan.ipynb, detail:RuntimeError: false INTERNAL ASSERT FAILED at "C:... please report a bug to PyTorch. torch.linalg.lstsq: (Batch element 0): Argument 6 has illegal value. Most certainly there is a bug in the implementation calling the backend library.

Open BinGnadenlos opened this issue 9 months ago • 7 comments

BinGnadenlos avatar May 13 '24 15:05 BinGnadenlos

maybe related to #179

KindXiaoming avatar May 14 '24 00:05 KindXiaoming

What is the CPU/GPU that you use during implementing pytorch? I saw in the issues that someone claim that there might be a problem with Apple Silicon devices. #199

Stealeristaken avatar May 16 '24 17:05 Stealeristaken

maybe related to #179

thanks for your reply, but I am ashamed that I still don't know how to solve it. as mentioned in #179, I tried to set model = model.prune(threshold=5e-2) instead of model = model.prune(), but new problem occurred. As shown below. fc245e30332209e15246b55eea3236c RuntimeError: stack expects a non-empty TensorList

BinGnadenlos avatar May 20 '24 13:05 BinGnadenlos

What is the CPU/GPU that you use during implementing pytorch? I saw in the issues that someone claim that there might be a problem with Apple Silicon devices. #199

sorry for the late response, I use the Cuda.

BinGnadenlos avatar May 20 '24 13:05 BinGnadenlos

Hi, in this case, it looks like the network fails to learn only one hidden neurons, but there are two duplicate neurons. You could try increase lamb_entropy, lamb or change another random seed.

This error "RuntimeError: stack expects a non-empty TensorList" means the threshold is set too large such that all neurons are pruned away.

KindXiaoming avatar May 21 '24 13:05 KindXiaoming

RuntimeError: false INTERNAL ASSERT FAILED at "C:\b\abs_6fueooay2f\croot\pytorch-select_1707342446212\work\aten\src\ATen\native\BatchLinearAlgebra.cpp":1540, please report a bug to PyTorch. torch.linalg.lstsq: (Batch element 0): Argument 6 has illegal value. Most certainly there is a bug in the implementation calling the backend library..

I have the same issue, I will try on different datasets and report

danjdowling avatar May 21 '24 18:05 danjdowling

i received this same error when training on a dataset i had generated for the purposes of classification, the inputs and oututs were so similar the model would need very little training. I followed this process and was able to generate outputs until In [11]. https://github.com/KindXiaoming/pykan/blob/master/hellokan.ipynb

I have now used data which involves learning and the issue has gone away. So possibly this was due to the data that was inputted and the lack of learning required.

danjdowling avatar May 21 '24 22:05 danjdowling