pyHSICLasso
pyHSICLasso copied to clipboard
Use only lower half of the kernel matrices
Hi @myamada0321,
As we discussed, I implement in this patch a speed-up for HSIC Lasso. When we vectorized the kernel matrix, I use only the lower triangle (diagonal included). Hence, this implementation uses half the memory. LARS might be faster as well.
I had to adapt some tests as well. It seems this causes the reordering of some features, but roughly the same features are selected.
Let me know if you find any issues with this version.
Cheers, H.
I am not sure if this is important. But I observed that in the tests, there is one specific feature (299) that always disappears in the tests. I.e. it was selected before, but not anymore. Not sure if that reveals some unexpected behaviour, I will look into it.
Thank you for the update! Reducing to half memory is nice.
I had to adapt some tests as well. It seems this causes the reordering of some features, but roughly the same features are selected.
In general, the results must be identical. Could you check the estimated parameter?
Indeed, it seems the estimated parameters are not the same. For a toy example:
- Old run:
data:image/s3,"s3://crabby-images/de72a/de72a1cb455b916e075605aec3ac6214c30c48da" alt="Screenshot 2021-07-14 at 12 35 39"
- New run:
data:image/s3,"s3://crabby-images/06d5a/06d5a96004dcbcc6603bfc60fd13899cf578638e" alt="Screenshot 2021-07-14 at 12 36 16"
I'll see if I can fix this eventually.