pyHSICLasso icon indicating copy to clipboard operation
pyHSICLasso copied to clipboard

Use only lower half of the kernel matrices

Open hclimente opened this issue 3 years ago • 3 comments

Hi @myamada0321,

As we discussed, I implement in this patch a speed-up for HSIC Lasso. When we vectorized the kernel matrix, I use only the lower triangle (diagonal included). Hence, this implementation uses half the memory. LARS might be faster as well.

I had to adapt some tests as well. It seems this causes the reordering of some features, but roughly the same features are selected.

Let me know if you find any issues with this version.

Cheers, H.

hclimente avatar Jul 12 '21 18:07 hclimente

I am not sure if this is important. But I observed that in the tests, there is one specific feature (299) that always disappears in the tests. I.e. it was selected before, but not anymore. Not sure if that reveals some unexpected behaviour, I will look into it.

hclimente avatar Jul 12 '21 18:07 hclimente

Thank you for the update! Reducing to half memory is nice.

I had to adapt some tests as well. It seems this causes the reordering of some features, but roughly the same features are selected.

In general, the results must be identical. Could you check the estimated parameter?

myamada0321 avatar Jul 13 '21 02:07 myamada0321

Indeed, it seems the estimated parameters are not the same. For a toy example:

  • Old run:
Screenshot 2021-07-14 at 12 35 39
  • New run:
Screenshot 2021-07-14 at 12 36 16

I'll see if I can fix this eventually.

hclimente avatar Jul 14 '21 10:07 hclimente