gudhi-devel icon indicating copy to clipboard operation
gudhi-devel copied to clipboard

[Persistence_representations - Persistence_heat_maps] create_Gaussian_filter ignores its second argument

Open VincentRouvreau opened this issue 5 years ago • 5 comments

#include <iostream>
#include <gudhi/Persistence_heat_maps.h>

int main(){
  int n=2;
  auto a=Gudhi::Persistence_representations::create_Gaussian_filter(n,.1);
  auto b=Gudhi::Persistence_representations::create_Gaussian_filter(n,10);
  for(int i=0;i<a.size();++i){
    for(int j=0;j<a[0].size();++j){
      std::cout << a[i][j] << ' ';
    }
    std::cout << '\n';
  }
  std::cout << '\n';
  for(int i=0;i<b.size();++i){
    for(int j=0;j<b[0].size();++j){
      std::cout << b[i][j] << ' ';
    }
    std::cout << '\n';
  }
}

prints

0.000106789 0.00214491 0.00583047 0.00214491 0.000106789
0.00214491 0.0430817 0.117108 0.0430817 0.00214491
0.00583047 0.117108 0.318333 0.117108 0.00583047
0.00214491 0.0430817 0.117108 0.0430817 0.00214491
0.000106789 0.00214491 0.00583047 0.00214491 0.000106789

0.000106789 0.00214491 0.00583047 0.00214491 0.000106789 
0.00214491 0.0430817 0.117108 0.0430817 0.00214491 
0.00583047 0.117108 0.318333 0.117108 0.00583047 
0.00214491 0.0430817 0.117108 0.0430817 0.00214491 
0.000106789 0.00214491 0.00583047 0.00214491 0.000106789 

i.e. the same thing whatever sigma we use. The code looks like it is using sigma, but it actually simplifies out (up to rounding errors of course).

VincentRouvreau avatar Mar 07 '19 20:03 VincentRouvreau

Is this bug still relevant ? Is it related to the mathematics involved (bad formula) or something else ?

Hind-M avatar Apr 27 '21 07:04 Hind-M

Yes, I don't believe the bug was fixed.

There are strange things about the formula. Computing sqrt then immediately squaring it back, the normalization factor 1/(pi*sigma²) (useless since there is a renormalization right after) doesn't match the usual 1/(sigma*sqrt(2pi)), it looks closer to its square but not quite either.

There are contradictory comments and I am not sure what is intended. From this comment

// we are computing the kernel mask to 2 standard deviations away from the center. We discretize it in a grid of a // size 2pixel_radius times 2pixel_radius.

it seems that pixel_radius is maybe also supposed to be used as the standard deviation of the Gaussian, up to some constant factor, in which case we should remove the confusing and useless second argument, document that, and check that the formula is consistent. Or if sigma is meant to be used, then we should fix the formula.

Paweł never commented on this, maybe @MathieuCarriere would have some idea there.

mglisse avatar Apr 27 '21 08:04 mglisse

Sorry for the late comment. I did not use this code a lot, but I seem to remember that pixel_radius was indeed the equivalent of sigma, while the actual sigma parameter is not really used. Moreover, I think there is some sort of approximation there: the kernel is rounded to 0 with a mask outside of pixel_radius (probably for faster computations), whereas the exact formula requires integration of the kernel on each pixel.

MathieuCarriere avatar May 23 '22 09:05 MathieuCarriere

Mentioning this also reminds me that in our current Python version (in gudhi.representations), we also do an approximation by assuming that the kernel is constant on each pixel (also for faster computations).

MathieuCarriere avatar May 23 '22 09:05 MathieuCarriere

Generally speaking, it would be nice to include such exact versions in Python (when there is no approximation going on).

MathieuCarriere avatar Oct 26 '23 09:10 MathieuCarriere