STRUMPACK icon indicating copy to clipboard operation
STRUMPACK copied to clipboard

[Feature Request] scikit-learn compatible GPR module

Open yhtang opened this issue 5 years ago • 6 comments
trafficstars

Hi @pghysels, @liuyangzhuan, and @xiaoyeli,

Following up on our discussion, we would like to explore the possibility of subclassing GaussianProcessRegressor from scikit-learn. The definition of the regressor is here.

The idea is to let users provide a kernel that conforms to the interface of sklearn.gaussian_process.kernels.Kernel, and then you are free to iterate the kernel over a list of submatrices of the overall kernel matrix for compression.

Let me know if there is anything that I can help here.

yhtang avatar Dec 03 '19 19:12 yhtang

Hi Yu-Hang,

Can you try STRUMPACKGaussianProcessRegressor in /global/cscratch1/sd/pghysels/STRUMPACK/install/include/python/STRUMPACKKernel.py on cori?

export PYTHONPATH=$PYTHONPATH:/global/cscratch1/sd/pghysels/STRUMPACK/install/include/python export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/global/cscratch1/sd/pghysels/STRUMPACK/install/lib export LD_LIBRARY_PATH=$LD_LIBRARY_PATH:/global/cscratch1/sd/pghysels/ButterflyPACK/build

It has __init__, fit and predict similar to GaussianProcessRegressor.

You can also install things yourself: https://github.com/pghysels/STRUMPACK (branch gpr) https://github.com/liuyangzhuan/ButterflyPACK

Pieter

pghysels avatar Dec 10 '19 04:12 pghysels

@pghysels Thanks for the update! I'll try it on my workstation since I don't (yes I'm not joking) have an account on Cori yet. Will let you know as soon as possible!

yhtang avatar Dec 10 '19 08:12 yhtang

Ok. Let me know if you have issues with the installation. We might also need to tune some parameters, we can look into this together.

pghysels avatar Dec 10 '19 18:12 pghysels

@pghysels How can I get a .so file? Currently, I can only get a .a library file after compilation.

yhtang avatar Dec 11 '19 07:12 yhtang

Add -DBUILD_SHARED_LIBS=ON to the cmake command. Did you also build ButterflyPACK and enable support for that in STRUMPACK?

      -DTPL_ENABLE_BPACK=ON \
      -DTPL_BPACK_INCLUDE_DIRS="$BPACKHOME/SRC_DOUBLE/;$BPACKHOME/SRC_DOUBLECOMPLEX" \
      -DTPL_BPACK_LIBRARIES="-L$BPACKHOME/build/SRC_DOUBLE/ -ldbutterflypack -L$BPACKHOME/build/SRC_DOUBLECOMPLEX/ -lzbutterflypack" \

You can add

      -DSTRUMPACK_BUILD_TESTS=OFF \
      -DSTRUMPACK_C_INTERFACE=OFF \

to speed up compilation.

Pieter

pghysels avatar Dec 11 '19 18:12 pghysels

Hi @pghysels and @liuyangzhuan, when I tried to run the regressor using stock kernels from scikit-learn, I got some error that crashes the Python interpreter. Could you please take a look at the details as in #26?

In addition to that, I've done some work to make the regressor (at least) syntactically compatible with generic kernels. However, I guess it will be more efficient if we can figure out #26 first.

yhtang avatar Dec 19 '19 08:12 yhtang