chemotools
chemotools copied to clipboard
Improve AirPLS and ArPLS performance - sparse matrix operations
Description
AirPLS (Adaptive Iteratively Reweighted Penalized Least Squares) and ArPLS (Asymmetrically Reweighted Penalized Least Squares) are powerful algorithms for removing complex non-linear baselines from spectral signals. However, their computational cost can be significant, especially when processing large numbers of spectra. Currently, we use the csc_matrix
representation from scipy.sparse
to optimize performance, but further improvements are needed.
Improving Attempts
To improve the performance, I have tried just-in-time compilation of some key functions using numba
. However, numba
does not support the csc_matrix
type, and I cannot JIT compile the code. To overcome this issue, I thought of looking for a numba
compatible representation of sparse matrices, but could not find one. Therefore, I have created my own, together with some functions to make basic algebra operations with them code to Gist. Unfortunately, this did not improve the performance over the current implementation.
Hacktoberfest Challenge
We invite open source developers to contribute to our project during Hacktoberfest. The goal is to improve the performance of both algorithms
Here are some ideas to work on:
- Find a more efficient way to JIT compile the code using tools like
numba
. - Investigate parallel or distributed computing techniques to speed up the processing of multiple spectra.
How to Contribute
Here is the contributing guidelines
Contact
We can have the the conversation in the Issue or the Discussion
Resources
Here are some relevant resources and references for understanding the theory and implementation of the AirPLS and ArPLS algorithms:
- Paper on AirPLS: Z.-M. Zhang, S. Chen, and Y.-Z. Liang, Baseline correction using adaptive iteratively reweighted penalized least squares. Analyst 135 (5), 1138-1146 (2010).
- Paper on ArPLS: Sung-June Baek, Aaron Park, Young-Jin Ahn, Jaebum Choo Baseline correction using asymmetrically reweighted penalized least squares smoothing