OmicsPLS icon indicating copy to clipboard operation
OmicsPLS copied to clipboard

Any special reason for n + max(nx, ny) in o2m2?

Open krassowski opened this issue 4 years ago • 1 comments

I just wanted to let you know that I was able to (roughly) reproduce Figure 12b from the (Trygg, and Wold, 2003) paper using (slightly modified) o2m2. It occurred to me that if I replace the number of components (A in the paper) in the first pass with n rather than n + max(nx, ny) the algorithm better reflects what I would read from the paper one and the recreated figure is more similar to the original one. SVD version works almost as well (+/- a flipping sign).

Result for n + max(nx, ny): 12a_reproduction_n+max(nx,ny)

Result for n: 12b_reproduction_n

Relevant code: https://github.com/selbouhaddani/OmicsPLS/blob/913c3e5ea403606e513bb888c195bdc7a3e0940a/R/OmicsPLS_o2m.R#L130-L134

Based on the comment ("larger principal subspace") I understand that there might be a reason for this modification and would be happy to learn if you could point me to a reference. If you don't have anything at hand, please feel free to close this issue - I wanted to put this up somewhere so another curious person (or future me) would not need to go through the debuging process again.

There is still some noise (which may have to do with the difference in cross-validation splits or with the differences in the OSC filtering) and the y-axis scales differ (I tried passing it through autoscailing, it did not help). I could not find anything what could explain the differences and it seems that not much more could be deduced from the original publication without having acess to their code.

Finally, thank you for all the recent improvements!

Best wishes, Michał

krassowski avatar Jul 30 '19 16:07 krassowski