SPORF icon indicating copy to clipboard operation
SPORF copied to clipboard

[WIP] An implementation of discontiguous sampling of the SRerf variant; we call it MTORF(?)

Open adam2392 opened this issue 4 years ago • 2 comments

Summary

@ChesterHuynh and I were interested in extending the SRerf variant that seems work very well on low-sample image datasets to low-sample multivariate-time series (mts). A corresponding issue was created here: https://github.com/adam2392/SPORF/issues/1 to discuss and design how this might look. This PR addresses the issue raised and implements MTORF(?).

We would love some feedback and potentially get this merged in so that way we can "pip install" this variant.

Details of Implementation

Assuming that mts are structured as (S x T), where S are time series signals and T is time, then MTORF essentially discontiguizes the sampling along the row dimensions, while keeping contiguous chunks in time (T).

@ChesterHuynh did a c++ implementation in the code that is attached and we have been running experiments to further some studies we have. I will summarize them here below.

Studies to Back it up

  1. Simulation of a Multivariate Gaussian With Noisy Samples in Between First, we did a simulation study that takes a 3-dim Gaussian and then generate 3 white noise signals. We generate ~1000 samples of each. Then we stack them as such:
signal = 3-dim Gaussian
noise_1 = white noise
noise_2 = white noise
noise_3 = white noise

# this is now a 6 x 1000 array
noisy_signal = np.concatenate((signal[0], noise_1, signal[1], noise_2, signal[2], noise_3), axis=0)

This was the result: image

This essentially demonstrates when MTORF vs SRERF is desirable. This motivated us to then proceed w/ some real data.

  1. Classification task for epilepsy: I used this variant when I set up an epilepsy outcome classification task based on the quantiles of features computed from iEEG data around a seizure onset. It was very helpful because I was able to utilize the fact that my input matrix was correlated in time, but I did not have to impose that each of the quantiles were correlated to its neighboring quantiles (SRerf vs MTORF). This example is a bit difficult to explain, so happy to add more details if desired.

  2. motor decoding from iEEG data: Chester and I are currently working on a research project trying to decode motor movements (L, R, Up, Down) from iEEG signals. We hypothesize that a subset of the iEEG data that we recorded is actually useful for decoding movement, and hence the MTORF variant is particularly useful.

Additional Information

Jesse helped me navigate where we might want to make the code change back in Feb 2020(?). Lol sorry for the delay in floating this back up. Jovo initially showed me the SRerf variant during I think a summer workshop he hosted. I prolly should do more tests comparing the different variants, but haven't found the time. I also briefly discussed things w/ Ronan and Hayden a long long time ago, so just trying to get this back on track :p.

Any critiques are appreciated.

adam2392 avatar Dec 01 '20 17:12 adam2392

Deploy preview for rerf failed.

Built with commit fcfabaa6d9626596437830eca6edc128f7d63cd9

https://app.netlify.com/sites/rerf/deploys/5fc67a8a27fa4f00074b09d8

netlify[bot] avatar Dec 01 '20 17:12 netlify[bot]

Currently some tests failed for me due to:

    def test_urerf(projection_matrix):
        n_samples = 100
        n_classes = 2
        X, y = make_blobs(
            n_samples=n_samples, centers=n_classes, n_features=2, random_state=2 ** 4
        )
    
        clf = UnsupervisedRandomForest(projection_matrix=projection_matrix)
        clf.fit(X)
        sim_mat = clf.transform()
    
        assert np.array_equal(sim_mat.diagonal(), np.ones(n_samples))
    
        cluster = AgglomerativeClustering(n_clusters=n_classes).fit(sim_mat)
        predict_labels = cluster.fit_predict(sim_mat)
        score = adjusted_rand_score(y, predict_labels)
>       assert score > 0.9
E       assert 0.48526863084922006 > 0.9

Not sure if this is related to us tho.

adam2392 avatar Dec 01 '20 17:12 adam2392