aeon
aeon copied to clipboard
[ENH] Make minirocket capable of taking unequal length collections
part of #1699 makes MiniRocket capable of unequal length and deprecates the MiniRocketMultivariateVariable class. This will be rolled out to the other convolution based transformers, also giving associated estimators capability:unequal_length: True tag.
The main issue is you cannot pass a both 3D numpy (equal length) and list of numpy arrays (np-list for unequal) to same numba parameter described by decorator. There are two locations that use numba functions that have to be changed:
_fit_biases: this uses series length internally here
_X = X[np.random.randint(n_cases)][channels_this_combination]
A = -_X # A = alpha * X = -X
G = _X + _X + _X # G = gamma * X = 3X
C_alpha = np.zeros(
(n_channels_this_combination, n_timepoints), dtype=np.float32
)
so my solution is to split it into two functions _fit_biases_numpy and _fit_biases_list. Currently the second is not numba, since I dont think you can easily pass a list of numpy (could very well be wrong). It is not computationally intensive
2. static _transform
this loops through each instance transforming it. My solution is to take this loop out of numba and have a new function _single_case_transform where we pass the case, etc
_X,
features,
n_channels,
n_timepoints,
n_dilations,
n_features_per_dilation,
dilations,
n_channels_per_combination,
channel_indices,
biases,
n_kernels,
indices,
an alternative would be to just remove the decorator typing (not sure if that works) or just have two separate private functions. I'll benchmark times, but atm it looks like it slows things down too much, I'll post graphs below
Thank you for contributing to aeon
I have added the following labels to this PR based on the title: [ $\color{#FEF1BE}{\textsf{enhancement}}$ ]. I would have added the following labels to this PR based on the changes made: [ $\color{#41A8F6}{\textsf{transformations}}$ ], however some package labels are already present.
The Checks tab will show the status of our automated tests. You can click on individual test runs in the tab or "Details" in the panel below to see more information if there is a failure.
If our pre-commit code quality check fails, any trivial fixes will automatically be pushed to your PR unless it is a draft.
Don't hesitate to ask questions on the aeon Slack channel if you have any.
PR CI actions
These checkboxes will add labels to enable/disable CI functionality for this PR. This may not take effect immediately, and a new commit may be required to run the new configuration.
- [ ] Run
pre-commitchecks for all files - [ ] Run all
pytesttests and configurations - [ ] Run all notebook example tests
- [ ] Run numba-disabled
codecovtests - [ ] Stop automatic
pre-commitfixes (always disabled for drafts)
timing experiment for reference (main version)
def timing_experiment():
import time
# Build numba functions
X = np.random.random(size=(10, 1, 100))
r = MiniRocket()
r.fit_transform(X)
r2 = MiniRocketMultivariateVariable()
r2.fit_transform(X)
for i in range(1000,21000,1000):
X1 = make_example_3d_numpy(n_cases=i, n_channels=1, n_timepoints=500,
return_y=False)
X2 = make_example_3d_numpy_list(n_cases=i, n_channels=1, min_n_timepoints=450,
max_n_timepoints=550, return_y=False)
X3 = make_example_3d_numpy(n_cases=i, n_channels=6, n_timepoints=500,
return_y=False)
X4 = make_example_3d_numpy_list(n_cases=i, n_channels=6, min_n_timepoints=450,
max_n_timepoints=550, return_y=False)
start = time.time()
r.fit_transform(X1)
t1 = time.time() - start
start = time.time()
r2.fit_transform(X2)
t2 = time.time() - start
start = time.time()
r2.fit_transform(X3)
t3 = time.time() - start
start = time.time()
r2.fit_transform(X4)
t4 = time.time() - start
print(i," ",t1,",",t2,",",t3,",",t4)
see #2351