Normalization of parameters and snapshots
Is your feature request related to a problem? Please describe. I would like to normalize / scale my snapshots and or the parameters using ezyrb.
Describe the solution you'd like
import numpy as np
from ezyrb import Database
points = np.array([[1, 2],
[5, 6],
[9, 10]])
values = np.array([[0.0, 0.1, 0.2],
[0.3, 0.4, 0.5],
[0.6, 0.7, 0.8]])
db_train = Database(points, values)
db_train.normalise_parameters()
print(db_train.parameters_n)
# [[0. 0. ]
# [0.5 0.5 ]
# [1. 1. ]]
db_test = Database(np.array([[2.5, 2.5]]),
np.array([[0.3, 0.3, 0.3]]))
db_test.scaler_parameters = db_train.scaler_parameters
db_test.scale_down_parameters()
print(db_test.parameters_n)
# [[0.1875 0.0625]]
Describe alternatives you've considered
I have looked into the optional keyword arguments scaler_parameters and scaler_snapshots, they are used by calling their method fit_transform. I did not find any definition of fit_transform or how the scalers are supposed to be used.
I am happy to contribute if you think that fits in your library and is not already possible in some other way.
I figured out that this must be for the StandardScaler from scikit-learn. I think this could be better documented.
- In the database class documentation, the
scaler_parametersandscaler_snapshotsshould be removed, since they are no longer used. (In older versions,fit_transformwas called every timeparametersorsnapshotswere referenced, meaning the scaler may change which can lead to unexpected behaviour...) - In the DatabaseScaler class, the StandardScaler should probably be referenced. It would be great to have an example how to use this plugin.
I wondered if the scalers (scaler_parameters and scaler_snapshots) from the database should be removed elsewhere (e.g. in the ROM) as well? I think it was a good idea to separate the scaling from the ROM.
Is database.parameters (and .snapshots) considered to be deprecated now in favour for database.parameters_matrix?
any thoughts @ndem0, do you have an example how to use the scaler plugin?