rascaline icon indicating copy to clipboard operation
rascaline copied to clipboard

Transformer class for the property selection

Open hurricane642 opened this issue 3 years ago • 7 comments

Hello, everyone! This PR is the first step in adding the 'Transformer' class to rascaline. The 'Transformer' class makes it easy to create a representation matrix when using some other matrix as a reference. A classic use case is to create a TensorMap representation for a dataset, then perform transformations within that TensorMap (e.g., keys_to_features or keys_to_properties), and select the most useful features in the transformed TensorMap. The 'Transformer' allows a set of these features to be used to calculate a new TensorMap, thus saving computation time and maintaining a single I tried to explain the logic of the algorithm in the comments to the code, but if you have any questions - ask, I'll be happy to answer, as well as happy to receive any comments and suggestions.

IMPORTANT: We need to add to the code the ability to perform several transformations one by one. This will be the next step. Note that this branch is based on the selected_keys branch, which itself is still a PR. representation for all representations.

hurricane642 avatar Nov 24 '22 11:11 hurricane642

Here is a pre-built version of the code in this pull request: wheels.zip, you can install it locally by unzipping wheels.zip and using pip to install the file matching your system

github-actions[bot] avatar Nov 24 '22 11:11 github-actions[bot]

The documentation for this PR is (or will soon be) available on readthedocs: https://rascaline--137.org.readthedocs.build/en/137/

github-actions[bot] avatar Nov 24 '22 11:11 github-actions[bot]

I noticed the branch before seeing the PR and was wondering what this was. I think that transformer has too much baggage in ML to use it as a class that enacts a generic transformation of a tensor. We need a better name.

ceriottm avatar Dec 06 '22 14:12 ceriottm

I noticed the branch before seeing the PR and was wondering what this was. I think that transformer has too much baggage in ML to use it as a class that enacts a generic transformation of a tensor. We need a better name.

Yes, of course the name will be changed, it's a draft version of the name

hurricane642 avatar Dec 06 '22 19:12 hurricane642

You can also add skcosmo to the tox environments to be able to run the tests.

Luthaf avatar Dec 08 '22 11:12 Luthaf

You can also add skcosmo to the tox environments to be able to run the tests.

Ok, I added it, but it's still having trouble passing the tests. Is there anything else I should add?

hurricane642 avatar Dec 19 '22 19:12 hurricane642

You can skip tests if skcosmo is not installed (as long as it is still tested in the all-deps environement), see for example https://github.com/Luthaf/rascaline/blob/3a28ad0831f6f0bf6cb81defd26a47a0e25f228a/python/tests/systems/chemfiles.py#L17

Luthaf avatar Dec 20 '22 14:12 Luthaf