feature_engine icon indicating copy to clipboard operation
feature_engine copied to clipboard

reduce running time of tests for feature selection module

Open solegalli opened this issue 2 years ago • 6 comments

solegalli avatar Sep 21 '22 16:09 solegalli

@solegalli I guess this relates to #592

If we would be able to profile any class or class method then I think it's trivial to profile tests as well. After having tests profile then we can go and optimize slow tests.

What do you think? Should I pick this one as well?

Okroshiashvili avatar Feb 22 '23 19:02 Okroshiashvili

Hey @Okroshiashvili

It is somewhat related. If we make classes more efficient, then the tests will run faster. But I think here we could already gain a lot by decreasing the size of the datasets that we use in the tests.

Recursive feature elimination / addition are per se quite time consuming. Increasing the speed of the tests would help us make our dev work more efficient.

Having said this, you are more than welcome to take the 2 issues together!

solegalli avatar Feb 24 '23 09:02 solegalli

Okay, sounds good for me. I will handle this issue alongside the another ☺️

Okroshiashvili avatar Feb 24 '23 10:02 Okroshiashvili

Hi @solegalli

So, as I mentioned in #592 we can use Pyinstrument to profile tests as well.

I've created small function to profile tests. Here it is:

from pathlib import Path

import pytest
from pyinstrument.profiler import Profiler


TESTS_ROOT = Path.cwd()


@pytest.fixture(autouse=True)
def auto_profile(request):
    PROFILE_ROOT = TESTS_ROOT / "profiles/test_profiles"
    profiler = Profiler()
    profiler.start()

    # Run the test
    yield

    profiler.stop()
    PROFILE_ROOT.mkdir(exist_ok=True)
    node_id = request.node.nodeid.replace("tests/", "").strip().split("/")
    if len(node_id) == 1:
        results_file = PROFILE_ROOT / f"{node_id[0].split('::')[-1]}.html"
    else:
        tp = "/".join(node_id[:-1])
        (PROFILE_ROOT / tp).mkdir(parents=True, exist_ok=True)
        results_file = PROFILE_ROOT / tp / f"{node_id[-1].split('::')[-1]}.html"
    with open(results_file, "w", encoding="utf-8") as f_html:
        f_html.write(profiler.output_html())

Put this function inside conftest.py file in tests directory and run tests as you used to run them. It will profile all the tests and will create the directory containing profile (HTML) files for each test_****.py file in the same hierarchy as the tests are.

I don't recommend at all to use this in any CI/CD flow. It produces lots of HTML files. Only internal usage

After having this, I will go and identify slow tests and will act accordingly. Either will reduce mock data size or optimize class behind that particular slow test.

Okroshiashvili avatar Feb 28 '23 08:02 Okroshiashvili

Sounds good! Thank you so much @Okroshiashvili

solegalli avatar Mar 01 '23 12:03 solegalli

You're welcome @solegalli

So, I'll push this issue forward and will investigate slow tests and will update you asap

Okroshiashvili avatar Mar 01 '23 12:03 Okroshiashvili