strawberryfields icon indicating copy to clipboard operation
strawberryfields copied to clipboard

test_normalized_conditional_states in test_fock_measurement is exceptionally slow with tensorflow backend

Open co9olguy opened this issue 6 years ago • 8 comments

There is a test in tests/backend/test_fock_measurement.py which takes a long time to run, and crashes my computer when run locally. We should pinpoint which test this is and mark it for refactoring.

co9olguy avatar Aug 20 '19 22:08 co9olguy

@co9olguy Looks like the issue is related to ram usage as my computer starts to swap during those tests... looking into this, but it appears that if you reduce the NUM_REPEATS parameter in the test significantly (currently it is 50) then the tests run and pass without ram issues. I do believe the problem is with this test:

14.25s call     backend/test_fock_measurement.py::TestFockRepresentation::test_normalized_conditional_states[TFBackend-True]
12.96s call     backend/test_fock_measurement.py::TestFockRepresentation::test_normalized_conditional_states[TFBackend-False]

The ram usage however is probably a bug in prepare_fock_state or measure_fock, which we should address as well (but need more profiling for that).

zzzeid avatar Aug 21 '19 12:08 zzzeid

That's weird - I don't remember this happening in the past. In this something we can trace to a specific PR or change in either the codebase or the tests?

josh146 avatar Aug 21 '19 14:08 josh146

but it appears that if you reduce the NUM_REPEATS parameter in the test significantly (currently it is 50) then this the tests run and pass without ram issues.

Oh, this could be an issue with TF not correctly clearing the graph between repeat runs? @co9olguy

josh146 avatar Aug 21 '19 14:08 josh146

This is definitely not a new issue. I've observed it for months now, but just got fed up enough to report it as issue now

co9olguy avatar Aug 21 '19 16:08 co9olguy

After some testing, the issue with the slow test appears to be a memory leak specific to pytest while it executes test_normalized_conditional_states with the Tensorflow backend only. I.e. it is not related to the actual code or actual test (executing the same commands outside pytest does not manifest this issue). It's possible that it's related to how the backend fixtures are set up but it will take more digging to figure this out. I will have to look at it in more detail some other time.

Note however though that the test is still pretty slow even outside of memory leak issues, it just gets extremely slow if it starts swapping.

zzzeid avatar Aug 21 '19 18:08 zzzeid

Thanks @zzzeid. It's enough to know which test was causing it and flag it for refactoring.

What likely happened is someone designed a test which would suit a particular backend, but forgot to account for the implementation/performance needs of other backends.

co9olguy avatar Aug 21 '19 18:08 co9olguy

Memory leak specific to pytest while it executes test_normalized_conditional_states with the Tensorflow backend only.

I agree @zzzeid, my suspicion is that there is a backend initialization fixture which isn't being reset after evaluation, or is using the wrong scope.

It might be something that is easily fixed by modifying the fixture to include tear down, i.e.,

@pytest.fixture(params=backend_params)
def setup_backend(
    request, cutoff, pure, batch_size
):  # pylint: disable=redefined-outer-name
    """Parameterized fixture, used to automatically create a backend of certain number of modes.

    This fixture should only be used in backend tests, as it bypasses Engine and
    initializes the backend directly.

    Every test that uses this fixture, or a fixture that depends on it
    will be run three times, one for each backend.

    To explicitly mark a test to only work on certain backends, you may
    use the ``@pytest.mark.backends()`` fixture. For example, for a test that
    only works on the TF and Fock backends, ``@pytest.mark.backends('tf', 'fock').
    """

    def _setup_backend(num_subsystems):
        """Factory function"""
        backend = request.param()
        backend.begin_circuit(
            num_subsystems=num_subsystems,
            cutoff_dim=cutoff,
            pure=pure,
            batch_size=batch_size,
        )
        yield backend
        backend.reset()

    return _setup_backend

(question: does yield work inside a fixture that returns a function? It might not.)

josh146 avatar Aug 22 '19 04:08 josh146

It might be a simpler solution: just call setup_backend within the loop (though I'd like to remove looping wherever possible)

co9olguy avatar Aug 22 '19 14:08 co9olguy