elastix icon indicating copy to clipboard operation
elastix copied to clipboard

ENH: Add ElastixFilter.UpdateInParallel GoogleTest unit test

Open N-Dekker opened this issue 4 years ago • 8 comments

Originally based on a code snippet by Konstantinos Ntatsis (@ntatsisk) at pull request #389 ("ENH: Make ElastixMain database creation + loading components thread safe"): https://github.com/SuperElastix/elastix/pull/389#issuecomment-759744106

This unit test is expected to pass successfully only when the elastix component database is indeed thread-safe.

N-Dekker avatar Jan 21 '21 09:01 N-Dekker

Hi Konstantinos (@ntatsisk), as you see, I did try to make some progress 😃 You may have a look! However, I still get those messages, when running the ElastixFilter.UpdateInParallel unit test, saying:

User Error 1001: omp_set_num_threads should only be called in serial regions

I don't really like them. The following example code also produces such messages:

#include <omp.h>
#include <iostream>
#include <string>

int main()
{
#pragma omp parallel for
	for (int i = 0; i < 8; ++i)
	{
		omp_set_num_threads(2);
		std::cout << (std::to_string(i) + '\n');
	}
}

So it really means that the spawned threads are trying to change the number of threads! I don't think that's OK! (Update: it appears to happen here: https://github.com/SuperElastix/elastix/blob/5.0.1/Common/CostFunctions/itkAdvancedImageToImageMetric.hxx#L96 )

Moreover, ElastixFilter.UpdateInParallel appears to produce some output that the non-parallel version ElastixFilter.UpdateSerially does not do!

Running main() from Modules\ThirdParty\GoogleTest\src\itkgoogletest\googletest\src\gtest_main.cc Note: Google Test filter = Update [==========] Running 2 tests from 1 test suite. [----------] Global test environment set-up. [----------] 2 tests from ElastixFilter [ RUN ] ElastixFilter.UpdateSerially [ OK ] ElastixFilter.UpdateSerially (1029 ms) [ RUN ] ElastixFilter.UpdateInParallel User Error 1001: omp_set_num_threads should only be called in serial regions User Error 1001: omp_set_num_threads should only be called in serial regions User Error 1001: omp_set_num_threads should only be called in serial regions -101.914380113.7312-101.2443100111.332-101.5814800139.653-101.244310-101.01088111.430300116.243-101.0641200123.265-1-1001.58148-101.010880-101.20941018.398291.06412018.295540133.9830117.1061-101.20941015.62525-102.0254700137.949-102.025470-101.904600128.061-102.5729100148.585-101.9046014.18544111.3771-102.57291014.52256-101.002580078.7878-101.00258013.05009-101.145330065.4204-101.195970068.465-101.14533015.59068-101.232670069.7134-101.19597012.40803-101.23267012.52056-101.056990072.8765-101.05699012.15936[ OK ] ElastixFilter.UpdateInParallel (762 ms) [----------] 2 tests from ElastixFilter (1791 ms total)

[----------] Global test environment tear-down [==========] 2 tests from 1 test suite ran. (1797 ms total) [ PASSED ] 2 tests.

D:\elastix\bin\Debug\ElastixLibGTest.exe (process 14476) exited with code 0.

Do you see the same? Do you have a clue where those numbers like -101.914380113.7312 could come from?

Note: In order to run this test, which is part of ElastixLibGTest, you may need to enable GoogleTest unit testing, by elastix CMake flag ELASTIX_USE_GTEST.

N-Dekker avatar Feb 05 '21 16:02 N-Dekker

For the record, the ElastixFilter.UpdateInParallel unit test just passed successfully on most platforms. It only failed on windows-2019 "(push)", by a SegFault:

2021-02-05T15:57:25.7554843Z 158: [ OK ] ElastixFilter.Translation (13 ms) 2021-02-05T15:57:25.7555614Z 158: [ RUN ] ElastixFilter.UpdateSerially 2021-02-05T15:57:25.8166473Z 158: [ OK ] ElastixFilter.UpdateSerially (79 ms) 2021-02-05T15:57:25.8167401Z 158: [ RUN ] ElastixFilter.UpdateInParallel 2021-02-05T15:57:25.8457472Z 158: 10.214815-102.09975002.33936-102.09975010.187635-103.87517001.12653-103.87517010.106812 2021-02-05T15:57:25.8481882Z 158/158 Test # 158: ElastixLibGTest_test ....................................................***Exception: SegFault 0.16 sec

N-Dekker avatar Feb 05 '21 16:02 N-Dekker

@ntatsisk It appears essential, when you have an OpenMP for-loop in your "client code", calling ElastixFilter.Update in parallel, you should have built Elastix with ELASTIX_USE_OPENMP = OFF! So no OpenMP inside Elastix, when you use it at the client side!

Does that solve your issue?

PS As you can see, I tried adding mutex locking to the xout classes, https://github.com/SuperElastix/elastix/pull/393/commits/0c84c5be8cf30adb971204ff8635f921260b3dbf but that does not seem necessary when you already do LogToConsoleOff(), LogToFileOff(), and (WriteFinalTransformParameters "false").

N-Dekker avatar Feb 05 '21 23:02 N-Dekker

Hi @N-Dekker, I did a quick test with the branch using ELASTIX_USE_OPENMP = OFF and it passes the gtests. However, when I run my code I get in the console a wall of numbers like the ones you shared and, at some point, the code crashes. I have set LogToConsoleOff(), LogToFileOff(), and (WriteFinalTransformParameters "false"). I am using windows btw. I will get back to you, when I start checking the source code.

ntatsisk avatar Feb 06 '21 13:02 ntatsisk

Update: when you do LogToConsoleOff(), LogToFileOff(), and (WriteFinalTransformParameters "false"), my suggested code change to allow sharing xoutManager between threads (this pull request, commit https://github.com/SuperElastix/elastix/pull/393/commits/38c0ab074889ae4f4b548bcd4d8f1228bded4c94) does not seem necessary either. (The aim of that commit is to avoid multiple parallel xoutSetup calls and xout cleanups. But if you don't have any output, that commit seems unnecessary.)

N-Dekker avatar Feb 06 '21 14:02 N-Dekker

I made a pull request that seems to work: https://github.com/SuperElastix/elastix/pull/406. I tested it on my application and it runs with no issues but I haven't tried the gtests.

Bonus: it even works with LogToConsoleOn() (although output is mixed in the console), LogToFileOn() (if you supply variable filename all logs are intact) and without the need of specifying (WriteFinalTransformParameters "false"). Let me know what you think, @N-Dekker !

ntatsisk avatar Feb 07 '21 22:02 ntatsisk

I think maybe the UpdateInParallel unit test should be skipped when ELASTIX_USE_OPENMP = ON, in order to avoid that error which said:

User Error 1001: omp_set_num_threads should only be called in serial regions

N-Dekker avatar Feb 18 '21 16:02 N-Dekker

Hi @N-Dekker, that could be an option. However, I also see that currently, all the #pragma omps of the main code (11 in total) are deactivated by forcing either ITK multithreading or single-threading. There are only 3 active but they belong to tests. Some examples in the main code:

https://github.com/SuperElastix/elastix/blob/b65034e96d01eb93b82290058f9d6b44dbe5d9ef/Components/Optimizers/StandardStochasticGradientDescent/itkStochasticGradientDescentOptimizer.cxx#L215

https://github.com/SuperElastix/elastix/blob/b65034e96d01eb93b82290058f9d6b44dbe5d9ef/Components/Metrics/AdvancedMattesMutualInformation/itkParzenWindowMutualInformationImageToImageMetric.hxx#L588

Is there a plan to re-introduce OpenMP or is it going to be deprecated/removed completely soon?

ntatsisk avatar Feb 19 '21 10:02 ntatsisk