pfft
pfft copied to clipboard
Crash in pfft_plan_dft
The attached program for 2D c2c transform with a 1D parallel decomposition crashes. Encountered on several computers and slightly different versions of PFFT. Last test done with 1.0.6-alpha. In this form it crashes with PFFT_PRESERVE_INPUT
. When I use in-place transform, it crashes also with PFFT_DESTROY_INPUT
.
> mpicc -Wall -std=c99 pfft_crash.c -lpfft -lfftw3 -lfftw3_mpi -g
> mpirun -n 12 valgrind./a.out
==13960== Invalid read of size 8
==13960== at 0x5092B69: fftw_plan_destroy_internal (in /usr/lib64/libfftw3.so.3.4.4)
==13960== by 0x5472453: ??? (in /usr/lib64/libfftw3_mpi.so.3.4.4)
==13960== by 0x5094172: ??? (in /usr/lib64/libfftw3.so.3.4.4)
==13960== by 0x5162969: ??? (in /usr/lib64/libfftw3.so.3.4.4)
==13960== by 0x5162B63: fftw_mkapiplan (in /usr/lib64/libfftw3.so.3.4.4)
==13960== by 0x546ED5B: fftw_mpi_plan_many_transpose (in /usr/lib64/libfftw3_mpi.so.3.4.4)
==13960== by 0x4E4DF2F: pfft_plan_global_transp (in /usr/local/lib64/libpfft.so.0.0.0)
==13960== by 0x4E41B03: pfft_plan_partrafo_transposed (in /usr/local/lib64/libpfft.so.0.0.0)
==13960== by 0x4E47536: pfft_plan_partrafo (in /usr/local/lib64/libpfft.so.0.0.0)
==13960== by 0x4E5472F: pfft_plan_many_dft (in /usr/local/lib64/libpfft.so.0.0.0)
==13960== by 0x4E540AB: pfft_plan_dft (in /usr/local/lib64/libpfft.so.0.0.0)
==13960== by 0x400E66: main (pfft_crash.c:44)
==13960== Address 0xb98c240 is 0 bytes inside a block of size 168 free'd
==13960== at 0x4C2A37C: free (in /usr/lib64/valgrind/vgpreload_memcheck-amd64-linux.so)
==13960== by 0x5471EFF: fftw_mpi_mkplans_posttranspose (in /usr/lib64/libfftw3_mpi.so.3.4.4)
==13960== by 0x54720C1: ??? (in /usr/lib64/libfftw3_mpi.so.3.4.4)
==13960== by 0x5094172: ??? (in /usr/lib64/libfftw3.so.3.4.4)
==13960== by 0x5162969: ??? (in /usr/lib64/libfftw3.so.3.4.4)
==13960== by 0x5162B63: fftw_mkapiplan (in /usr/lib64/libfftw3.so.3.4.4)
program:
#include <complex.h>
#include <pfft.h>
int main(int argc, char **argv)
{
int np[1];
ptrdiff_t n[2];
ptrdiff_t alloc_local;
ptrdiff_t local_ni[2], local_i_start[2];
ptrdiff_t local_no[2], local_o_start[2];
double err;
pfft_complex *in, *out;
pfft_plan plan_forw=NULL, plan_back=NULL;
MPI_Comm comm_cart_1d;
/* Set size of FFT and process mesh */
n[0] = 200; n[1] = 200;
np[0] = 12;
/* Initialize MPI and PFFT */
MPI_Init(&argc, &argv);
pfft_init();
/* Create two-dimensional process grid of size np[0] x np[1], if possible */
if( pfft_create_procmesh(1, MPI_COMM_WORLD, np, &comm_cart_1d) ){
pfft_fprintf(MPI_COMM_WORLD, stderr, "Error: This test file only works with %d processes.\n", np[0]);
MPI_Finalize();
return 1;
}
/* Get parameters of data distribution */
alloc_local = pfft_local_size_dft(2, n, comm_cart_1d, PFFT_TRANSPOSED_NONE,
local_ni, local_i_start, local_no, local_o_start);
/* Allocate memory */
in = pfft_alloc_complex(alloc_local);
out = pfft_alloc_complex(alloc_local);
/* Plan parallel forward FFT */
plan_forw = pfft_plan_dft(
2, n, in, out, comm_cart_1d, PFFT_FORWARD, PFFT_TRANSPOSED_NONE| PFFT_ESTIMATE| PFFT_PRESERVE_INPUT);
/* Plan parallel backward FFT */
plan_back = pfft_plan_dft(
2, n, out, in, comm_cart_1d, PFFT_BACKWARD, PFFT_TRANSPOSED_NONE| PFFT_ESTIMATE| PFFT_PRESERVE_INPUT);
/* Initialize input with random numbers */
pfft_init_input_complex(2, n, local_ni, local_i_start,
in);
/* execute parallel forward FFT */
pfft_execute(plan_forw);
/* clear the old input */
pfft_clear_input_complex(2, n, local_ni, local_i_start,
in);
/* execute parallel backward FFT */
pfft_execute(plan_back);
/* Scale data */
for(ptrdiff_t l=0; l < local_ni[0] * local_ni[1]; l++)
in[l] /= (n[0]*n[1]);
/* Print error of back transformed data */
err = pfft_check_output_complex(2, n, local_ni, local_i_start, in, comm_cart_1d);
pfft_printf(comm_cart_1d, "Error after one forward and backward trafo of size n=(%td, %td):\n", n[0], n[1]);
pfft_printf(comm_cart_1d, "maxerror = %6.2e;\n", err);
/* free mem and finalize */
pfft_destroy_plan(plan_forw);
pfft_destroy_plan(plan_back);
MPI_Comm_free(&comm_cart_1d);
pfft_free(in); pfft_free(out);
MPI_Finalize();
return 0;
}
Just to add that an equivalent in pure FFTW3 is running properly. I am not sure if this is a regression, but I think my program used to work with this transform before.
Thanks for the report. I checked it with the current master branch and it did not crash. Did you install FFTW-3.3.4 with my install scripts (including my performance patches), or did you use the plain FFTW-3.3.4? Would be great if you could test it with the current master or at least version 1.0.8. Does it also crash with less processes?
I used stock FFTW 3.3.4. One of the computers has pfft commit cd4619163026e8d14b8b1e097aa214ae9311f52c. It doesn't crash with less processes.
Now, I got the same segfault in FFTWs internal plan free with the non-patched version. Looks like a bug in FFTW-MPI. I have to dig deeper into the code for locating the problem.
This segfault was caused by a double free within FFTW-MPI. I fixed it in commit 9831bbd of the official FFTW repository on Github. The bugfix is already included in FFTW-3.3.5. Please verify if the problem is solved for you with this version of FFTW.
This is the link to the bugfix.