ompi icon indicating copy to clipboard operation
ompi copied to clipboard

Facing issue using mpirun as Grid points increased

Open atanuchaudhury opened this issue 1 year ago • 0 comments

As I increase the Grid points of the model its giving an out of memory issue. Any solution to that? The error is given below:

[MpiManager] Sucessfully initialized, numThreads=1 [ThreadPool] Sucessfully initialized, numThreads=1 [Directories] Directory ./tmp/ created. [Directories] Directory ./tmp/imageData/ created. [Directories] Directory ./tmp/imageData/data/ created. [Directories] Directory ./tmp/vtkData/ created. [Directories] Directory ./tmp/vtkData/data/ created. [Directories] Directory ./tmp/gnuplotData/ created. [Directories] Directory ./tmp/gnuplotData/data/ created. [UnitConverter] ----------------- UnitConverter information ----------------- [UnitConverter] -- Parameters: [UnitConverter] Resolution: N= 30 [UnitConverter] Lattice velocity: latticeU= 0.09 [UnitConverter] Lattice relaxation frequency: omega= 1.88679 [UnitConverter] Lattice relaxation time: tau= 0.53 [UnitConverter] Characteristical length(m): charL= 0.1 [UnitConverter] Characteristical speed(m/s): charU= 0.2 [UnitConverter] Phys. kinematic viscosity(m^2/s): charNu= 7.40741e-05 [UnitConverter] Phys. density(kg/m^d): charRho= 1 [UnitConverter] Characteristical pressure(N/m^2): charPressure= 0 [UnitConverter] Mach number: machNumber= 0.155885 [UnitConverter] Reynolds number: reynoldsNumber= 270 [UnitConverter] Knudsen number: knudsenNumber= 0.00057735 [UnitConverter] [UnitConverter] -- Conversion factors: [UnitConverter] Voxel length(m): physDeltaX= 0.00333333 [UnitConverter] Time step(s): physDeltaT= 0.0015 [UnitConverter] Velocity factor(m/s): physVelocity= 2.22222 [UnitConverter] Density factor(kg/m^3): physDensity= 1 [UnitConverter] Mass factor(kg): physMass= 3.7037e-08 [UnitConverter] Viscosity factor(m^2/s): physViscosity= 0.00740741 [UnitConverter] Force factor(N): physForce= 5.48697e-05 [UnitConverter] Pressure factor(N/m^2): physPressure= 4.93827 [UnitConverter] ------------------------------------------------------------- [prepareGeometry] Prepare Geometry ... [SuperGeometry3D] cleaned 0 outer boundary voxel(s) [SuperGeometry3D] cleaned 0 outer boundary voxel(s) [SuperGeometryStatistics3D] updated [SuperGeometry3D] the model is correct! [CuboidGeometry3D] ---Cuboid Stucture Statistics--- [CuboidGeometry3D] Number of Cuboids: 1 [CuboidGeometry3D] Delta (min): 0.00333333 [CuboidGeometry3D] (max): 0.00333333 [CuboidGeometry3D] Ratio (min): 0.213049 [CuboidGeometry3D] (max): 3.75625 [CuboidGeometry3D] Nodes (min): 577729280 [CuboidGeometry3D] (max): 577729280 [CuboidGeometry3D] Weight (min): 577729280 [CuboidGeometry3D] (max): 577729280 [CuboidGeometry3D] -------------------------------- [SuperGeometryStatistics3D] materialNumber=0; count=189120; minPhysR=(1.95417,1.95417,-0.0025); maxPhysR=(2.04417,2.04417,1.06083) [SuperGeometryStatistics3D] materialNumber=1; count=572370220; minPhysR=(-0.0025,0.000833333,0.000833333); maxPhysR=(4.9975,3.9975,1.06083) [SuperGeometryStatistics3D] materialNumber=2; count=4552588; minPhysR=(-0.0025,-0.0025,-0.0025); maxPhysR=(5.00083,4.00083,1.06083) [SuperGeometryStatistics3D] materialNumber=3; count=198864; minPhysR=(-0.0025,0.000833333,0.000833333); maxPhysR=(-0.0025,3.9975,1.0575) [SuperGeometryStatistics3D] materialNumber=4; count=381600; minPhysR=(5.00083,0.000833333,0.000833333); maxPhysR=(5.00083,3.9975,1.0575) [SuperGeometryStatistics3D] materialNumber=5; count=36888; minPhysR=(1.95083,1.95083,0.000833333); maxPhysR=(2.0475,2.0475,1.0575) [SuperGeometryStatistics3D] countTotal[1e6]=577.729 [prepareGeometry] Prepare Geometry ... OK terminate called after throwing an instance of 'std::runtime_error' what(): out of memory [acmt-gpu:209572] *** Process received signal *** [acmt-gpu:209572] Signal: Aborted (6) [acmt-gpu:209572] Signal code: (-6) [acmt-gpu:209572] [ 0] /lib/x86_64-linux-gnu/libc.so.6(+0x42520)[0x7f688f419520] [acmt-gpu:209572] [ 1] /lib/x86_64-linux-gnu/libc.so.6(pthread_kill+0x12c)[0x7f688f46d9fc] [acmt-gpu:209572] [ 2] /lib/x86_64-linux-gnu/libc.so.6(raise+0x16)[0x7f688f419476] [acmt-gpu:209572] [ 3] /lib/x86_64-linux-gnu/libc.so.6(abort+0xd3)[0x7f688f3ff7f3] [acmt-gpu:209572] [ 4] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xa2b9e)[0x7f6890e76b9e] [acmt-gpu:209572] [ 5] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae20c)[0x7f6890e8220c] [acmt-gpu:209572] [ 6] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae277)[0x7f6890e82277] [acmt-gpu:209572] [ 7] /lib/x86_64-linux-gnu/libstdc++.so.6(+0xae4d8)[0x7f6890e824d8] [acmt-gpu:209572] [ 8] /home/achaudhury/olb-1.7r1/olb-1.7r0/examples/laminar/cylinder3d/libolbcuda.so(+0xdb53c)[0x7f689402353c] [acmt-gpu:209572] [ 9] /home/achaudhury/olb-1.7r1/olb-1.7r0/examples/laminar/cylinder3d/libolbcuda.so(_ZN3olb3gpu4cuda12CyclicColumnIdEC1Em+0x12c)[0x7f689409191c] [acmt-gpu:209572] [10] ./cylinder3d[0x5faa5c] [acmt-gpu:209572] [11] ./cylinder3d[0x5168c2] [acmt-gpu:209572] [12] ./cylinder3d[0x4fe5f9] [acmt-gpu:209572] [13] ./cylinder3d[0x44b075] [acmt-gpu:209572] [14] ./cylinder3d[0x50d7f2] [acmt-gpu:209572] [15] ./cylinder3d[0x4a8e6a] [acmt-gpu:209572] [16] ./cylinder3d[0x44a405] [acmt-gpu:209572] [17] /lib/x86_64-linux-gnu/libc.so.6(+0x29d90)[0x7f688f400d90] [acmt-gpu:209572] [18] /lib/x86_64-linux-gnu/libc.so.6(__libc_start_main+0x80)[0x7f688f400e40] [acmt-gpu:209572] [19] ./cylinder3d[0x444b25] [acmt-gpu:209572] *** End of error message ***

Primary job terminated normally, but 1 process returned a non-zero exit code. Per user-direction, the job has been aborted.


mpirun noticed that process rank 0 with PID 209572 on node acmt-gpu exited on signal 6 (Aborted).

atanuchaudhury avatar Oct 04 '24 12:10 atanuchaudhury