Cannot process file larger than memory size

Open rafaelcarv opened this issue 9 years ago • 1 comments

Hello,

I created a csv file with 119GB and a HDF5 file with 50GB (number of instances 6.6 Billions), they was generated with the generate_1d_array.jl program from the HPAT generate folder.

First i tried with the following settings: 1 Google Compute Engine VM with: Ubuntu 16.04 52GB Memory 2TB Storage 8vCPUs

The error occurred, but not with a smaller file (with 2 Billions instances, using the same program to create).

I though that was an error with any configuration, or maybe a conflict with the mpi installed (the instance had 2 different versions of mpi installed).

So I created another VM 1 Google Compute Engine VM with: Ubuntu 14.04 52GB Memory 1TB Storage 8 vCPUs

The same error occurred with the same parameters.

I am executing with the following line: mpirun -np 8 julia .julia/v0.5/HPAT/examples/1D_sum.jl --file=1D_large.hdf5

error.txt

Oct 27 '16 01:10 rafaelcarv

HPAT doesn't support out-of-core computation currently, so the data has to fit in the cluster's memory. I recommend using more nodes.

Also, if the installed MPI I/O is compiled with 32 bit integer, the number of elements per core in each I/O operation cannot be more than 2.1 billion. Using more nodes solves this issue too.

Oct 27 '16 02:10 ehsantn