ndarray-npy
ndarray-npy copied to clipboard
Cannot save large files.
Thank you for providing us with a great crate. I looked here and there to try to save a file in .npy format using ndarray-npy. So I tried to save a file of about 70 GB and got this error.
error
memory allocation of 73603432908 bytes failed
/var/spool/uge/at163/job_scripts/12220153: line 8: 46483 Aborted
code
fn main() {
let a: Array3<f32> = Array3::zeros((~~~)) // about 70GB
// do something
write_npy(
&(dir.to_string() + &fname + "_input_C.npy"),
&features.input_c,
)
.unwrap(); // error
}
Based on the line the error occurs, I think there is a possibility that more memory is being used when saving the file. Are there any other memory efficient functions other than the one I used?
This is a fun problem. My machine can't even even create an array that big in memory. I'm surprised that you're seeing this, though. write_npy
shouldn't be allocating much memory, especially not ~70 GB as indicated by the error message. More information would be helpful to diagnose the issue:
-
Does the array have standard layout, Fortran layout, or another layout? (In other words, what is the result of
features.input_c.is_standard_layout()
andfeatures.input_c.view().reversed_axes().is_standard_layout()
?) -
What specific versions of
ndarray
,ndarray-npy
, and Rust are you using? (You can determine this by searching forname = "ndarray"
andname = "ndarray-npy"
in yourCargo.lock
, and callingrustc --version
.) -
What happens when you run this program, which just allocates ~70 GB and writes it to a file?
use std::fs::File; use std::io::Write; fn main() -> Result<(), Box<dyn std::error::Error>> { let mut file = File::create("test")?; // Allocate 73.6 GB of data. let num_bytes = 73603432908; let mut data = vec![0u8; num_bytes]; // Let's make at least a couple of the elements nonzero. data[3] = 42; data[num_bytes - 10] = 5; file.write_all(&data)?; println!("success"); Ok(()) }
-
Are you sure that the allocation failure is occurring in the
write_npy
call, and not somewhere else (e.g. when first allocating the array or when performing an arithmetic operation on it which allocates another array)?
I apologize for the delay in replying. The supercomputer I am currently using is undergoing maintenance, so it is difficult to answer the above questions. I will try to answer the first two questions.
features.input_c.is_standard_layout()
-> true
version I use
ndarray
-> 0.14, ndarray-npy
-> 0.7.1
Okay, for Array3<f32>
in standard layout, the relevant portions of the code are:
-
write_npy
function -
is_standard_layout
portion ofwrite_npy
method implementation forArrayBase
-
write_slice
method implementation forf32
Basically, this consists of checking the layout of the array (which for Array3
should perform no allocations), writing the .npy
header (which performs a few small allocations), getting the array data as a &[f32]
slice via as_slice_memory_order
, and then casting the contiguous slice of data from &[f32]
to &[u8]
and calling write_all
on the writer (i.e. the File
in this case). The only place where I could potentially see a 70 GB allocation occurring is if std::fs::File
's implementation of write_all
makes a copy of the 70 GB slice of data in memory for some reason, but that seems unlikely, and I'd consider it a bug in std::fs::File
rather than ndarray-npy
.
So, I think it's unlikely that this is a bug in ndarray-npy
. When the supercomputer is operational again, I'd suggest trying the sample code I provided in my previous comment to see if just allocating a large amount of data and writing it to a file is problematic. If that succeeds without errors, then I'd suggest trying to narrow down where the allocation is occurring. (Perhaps the simplest approach would be to step through the code in a debugger and see where the program crashes. Alternatively, you could try replacing the global allocator with something that provides more information, or you could add logging messages between each line of code in the area where you think the allocation might be occurring.) In your initial comment, you seemed to be somewhat unsure that the allocation is actually in ndarray-npy
. My guess is that it's somewhere else in the code. If you're able to provide the code, I could help look to see where an allocation might be occurring, but otherwise, I'm not sure there's much I can do to help.
Out of curiosity, have you been able to diagnose the issue? Another tool which may be useful is Heaptrack.