rosettasciio icon indicating copy to clipboard operation
rosettasciio copied to clipboard

True lazy reading to open EMD files

Open ZanettaPM opened this issue 7 years ago • 8 comments

Currently lazy reading of EDX spectra in EMD files is not truly lazy. The data is stored in a compressed format in the file. HyperSpy reads the compressed data in memory and uncompresses it lazily.

This lead to error messages if the number of frame is consequent.

Implementing pure lazy reading could permit to work with big datasets (i.e. cartography with too much frames)

ZanettaPM avatar May 29 '18 11:05 ZanettaPM

This turned out to be partially a bug, see hyperspy/hyperspy#2008.

@ZanettaPM, hyperspy/hyperspy#2008 does not implement true lazy reading, but is it enough for your purposes?

francisco-dlp avatar Jul 13 '18 11:07 francisco-dlp

I don't know i have to check this. But the dask array might be too large if the number of frames is high right ?

ZanettaPM avatar Jul 13 '18 11:07 ZanettaPM

Yes, it keeps the stream array in memory. I've just opened with 40GB of data (uncompressed) whose stream array only took 1GB of memory. Performing a sum over the whole navigation axes took 20s.

francisco-dlp avatar Jul 13 '18 11:07 francisco-dlp

Wow, so it should be enough for me !

ZanettaPM avatar Jul 13 '18 11:07 ZanettaPM

Could you tested it?

On Fri, 13 Jul 2018 at 13:57, Zanetta Pierre-marie [email protected] wrote:

Wow, so it should be enough for me !

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/hyperspy/rosettasciio/issues/18, or mute the thread https://github.com/notifications/unsubscribe-auth/AA8aF0zqOqg_OzKC_O5Hjvhj9ffql7cTks5uGIsRgaJpZM4URWLa .

francisco-dlp avatar Jul 13 '18 11:07 francisco-dlp

Actually not now, but i'll let you know. I'll try this asap !

ZanettaPM avatar Jul 13 '18 11:07 ZanettaPM

Reopening because hyperspy/hyperspy#2008 only partially fixes this as it doesn't implement true lazy loading.

francisco-dlp avatar Jul 18 '18 13:07 francisco-dlp

The situation should improve further (without fully fixing the issue) with hyperspy/hyperspy#2012

francisco-dlp avatar Jul 19 '18 07:07 francisco-dlp