pyNastran icon indicating copy to clipboard operation
pyNastran copied to clipboard

op2 reader optimisation ideas: mmap and asyncio io

Open EmanueleCannizzaro opened this issue 5 years ago • 5 comments

Steven,

I have a usercase where I need to process the data stored in multiple op2 files.

For a large file reading the op2 contents can be quite slow.

How can my workflow be optimised?

Thank you. Emanuele

EmanueleCannizzaro avatar Jan 27 '20 08:01 EmanueleCannizzaro

You can certainly limit result outputs if you don't need everything. Beyond that, I'm not sure. You're not giving me much information about your problem.

I wouldn't call OP2 reading slow though. You can definitely manage 500 MB/sec for large OP2s. You need to use the right SORT method though.

SteveDoyle2 avatar Jan 27 '20 08:01 SteveDoyle2

Hi steven,

hope you're doing well! You are talking about "SORT" method with Emanuele. Can you tell me more about it? Is that method is about to extract only a restricted list of results from an OP2?

Today we get all the op2 results into memory and then manage it, is it the right method? Have you a better one?

thanks and have a nice day ;0)

Pascal Lopez

Le lun. 27 janv. 2020 à 09:38, Steven Doyle [email protected] a écrit :

You can certainly limit result outputs if you don't need everything. Beyond that, I'm not sure. You're not giving me much information about your problem.

I wouldn't call OP2 reading slow though. You can definitely manage 500 MB/sec for large OP2s. You need to use the right SORT method though.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/SteveDoyle2/pyNastran/issues/587?email_source=notifications&email_token=AGETNVUG4MZOJYIYWYC6HCDQ72MQNA5CNFSM4KL5CIOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ6WSOA#issuecomment-578644280, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGETNVS6AFCO7EZWMUMBFKDQ72MQNANCNFSM4KL5CIOA .

Pascal3100 avatar Jan 27 '20 10:01 Pascal3100

Apologies if I sound aggressive. I think pynastran is a great piece of software! 👍 I have benchmarked a 380mb op2 file and reading it into pandas dataframes took about 40s on a powerful virtual machine with a large amount of ram. I would like to reduce the time by a order of magnitude. Have you got any suggestion? Thank you, Emanuele I would like to

EmanueleCannizzaro avatar Jan 27 '20 20:01 EmanueleCannizzaro

No worries.

One obvious thing you can do is not use pandas. It's shockingly inefficient on both memory and speed unless you're doing stats. Numpy is very good and the data structures that are used are as close to the source. I have to take some liberty with the pandas support.

If you're going to be repeatedly reading the same files, you can save it to HDF5. That makes reloading fast.

SteveDoyle2 avatar Jan 28 '20 06:01 SteveDoyle2

Hello Emanuele,

if you want to read op2-files written by Siemens NX Nastran, you can speed up reading by defining the following parameters:

image

Regards Andreas

BaurAndreas avatar Jul 22 '20 09:07 BaurAndreas