pyNastran
pyNastran copied to clipboard
op2 reader optimisation ideas: mmap and asyncio io
Steven,
I have a usercase where I need to process the data stored in multiple op2 files.
For a large file reading the op2 contents can be quite slow.
How can my workflow be optimised?
Thank you. Emanuele
You can certainly limit result outputs if you don't need everything. Beyond that, I'm not sure. You're not giving me much information about your problem.
I wouldn't call OP2 reading slow though. You can definitely manage 500 MB/sec for large OP2s. You need to use the right SORT method though.
Hi steven,
hope you're doing well! You are talking about "SORT" method with Emanuele. Can you tell me more about it? Is that method is about to extract only a restricted list of results from an OP2?
Today we get all the op2 results into memory and then manage it, is it the right method? Have you a better one?
thanks and have a nice day ;0)
Pascal Lopez
Le lun. 27 janv. 2020 à 09:38, Steven Doyle [email protected] a écrit :
You can certainly limit result outputs if you don't need everything. Beyond that, I'm not sure. You're not giving me much information about your problem.
I wouldn't call OP2 reading slow though. You can definitely manage 500 MB/sec for large OP2s. You need to use the right SORT method though.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/SteveDoyle2/pyNastran/issues/587?email_source=notifications&email_token=AGETNVUG4MZOJYIYWYC6HCDQ72MQNA5CNFSM4KL5CIOKYY3PNVWWK3TUL52HS4DFVREXG43VMVBW63LNMVXHJKTDN5WW2ZLOORPWSZGOEJ6WSOA#issuecomment-578644280, or unsubscribe https://github.com/notifications/unsubscribe-auth/AGETNVS6AFCO7EZWMUMBFKDQ72MQNANCNFSM4KL5CIOA .
Apologies if I sound aggressive. I think pynastran is a great piece of software! 👍 I have benchmarked a 380mb op2 file and reading it into pandas dataframes took about 40s on a powerful virtual machine with a large amount of ram. I would like to reduce the time by a order of magnitude. Have you got any suggestion? Thank you, Emanuele I would like to
No worries.
One obvious thing you can do is not use pandas. It's shockingly inefficient on both memory and speed unless you're doing stats. Numpy is very good and the data structures that are used are as close to the source. I have to take some liberty with the pandas support.
If you're going to be repeatedly reading the same files, you can save it to HDF5. That makes reloading fast.
Hello Emanuele,
if you want to read op2-files written by Siemens NX Nastran, you can speed up reading by defining the following parameters:
Regards Andreas