python-fitparse icon indicating copy to clipboard operation
python-fitparse copied to clipboard

Remove RecordBase to speedup processing

Open xmedeko opened this issue 6 years ago • 7 comments

Fix #55 The second commit limits the data processor getattr.

xmedeko avatar Mar 07 '18 12:03 xmedeko

Looks decent. Thanks for submitting. Need more time to review more intimately. What sort of performance gains do you observe.

dtcooper avatar Mar 07 '18 20:03 dtcooper

The first commit is about 10-15%, the second commit about 5%.

The second commit moved caching to the instance of processor, so, if someone uses a new processor every time and have small FITs, may have some performance degradation, maybe. Anyway, maybe the scrubbing is to complicated and may be simplified? Do not know.

xmedeko avatar Mar 07 '18 21:03 xmedeko

I've added BaseType.size caching to the first commit, which is kind of micro-optimization.

xmedeko avatar Mar 09 '18 08:03 xmedeko

Update: I've run benchmarks on my Win32 Intel i5 2.5GHz machine previously. I've tried it on our Linux x64 (Ubuntu Xenial) Intel Xeon 2.6GHz server, and it's 5x faster. (all running Python 3.5.2). IMHO it cannot be caused just by the faster processor, there must be some performance problem with Python on Windows or on x32 arch.

Anyway, removing RecordBase brings more flexibility to class constructor, e.g. see #58.

xmedeko avatar Mar 16 '18 10:03 xmedeko

Just about the commit: FitFileDataProcessor cache methods not just method names I thing the data_processor should be removed from the FitFile completely. It may be designed as a wrapper around DataMessage, so a user may plugin into the parsing process externally. Something like:

with FitFile(...) as ffile:
    pr = FitFileDataProcessor(ffile)
    for m in pr.get_messages(...):
        # m is processed

This way, we do not need any static caching, it's up to the user.

xmedeko avatar Apr 04 '18 07:04 xmedeko

@pR0Ps thanks for CR. Meanwhile I've got mroe insight into the problem with the processors a bit. So I would discard the commit about the processors, make a issue for that where we can debate it more in depth.

xmedeko avatar Apr 25 '18 12:04 xmedeko

If your solution to the processors is the one you suggested above, I wouldn't recommend it. Making things "up to the user" is just pushing the problem along. If most users are going to want the values out of the FitFile object as standard SI units (likely), then it makes sense for that to be the default. The common case should be easy, and the advanced case (get raw values) should be possible. This is the way it is currently set up now.

pR0Ps avatar Apr 25 '18 12:04 pR0Ps