asammdf
asammdf copied to clipboard
Performance benchmark
Hi Daniel, I would be curious to have comparison from your benchmark environment with the following: https://github.com/ratal/mdfr
Hello Aymeric, I will have a look in the next days
Benchmark environment
- 3.10.8 (tags/v3.10.8:aaaf517, Oct 11 2022, 16:50:30) [MSC v.1933 64 bit (AMD64)]
- Windows-10-10.0.22621-SP0
- AMD64 Family 25 Model 80 Stepping 0, AuthenticAMD
- numpy 1.23.1
- 15GB installed RAM
Files used for benchmark:
- mdf version 3.10
- 167 MB file size
- 183 groups
- 36424 channels
- mdf version 4.00
- 183 MB file size
- 183 groups
- 36424 channels
| Open file | Time [ms] | RAM [MB] |
|---|---|---|
| asammdf 7.4.0.dev9 mdfv3 | 358 | 221 |
| mdrf 0.4.1 mdfv3 | 250 | 202 |
| asammdf 7.4.0.dev9 mdfv4 | 455 | 234 |
| mdrf 0.4.1 mdfv4 | 225 | 247 |
| Save file | Time [ms] | RAM [MB] |
|---|---|---|
| asammdf 7.4.0.dev9 mdfv3 | 361 | 381 |
| mdrf 0.4.1 mdfv3 | 275 | 336 |
| asammdf 7.4.0.dev9 mdfv4 | 898 | 400 |
| mdrf 0.4.1 mdfv4 | 126 | 328 |
| Get all channels (36424 calls) | Time [ms] | RAM [MB] |
|---|---|---|
| asammdf 7.4.0.dev9 mdfv3 | 1923 | 383 |
| mdrf 0.4.1 mdfv3 | 0 | 209 |
| asammdf 7.4.0.dev9 mdfv4 | 3934 | 399 |
| mdrf 0.4.1 mdfv4 | 0 | 256 |
I guess in mdfr all the data is loaded into the RAM when the file is opened
Thanks for investigating Daniel. API is different from mdfreader. What you have might be only for metadata parsing ? To load data in memory, it is needed to use load_channels_data_in_memory(channel_name) or load_all_channels_data_in_memory(). From my estimations, performance should be similar or worse than asammdf ; there is room for improvement, not yet really optimised. For instance choice of arrow2 and polars is not really assumed yet. Also, I think performance should come on long term from processing with polars : target use case is again more onto big data.