gcpy
gcpy copied to clipboard
Feature request: Add a script to scrape timing info from benchmark simulation log files
Name and Institution (Required)
Name: Bob Yantosca Institution: Harvard + GCST
Confirm you have reviewed the following documentation
New GCPy feature or discussion
Currently, we have to copy GEOS-Chem Classic and GCHP timing information from log files into a spreadsheet. It would be great if we could have a script to scrape this information and put it into a table.
The script could take an existing file with a table as an optional input argument and append to it. And it could take either one log or a list of logs to create a new table with.
GCHP log file info looks like this:
GEOS-Chem Classic timers info looks like this:
===============================================================================
G E O S - C H E M T I M E R S
Timer name DD-hh:mm:ss.SSS Total Seconds
-------------------------------------------------------------------------------
GEOS-Chem : 00-06:17:48.512 22668.512
HEMCO : 00-00:34:44.061 2084.061
All chemistry : 00-02:38:10.404 9490.405
=> Gas-phase chem : 00-01:32:16.579 5536.579
=> Photolysis : 00-00:12:25.899 745.899
=> Aerosol chem : 00-00:49:34.497 2974.498
=> Linearized chem : 00-00:00:29.576 29.577
Transport : 00-00:27:46.755 1666.755
Convection : 00-00:41:14.616 2474.616
Boundary layer mixing : 00-00:50:08.453 3008.453
Dry deposition : 00-00:00:51.259 51.259
Wet deposition : 00-00:16:32.521 992.521
Diagnostics : 00-00:38:41.675 2321.675
Unit conversions : 00-00:31:20.192 1880.193
Looking for volunteers!
For GEOS-Chem Classic, timing information is also saved to a JSON file, so it would be easy to parse that with Python.
The GCHP log file also includes each gridded component broken down into further timing. The section that includes GEOS-Chem looks like this (example is transport tracers simulations hence chemistry is so low):
Min, mean, and max are included to show the range of times across CPUs. Inclusive means it includes subroutines called , while exclusive means it does not (e.g. timer is stopped for the sub-processes).
See PR #319 for sample output. I will also try to scrape the timings of the GCHP component.
This has now been completed, so we can close this issue.