gcpy icon indicating copy to clipboard operation
gcpy copied to clipboard

Feature request: Add a script to scrape timing info from benchmark simulation log files

Open yantosca opened this issue 1 year ago • 3 comments

Name and Institution (Required)

Name: Bob Yantosca Institution: Harvard + GCST

Confirm you have reviewed the following documentation

New GCPy feature or discussion

Currently, we have to copy GEOS-Chem Classic and GCHP timing information from log files into a spreadsheet. It would be great if we could have a script to scrape this information and put it into a table.

The script could take an existing file with a table as an optional input argument and append to it. And it could take either one log or a list of logs to create a new table with.

GCHP log file info looks like this:

image

GEOS-Chem Classic timers info looks like this:

===============================================================================
G E O S - C H E M   T I M E R S
 
  Timer name                       DD-hh:mm:ss.SSS     Total Seconds
-------------------------------------------------------------------------------
  GEOS-Chem                     :  00-06:17:48.512         22668.512
  HEMCO                         :  00-00:34:44.061          2084.061
  All chemistry                 :  00-02:38:10.404          9490.405
  => Gas-phase chem             :  00-01:32:16.579          5536.579
  => Photolysis                 :  00-00:12:25.899           745.899
  => Aerosol chem               :  00-00:49:34.497          2974.498
  => Linearized chem            :  00-00:00:29.576            29.577
  Transport                     :  00-00:27:46.755          1666.755
  Convection                    :  00-00:41:14.616          2474.616
  Boundary layer mixing         :  00-00:50:08.453          3008.453
  Dry deposition                :  00-00:00:51.259            51.259
  Wet deposition                :  00-00:16:32.521           992.521
  Diagnostics                   :  00-00:38:41.675          2321.675
  Unit conversions              :  00-00:31:20.192          1880.193

Looking for volunteers!

yantosca avatar Apr 12 '24 17:04 yantosca

For GEOS-Chem Classic, timing information is also saved to a JSON file, so it would be easy to parse that with Python.

yantosca avatar Apr 12 '24 19:04 yantosca

The GCHP log file also includes each gridded component broken down into further timing. The section that includes GEOS-Chem looks like this (example is transport tracers simulations hence chemistry is so low): Screenshot 2024-04-12 at 1 18 43 PM Min, mean, and max are included to show the range of times across CPUs. Inclusive means it includes subroutines called , while exclusive means it does not (e.g. timer is stopped for the sub-processes).

lizziel avatar Apr 16 '24 15:04 lizziel

See PR #319 for sample output. I will also try to scrape the timings of the GCHP component.

yantosca avatar May 06 '24 14:05 yantosca

This has now been completed, so we can close this issue.

yantosca avatar May 29 '24 17:05 yantosca