earth2studio icon indicating copy to clipboard operation
earth2studio copied to clipboard

Add TimeWindow temporal wrapper and CorrDiffCMIP6 diagnostic model

Open gertln opened this issue 1 month ago • 3 comments

Earth2Studio Pull Request

Description

  1. TimeWindow data wrapper - A generic temporal data wrapper that fetches data at multiple time offsets
  2. CorrDiffCMIP6 model wrapper - A CanESM5 CMIP6-specific variant of CorrDiff
  3. diagnostic_from_data() - New workflow for applying diagnostic models directly to data without requiring a prognostic model

Checklist

  • [x] I am familiar with the Contributing Guidelines.
  • [x] New or existing tests cover these changes.
  • [x] The documentation is up to date with these changes.
  • [ ] The CHANGELOG.md is up to date with these changes.
  • [ ] An issue is linked to this pull request.
  • [ ] Assess and address Greptile feedback (AI code review bot for guidance; use discretion, addressing all feedback is not required).

Dependencies

gertln avatar Nov 18 '25 18:11 gertln

Disclaimer: This is AI-generated, please review response for accuracy

Greptile Overview

Greptile Summary

This PR adds three major features for CMIP6 climate model data processing:

  • TimeWindow data wrapper - Generic temporal data wrapper that fetches data at multiple time offsets, concatenates them along the variable dimension with suffixes, and provides flexible ordering options. Implementation is clean with comprehensive error handling and async support.

  • CorrDiffCMIP6 model - CMIP6-specific variant of CorrDiff that adds time-dependent features (solar zenith angle, hour of day), relaxed grid validation for Gaussian grids, and specialized preprocessing. Includes workaround for training bug where hour-of-day values were placed in the cosine-latitude position.

  • diagnostic_from_data workflow - New workflow function in run.py for applying diagnostic models directly to data without prognostic models. Note: Previous developer feedback suggests adding functions to run.py may be an anti-pattern for this package - consider confirming this is acceptable.

The implementation is thorough with extensive test coverage. Previous issues (missing TimeWindow file, debug print statements) have been addressed in subsequent commits.

Confidence Score: 4/5

  • Safe to merge with minor architectural clarification needed
  • Score reflects solid implementation quality with comprehensive tests, but slight uncertainty around the run.py addition given previous maintainer feedback about it being an anti-pattern. The core functionality (TimeWindow and CorrDiffCMIP6) is well-implemented.
  • Review earth2studio/run.py - confirm that adding diagnostic_from_data() workflow aligns with package architecture principles

Important Files Changed

File Analysis

Filename Score Overview
earth2studio/data/time_window.py 5/5 New TimeWindow data wrapper with comprehensive temporal offset handling, proper error handling, and flexible variable ordering options
earth2studio/models/dx/corrdiff.py 4/5 Major refactoring: adds CorrDiffCMIP6 variant with time features, improved grid validation, configurable sigma parameters, and TimeWindow integration
earth2studio/run.py 3/5 Added diagnostic_from_data workflow for applying diagnostic models directly to data without prognostic models

greptile-apps[bot] avatar Nov 18 '25 18:11 greptile-apps[bot]

@greptile-apps

gertln avatar Nov 18 '25 20:11 gertln

@greptile-apps rescan

gertln avatar Nov 19 '25 17:11 gertln

@CharlelieLrt you're correct, please do not add to the run.py. depending on what the workflow is, maybe this should be added as an example. Regardless, I think it would be best to push that to another pr for discussion and just focus on getting the wrapper added.

+1 it would be nice to see a code snippet in the PR that runs the model and also plots results. Please see other model PRs from me or Oliver for examples. Good for reference in the future

Thanks guys!

NickGeneva avatar Dec 04 '25 11:12 NickGeneva