Add TimeWindow temporal wrapper and CorrDiffCMIP6 diagnostic model
Earth2Studio Pull Request
Description
- TimeWindow data wrapper - A generic temporal data wrapper that fetches data at multiple time offsets
- CorrDiffCMIP6 model wrapper - A CanESM5 CMIP6-specific variant of CorrDiff
- diagnostic_from_data() - New workflow for applying diagnostic models directly to data without requiring a prognostic model
Checklist
- [x] I am familiar with the Contributing Guidelines.
- [x] New or existing tests cover these changes.
- [x] The documentation is up to date with these changes.
- [ ] The CHANGELOG.md is up to date with these changes.
- [ ] An issue is linked to this pull request.
- [ ] Assess and address Greptile feedback (AI code review bot for guidance; use discretion, addressing all feedback is not required).
Dependencies
Disclaimer: This is AI-generated, please review response for accuracy
Greptile Overview
Greptile Summary
This PR adds three major features for CMIP6 climate model data processing:
-
TimeWindow data wrapper - Generic temporal data wrapper that fetches data at multiple time offsets, concatenates them along the variable dimension with suffixes, and provides flexible ordering options. Implementation is clean with comprehensive error handling and async support.
-
CorrDiffCMIP6 model - CMIP6-specific variant of CorrDiff that adds time-dependent features (solar zenith angle, hour of day), relaxed grid validation for Gaussian grids, and specialized preprocessing. Includes workaround for training bug where hour-of-day values were placed in the cosine-latitude position.
-
diagnostic_from_data workflow - New workflow function in
run.pyfor applying diagnostic models directly to data without prognostic models. Note: Previous developer feedback suggests adding functions torun.pymay be an anti-pattern for this package - consider confirming this is acceptable.
The implementation is thorough with extensive test coverage. Previous issues (missing TimeWindow file, debug print statements) have been addressed in subsequent commits.
Confidence Score: 4/5
- Safe to merge with minor architectural clarification needed
- Score reflects solid implementation quality with comprehensive tests, but slight uncertainty around the
run.pyaddition given previous maintainer feedback about it being an anti-pattern. The core functionality (TimeWindow and CorrDiffCMIP6) is well-implemented. - Review
earth2studio/run.py- confirm that addingdiagnostic_from_data()workflow aligns with package architecture principles
Important Files Changed
File Analysis
| Filename | Score | Overview |
|---|---|---|
| earth2studio/data/time_window.py | 5/5 | New TimeWindow data wrapper with comprehensive temporal offset handling, proper error handling, and flexible variable ordering options |
| earth2studio/models/dx/corrdiff.py | 4/5 | Major refactoring: adds CorrDiffCMIP6 variant with time features, improved grid validation, configurable sigma parameters, and TimeWindow integration |
| earth2studio/run.py | 3/5 | Added diagnostic_from_data workflow for applying diagnostic models directly to data without prognostic models |
@greptile-apps
@greptile-apps rescan
@CharlelieLrt you're correct, please do not add to the run.py. depending on what the workflow is, maybe this should be added as an example. Regardless, I think it would be best to push that to another pr for discussion and just focus on getting the wrapper added.
+1 it would be nice to see a code snippet in the PR that runs the model and also plots results. Please see other model PRs from me or Oliver for examples. Good for reference in the future
Thanks guys!