climpred icon indicating copy to clipboard operation
climpred copied to clipboard

Add simultaneous member and init resampling to HindcastEnsemble.bootstrap with xbootstrap integration

Open Copilot opened this issue 5 months ago • 2 comments

This PR implements the ability to resample both "member" and "init" dimensions simultaneously in HindcastEnsemble.bootstrap(), addressing the feature request in the original issue. The implementation now integrates with the xbootstrap library for efficient nested block bootstrap resampling.

Problem

Previously, HindcastEnsemble.bootstrap() only supported resampling along a single dimension at a time:

# Only single dimension resampling was possible
HindcastEnsemble.bootstrap(resample_dim="member", ...)
HindcastEnsemble.bootstrap(resample_dim="init", ...)

Users requested the ability to resample both dimensions simultaneously for more comprehensive bootstrap analysis.

Solution

The resample_dim parameter now accepts both single strings and lists of strings, enabling simultaneous multi-dimensional resampling:

# New: simultaneous resampling across multiple dimensions
HindcastEnsemble.bootstrap(
    resample_dim=["member", "init"],  # Resample both dimensions
    metric="crps",
    comparison="m2o",
    iterations=500,
    alignment="same_inits",
)

Implementation Details

  1. Enhanced Type Support: Updated method signatures to accept Union[str, List[str]] for resample_dim parameter

  2. xbootstrap Integration: Added optional integration with the xbootstrap library for efficient nested circular block bootstrap resampling

    • Uses xb.block_bootstrap() for multi-dimensional resampling when available
    • Graceful fallback to sequential method when xbootstrap is not installed
    • Provides significant performance improvements for multi-dimensional cases
  3. Smart Routing Logic: Modified bootstrap logic to detect multi-dimensional resampling and automatically choose the most efficient method:

    • Prefers xbootstrap for multi-dimensional resampling when available
    • Falls back to loop-based sequential resampling otherwise
    • Routes through appropriate functions for maximum compatibility
  4. Backward Compatibility: All existing single-dimension calls continue to work unchanged with no API modifications required

  5. Comprehensive Documentation: Updated docstrings with examples and clear descriptions of the new functionality, including xbootstrap integration notes

Files Changed

  • src/climpred/bootstrap.py: Added xbootstrap integration, _resample_multiple_dims_xbootstrap() function, resample_skill_xbootstrap() method, and updated existing functions
  • src/climpred/classes.py: Updated method signatures, bootstrap logic with xbootstrap routing, and documentation
  • src/climpred/tests/test_bootstrap.py: Added comprehensive tests for multi-dimensional resampling including xbootstrap-specific tests

Testing

Added test coverage for:

  • Multi-dimensional resampling function behavior with both sequential and xbootstrap methods
  • Bootstrap integration with list of resample dimensions using both approaches
  • Proper handling of xbootstrap availability with skip conditions
  • Verification that single and multi-dimensional resampling produce different results as expected

Key Benefits

  • Performance: Significant improvement for multi-dimensional resampling when using xbootstrap
  • Statistical Rigor: Proper nested circular block bootstrap methodology via xbootstrap
  • Flexibility: Supports both simple sequential resampling and advanced block bootstrap approaches
  • Robustness: Graceful degradation when xbootstrap is not available

This enhancement provides users with more flexible and efficient bootstrap analysis capabilities while maintaining full backward compatibility with existing code.

Original prompt

This section details on the original issue you should resolve

<issue_title>climpred.classes.HindcastEnsemble.bootstrap dim member and init simultaneously</issue_title> <issue_description>Could the HindcastEnsemble.bootstrapresample for "member" and "init" in dim simultaneously? And how?</issue_description>

Comments on the Issue (you are @copilot in this section)

@aaronspring Thank you for considering climpred and raising this issue @zmli-666.

Bootstrapping / resampling along both dimensions simultaneously isn't implemented (yet). It could be implemented by doing first the init resampling and then internally a member resampling on top of that.

PR welcome 😃 </comment_new> <comment_new>@aaronspring

Could explore https://github.com/dougiesquire/xbootstrap</comment_new> <comment_new>@aaronspring played a bit around with xbootstrap: https://gist.github.com/aaronspring/471e70f787aef6689825182e794421fb</comment_new>
Fixes pangeo-data/climpred#763

💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Copilot avatar Oct 05 '25 11:10 Copilot

@copilot try with xbootstrap

aaronspring avatar Oct 06 '25 14:10 aaronspring

@copilot try with xbootstrap

Implemented xbootstrap integration for efficient multi-dimensional resampling! The system now:

  1. Automatically detects if xbootstrap is available
  2. Uses xb.block_bootstrap() for multi-dimensional resampling when possible
  3. Falls back gracefully to the original method if xbootstrap is not installed
  4. Maintains full backward compatibility - no API changes needed

Key benefits:

  • Much more efficient than loop-based approach
  • Proper nested circular block bootstrap methodology
  • No breaking changes to existing code

The same API call now automatically leverages xbootstrap:

HindcastEnsemble.bootstrap(resample_dim=["member", "init"], ...)

Commit: 629738d implements the xbootstrap integration as suggested.

Copilot avatar Oct 06 '25 14:10 Copilot