PyAirbyte icon indicating copy to clipboard operation
PyAirbyte copied to clipboard

feat: implement per-test timeout limits for slow test investigation

Open aaronsteers opened this issue 2 months ago • 6 comments

feat: implement per-test timeout limits for slow test investigation

Summary

This PR implements per-test timeout limits using the existing pytest-timeout plugin to address Windows CI timeout issues. The changes include:

  • CI timeout configuration: Added --timeout=300 (5 minutes per test) and --session-timeout=3600 (1 hour total) to all pytest jobs
  • New poe tasks: Added timeout-controlled test execution tasks for different scenarios
  • Slow test analysis: Created executable script to identify and analyze tests marked with @pytest.mark.slow
  • Documentation: Added comprehensive timeout configuration guide

The implementation leverages pytest-timeout plugin v2.4.0 already installed in PyAirbyte to provide granular timeout control that should prevent the 1-hour CI timeouts currently experienced on Windows.

Review & Testing Checklist for Human

  • [ ] Verify timeout values are appropriate: Test that 300s per-test limit doesn't cause legitimate tests to fail unexpectedly (especially integration tests that may need longer execution time)
  • [ ] Test new poe tasks locally: Run poetry run poe test-integration-timeout and poetry run poe test-with-short-timeout to ensure they work correctly and respect timeout limits
  • [ ] Confirm scope alignment: The implementation applies timeouts to ALL CI jobs (Linux + Windows), not just Windows as originally discussed - verify this aligns with requirements
  • [ ] Validate session timeout: Ensure 3600s (1 hour) session timeout allows full test suite completion on Ubuntu while preventing Windows hangs

Notes

⚠️ Scope Change: The original plan targeted Windows-only timeout limits, but this implementation applies timeouts to all CI matrix jobs. This was necessary because the pytest job in the workflow doesn't have separate Windows/Linux conditional steps.

Test Analysis: The slow test analysis script identifies 11 integration tests marked as @pytest.mark.slow, primarily involving source-faker connector with multiple cache types and write strategies. These tests involve 200-300 record scales and are parametrized across DuckDB, Postgres, BigQuery, and Snowflake caches.

Timeout Strategy:

  • Per-test: 300s (down from 600s global timeout)
  • Session: 3600s (1 hour maximum)
  • Integration tests can use test-integration-timeout task with 180s limit for faster feedback

Requested by AJ Steers (@aaronsteers) in Devin session: https://app.devin.ai/sessions/2cd62f25e51440258413a76d0e0763c6

Summary by CodeRabbit

  • New Features

    • Added a tool to report and summarize slow tests.
  • Tests

    • Enforced per-test and overall session timeouts in CI to reduce hangs and improve reliability.
  • Documentation

    • Added a guide explaining test timeout configuration, strategies, and usage examples.
  • Chores

    • Added convenient tasks/commands to run unit, integration, and slow-test analyses with predefined timeouts and duration reporting.

aaronsteers avatar Sep 09 '25 21:09 aaronsteers