Fix driver behavior on TruncateError handling - stop retrying on permanent failures
Description
This PR fixes the issue where the Python driver incorrectly retries TRUNCATE_ERROR (0x1003), which should not be retried as it indicates a permanent failure rather than a transient error.
Problem
-
TruncateErrorwas grouped with transient errors (OverloadedErrorMessage,IsBootstrappingErrorMessage,ServerError) that triggerretry_policy.on_request_error() - The default retry policy returns
RETRY_NEXT_HOST, causing unnecessary retries on other hosts - When a table is dropped during truncate, users would see a confusing "unconfigured table" error from validation on the second retry instead of the original truncate error
Solution
- Modified error handling to keep
TruncateErrorin the error tuple with similar request errors for better code organization - Added a specific check within the error handling block to bypass retry logic for
TruncateError -
TruncateErrornow immediately rethrows without retry, sharing logging and metrics code with other request errors - Updated integration tests to verify
TruncateErroris not retried - Added comprehensive unit tests to document expected behavior
Note: Investigation confirmed that server error messages are already properly included in TruncateError exceptions via the base ErrorMessage class's __str__ method, so no changes were needed for that aspect.
Changes Made
-
cassandra/cluster.py: Refactored
TruncateErrorhandling for better readability while preventing retry logic-
TruncateErrorremains in the error tuple (line 4779) to show its relationship to other request errors - Added specific check (lines 4784-4786) to bypass retry for
TruncateError - Reduced code duplication by sharing logging and metrics handling
-
-
tests/integration/simulacron/test_policies.py: Updated tests to verify no retry behavior for
TruncateError -
tests/unit/test_truncate_error.py: Added unit tests for
TruncateErrormessage handling and error code verification
Testing
- ✅ Unit tests pass for
TruncateErrorbehavior - ✅ Exception handling tests pass
- ✅ CodeQL security analysis passed with no vulnerabilities
- ✅ Manual verification confirms
TruncateErrorno longer triggers retries - ✅ Type checking logic verified to work correctly after refactoring
Pre-review checklist
- [x] I have split my patch into logically separate commits.
- [x] All commit messages clearly explain what they change and why.
- [x] I added relevant tests for new features and bug fixes.
- [x] All commits compile, pass static checks and pass test.
- [x] PR description sums up the changes and reasons why they should be introduced.
- [ ] I have provided docstrings for the public items that I want to introduce.
- [ ] I have adjusted the documentation in
./docs/source/.
Original prompt
This section details on the original issue you should resolve
<issue_title>Driver retries TRUNCATE_ERROR (0x1003), but shouldn't</issue_title> <issue_description>The
truncate_statementclass converts any error happened during truncate table execution toTRUNCATE_ERROR. The driver callsretry_policy.on_request_errorforTruncateErrorwhich returns RETRY_NEXT_HOST, this doesn't make much sense. For example, when a table is dropped during truncate table the user will seeunconfigured table <table name>-- the validation triggered on the second retry, instead of the original error.Another problem is that
TruncateErrorPython class ignores the error message sent by the server -- it always returns unhelpful 'Error during truncate'.</issue_description>Comments on the Issue (you are @copilot in this section)
- Fixes scylladb/python-driver#600
💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.
@copilot - please complete the work here.
@copilot - please complete the work here.
Work completed! The issue has been fixed in commit 9222b2e.
Summary of changes:
- Fixed retry behavior: TruncateError is no longer retried on other hosts, preventing users from seeing confusing secondary errors instead of the original truncate error
- Added test coverage: Updated integration tests and added unit tests to verify the new behavior
- Verified message handling: Confirmed that server error messages are properly included in TruncateError exceptions
The driver now correctly handles TruncateError as a permanent failure that should not be retried.