redpanda
redpanda copied to clipboard
CI Failure (`cloud_storage::download_exception (NotFound)`) in `CloudRetentionTest.test_cloud_retention`
https://buildkite.com/redpanda/vtools/builds/11824
Module: rptest.tests.cloud_retention_test
Class: CloudRetentionTest
Method: test_cloud_retention
Arguments: {
"cloud_storage_type": 1,
"max_consume_rate_mb": null
}
test_id: CloudRetentionTest.test_cloud_retention
status: FAIL
run time: 472.215 seconds
<BadLogLines nodes=ip-172-31-12-189(1) example="ERROR 2024-02-12 12:19:23,003 [shard 2:fetc] kafka - fetch.cc:1171 - unknown exception thrown: cloud_storage::download_exception (NotFound)">
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 184, in _do_run
data = self.run_test()
File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 269, in run_test
return self.test_context.function(self.test)
File "/usr/local/lib/python3.10/dist-packages/ducktape/mark/_mark.py", line 481, in wrapper
return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 173, in wrapped
redpanda.raise_on_bad_logs(
File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 1412, in raise_on_bad_logs
raise BadLogLines(bad_lines)
rptest.services.utils.BadLogLines: <BadLogLines nodes=ip-172-31-12-189(1) example="ERROR 2024-02-12 12:19:23,003 [shard 2:fetc] kafka - fetch.cc:1171 - unknown exception thrown: cloud_storage::download_exception (NotFound)">
*https://buildkite.com/redpanda/vtools/builds/11875 *https://buildkite.com/redpanda/vtools/builds/11880 *https://buildkite.com/redpanda/vtools/builds/11881
*https://buildkite.com/redpanda/vtools/builds/11886 *https://buildkite.com/redpanda/vtools/builds/11892 *https://buildkite.com/redpanda/vtools/builds/11891
*https://buildkite.com/redpanda/vtools/builds/11895 *https://buildkite.com/redpanda/vtools/builds/11901 *https://buildkite.com/redpanda/vtools/builds/11902
*https://buildkite.com/redpanda/vtools/builds/11905 *https://buildkite.com/redpanda/vtools/builds/11911 *https://buildkite.com/redpanda/vtools/builds/11910
*https://buildkite.com/redpanda/vtools/builds/11929 *https://buildkite.com/redpanda/vtools/builds/11937 *https://buildkite.com/redpanda/vtools/builds/11938
*https://buildkite.com/redpanda/vtools/builds/11944 *https://buildkite.com/redpanda/vtools/builds/11952 *https://buildkite.com/redpanda/vtools/builds/11951
*https://buildkite.com/redpanda/vtools/builds/11957 *https://buildkite.com/redpanda/vtools/builds/11962 *https://buildkite.com/redpanda/vtools/builds/11963
*https://buildkite.com/redpanda/vtools/builds/11967 *https://buildkite.com/redpanda/vtools/builds/11971
The line that is printed as part of this CI report is
rptest.services.utils.BadLogLines: <BadLogLines nodes=ip-172-31-12-189(1) example="ERROR 2024-02-12 12:19:23,003 [shard 2:fetc] kafka - fetch.cc:1171 - unknown exception thrown: cloud_storage::download_exception (NotFound)">
But that line was removed by this commit, and in particular, there was some handling int his commit related to exceptions from outside fetch bubbling up.
commit a09d160db0f74034cdf47ae5204afbe5a7218cad
Author: Brandon Allard <[email protected]>
Date: Thu Feb 8 22:41:52 2024 -0500
kafka: rethrow on unknown exceptions in fetch handler
Exceptions from outside the Kafka handler context can bubble up to the
catch present in the handler. This seems to be the way some subsystems
communicate issues with the requests. There is no current listing of
what exceptions subsystems may throw, if/how to recover from these
exceptions, or if a fetch should end as a result of the exception. Hence
for the time being the fetch impl will revert to the behavior of further
bubbling up unknown exceptions to outside of the handler context.
src/v/kafka/server/handlers/fetch.cc | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)
Indeed, that commit was marked as resolving a nearly identical CI failure report.
So, this can be considered a duplicate of https://github.com/redpanda-data/redpanda/issues/16532 and was resolved by https://github.com/redpanda-data/redpanda/pull/16554.