redpanda
redpanda copied to clipboard
CI Failure (Internal object storage scrub detected fatal anomalies) in `ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy`
https://buildkite.com/redpanda/vtools/builds/10650 https://buildkite.com/redpanda/vtools/builds/10846
Module: rptest.tests.e2e_shadow_indexing_test
Class: ShadowIndexingWhileBusyTest
Method: test_create_or_delete_topics_while_busy
Arguments: {
"short_retention": true,
"cloud_storage_type": 1
}
test_id: ShadowIndexingWhileBusyTest.test_create_or_delete_topics_while_busy
status: FAIL
run time: 789.483 seconds
RuntimeError("Internal object storage scrub detected fatal anomalies: [{'ns': 'kafka', 'topic': 'topic-zmilrsrinp', 'partition': 36, 'revision_id': 51, 'missing_segments': ['22e19e71/kafka/topic-zmilrsrinp/36_51/5185-5226-20977506-1-v1.log.1', '36b4b4a9/kafka/topic-zmilrsrinp/36_51/5145-5184-20977360-1-v1.log.1', '2526f118/kafka/topic-zmilrsrinp/36_51/4945-4984-20977360-1-v1.log.1'], 'last_complete_scrub_at': 1700085196023}, {'ns': 'kafka', 'topic': 'topic-zmilrsrinp', 'partition': 12, 'revision_id': 51, 'missing_segments': ['f3d2be80/kafka/topic-zmilrsrinp/12_51/2896-2937-20977639-1-v1.log.1', '65c66587/kafka/topic-zmilrsrinp/12_51/3104-3147-20977918-1-v1.log.1', 'db7677ea/kafka/topic-zmilrsrinp/12_51/3778-3819-20977639-1-v1.log.1', 'ee6d0524/kafka/topic-zmilrsrinp/12_51/4617-4656-20977360-1-v1.log.1', '2ec0f06f/kafka/topic-zmilrsrinp/12_51/3694-3735-20977639-1-v1.log.1', 'eae7d342/kafka/topic-zmilrsrinp/12_51/1131-1174-20977918-1-v1.log.1', 'f5d03a8f/kafka/topic-zmilrsrinp/12_51/84-125-20977637-1-v1.log.1', 'a976720f/kafka/topic-zmilrsrinp/12_51/1469-1510-20977639-1-v1.log.1', 'cc3e11ca/kafka/topic-zmilrsrinp/12_51/1677-1718-20977639-1-v1.log.1', '76567a90/kafka/topic-zmilrsrinp/12_51/1931-1972-20977639-1-v1.log.1', '224cbf80/kafka/topic-zmilrsrinp/12_51/3610-3651-20977639-1-v1.log.1', 'b71440c6/kafka/topic-zmilrsrinp/12_51/168-209-20977638-1-v1.log.1', 'be8707b2/kafka/topic-zmilrsrinp/12_51/5189-5228-20977360-1-v1.log.1', 'de7fd964/kafka/topic-zmilrsrinp/12_51/4657-4700-20977918-1-v1.log.1', 'd58bb90a/kafka/topic-zmilrsrinp/12_51/2141-2182-20977639-1-v1.log.1', '1871d1ac/kafka/topic-zmilrsrinp/12_51/2225-2266-20977639-1-v1.log.1', 'a6252834/kafka/topic-zmilrsrinp/12_51/2477-2518-20977639-1-v1.log.1', 'b7a7001d/kafka/topic-zmilrsrinp/12_51/755-796-20977638-1-v1.log.1', '10a2a9c3/kafka/topic-zmilrsrinp/12_51/4869-4908-20977360-1-v1.log.1', '90725443/kafka/topic-zmilrsrinp/12_51/1385-1426-20977639-1-v1.log.1', '8090ce86/kafka/topic-zmilrsrinp/12_51/4155-4196-20977639-1-v1.log.1', 'bbc81a72/kafka/topic-zmilrsrinp/12_51/921-964-20977916-1-v1.log.1', 'bdc82106/kafka/topic-zmilrsrinp/12_51/2938-2979-20977639-1-v1.log.1', '42d4d859/kafka/topic-zmilrsrinp/12_51/5149-5188-20977360-1-v1.log.1', '9809514d/kafka/topic-zmilrsrinp/12_51/3400-3441-20977639-1-v1.log.1', '2348add4/kafka/topic-zmilrsrinp/12_51/5069-5108-20977360-1-v1.log.1', '435ff1fb/kafka/topic-zmilrsrinp/12_51/4239-4280-20977639-1-v1.log.1', '1d16b05d/kafka/topic-zmilrsrinp/12_51/2686-2727-20977639-1-v1.log.1', 'ce7b3ccc/kafka/topic-zmilrsrinp/12_51/4071-4112-20977639-1-v1.log.1', '06e502f3/kafka/topic-zmilrsrinp/12_51/419-460-20977638-1-v1.log.1', '3904e663/kafka/topic-zmilrsrinp/12_51/1427-1468-20977639-1-v1.log.1', '90d342ae/kafka/topic-zmilrsrinp/12_51/4029-4070-20977639-1-v1.log.1', '960cc049/kafka/topic-zmilrsrinp/12_51/545-586-20977638-1-v1.log.1', 'bc059ce3/kafka/topic-zmilrsrinp/12_51/3232-3272-20977562-1-v1.log.1', '4f3cad8a/kafka/topic-zmilrsrinp/12_51/4909-4948-20977360-1-v1.log.1', '6c0f4d7a/kafka/topic-zmilrsrinp/12_51/965-1004-20977360-1-v1.log.1', '20ac88bf/kafka/topic-zmilrsrinp/12_51/1343-1384-20977639-1-v1.log.1', '41adbf3b/kafka/topic-zmilrsrinp/12_51/2183-2224-20977639-1-v1.log.1', '8e47f47e/kafka/topic-zmilrsrinp/12_51/4742-4784-20977716-1-v1.log.1', 'cee530ee/kafka/topic-zmilrsrinp/12_51/3148-3189-20977639-1-v1.log.1', 'bbf8b90e/kafka/topic-zmilrsrinp/12_51/332-376-20978056-1-v1.log.1', '61ee93ad/kafka/topic-zmilrsrinp/12_51/1889-1930-20977639-1-v1.log.1', 'fbc3c6e0/kafka/topic-zmilrsrinp/12_51/671-712-20977638-1-v1.log.1', '86127870/kafka/topic-zmilrsrinp/12_51/2770-2811-20977639-1-v1.log.1', 'c8daa35c/kafka/topic-zmilrsrinp/12_51/2267-2308-20977639-1-v1.log.1', 'f294929e/kafka/topic-zmilrsrinp/12_51/3358-3399-20977639-1-v1.log.1', '103d2365/kafka/topic-zmilrsrinp/12_51/1089-1130-20977639-1-v1.log.1', '854f1683/kafka/topic-zmilrsrinp/12_51/1763-1802-20977360-1-v1.log.1', 'faffe22f/kafka/topic-zmilrsrinp/12_51/4197-4238-20977639-1-v1.log.1', '676b902a/kafka/topic-zmilrsrinp/12_51/2728-2769-20977639-1-v1.log.1', '87fa5103/kafka/topic-zmilrsrinp/12_51/3820-3859-20977360-1-v1.log.1', '07f8253d/kafka/topic-zmilrsrinp/12_51/879-920-20977638-1-v1.log.1', '599d62e5/kafka/topic-zmilrsrinp/12_51/4785-4826-20977639-1-v1.log.1', 'e3ef074b/kafka/topic-zmilrsrinp/12_51/2519-2560-20977639-1-v1.log.1', 'e2ab6da8/kafka/topic-zmilrsrinp/12_51/4989-5028-20977360-1-v1.log.1', '97abc74f/kafka/topic-zmilrsrinp/12_51/210-251-20977638-1-v1.log.1', '667482f9/kafka/topic-zmilrsrinp/12_51/501-544-20977916-1-v1.log.1', 'ae5f95a8/kafka/topic-zmilrsrinp/12_51/4575-4616-20977639-1-v1.log.1', 'ea8d3d36/kafka/topic-zmilrsrinp/12_51/126-167-20977637-1-v1.log.1', 'ad77531c/kafka/topic-zmilrsrinp/12_51/377-418-20977638-1-v1.log.1', '982be0f8/kafka/topic-zmilrsrinp/12_51/3652-3693-20977639-1-v1.log.1', '1d95ce80/kafka/topic-zmilrsrinp/12_51/292-331-20977360-1-v1.log.1', '83cd2c3a/kafka/topic-zmilrsrinp/12_51/4701-4741-20977562-1-v1.log.1', 'acf5d5b4/kafka/topic-zmilrsrinp/12_51/3736-3777-20977639-1-v1.log.1', '2d65cc47/kafka/topic-zmilrsrinp/12_51/252-291-20977360-1-v1.log.1', '0d23e49c/kafka/topic-zmilrsrinp/12_51/1049-1088-20977360-1-v1.log.1', 'd6b44ae8/kafka/topic-zmilrsrinp/12_51/1803-1846-20977918-1-v1.log.1', '1257b97e/kafka/topic-zmilrsrinp/12_51/2601-2643-20977780-1-v1.log.1', '2b03c1b8/kafka/topic-zmilrsrinp/12_51/2435-2476-20977639-1-v1.log.1', '38d22c9d/kafka/topic-zmilrsrinp/12_51/797-838-20977638-1-v1.log.1', '14d29aeb/kafka/topic-zmilrsrinp/12_51/2980-3019-20977360-1-v1.log.1', '887aa9e2/kafka/topic-zmilrsrinp/12_51/3568-3609-20977639-1-v1.log.1', '63ea855a/kafka/topic-zmilrsrinp/12_51/0-41-20977591-1-v1.log.1', '2b51ff73/kafka/topic-zmilrsrinp/12_51/1553-1594-20977639-1-v1.log.1', '893fbe03/kafka/topic-zmilrsrinp/12_51/587-628-20977638-1-v1.log.1', '575847a3/kafka/topic-zmilrsrinp/12_51/461-500-20977360-1-v1.log.1', '3d07166f/kafka/topic-zmilrsrinp/12_51/1719-1762-20977918-1-v1.log.1', 'd082fbca/kafka/topic-zmilrsrinp/12_51/3526-3567-20977639-1-v1.log.1', '29aa36a0/kafka/topic-zmilrsrinp/12_51/1847-1888-20977639-1-v1.log.1', '46a527cf/kafka/topic-zmilrsrinp/12_51/1175-1216-20977639-1-v1.log.1', 'c9e3f156/kafka/topic-zmilrsrinp/12_51/3316-3357-20977639-1-v1.log.1', '40840bfa/kafka/topic-zmilrsrinp/12_51/4281-4322-20977639-1-v1.log.1', '9dda12a4/kafka/topic-zmilrsrinp/12_51/1511-1552-20977639-1-v1.log.1', '037912f0/kafka/topic-zmilrsrinp/12_51/3273-3315-20977716-1-v1.log.1', '32a85d88/kafka/topic-zmilrsrinp/12_51/4448-4490-20977655-1-v1.log.1', '953cac9f/kafka/topic-zmilrsrinp/12_51/3860-3901-20977639-1-v1.log.1', '04c1d40e/kafka/topic-zmilrsrinp/12_51/4827-4868-20977645-1-v1.log.1', 'bfed2bb0/kafka/topic-zmilrsrinp/12_51/2644-2685-20977639-1-v1.log.1', 'd88bdb45/kafka/topic-zmilrsrinp/12_51/3987-4028-20977639-1-v1.log.1', '24b6668b/kafka/topic-zmilrsrinp/12_51/4533-4574-20977639-1-v1.log.1', 'c03786bd/kafka/topic-zmilrsrinp/12_51/2812-2853-20977639-1-v1.log.1', '67566786/kafka/topic-zmilrsrinp/12_51/2057-2098-20977639-1-v1.log.1', '8345c737/kafka/topic-zmilrsrinp/12_51/2015-2056-20977639-1-v1.log.1', '45ffb451/kafka/topic-zmilrsrinp/12_51/42-83-20977636-1-v1.log.1', '6a9477c8/kafka/topic-zmilrsrinp/12_51/2351-2392-20977639-1-v1.log.1', 'f8b00efd/kafka/topic-zmilrsrinp/12_51/629-670-20977638-1-v1.log.1', 'db00cb10/kafka/topic-zmilrsrinp/12_51/3484-3525-20977639-1-v1.log.1', '8f7ccf6e/kafka/topic-zmilrsrinp/12_51/4113-4154-20977639-1-v1.log.1', '94350b98/kafka/topic-zmilrsrinp/12_51/2099-2140-20977639-1-v1.log.1', 'ad1e3da7/kafka/topic-zmilrsrinp/12_51/3064-3103-20977360-1-v1.log.1', '37175c4d/kafka/topic-zmilrsrinp/12_51/2854-2895-20977639-1-v1.log.1', 'eb4cf7de/kafka/topic-zmilrsrinp/12_51/3945-3986-20977639-1-v1.log.1', '09db2d4a/kafka/topic-zmilrsrinp/12_51/4491-4532-20977639-1-v1.log.1', '330a0846/kafka/topic-zmilrsrinp/12_51/2393-2434-20977639-1-v1.log.1', '7fa89a5d/kafka/topic-zmilrsrinp/12_51/5029-5068-20977360-1-v1.log.1', '6deec357/kafka/topic-zmilrsrinp/12_51/2561-2600-20977360-1-v1.log.1', 'c72193af/kafka/topic-zmilrsrinp/12_51/1005-1048-20977916-1-v1.log.1', '8614f857/kafka/topic-zmilrsrinp/12_51/2309-2350-20977639-1-v1.log.1', '26705615/kafka/topic-zmilrsrinp/12_51/3902-3944-20977841-1-v1.log.1', '1d657d6c/kafka/topic-zmilrsrinp/12_51/5229-5268-20977360-1-v1.log.1', '3a0a21de/kafka/topic-zmilrsrinp/12_51/1301-1342-20977639-1-v1.log.1', 'c3d9107a/kafka/topic-zmilrsrinp/12_51/4365-4404-20977360-1-v1.log.1', 'a5aa7ba0/kafka/topic-zmilrsrinp/12_51/3442-3483-20977639-1-v1.log.1', '50d95089/kafka/topic-zmilrsrinp/12_51/1973-2014-20977639-1-v1.log.1', '5987db7d/kafka/topic-zmilrsrinp/12_51/4323-4364-20977639-1-v1.log.1', '9fd7ca1c/kafka/topic-zmilrsrinp/12_51/4949-4988-20977360-1-v1.log.1', 'f82f2a9a/kafka/topic-zmilrsrinp/12_51/839-878-20977360-1-v1.log.1', 'df34d9ae/kafka/topic-zmilrsrinp/12_51/4405-4447-20977841-1-v1.log.1', '81cfe030/kafka/topic-zmilrsrinp/12_51/713-754-20977638-1-v1.log.1', '2babec87/kafka/topic-zmilrsrinp/12_51/1259-1300-20977639-1-v1.log.1', '8a47cf0f/kafka/topic-zmilrsrinp/12_51/1637-1676-20977360-1-v1.log.1', '55aa25ec/kafka/topic-zmilrsrinp/12_51/1217-1258-20977639-1-v1.log.1', 'f3451177/kafka/topic-zmilrsrinp/12_51/1595-1636-20977639-1-v1.log.1', 'fb7c057b/kafka/topic-zmilrsrinp/12_51/3020-3063-20977918-1-v1.log.1'], 'last_complete_scrub_at': 1700085220006}]")
Traceback (most recent call last):
File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 184, in _do_run
data = self.run_test()
File "/usr/local/lib/python3.10/dist-packages/ducktape/tests/runner_client.py", line 269, in run_test
return self.test_context.function(self.test)
File "/usr/local/lib/python3.10/dist-packages/ducktape/mark/_mark.py", line 481, in wrapper
return functools.partial(f, *args, **kwargs)(*w_args, **w_kwargs)
File "/home/ubuntu/redpanda/tests/rptest/services/cluster.py", line 159, in wrapped
self.redpanda.maybe_do_internal_scrub()
File "/home/ubuntu/redpanda/tests/rptest/services/redpanda.py", line 3894, in maybe_do_internal_scrub
raise RuntimeError(
RuntimeError: Internal object storage scrub detected fatal anomalies: [{'ns': 'kafka', 'topic': 'topic-zmilrsrinp', 'partition': 36, 'revision_id': 51, 'missing_segments': ['22e19e71/kafka/topic-zmilrsrinp/36_51/5185-5226-20977506-1-v1.log.1', '36b4b4a9/kafka/topic-zmilrsrinp/36_51/5145-5184-20977360-1-v1.log.1', '2526f118/kafka/topic-zmilrsrinp/36_51/4945-4984-20977360-1-v1.log.1'], 'last_complete_scrub_at': 1700085196023}, {'ns': 'kafka', 'topic': 'topic-zmilrsrinp', 'partition': 12, 'revision_id': 51, 'missing_segments': ['f3d2be80/kafka/topic-zmilrsrinp/12_51/2896-2937-20977639-1-v1.log.1', '65c66587/kafka/topic-zmilrsrinp/12_51/3104-3147-20977918-1-v1.log.1', 'db7677ea/kafka/topic-zmilrsrinp/12_51/3778-3819-20977639-1-v1.log.1', 'ee6d0524/kafka/topic-zmilrsrinp/12_51/4617-4656-20977360-1-v1.log.1', '2ec0f06f/kafka/topic-zmilrsrinp/12_51/3694-3735-20977639-1-v1.log.1', 'eae7d342/kafka/topic-zmilrsrinp/12_51/1131-1174-20977918-1-v1.log.1', 'f5d03a8f/kafka/topic-zmilrsrinp/12_51/84-125-20977637-1-v1.log.1', 'a976720f/kafka/topic-zmilrsrinp/12_51/1469-1510-20977639-1-v1.log.1', 'cc3e11ca/kafka/topic-zmilrsrinp/12_51/1677-1718-20977639-1-v1.log.1', '76567a90/kafka/topic-zmilrsrinp/12_51/1931-1972-20977639-1-v1.log.1', '224cbf80/kafka/topic-zmilrsrinp/12_51/3610-3651-20977639-1-v1.log.1', 'b71440c6/kafka/topic-zmilrsrinp/12_51/168-209-20977638-1-v1.log.1', 'be8707b2/kafka/topic-zmilrsrinp/12_51/5189-5228-20977360-1-v1.log.1', 'de7fd964/kafka/topic-zmilrsrinp/12_51/4657-4700-20977918-1-v1.log.1', 'd58bb90a/kafka/topic-zmilrsrinp/12_51/2141-2182-20977639-1-v1.log.1', '1871d1ac/kafka/topic-zmilrsrinp/12_51/2225-2266-20977639-1-v1.log.1', 'a6252834/kafka/topic-zmilrsrinp/12_51/2477-2518-20977639-1-v1.log.1', 'b7a7001d/kafka/topic-zmilrsrinp/12_51/755-796-20977638-1-v1.log.1', '10a2a9c3/kafka/topic-zmilrsrinp/12_51/4869-4908-20977360-1-v1.log.1', '90725443/kafka/topic-zmilrsrinp/12_51/1385-1426-20977639-1-v1.log.1', '8090ce86/kafka/topic-zmilrsrinp/12_51/4155-4196-20977639-1-v1.log.1', 'bbc81a72/kafka/topic-zmilrsrinp/12_51/921-964-20977916-1-v1.log.1', 'bdc82106/kafka/topic-zmilrsrinp/12_51/2938-2979-20977639-1-v1.log.1', '42d4d859/kafka/topic-zmilrsrinp/12_51/5149-5188-20977360-1-v1.log.1', '9809514d/kafka/topic-zmilrsrinp/12_51/3400-3441-20977639-1-v1.log.1', '2348add4/kafka/topic-zmilrsrinp/12_51/5069-5108-20977360-1-v1.log.1', '435ff1fb/kafka/topic-zmilrsrinp/12_51/4239-4280-20977639-1-v1.log.1', '1d16b05d/kafka/topic-zmilrsrinp/12_51/2686-2727-20977639-1-v1.log.1', 'ce7b3ccc/kafka/topic-zmilrsrinp/12_51/4071-4112-20977639-1-v1.log.1', '06e502f3/kafka/topic-zmilrsrinp/12_51/419-460-20977638-1-v1.log.1', '3904e663/kafka/topic-zmilrsrinp/12_51/1427-1468-20977639-1-v1.log.1', '90d342ae/kafka/topic-zmilrsrinp/12_51/4029-4070-20977639-1-v1.log.1', '960cc049/kafka/topic-zmilrsrinp/12_51/545-586-20977638-1-v1.log.1', 'bc059ce3/kafka/topic-zmilrsrinp/12_51/3232-3272-20977562-1-v1.log.1', '4f3cad8a/kafka/topic-zmilrsrinp/12_51/4909-4948-20977360-1-v1.log.1', '6c0f4d7a/kafka/topic-zmilrsrinp/12_51/965-1004-20977360-1-v1.log.1', '20ac88bf/kafka/topic-zmilrsrinp/12_51/1343-1384-20977639-1-v1.log.1', '41adbf3b/kafka/topic-zmilrsrinp/12_51/2183-2224-20977639-1-v1.log.1', '8e47f47e/kafka/topic-zmilrsrinp/12_51/4742-4784-20977716-1-v1.log.1', 'cee530ee/kafka/topic-zmilrsrinp/12_51/3148-3189-20977639-1-v1.log.1', 'bbf8b90e/kafka/topic-zmilrsrinp/12_51/332-376-20978056-1-v1.log.1', '61ee93ad/kafka/topic-zmilrsrinp/12_51/1889-1930-20977639-1-v1.log.1', 'fbc3c6e0/kafka/topic-zmilrsrinp/12_51/671-712-20977638-1-v1.log.1', '86127870/kafka/topic-zmilrsrinp/12_51/2770-2811-20977639-1-v1.log.1', 'c8daa35c/kafka/topic-zmilrsrinp/12_51/2267-2308-20977639-1-v1.log.1', 'f294929e/kafka/topic-zmilrsrinp/12_51/3358-3399-20977639-1-v1.log.1', '103d2365/kafka/topic-zmilrsrinp/12_51/1089-1130-20977639-1-v1.log.1', '854f1683/kafka/topic-zmilrsrinp/12_51/1763-1802-20977360-1-v1.log.1', 'faffe22f/kafka/topic-zmilrsrinp/12_51/4197-4238-20977639-1-v1.log.1', '676b902a/kafka/topic-zmilrsrinp/12_51/2728-2769-20977639-1-v1.log.1', '87fa5103/kafka/topic-zmilrsrinp/12_51/3820-3859-20977360-1-v1.log.1', '07f8253d/kafka/topic-zmilrsrinp/12_51/879-920-20977638-1-v1.log.1', '599d62e5/kafka/topic-zmilrsrinp/12_51/4785-4826-20977639-1-v1.log.1', 'e3ef074b/kafka/topic-zmilrsrinp/12_51/2519-2560-20977639-1-v1.log.1', 'e2ab6da8/kafka/topic-zmilrsrinp/12_51/4989-5028-20977360-1-v1.log.1', '97abc74f/kafka/topic-zmilrsrinp/12_51/210-251-20977638-1-v1.log.1', '667482f9/kafka/topic-zmilrsrinp/12_51/501-544-20977916-1-v1.log.1', 'ae5f95a8/kafka/topic-zmilrsrinp/12_51/4575-4616-20977639-1-v1.log.1', 'ea8d3d36/kafka/topic-zmilrsrinp/12_51/126-167-20977637-1-v1.log.1', 'ad77531c/kafka/topic-zmilrsrinp/12_51/377-418-20977638-1-v1.log.1', '982be0f8/kafka/topic-zmilrsrinp/12_51/3652-3693-20977639-1-v1.log.1', '1d95ce80/kafka/topic-zmilrsrinp/12_51/292-331-20977360-1-v1.log.1', '83cd2c3a/kafka/topic-zmilrsrinp/12_51/4701-4741-20977562-1-v1.log.1', 'acf5d5b4/kafka/topic-zmilrsrinp/12_51/3736-3777-20977639-1-v1.log.1', '2d65cc47/kafka/topic-zmilrsrinp/12_51/252-291-20977360-1-v1.log.1', '0d23e49c/kafka/topic-zmilrsrinp/12_51/1049-1088-20977360-1-v1.log.1', 'd6b44ae8/kafka/topic-zmilrsrinp/12_51/1803-1846-20977918-1-v1.log.1', '1257b97e/kafka/topic-zmilrsrinp/12_51/2601-2643-20977780-1-v1.log.1', '2b03c1b8/kafka/topic-zmilrsrinp/12_51/2435-2476-20977639-1-v1.log.1', '38d22c9d/kafka/topic-zmilrsrinp/12_51/797-838-20977638-1-v1.log.1', '14d29aeb/kafka/topic-zmilrsrinp/12_51/2980-3019-20977360-1-v1.log.1', '887aa9e2/kafka/topic-zmilrsrinp/12_51/3568-3609-20977639-1-v1.log.1', '63ea855a/kafka/topic-zmilrsrinp/12_51/0-41-20977591-1-v1.log.1', '2b51ff73/kafka/topic-zmilrsrinp/12_51/1553-1594-20977639-1-v1.log.1', '893fbe03/kafka/topic-zmilrsrinp/12_51/587-628-20977638-1-v1.log.1', '575847a3/kafka/topic-zmilrsrinp/12_51/461-500-20977360-1-v1.log.1', '3d07166f/kafka/topic-zmilrsrinp/12_51/1719-1762-20977918-1-v1.log.1', 'd082fbca/kafka/topic-zmilrsrinp/12_51/3526-3567-20977639-1-v1.log.1', '29aa36a0/kafka/topic-zmilrsrinp/12_51/1847-1888-20977639-1-v1.log.1', '46a527cf/kafka/topic-zmilrsrinp/12_51/1175-1216-20977639-1-v1.log.1', 'c9e3f156/kafka/topic-zmilrsrinp/12_51/3316-3357-20977639-1-v1.log.1', '40840bfa/kafka/topic-zmilrsrinp/12_51/4281-4322-20977639-1-v1.log.1', '9dda12a4/kafka/topic-zmilrsrinp/12_51/1511-1552-20977639-1-v1.log.1', '037912f0/kafka/topic-zmilrsrinp/12_51/3273-3315-20977716-1-v1.log.1', '32a85d88/kafka/topic-zmilrsrinp/12_51/4448-4490-20977655-1-v1.log.1', '953cac9f/kafka/topic-zmilrsrinp/12_51/3860-3901-20977639-1-v1.log.1', '04c1d40e/kafka/topic-zmilrsrinp/12_51/4827-4868-20977645-1-v1.log.1', 'bfed2bb0/kafka/topic-zmilrsrinp/12_51/2644-2685-20977639-1-v1.log.1', 'd88bdb45/kafka/topic-zmilrsrinp/12_51/3987-4028-20977639-1-v1.log.1', '24b6668b/kafka/topic-zmilrsrinp/12_51/4533-4574-20977639-1-v1.log.1', 'c03786bd/kafka/topic-zmilrsrinp/12_51/2812-2853-20977639-1-v1.log.1', '67566786/kafka/topic-zmilrsrinp/12_51/2057-2098-20977639-1-v1.log.1', '8345c737/kafka/topic-zmilrsrinp/12_51/2015-2056-20977639-1-v1.log.1', '45ffb451/kafka/topic-zmilrsrinp/12_51/42-83-20977636-1-v1.log.1', '6a9477c8/kafka/topic-zmilrsrinp/12_51/2351-2392-20977639-1-v1.log.1', 'f8b00efd/kafka/topic-zmilrsrinp/12_51/629-670-20977638-1-v1.log.1', 'db00cb10/kafka/topic-zmilrsrinp/12_51/3484-3525-20977639-1-v1.log.1', '8f7ccf6e/kafka/topic-zmilrsrinp/12_51/4113-4154-20977639-1-v1.log.1', '94350b98/kafka/topic-zmilrsrinp/12_51/2099-2140-20977639-1-v1.log.1', 'ad1e3da7/kafka/topic-zmilrsrinp/12_51/3064-3103-20977360-1-v1.log.1', '37175c4d/kafka/topic-zmilrsrinp/12_51/2854-2895-20977639-1-v1.log.1', 'eb4cf7de/kafka/topic-zmilrsrinp/12_51/3945-3986-20977639-1-v1.log.1', '09db2d4a/kafka/topic-zmilrsrinp/12_51/4491-4532-20977639-1-v1.log.1', '330a0846/kafka/topic-zmilrsrinp/12_51/2393-2434-20977639-1-v1.log.1', '7fa89a5d/kafka/topic-zmilrsrinp/12_51/5029-5068-20977360-1-v1.log.1', '6deec357/kafka/topic-zmilrsrinp/12_51/2561-2600-20977360-1-v1.log.1', 'c72193af/kafka/topic-zmilrsrinp/12_51/1005-1048-20977916-1-v1.log.1', '8614f857/kafka/topic-zmilrsrinp/12_51/2309-2350-20977639-1-v1.log.1', '26705615/kafka/topic-zmilrsrinp/12_51/3902-3944-20977841-1-v1.log.1', '1d657d6c/kafka/topic-zmilrsrinp/12_51/5229-5268-20977360-1-v1.log.1', '3a0a21de/kafka/topic-zmilrsrinp/12_51/1301-1342-20977639-1-v1.log.1', 'c3d9107a/kafka/topic-zmilrsrinp/12_51/4365-4404-20977360-1-v1.log.1', 'a5aa7ba0/kafka/topic-zmilrsrinp/12_51/3442-3483-20977639-1-v1.log.1', '50d95089/kafka/topic-zmilrsrinp/12_51/1973-2014-20977639-1-v1.log.1', '5987db7d/kafka/topic-zmilrsrinp/12_51/4323-4364-20977639-1-v1.log.1', '9fd7ca1c/kafka/topic-zmilrsrinp/12_51/4949-4988-20977360-1-v1.log.1', 'f82f2a9a/kafka/topic-zmilrsrinp/12_51/839-878-20977360-1-v1.log.1', 'df34d9ae/kafka/topic-zmilrsrinp/12_51/4405-4447-20977841-1-v1.log.1', '81cfe030/kafka/topic-zmilrsrinp/12_51/713-754-20977638-1-v1.log.1', '2babec87/kafka/topic-zmilrsrinp/12_51/1259-1300-20977639-1-v1.log.1', '8a47cf0f/kafka/topic-zmilrsrinp/12_51/1637-1676-20977360-1-v1.log.1', '55aa25ec/kafka/topic-zmilrsrinp/12_51/1217-1258-20977639-1-v1.log.1', 'f3451177/kafka/topic-zmilrsrinp/12_51/1595-1636-20977639-1-v1.log.1', 'fb7c057b/kafka/topic-zmilrsrinp/12_51/3020-3063-20977918-1-v1.log.1'], 'last_complete_scrub_at': 1700085220006}]
JIRA Link: CORE-1603
This seems to be a problem with deleting segments. The segments may have been partially deleted (delete happens in batch). Since not all of them were deleted, the manifest was not updated. The anomaly is caused by the manifest pointing to the segment but the segment was earlier deleted:
log of segments being deleted:
INFO 2023-11-15 21:53:05,409 [shard 3:au ] archival - [fiber111 kafka/topic-zmilrsrinp/12] - ntp_archiver_service.cc:2670 - Deleting segment from cloud storage: {"63ea855a/kafka/topic-zmilrsrinp/12_51/0-41-20977591-1-v1.log.1"}
INFO 2023-11-15 21:53:05,409 [shard 3:au ] archival - [fiber111 kafka/topic-zmilrsrinp/12] - ntp_archiver_service.cc:2670 - Deleting segment from cloud storage: {"45ffb451/kafka/topic-zmilrsrinp/12_51/42-83-20977636-1-v1.log.1"}
...
Soon after NTP archiver service logs that delete failed:
INFO 2023-11-15 21:53:29,983 [shard 3:au ] archival - [fiber111 kafka/topic-zmilrsrinp/12] - ntp_archiver_service.cc:2718 - Failed to delete all selected segments from cloud storage. Will retry on the next housekeeping run.
Presumably many of the segments are actually deleted by this point but we do not update the manifest.
At some point after this, the scrub seems to have run and detected the missing segments:
WARN 2023-11-15 21:53:35,332 [shard 3:au ] cloud_storage - [fiber113~0~1|1|299970ms] - remote.cc:943 - HeadObject from {panda-bucket-789abc7a-8400-11ee-b7d0-35661a636e44}, {key_not_found}, segment at {"63ea855a/kafka/topic-zmilrsrinp/12_51/0-41-20977591-1-v1.log.1"} not available
...
INFO 2023-11-15 21:53:40,006 [shard 3:au ] archival - [fiber113 kafka/topic-zmilrsrinp/12] - scrubber.cc:139 - Scrub which started at {nullopt} finished at {nullopt} with status {full} and detected {missing_partition_manifest: false, missing_spillover_manifests: 0, missing_segments: 124, segment_metadata_anomalies: 0} and used 127 quota
The above reason outlined in comment seems unlikely, because GC works on segments which are already in the replaced list or below manifest start offset, so the scrubber would not end up looking for those segments in the bucket.
One possibility is that GC collects and removes all segments below manifest start offset, whereas scrub starts at the beginning of the manifest segment collection. Perhaps the scrub is starting earlier than it should, and it needs to start at manifest start offset.
DEBUG 2023-11-15 21:53:05,211 [shard 3:main] cluster - ntp: {kafka/topic-zmilrsrinp/12} - archival_metadata_stm.cc:1313 - Updating start offset, current value 4827, update 5269
DEBUG 2023-11-15 21:53:05,211 [shard 3:main] cluster - ntp: {kafka/topic-zmilrsrinp/12} - archival_metadata_stm.cc:1320 - Start offset updated to 5269
*https://buildkite.com/redpanda/vtools/builds/11071
*https://buildkite.com/redpanda/vtools/builds/11190
One possibility is that GC collects and removes all segments below manifest start offset, whereas scrub starts at the beginning of the manifest segment collection. Perhaps the scrub is starting earlier than it should, and it needs to start at manifest start offset.
DEBUG 2023-11-15 21:53:05,211 [shard 3:main] cluster - ntp: {kafka/topic-zmilrsrinp/12} - archival_metadata_stm.cc:1313 - Updating start offset, current value 4827, update 5269 DEBUG 2023-11-15 21:53:05,211 [shard 3:main] cluster - ntp: {kafka/topic-zmilrsrinp/12} - archival_metadata_stm.cc:1320 - Start offset updated to 5269
This seems correct and the lastest failures show that the missing segments are before the start offset but still in the segment_set of the manifest. So technically not problematic but it's some form of race condition while cleaning up the data and updating the manifest. sev/medium also for the frequency
*https://buildkite.com/redpanda/vtools/builds/11393
*https://buildkite.com/redpanda/vtools/builds/11721
*https://buildkite.com/redpanda/vtools/builds/11929
*https://buildkite.com/redpanda/vtools/builds/12030
*https://buildkite.com/redpanda/vtools/builds/12172#018e1a81-7a47-474d-8711-c70337e96497
*https://buildkite.com/redpanda/vtools/builds/12472
@abhijat should we leave this open or are you going to rework this anyway and we should disable/redo this test?
*https://buildkite.com/redpanda/vtools/builds/13369
*https://buildkite.com/redpanda/vtools/builds/13519
*https://buildkite.com/redpanda/vtools/builds/13970
*https://buildkite.com/redpanda/vtools/builds/14129
*https://buildkite.com/redpanda/vtools/builds/13369 *https://buildkite.com/redpanda/vtools/builds/13519 *https://buildkite.com/redpanda/vtools/builds/13970 *https://buildkite.com/redpanda/vtools/builds/14129
*https://buildkite.com/redpanda/vtools/builds/13369 *https://buildkite.com/redpanda/vtools/builds/13519 *https://buildkite.com/redpanda/vtools/builds/13970 *https://buildkite.com/redpanda/vtools/builds/14129
*https://buildkite.com/redpanda/vtools/builds/13369 *https://buildkite.com/redpanda/vtools/builds/13519 *https://buildkite.com/redpanda/vtools/builds/13970 *https://buildkite.com/redpanda/vtools/builds/14129
Closing older-bot-filed CI issues as we transition to a more reliable system.