redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

WARN: Failed to make upload candidate

Open travisdowns opened this issue 2 years ago • 6 comments

Version & Environment

Redpanda version: v23.1.1

What went wrong?

The following warning shows in the log:

WARN  2023-03-16 18:45:31,586 [shard  7] archival - [fiber47 kafka/topic1/216] - ntp_archiver_service.cc:1489 - Failed to make upload candidate

What should have happened instead?

No warnings without external causes that we can't avoid.

JIRA Link: CORE-1214

travisdowns avatar Mar 16 '23 21:03 travisdowns

Also noticed this today on the long running test cluster (it only got upgraded to 23.1 a day or so ago).

No sign of it having an impact (uploads appeared to be proceeding eventually), Evgeny suggested that it could happen in situations where the retention code is racing with the upload code, but it has been seen on topics that had infinite retention.

jcsp avatar Mar 16 '23 21:03 jcsp

Sounds like something we should target for next minor, then?

piyushredpanda avatar Mar 16 '23 21:03 piyushredpanda

This has started to come up in automated tests.

FAIL test: KgoVerifierWithSiTestLargeSegments.test_si_with_timeboxed.cloud_storage_type=CloudStorageType.S3 (1/2 runs) failure at 2023-05-04T17:16:24.037Z: <BadLogLines nodes=ip-172-31-9-183(2) example="ERROR 2023-05-04 10:40:27,356 [shard 1] archival - [fiber67 kafka/topic-rklshggkhy/83] - ntp_archiver_service.cc:2060 - Failed to make upload candidate with correct size, expected {source segment offsets: {term:2, base_offset:2, committed_offset:114, dirty_offset:114}, exposed_name: {2-2-v1.log}, starting_offset: 2, file_offset: 0, content_length: 109016389, final_offset: 114, final_file_offset: 109016389, term: 2, source names: {/var/lib/redpanda/data/kafka/topic-rklshggkhy/83_18/2-2-v1.log}}, actual {is_compacted: false, size_bytes: 25003650, base_offset: 90, committed_offset: 114, base_timestamp: {timestamp: 1683196755282}, max_timestamp: {timestamp: 1683196762533}, delta_offset: 6, ntp_revision: 18, archiver_term: 2, segment_term: 2, delta_offset_end: 6, sname_format: {v3}, metadata_size_hint: 0}"> on (amd64, VM) in job https://buildkite.com/redpanda/vtools/builds/7377#0187e5cb-c7ac-45a5-a783-89ebbe7df193

FAIL test: CloudRetentionTest.test_cloud_retention.max_consume_rate_mb=20.cloud_storage_type=CloudStorageType.S3 (1/31 runs) failure at 2023-05-05T04:45:27.129Z: <BadLogLines nodes=ip-172-31-4-15(1) example="ERROR 2023-05-05 00:13:10,861 [shard 1] archival - [fiber29 kafka/si_test_topic/3] - ntp_archiver_service.cc:2060 - Failed to make upload candidate with correct size, expected {source segment offsets: {term:2, base_offset:2325, committed_offset:2411, dirty_offset:2411}, exposed_name: {2325-2-v1.log}, starting_offset: 2325, file_offset: 0, content_length: 0, final_offset: -9223372036854775808, final_file_offset: 0, term: 2, source names: {/var/lib/redpanda/data/kafka/si_test_topic/3_18/2325-2-v1.log}}, actual {is_compacted: false, size_bytes: 9970066, base_offset: 2245, committed_offset: 2324, base_timestamp: {timestamp: 1683245577314}, max_timestamp: {timestamp: 1683245579310}, delta_offset: 65, ntp_revision: 18, archiver_term: 2, segment_term: 1, delta_offset_end: 66, sname_format: {v3}, metadata_size_hint: 0}"> on (arm64, VM) in job https://buildkite.com/redpanda/vtools/builds/7385#0187e85f-7419-4984-9d70-bec9324a05ac

jcsp avatar May 05 '23 08:05 jcsp

Created https://github.com/redpanda-data/redpanda/issues/10583 so that it easier for PR authors to find and refer to the issue.

ztlpn avatar May 05 '23 12:05 ztlpn

for me it looks like the issue we are seeing in the mentioned tests and the one mentioned in the issue title are two different problems. Then one that started happening in the mentioned tests is related with the size mismatch, whereas the problem seen in the PoC is more general as it hits the case where there is no upload candidate at all.

mmaslankaprv avatar Jun 06 '23 14:06 mmaslankaprv

This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

github-actions[bot] avatar Aug 20 '24 06:08 github-actions[bot]

This issue hasn't seen activity in 3 months. If you want to keep it open, post a comment or remove the stale label – otherwise this will be closed in two weeks.

github-actions[bot] avatar Dec 20 '24 06:12 github-actions[bot]

This issue was closed due to lack of activity. Feel free to reopen if it's still relevant.

github-actions[bot] avatar Jan 06 '25 06:01 github-actions[bot]