Require serde_fields for all serde structs
In effect, this PR modifies serde to require using serde_fields for all structs EXCEPT those implementing both serde_read and serde_write. In these cases envelope_to_tuple won't generally participate in overload resolution
Also adds serde_fields() to several existing structs, including those generated automatically for fuzz-testing.
Why do this? This PR provides a motivating example. Where possible, we try to avoid compat-breaking serde changes by appending new fields to the end of the binary format. The implicit coupling between aggregate field ordering and binary format makes it easy to introduce subtle serialization bugs or unnecessary compat breaks.
Backports Required
- [x] none - not a bug fix
- [ ] none - this is a backport
- [ ] none - issue does not exist in previous branches
- [ ] none - papercut/not impactful enough to backport
- [ ] v24.2.x
- [ ] v24.1.x
- [ ] v23.3.x
Release Notes
- none
/ci-repeat 1 release
/ci-repeat 1
new failures in https://buildkite.com/redpanda/redpanda/builds/52635#01912f5f-6009-4b03-aec1-ed4996c0a79a:
"rptest.tests.recovery_mode_test.RecoveryModeTest.test_rolling_restart"
"rptest.tests.shadow_indexing_compacted_topic_test.ShadowIndexingCompactedTopicTest.test_upload.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.path.test_case=.TS_Read==True.AdjacentSegmentMergerReupload==True"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=check_manifest_and_segment_metadata"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=check_manifest_existence"
new failures in https://buildkite.com/redpanda/redpanda/builds/52635#01912f5f-6004-42f7-9508-1976d10b810b:
"rptest.tests.shadow_indexing_compacted_topic_test.ShadowIndexingCompactedTopicTest.test_upload.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=check_manifest_existence"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=no_check"
"rptest.tests.adjacent_segment_merging_test.AdjacentSegmentMergingTest.test_reupload_of_local_segments.acks=-1.cloud_storage_type=CloudStorageType.ABS"
new failures in https://buildkite.com/redpanda/redpanda/builds/52635#01912f5f-6006-4028-9934-40514b596708:
"rptest.tests.cloud_storage_usage_test.CloudStorageUsageTest.test_cloud_storage_usage_reporting"
"rptest.tests.follower_fetching_test.FollowerFetchingTest.test_follower_fetching_with_maintenance_mode"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_config_batches.num_messages=2.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.ABS.2.virtual_host.test_case=.TS_Read==True.AdjacentSegmentMergerReupload==True"
"rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.AdjacentSegmentMergerReupload==True"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=no_check"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_prevent_recovery.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_time_based_retention.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.adjacent_segment_merging_test.AdjacentSegmentMergingTest.test_reupload_of_local_segments.acks=-1.cloud_storage_type=CloudStorageType.S3"
new failures in https://buildkite.com/redpanda/redpanda/builds/52635#01912f5f-6007-482b-a8bc-53eb021338c9:
"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_config_batches.num_messages=2.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=check_manifest_and_segment_metadata"
"rptest.tests.adjacent_segment_merging_test.AdjacentSegmentMergingTest.test_reupload_of_local_segments.acks=1.cloud_storage_type=CloudStorageType.ABS"
new failures in https://buildkite.com/redpanda/redpanda/builds/52635#01912f60-be5f-4e6c-a136-374e8e806250:
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=off.test_mode=TestMode.FAST_MOVES.cleanup_policy=compact"
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=node_add.test_mode=TestMode.NO_TIRED_STORAGE.cleanup_policy=compact"
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=off.test_mode=TestMode.TIRED_STORAGE.cleanup_policy=compact"
"rptest.tests.read_replica_e2e_test.ReadReplicasUpgradeTest.test_upgrades.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.scaling_up_test.ScalingUpTest.test_moves_with_local_retention.use_topic_property=True"
"rptest.tests.simple_e2e_test.SimpleEndToEndTest.test_relaxed_acks.write_caching=True.disable_batch_cache=False"
"rptest.tests.cloud_storage_usage_test.CloudStorageUsageTest.test_cloud_storage_usage_reporting_with_partition_moves"
"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTestCompactedTopic.test_compacting_during_leadership_transfer.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_shadow_indexing.num_to_upgrade=2.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_cross_shard.num_to_upgrade=2.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.e2e_shadow_indexing_test.EndToEndThrottlingTest.test_throttling.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=no_check"
"rptest.tests.adjacent_segment_merging_test.AdjacentSegmentMergingTest.test_reupload_of_local_segments.acks=1.cloud_storage_type=CloudStorageType.ABS"
new failures in https://buildkite.com/redpanda/redpanda/builds/52635#01912f60-be5b-41f7-8d04-7012a56d0c28:
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=node_add.test_mode=TestMode.FAST_MOVES.cleanup_policy=compact"
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=node_add.test_mode=TestMode.TIRED_STORAGE.cleanup_policy=compact"
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=off.test_mode=TestMode.NO_TIRED_STORAGE.cleanup_policy=compact"
"rptest.tests.partition_balancer_test.PartitionBalancerTest.test_nodes_with_reclaimable_space"
"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTestCompactedTopic.test_write.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.multi_restarts_with_archival_test.MultiRestartTest.test_recovery_after_multiple_restarts.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_cross_shard.num_to_upgrade=0.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_shadow_indexing.num_to_upgrade=0.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.adjacent_segment_merging_test.AdjacentSegmentMergingTest.test_reupload_of_local_segments.acks=-1.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_config_batches.num_messages=2.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.shadow_indexing_compacted_topic_test.ShadowIndexingCompactedTopicTest.test_upload.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=check_manifest_existence"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=check_manifest_and_segment_metadata"
new failures in https://buildkite.com/redpanda/redpanda/builds/52635#01912f60-be61-40f3-ba20-106bfde29737:
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=off.test_mode=TestMode.FAST_MOVES.cleanup_policy=compact.delete"
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=node_add.test_mode=TestMode.NO_TIRED_STORAGE.cleanup_policy=compact.delete"
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=off.test_mode=TestMode.TIRED_STORAGE.cleanup_policy=compact.delete"
"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTestCompactedTopic.test_compacting_during_leadership_transfer.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.follower_fetching_test.FollowerFetchingTest.test_follower_fetching_with_maintenance_mode"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_shadow_indexing.num_to_upgrade=2.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_cross_shard.num_to_upgrade=2.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.e2e_topic_recovery_test.EndToEndTopicRecovery.test_restore_with_config_batches.num_messages=2.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.shadow_indexing_compacted_topic_test.ShadowIndexingCompactedTopicTest.test_upload.cloud_storage_type=CloudStorageType.ABS"
"rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.path.test_case=.TS_Read==True.AdjacentSegmentMergerReupload==True"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=check_manifest_and_segment_metadata"
"rptest.tests.adjacent_segment_merging_test.AdjacentSegmentMergingTest.test_reupload_of_local_segments.acks=1.cloud_storage_type=CloudStorageType.S3"
new failures in https://buildkite.com/redpanda/redpanda/builds/52635#01912f60-be5d-4e61-ab1b-2ff1bf9730d0:
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=node_add.test_mode=TestMode.TIRED_STORAGE.cleanup_policy=compact.delete"
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=node_add.test_mode=TestMode.FAST_MOVES.cleanup_policy=compact.delete"
"rptest.tests.node_pool_migration_test.NodePoolMigrationTest.test_migrating_redpanda_nodes_to_new_pool.balancing_mode=off.test_mode=TestMode.NO_TIRED_STORAGE.cleanup_policy=compact.delete"
"rptest.tests.cloud_storage_usage_test.CloudStorageUsageTest.test_cloud_storage_usage_reporting"
"rptest.tests.e2e_shadow_indexing_test.EndToEndShadowIndexingTestCompactedTopic.test_write.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.scaling_up_test.ScalingUpTest.test_moves_with_local_retention.use_topic_property=False"
"rptest.tests.multi_restarts_with_archival_test.MultiRestartTest.test_recovery_after_multiple_restarts.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_shadow_indexing.num_to_upgrade=0.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.partition_movement_test.SIPartitionMovementTest.test_cross_shard.num_to_upgrade=0.cloud_storage_type=CloudStorageType.S3"
"rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.ABS.2.virtual_host.test_case=.TS_Read==True.AdjacentSegmentMergerReupload==True"
"rptest.tests.tiered_storage_model_test.TieredStorageTest.test_tiered_storage.cloud_storage_type_and_url_style=.CloudStorageType.S3.1.virtual_host.test_case=.TS_Read==True.AdjacentSegmentMergerReupload==True"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.ABS.check_mode=check_manifest_existence"
"rptest.tests.topic_recovery_test.TopicRecoveryTest.test_many_partitions.cloud_storage_type=CloudStorageType.S3.check_mode=no_check"
"rptest.tests.adjacent_segment_merging_test.AdjacentSegmentMergingTest.test_reupload_of_local_segments.acks=-1.cloud_storage_type=CloudStorageType.S3"
new failures in https://buildkite.com/redpanda/redpanda/builds/52977#0191550a-eb87-4aad-8a64-26539e006c3b:
"rptest.tests.leadership_transfer_test.AutomaticLeadershipBalancingTest.test_automatic_rebalance"
new failures in https://buildkite.com/redpanda/redpanda/builds/53329#0191773e-a962-47fa-9b1e-dc702a9434ca:
"rptest.tests.quota_management_test.QuotaManagementTest.test_describe_default"
"rptest.tests.self_test_test.SelfTestTest.test_self_test_mixed_node_controller_lower_version"
new failures in https://buildkite.com/redpanda/redpanda/builds/53329#0191773e-a964-490d-a17c-8ae15bc6540d:
"rptest.tests.quota_management_test.QuotaManagementTest.test_kafka_configs"
"rptest.tests.self_test_test.SelfTestTest.test_self_test_node_crash"
new failures in https://buildkite.com/redpanda/redpanda/builds/53329#0191773e-a966-4924-9ae6-4786ca3455f2:
"rptest.tests.rpk_cluster_quota_test.RpkClusterQuotaTest.test_import_describe_quotas"
new failures in https://buildkite.com/redpanda/redpanda/builds/53329#0191773e-a968-4cf0-aa45-90e910853caf:
"rptest.tests.quota_management_test.QuotaManagementTest.test_describe_any"
"rptest.tests.quota_management_test.QuotaManagementTest.test_describe_name"
new failures in https://buildkite.com/redpanda/redpanda/builds/53329#01917740-42b8-4b8a-a8b2-f49bc0c88379:
"rptest.tests.rpk_cluster_quota_test.RpkClusterQuotaTest.test_import_describe_quotas"
new failures in https://buildkite.com/redpanda/redpanda/builds/53329#01917740-42b9-4778-9f6c-14e23bce22e6:
"rptest.tests.quota_management_test.QuotaManagementTest.test_describe_any"
"rptest.tests.quota_management_test.QuotaManagementTest.test_describe_name"
"rptest.tests.self_test_test.SelfTestTest.test_self_test_mixed_node_controller_lower_version"
new failures in https://buildkite.com/redpanda/redpanda/builds/53329#01917740-42b4-47ec-b55d-ccc8baf8d67d:
"rptest.tests.quota_management_test.QuotaManagementTest.test_describe_default"
"rptest.tests.self_test_test.SelfTestTest.test_self_test_node_crash"
new failures in https://buildkite.com/redpanda/redpanda/builds/53329#01917740-42b6-4ab8-b9c1-f9a75988966e:
"rptest.tests.quota_management_test.QuotaManagementTest.test_kafka_configs"
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/52635#01912f5f-6004-42f7-9508-1976d10b810b
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/52635#01912f5f-6007-482b-a8bc-53eb021338c9
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/52635#01912f60-be5b-41f7-8d04-7012a56d0c28
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/52780#01914326-9a3d-4dd4-91df-5f62e843d081
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53329#0191773e-a966-4924-9ae6-4786ca3455f2
ducktape was retried in https://buildkite.com/redpanda/redpanda/builds/53329#01917740-42b6-4ab8-b9c1-f9a75988966e
/ci-repeat 1
/ci-repeat 1
/ci-repeat 1
CI Failure:
- https://redpandadata.atlassian.net/browse/CORE-4243 (known, unrelated)
I wonder if we should go further and require serde_fields except for custom implementations of serde_read and serde_write. so even for single field structs?
Ha, yeah, I've been avoiding flipping this to ready for exactly that reason. Leaning towards yes; better to require it up front vs. spewing template mess on the next append.
This is awesome, +1.
/ci-repeat 1
/ci-repeat 1 release
force push to rebase dev and fix a merge conflict
/ci-repeat 1
CI Failures:
-
debug build
-
quota_management_test.QuotaManagementTest.test_describe_default(~~passes locally, needs issue?~~) -
self_test_test.SelfTestTest.test_self_test_mixed_node_controller_lower_version(~~transient error?~~)
-
-
release build
-
quota_management_test.QuotaManagementTest.test_describe_default(~~passes locally, needs issue?~~) -
self_test_test.SelfTestTest.test_self_test_node_crash(~~passes locally, needs issue?~~)
-
~~None of these seems serde related, but I'm not finding open issues for any of these...~~
self_test_test.SelfTestTest.test_self_test_mixed_node_controller_lower_version
From the logs, is a serde error.
self_test_test.SelfTestTest.test_self_test_mixed_node_controller_lower_versionFrom the logs, is a
serdeerror.
oh? i haven't looked at the broker logs yet... passes consistently on my dev box so I assumed transient. will take a look
ah, I have the wrong branch checked out lol. disregard https://github.com/redpanda-data/redpanda/pull/22782#issuecomment-2303531940. Thanks @WillemKauf!
force push to fix an integration issue and a couple omissions.
@felixguendling sup?
The general idea LGTM. Make sure to check that all necessary fields are (de)serialized when doing a code review (before this check was not necessary with "automatic mode" aka aggregate to tuple).
CI Failure:
-
storage_e2e_rpfixturepasses locally. flake?
storage_e2e_rpfixture passes locally. flake?
Yeah, previously flakey, there was a revert PR that should fix this that got merged this AM. Re-running.
@oleiman do you think we should add a static assert if it doesn't already exist that serde_fields xor serde_read/write are used so we don't have to think about ambiguity if both are present?
@oleiman do you think we should add a static assert if it doesn't already exist that serde_fields xor serde_read/write are used so we don't have to think about ambiguity if both are present?
Seems like a good idea. I think in some cases we use serde_fields on the write path with explicit reader for compat reasons, but that stuff can be rejiggered as well.
Anyway, I'm going to throw this PR into draft mode and split it up. Thanks @dotnwat & @WillemKauf for feedback 🙂
superseded by https://github.com/redpanda-data/redpanda/pull/23126 et al.