redpanda icon indicating copy to clipboard operation
redpanda copied to clipboard

Tolerate partition deallocation invariant failures from delete_topic_cmd

Open dlex opened this issue 2 years ago • 1 comments

A controller log may possibly end up with the commands sequence that would deallocate partitions from a node when the node does not have partitions in the domain. This change relaxes vassert to log warning in deallocation scenarios that directly result from delete_topic_cmd and prevents decreasing pratition counts below zero.

May be a workarond for #7343.

Backports Required

  • [ ] none - not a bug fix
  • [ ] none - issue does not exist in previous branches
  • [ ] none - papercut/not impactful enough to backport
  • [x] v22.3.x
  • [x] v22.2.x
  • [ ] v22.1.x

UX Changes

  • none

Release Notes

  • Automatic recovery from certain controller states causing deletion of non-exiting partitions while replaying

dlex avatar Nov 18 '22 22:11 dlex

@dlex - do you have a theory how the log could end up in such a state? Could it be that that the partition in-memory state gets inconsistent with the log causing multiple deletion commands to be accepted (because it checks the in-memory state to see if an action makes sense).

I'm a bit worried about relaxing an assert that guards against a clearly invalid state, though I do agree in this case it might be harmless.

I think it would be nice to get a second set of eyes on this, perhaps @jcsp , @dotnwat or @mmaslankaprv could oblige.

travisdowns avatar Nov 18 '22 23:11 travisdowns

I would suggest to first find an underlying issue that caused the allocated partition count to go negative. The whole system should be strongly consistent and other checks should prevent allocated partition count to go negative. I would suggest not to merge this workaround.

mmaslankaprv avatar Nov 21 '22 08:11 mmaslankaprv

Hmm. So we know we have a bug, but we haven't found the source.

From the way #7343 manifested immediately on upgrade (and not on partition creation/deletion), it sounds like there was some content written by an earlier version of redpanda that violated the expectations of our latest code. The question is whether the writer was wrong (in which case we need this tolerant apply() logic), or whether we're doing something wrong while applying updates (in which case we can fix it there without making apply() tolerant).

jcsp avatar Nov 21 '22 08:11 jcsp

After taking a fresh look at the problem, a highly likely cause for the problem has been found: #7406. Closing this one as not needed, and also per @mmaslankaprv suggestion.

or whether we're doing something wrong while applying updates (in which case we can fix it there without making apply() tolerant)

Exactly this.

dlex avatar Nov 21 '22 18:11 dlex