arroba icon indicating copy to clipboard operation
arroba copied to clipboard

allocating sequence numbers should be transactional with writing blocks

Open snarfed opened this issue 1 year ago • 12 comments

From https://github.com/snarfed/bridgy-fed/issues/1130:

I suspect this is because we're not allocating sequence numbers and writing blocks in the same transaction. If we allocates a seq for the new repo's initial commit, and then another commit allocates the next seq and writes its commit, and our firehose reads it, before we write ours, our firehose will skip our commit entirely. And I'm more likely to trigger that when I do this kind of manual change locally, outside GCP, because of the network distance.

Ugh.

snarfed avatar Oct 22 '24 23:10 snarfed

Another layer of defense here would be, if our firehose sees a missed sequence number, wait for a while, but not indefinitely, to see if it shows up before continuing on.

snarfed avatar Nov 02 '24 18:11 snarfed

Started on this in the allocate-seq-transactional branch, above, but it's not pretty, and not functional yet. I may try the query idea in https://github.com/snarfed/arroba/issues/34#issuecomment-2453065787 instead.

snarfed avatar Nov 03 '24 16:11 snarfed

Another option here: drop the storage API abstraction and fully tie this library to the GCP Datastore. Not ideal, but it would make this refactoring much easier.

snarfed avatar Nov 04 '24 01:11 snarfed

Yet another option: actually refactor the Repo and Storage APIs to let storage implementations finalize commits - notably, generate their CIDs and signatures - all at once, so they can put it all inside a single transaction.

Wouldn't be backward compatible, but we're not 1.0 yet, that's still reasonable.

snarfed avatar Nov 04 '24 01:11 snarfed

Another layer of defense here would be, if our firehose sees a missed sequence number, wait for a while, but not indefinitely, to see if it shows up before continuing on.

I did this, commit above. Seems ok before, and we're definitely hitting it, averaging once every few min or so. 🤞🤞🤞

snarfed avatar Nov 05 '24 05:11 snarfed

Leaving this open to track combining allocating seqs and writing blocks into the same tx, but that seems lower priority if the query fix makes sure we still emit every seq.

snarfed avatar Nov 05 '24 05:11 snarfed

sorry for suddenly appearing here, but does that workaround allow me to maybe have my account's metadata filled in?

if you recall, i had an issue where my avatar, banner, username weren't going through: https://bsky.app/profile/did:plc:iykvw5xeefafbsai4ygi6dk3

derspyy avatar Nov 20 '24 12:11 derspyy

I used the refresh button on https://fed.brid.gy/ap/@[email protected] earlier, but the profile still didn't update. There might be something else still stuck in this case.

Tamschi avatar Nov 20 '24 17:11 Tamschi

Sorry @derspyy! The Bluesky team is working on handle the huge growth they've seen over the last week or two. They're making some upgrades and changes to their relay, and it's currently behind on ingesting commits from federated PDSes like BF. Hopefully that will clear up soon. https://github.com/bluesky-social/atproto/discussions/3036

snarfed avatar Nov 21 '24 20:11 snarfed

Back to looking at this as part of trying to reduce datastore lookups in https://github.com/snarfed/bridgy-fed/issues/1149. (gets inside transactions always bypass cache and hit the datastore.) Current idea is:

  1. Merge Repo.format_commit and apply_commit
  2. Maybe switch from datastore tx to memcache lease for serializing commits per repo?

snarfed avatar Jan 28 '25 22:01 snarfed

I also wonder if this might be the root cause of https://github.com/snarfed/bridgy-fed/issues/1595

snarfed avatar Jan 28 '25 22:01 snarfed

^ Narrator: it wasn't.

snarfed avatar Mar 21 '25 02:03 snarfed

Doing this should stop us from skipping sequence numbers sometimes and needing to wait for them in the firehose server, which should largely get rid of the delays we see there, ie the green "Emitting" line in this chart:

Image

snarfed avatar Jun 03 '25 13:06 snarfed

I tried switching our datastore concurrency mode from PESSIMISTIC (ie locking) to OPTIMISTIC (optimistic concurrency/MVCC) just now. We see a fair amount of contention on ATProto repo and sequence number writes, so I figured optimistic would make our contention and retries much worse, but instead it made them much better. Datastore write failures due to contention have basically disappeared, and our ATProto firehose serving delay is way down. I still don't have good intuition for why, but I won't argue.

Image Image

snarfed avatar Sep 23 '25 15:09 snarfed

Unfortunately adding sequence number allocation into the commit transaction brought back the contention, with a vengeance, as expected. Commits were failing at roughly .5qps, and that's after 10 retries. Pessimistic mode at ~12:53, optimistic at ~13:23. Ugh.

Image

snarfed avatar Sep 25 '25 20:09 snarfed

I'm reconsidering the premise here. We emit 1-2qps of ATProto events right now, and growing. Almost all are commits, which are relatively heavy datastore transactions. Do we really need to serialize all datastore writes by sequence number? We currently wait up to 5m before skipping a sequence number, which handles out-of-order datastore writes. Maybe that's enough?

snarfed avatar Sep 26 '25 16:09 snarfed

I'm going to call it here. Allocating sequence numbers transactionally with commits causes too much contention. Commits are too heavy and long-running. I think the 5m buffer for reordering is enough.

snarfed avatar Sep 27 '25 14:09 snarfed