neon icon indicating copy to clipboard operation
neon copied to clipboard

Flakiness in test_sharding_split_smoke

Open jcsp opened this issue 1 year ago • 2 comments

Example: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-6805/7957035402/index.html#suites/140824de6e814b5b1ae2b622c3f67840/6cd46f9911ed5b0f

In that run, the compute hook (local version, using neon_local Endpoint) is hanging, causing migration to time out.

jcsp avatar Feb 19 '24 13:02 jcsp

Having fixed spurious reconciles, the logs are cleaner in this failure: https://neon-github-public-dev.s3.amazonaws.com/reports/pr-6828/7964818040/index.html#/testresult/dd6a70cde8671f8

In this example we see an error configuring compute node, but we're only giving it 2 seconds to complete because the earlier part of the migration (switching origin to stale mode) took 8 seconds. So it may be that we're just not giving it enough time, and the apparent failure of compute configuration is actually just the compute notification taking a while (although I'm still a bit concerned that the compute configuration in a neon_local environment isn't sub-second, it shouldn't be slow).

jcsp avatar Feb 20 '24 09:02 jcsp

zero flakes of this in last 24h -- I suspect that #6814 made it much much rarer.

jcsp avatar Feb 21 '24 12:02 jcsp

Recently it started to re-occur again:

Screenshot_20240321_175543

edit: #7201

arpad-m avatar Mar 21 '24 16:03 arpad-m