readyset icon indicating copy to clipboard operation
readyset copied to clipboard

Replication adapter timeout when trying to connect to domain

Open readyset-railway opened this issue 2 years ago • 5 comments

Summary

When ReadySet is loading with a deployment that has been already snapshotted, an error message appears about the replication domains.

ERROR replicators: Error in replication, will retry after timeout error=Error during RPC (replication_offsets): Could not find domain 10.0.0 on worker timeout_sec=1

Description

This occurs when the internet connection is slow or when there is no internet connection.

Expected behavior

ReadySet should start without the error messages.

Actual behavior

The error message pops up for about 30 seconds until the domain name gets resolved.

Steps to reproduce

  1. Disconnect from the internet or use a limited connection.
  2. Connect to a new upstream database and start ReadySet and wait until the snapshotting is finished.
  3. Restart ReadySet and the error messages will pop up.

ReadySet version

Git : a56cc1ba52724ee7e4200ac3add9e06c17ca2730

Upstream DB type and version

MySQL 8.0.33

Instance Details

Logs

2023-06-13T16:27:32.472861Z  WARN replicators::noria_adapter: Restarting adapter after error encountered error=Error during RPC (replication_offsets): Could not find domain 14.0.0 on worker
2023-06-13T16:27:32.472874Z ERROR replicators: Error in replication, will retry after timeout error=Error during RPC (replication_offsets): Could not find domain 14.0.0 on worker timeout_sec=1
2023-06-13T16:27:32.472894Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=GoAway { error_code: NO_ERROR, last_stream_id: StreamId(0) }
2023-06-13T16:27:32.472966Z DEBUG Connection{peer=Client}: h2::proto::connection: Connection::poll; connection error error=GoAway(b"", NO_ERROR, Library)
2023-06-13T16:27:32.473025Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=GoAway { error_code: NO_ERROR, last_stream_id: StreamId(0) }
2023-06-13T16:27:33.473839Z DEBUG tower::buffer::worker: service.ready=true processing request
2023-06-13T16:27:33.473991Z DEBUG hyper::client::connect::http: connecting to 127.0.0.1:6033
2023-06-13T16:27:33.474566Z DEBUG hyper::client::connect::http: connected to 127.0.0.1:6033
2023-06-13T16:27:33.474642Z DEBUG h2::client: binding client connection
2023-06-13T16:27:33.474691Z DEBUG h2::client: client connection bound
2023-06-13T16:27:33.474706Z DEBUG h2::codec::framed_write: send frame=Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384 }
2023-06-13T16:27:33.474799Z DEBUG hyper::client::pool: pooling idle connection for ("http", 127.0.0.1:6033)
2023-06-13T16:27:33.474799Z DEBUG hyper::proto::h1::conn: parse error (invalid HTTP version parsed) with 24 bytes
2023-06-13T16:27:33.474850Z DEBUG hyper::proto::h1::dispatch: read_head error: invalid HTTP version parsed (found HTTP2 preface)
2023-06-13T16:27:33.474873Z DEBUG h2::codec::framed_write: send frame=Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384 }
2023-06-13T16:27:33.474886Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=WindowUpdate { stream_id: StreamId(0), size_increment: 5177345 }
2023-06-13T16:27:33.474918Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(1), flags: (0x5: END_HEADERS | END_STREAM) }
2023-06-13T16:27:33.474930Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=WindowUpdate { stream_id: StreamId(0), size_increment: 983041 }
2023-06-13T16:27:33.475003Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384 }
2023-06-13T16:27:33.475027Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:33.475041Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=WindowUpdate { stream_id: StreamId(0), size_increment: 983041 }
2023-06-13T16:27:33.475100Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384 }
2023-06-13T16:27:33.475255Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:33.475281Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=WindowUpdate { stream_id: StreamId(0), size_increment: 5177345 }
2023-06-13T16:27:33.475306Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(1), flags: (0x5: END_HEADERS | END_STREAM) }
2023-06-13T16:27:33.475327Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:33.475337Z DEBUG Connection{peer=Server}: h2::proto::settings: received settings ACK; applying Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384 }
2023-06-13T16:27:33.475498Z DEBUG readyset_server::controller::inner: received external HTTP request method=POST path=/replication_offsets
2023-06-13T16:27:33.475610Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(1), flags: (0x4: END_HEADERS) }
2023-06-13T16:27:33.475639Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Data { stream_id: StreamId(1), flags: (0x1: END_STREAM) }
2023-06-13T16:27:33.475714Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:33.475729Z DEBUG Connection{peer=Client}: h2::proto::settings: received settings ACK; applying Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384 }
2023-06-13T16:27:33.475749Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(1), flags: (0x4: END_HEADERS) }
2023-06-13T16:27:33.475763Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Data { stream_id: StreamId(1), flags: (0x1: END_STREAM) }
2023-06-13T16:27:33.476004Z  WARN replicators::noria_adapter: Restarting adapter after error encountered error=Error during RPC (replication_offsets): Could not find domain 47.0.0 on worker
2023-06-13T16:27:33.476036Z ERROR replicators: Error in replication, will retry after timeout error=Error during RPC (replication_offsets): Could not find domain 47.0.0 on worker timeout_sec=1
2023-06-13T16:27:33.476072Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=GoAway { error_code: NO_ERROR, last_stream_id: StreamId(0) }
2023-06-13T16:27:33.476088Z DEBUG Connection{peer=Client}: h2::proto::connection: Connection::poll; connection error error=GoAway(b"", NO_ERROR, Library)
2023-06-13T16:27:33.476201Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=GoAway { error_code: NO_ERROR, last_stream_id: StreamId(0) }
2023-06-13T16:27:34.476590Z DEBUG tower::buffer::worker: service.ready=true processing request
2023-06-13T16:27:34.476727Z DEBUG hyper::client::connect::http: connecting to 127.0.0.1:6033
2023-06-13T16:27:34.477337Z DEBUG hyper::client::connect::http: connected to 127.0.0.1:6033
2023-06-13T16:27:34.477391Z DEBUG h2::client: binding client connection
2023-06-13T16:27:34.477432Z DEBUG h2::client: client connection bound
2023-06-13T16:27:34.477453Z DEBUG h2::codec::framed_write: send frame=Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384 }
2023-06-13T16:27:34.477549Z DEBUG hyper::proto::h1::conn: parse error (invalid HTTP version parsed) with 24 bytes
2023-06-13T16:27:34.477577Z DEBUG hyper::proto::h1::dispatch: read_head error: invalid HTTP version parsed (found HTTP2 preface)
2023-06-13T16:27:34.477576Z DEBUG hyper::client::pool: pooling idle connection for ("http", 127.0.0.1:6033)
2023-06-13T16:27:34.477631Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=WindowUpdate { stream_id: StreamId(0), size_increment: 5177345 }
2023-06-13T16:27:34.477612Z DEBUG h2::codec::framed_write: send frame=Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384 }
2023-06-13T16:27:34.477745Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=WindowUpdate { stream_id: StreamId(0), size_increment: 983041 }
2023-06-13T16:27:34.477759Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(1), flags: (0x5: END_HEADERS | END_STREAM) }
2023-06-13T16:27:34.477850Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384 }
2023-06-13T16:27:34.477865Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:34.477877Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=WindowUpdate { stream_id: StreamId(0), size_increment: 5177345 }
2023-06-13T16:27:34.477878Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384 }
2023-06-13T16:27:34.478017Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:34.478042Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=WindowUpdate { stream_id: StreamId(0), size_increment: 983041 }
2023-06-13T16:27:34.477902Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(1), flags: (0x5: END_HEADERS | END_STREAM) }
2023-06-13T16:27:34.478268Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:34.478294Z DEBUG Connection{peer=Server}: h2::proto::settings: received settings ACK; applying Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384 }
2023-06-13T16:27:34.478305Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:34.478322Z DEBUG Connection{peer=Client}: h2::proto::settings: received settings ACK; applying Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384 }
2023-06-13T16:27:34.478310Z DEBUG readyset_server::controller::inner: received external HTTP request method=POST path=/replication_offsets
2023-06-13T16:27:34.478461Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(1), flags: (0x4: END_HEADERS) }
2023-06-13T16:27:34.478494Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Data { stream_id: StreamId(1), flags: (0x1: END_STREAM) }
2023-06-13T16:27:34.478632Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(1), flags: (0x4: END_HEADERS) }
2023-06-13T16:27:34.478662Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Data { stream_id: StreamId(1), flags: (0x1: END_STREAM) }
2023-06-13T16:27:34.478712Z  WARN replicators::noria_adapter: Restarting adapter after error encountered error=Error during RPC (replication_offsets): Could not find domain 25.0.0 on worker
2023-06-13T16:27:34.478735Z ERROR replicators: Error in replication, will retry after timeout error=Error during RPC (replication_offsets): Could not find domain 25.0.0 on worker timeout_sec=1
2023-06-13T16:27:34.478771Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=GoAway { error_code: NO_ERROR, last_stream_id: StreamId(0) }
2023-06-13T16:27:34.478785Z DEBUG Connection{peer=Client}: h2::proto::connection: Connection::poll; connection error error=GoAway(b"", NO_ERROR, Library)
2023-06-13T16:27:34.478914Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=GoAway { error_code: NO_ERROR, last_stream_id: StreamId(0) }
2023-06-13T16:27:35.479694Z DEBUG tower::buffer::worker: service.ready=true processing request
2023-06-13T16:27:35.479805Z DEBUG hyper::client::connect::http: connecting to 127.0.0.1:6033
2023-06-13T16:27:35.480397Z DEBUG hyper::client::connect::http: connected to 127.0.0.1:6033
2023-06-13T16:27:35.480459Z DEBUG h2::client: binding client connection
2023-06-13T16:27:35.480509Z DEBUG h2::client: client connection bound
2023-06-13T16:27:35.480533Z DEBUG h2::codec::framed_write: send frame=Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384 }
2023-06-13T16:27:35.480605Z DEBUG hyper::proto::h1::conn: parse error (invalid HTTP version parsed) with 24 bytes
2023-06-13T16:27:35.480671Z DEBUG hyper::proto::h1::dispatch: read_head error: invalid HTTP version parsed (found HTTP2 preface)
2023-06-13T16:27:35.480687Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=WindowUpdate { stream_id: StreamId(0), size_increment: 5177345 }
2023-06-13T16:27:35.480705Z DEBUG h2::codec::framed_write: send frame=Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384 }
2023-06-13T16:27:35.480618Z DEBUG hyper::client::pool: pooling idle connection for ("http", 127.0.0.1:6033)
2023-06-13T16:27:35.480780Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=WindowUpdate { stream_id: StreamId(0), size_increment: 983041 }
2023-06-13T16:27:35.480804Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(1), flags: (0x5: END_HEADERS | END_STREAM) }
2023-06-13T16:27:35.480836Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384 }
2023-06-13T16:27:35.480852Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:35.480877Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384 }
2023-06-13T16:27:35.480882Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=WindowUpdate { stream_id: StreamId(0), size_increment: 5177345 }
2023-06-13T16:27:35.480900Z DEBUG Connection{peer=Client}: h2::codec::framed_write: send frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:35.480935Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=WindowUpdate { stream_id: StreamId(0), size_increment: 983041 }
2023-06-13T16:27:35.480938Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(1), flags: (0x5: END_HEADERS | END_STREAM) }
2023-06-13T16:27:35.481152Z DEBUG readyset_server::controller::inner: received external HTTP request method=POST path=/replication_offsets
2023-06-13T16:27:35.481148Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:35.481167Z DEBUG Connection{peer=Server}: h2::codec::framed_read: received frame=Settings { flags: (0x1: ACK) }
2023-06-13T16:27:35.481230Z DEBUG Connection{peer=Client}: h2::proto::settings: received settings ACK; applying Settings { flags: (0x0), enable_push: 0, initial_window_size: 2097152, max_frame_size: 16384 }
2023-06-13T16:27:35.481261Z DEBUG Connection{peer=Server}: h2::proto::settings: received settings ACK; applying Settings { flags: (0x0), initial_window_size: 1048576, max_frame_size: 16384 }
2023-06-13T16:27:35.481326Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Headers { stream_id: StreamId(1), flags: (0x4: END_HEADERS) }
2023-06-13T16:27:35.481459Z DEBUG Connection{peer=Server}: h2::codec::framed_write: send frame=Data { stream_id: StreamId(1), flags: (0x1: END_STREAM) }
2023-06-13T16:27:35.481582Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Headers { stream_id: StreamId(1), flags: (0x4: END_HEADERS) }
2023-06-13T16:27:35.481608Z DEBUG Connection{peer=Client}: h2::codec::framed_read: received frame=Data { stream_id: StreamId(1), flags: (0x1: END_STREAM) }

readyset-railway avatar Aug 30 '23 16:08 readyset-railway

The ticket description says "The error message pops up for about 30 seconds until the domain name gets resolved" but I think this indicates a misunderstanding, because the "domain" in question is not a DNS domain name. In ReadySet a "domain" is a single-threaded job that runs a small group of dataflow nodes.

It could simply be that sometimes there's a delay before all the domains are started up, and that replication correctly waits until the domains are all running before it begins replicating.

That said, I don't have deep knowledge around the startup code, so while my best guess is that this error is normal and doesn't indicate any deeper problems, I would like a quick second opinion. I think @grfn and @frannoriega have both spent time working in the startup code recently IIRC, so maybe one of them can confirm whether this is expected behavior.

readyset-railway avatar Aug 30 '23 16:08 readyset-railway

@nickelization - can you triage this issue? Is the error message innocuous or is there more to it?

readyset-railway avatar Aug 30 '23 16:08 readyset-railway

Pushing this out of the current cycle as we will not be able to get to this.

gvsg-rs avatar Sep 21 '23 03:09 gvsg-rs

Have an idea of how to suppress these logs that I had while doing other development, will take a crack at it.

lukoktonos avatar Nov 09 '23 20:11 lukoktonos