N2N, Kafka: Oura stops logging and pushing messages after a couple minutes
When running Oura with the Kafka or with terminal backend and connecting using N2N, (see configuration below) we stop receiving both logs and Kafka messages after a couple minutes (20-30). It could be that the node stops responding, due to being "exhausted," but I'm not sure this is it. In general, Oura behaves fine until this happens, however.
The config we are using:
[source]
type = "N2N"
address = ["Tcp", "preview-node.world.dev.cardano.org:30002"]
magic = "preview"
[source.mapper]
include_block_end_events = false
include_transaction_details = true
include_transaction_end_events = true
include_block_cbor = false
include_byron_ebb = false
[sink]
type = "Kafka"
brokers = ["0.0.0.0:9092"]
topic = "cardano-events-preview"
Logs
If I run it like this:
RUST_LOG=info ./support/oura daemon --config ./support/oura-configuration.testnet.toml
I get the following logs:
... many similar messages above ...
[2023-09-08T11:46:10Z INFO oura::sources::n2n::run] rollback buffer state, size: 0, oldest: None, latest: None
[2023-09-08T11:47:23Z INFO oura::sources::n2n::run] rolling forward to point (27517643, 5a0d142d0f1411192c4297e5a5197940714ab33b3025c64676e08cc2db4d3f18)
[2023-09-08T11:47:23Z INFO oura::sources::n2n::run] rollback buffer state, size: 0, oldest: None, latest: None
[2023-09-08T11:48:14Z INFO oura::sources::n2n::run] rolling forward to point (27517694, 153fcf27c79530675fc153269ee6bc6e015123b86f8d2c2addcd71a6f1db735c)
[2023-09-08T11:48:14Z INFO oura::sources::n2n::run] rollback buffer state, size: 0, oldest: None, latest: None
[2023-09-08T11:48:18Z INFO oura::sources::n2n::run] rolling forward to point (27517698, 7108ddb4fc775c5929f7f7daa1272c5da44af6f1f58bfa668f76bd0d59bb85ef)
[2023-09-08T11:48:18Z INFO oura::sources::n2n::run] rollback buffer state, size: 0, oldest: None, latest: None
[2023-09-08T11:48:24Z INFO oura::sources::n2n::run] rolling forward to point (27517704, a533472feaf9e659390641899cbc7038fa88d0d4011d1761d74bce3bb2c89d26)
[2023-09-08T11:48:24Z INFO oura::sources::n2n::run] rollback buffer state, size: 0, oldest: None, latest: None
[2023-09-08T11:48:36Z INFO oura::sources::n2n::run] rolling forward to point (27517716, f978ffa72d5be3701597f7460ee253513aae2c30c14bd224ccafb8ad4c69163c)
[2023-09-08T11:48:36Z INFO oura::sources::n2n::run] rollback buffer state, size: 0, oldest: None, latest: None
[2023-09-08T11:48:45Z INFO oura::sources::n2n::run] rolling forward to point (27517725, 9cb7901fbb62ff974e6bb775ef3a32b9731831c8ce8eb3e571b12fad0a9efbef)
[2023-09-08T11:48:45Z INFO oura::sources::n2n::run] rollback buffer state, size: 0, oldest: None, latest: None
Nothing out of the ordinary here.
The total count of messages produced to our Kafka topic was 1772, in this instance, and it never went up after that. Restarting oura gets it to continue again, but isn't ideal of course.
We experience the same thing when running using the Terminal sink, so it's not Kafka-related.
@emiflake thank you for reporting!
Could you please provide the version of Oura you're using? We'll try to reproduce.
I'm on version 1.8.1, which I got from the releases page. I can try updating, but I assume if it's not a known issue, it's probably not been fixed.
@emiflake could you run env in debug mode and send me the logs? I couldn't reproduce this problem. Could you see if this occurs on the mainnet network as well?
export RUST_LOG=debug
export RUST_BACKTRACE=1
I don't seem to get anything out of the ordinary. It sort of just dies after these logs:
[2023-09-15T14:01:50Z DEBUG pallas_miniprotocols::chainsync::client] requesting next block
[2023-09-15T14:01:50Z DEBUG pallas_multiplexer::bearers] parsed inbound msg protocol=2 timestamp=1882363499 length=2
[2023-09-15T14:01:50Z DEBUG pallas_multiplexer::bearers] parsed inbound msg protocol=3 timestamp=1882363550 length=874
[2023-09-15T14:01:50Z DEBUG oura::sources::n2n::run] blockfetch succeeded: (28130510, 19aac054136f66e04b3dc755c709f9093765ab00c99cf56172c5d186ef58a122)
[2023-09-15T14:01:50Z DEBUG oura::sinks::kafka::run] pushed event to kafka: Event { context: EventContext { block_hash: Some("19aac054136f66e04b3dc755c709f9093765ab00c99cf56172c5d186ef58a122"), block_number: Some(1226309), slot: Some(28130510), timestamp: Some(1694786510), tx_idx: None, tx_hash: None, input_idx: None, output_idx: None, output_address: None, certificate_idx: None }, data: Block(BlockRecord { era: Babbage, epoch: Some(65), epoch_slot: Some(50510), body_size: 4, issuer_vkey: "6185c9f8503460d4941abc0a947226f449c68a6836570d8895f6982cbd7a2ba6", vrf_vkey: "0d567155045b9c3fc2d6e85907193236a1fc2379b6a7039a246119b2c84c34ef", tx_count: 0, slot: 28130510, hash: "19aac054136f66e04b3dc755c709f9093765ab00c99cf56172c5d186ef58a122", number: 1226309, previous_hash: "b3aa4b1e4f763f96fa5a91e0b15ac081722e86dc35c44baba773a99b8c502d0b", cbor_hex: None, transactions: None }), fingerprint: None }
[2023-09-15T14:01:55Z DEBUG pallas_multiplexer::bearers] parsed inbound msg protocol=2 timestamp=1887275704 length=911
[2023-09-15T14:01:55Z INFO oura::sources::n2n::run] rolling forward to point (28130515, 0d062f8610aa244c2ca6df05bcfca483acbdfa11e9b057eb95a58f605ed9e373)
[2023-09-15T14:01:55Z DEBUG oura::sources::n2n::run] found 1 points with required min depth
[2023-09-15T14:01:55Z DEBUG oura::sources::n2n::run] requesting block fetch for point (28130515, 0d062f8610aa244c2ca6df05bcfca483acbdfa11e9b057eb95a58f605ed9e373)
[2023-09-15T14:01:55Z INFO oura::sources::n2n::run] rollback buffer state, size: 0, oldest: None, latest: None
[2023-09-15T14:01:55Z DEBUG pallas_miniprotocols::chainsync::client] requesting next block
[2023-09-15T14:01:55Z DEBUG pallas_multiplexer::bearers] parsed inbound msg protocol=3 timestamp=1887319507 length=874
[2023-09-15T14:01:55Z DEBUG pallas_multiplexer::bearers] parsed inbound msg protocol=2 timestamp=1887319972 length=2
[2023-09-15T14:01:55Z DEBUG oura::sources::n2n::run] blockfetch succeeded: (28130515, 0d062f8610aa244c2ca6df05bcfca483acbdfa11e9b057eb95a58f605ed9e373)
[2023-09-15T14:01:55Z DEBUG oura::sinks::kafka::run] pushed event to kafka: Event { context: EventContext { block_hash: Some("0d062f8610aa244c2ca6df05bcfca483acbdfa11e9b057eb95a58f605ed9e373"), block_number: Some(1226310), slot: Some(28130515), timestamp: Some(1694786515), tx_idx: None, tx_hash: None, input_idx: None, output_idx: None, output_address: None, certificate_idx: None }, data: Block(BlockRecord { era: Babbage, epoch: Some(65), epoch_slot: Some(50515), body_size: 4, issuer_vkey: "7352fdc8c4cbc11b7624539509e4b3d50e71098d74ef37621c87ecff5dc41403", vrf_vkey: "8986663f6c6fa476f05b5447d8796afc29649a402c7b333b00859f38458f4b2b", tx_count: 0, slot: 28130515, hash: "0d062f8610aa244c2ca6df05bcfca483acbdfa11e9b057eb95a58f605ed9e373", number: 1226310, previous_hash: "19aac054136f66e04b3dc755c709f9093765ab00c99cf56172c5d186ef58a122", cbor_hex: None, transactions: None }), fingerprint: None }
The rest of the logs aren't anything special either, albeit very verbose. And yeah, this does happen on mainnet.