mongodb_exporter icon indicating copy to clipboard operation
mongodb_exporter copied to clipboard

Add getReplicationInfo to available metrics

Open caiorcferreira opened this issue 3 years ago • 11 comments

The easiest way to track oplog size and oplog window is through db.getReplicationInfo().

They are critical metrics missing on the project right know.

caiorcferreira avatar May 14 '21 18:05 caiorcferreira

Hello,

Would you like to create a Jira ticket with more detailed information? You can do that at our project's Jira and also, if you want to provide a fix and need help to start, just ping me and I'll be glad to help you.

Thanks. Regards

percona-csalguero avatar May 17 '21 18:05 percona-csalguero

Hi @percona-csalguero,

I was setuping up the project locally in order to provide this feature but got some problems with the tests.

Steps:

  1. Run make test-cluster
  2. Run make test

Output of the tests:

goroutine 356 [IO wait]:
internal/poll.runtime_pollWait(0xd42c900, 0x77, 0xc00001c180)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/runtime/netpoll.go:203 +0x55
internal/poll.(*pollDesc).wait(0xc00038c798, 0x77, 0x4d0f500, 0xc000286ba0, 0xc00038c780)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/internal/poll/fd_poll_runtime.go:87 +0x45
internal/poll.(*pollDesc).waitWrite(...)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/internal/poll/fd_poll_runtime.go:96
internal/poll.(*FD).WaitWrite(...)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/internal/poll/fd_unix.go:498
net.(*netFD).connect(0xc00038c780, 0x4d0f520, 0xc000286ba0, 0x0, 0x0, 0x4cff020, 0xc00009c3e0, 0x0, 0x0, 0x0, ...)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:152 +0x257
net.(*netFD).dial(0xc00038c780, 0x4d0f520, 0xc000286ba0, 0x4d14c20, 0x0, 0x4d14c20, 0xc0004c5440, 0x0, 0x1, 0xc00079a390)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/sock_posix.go:149 +0xff
net.socket(0x4d0f520, 0xc000286ba0, 0x4a2e7e7, 0x3, 0x2, 0x1, 0x0, 0x0, 0x4d14c20, 0x0, ...)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/sock_posix.go:70 +0x1c0
net.internetSocket(0x4d0f520, 0xc000286ba0, 0x4a2e7e7, 0x3, 0x4d14c20, 0x0, 0x4d14c20, 0xc0004c5440, 0x1, 0x0, ...)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/ipsock_posix.go:141 +0x141
net.(*sysDialer).doDialTCP(0xc00038c700, 0x4d0f520, 0xc000286ba0, 0x0, 0xc0004c5440, 0x49454a0, 0x5349530, 0x0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/tcpsock_posix.go:65 +0xc2
net.(*sysDialer).dialTCP(0xc00038c700, 0x4d0f520, 0xc000286ba0, 0x0, 0xc0004c5440, 0x4069df0, 0xc00079a5a0, 0xb691af3c)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/tcpsock_posix.go:61 +0xd7
net.(*sysDialer).dialSingle(0xc00038c700, 0x4d0f520, 0xc000286ba0, 0x4d04fe0, 0xc0004c5440, 0x0, 0x0, 0x0, 0x0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/dial.go:581 +0x60a
net.(*sysDialer).dialSerial(0xc00038c700, 0x4d0f520, 0xc000286ba0, 0xc00009bd30, 0x1, 0x1, 0x0, 0x0, 0x0, 0x0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/dial.go:549 +0x14f
net.(*Dialer).DialContext(0xc000286b40, 0x4d0f4a0, 0xc000645b40, 0x4a2e7e7, 0x3, 0xc00003caf0, 0x10, 0x0, 0x0, 0x0, ...)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/dial.go:426 +0x6d8
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*connection).connect(0xc00038b180, 0x4d0f4a0, 0xc000645b40)
	/Users/caioferreira/workspace/b2w/persistencia/mongodb_exporter/vendor/go.mongodb.org/mongo-driver/x/mongo/driver/topology/connection.go:136 +0x242
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).setupHeartbeatConnection(0xc0004648c0, 0x0, 0x0)
	/Users/caioferreira/workspace/b2w/persistencia/mongodb_exporter/vendor/go.mongodb.org/mongo-driver/x/mongo/driver/topology/server.go:590 +0x115
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).check(0xc0004648c0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, 0x0, ...)
	/Users/caioferreira/workspace/b2w/persistencia/mongodb_exporter/vendor/go.mongodb.org/mongo-driver/x/mongo/driver/topology/server.go:637 +0x92a
go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).update(0xc0004648c0)
	/Users/caioferreira/workspace/b2w/persistencia/mongodb_exporter/vendor/go.mongodb.org/mongo-driver/x/mongo/driver/topology/server.go:493 +0x350
created by go.mongodb.org/mongo-driver/x/mongo/driver/topology.(*Server).Connect
	/Users/caioferreira/workspace/b2w/persistencia/mongodb_exporter/vendor/go.mongodb.org/mongo-driver/x/mongo/driver/topology/server.go:198 +0x201

goroutine 494 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000572480, 0xc0001a7f80, 0xc0005719e0, 0xc000571980)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 493 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc0005723c0, 0xc0001a7e80, 0xc000571740, 0xc0005716e0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 601 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000164f60, 0xc000334a00, 0xc000400ea0, 0xc000400e40)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 510 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000286420, 0xc000453e80, 0xc0000ff380, 0xc0000ff320)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 521 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000164a20, 0xc000334300, 0xc000181ce0, 0xc000181c80)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 531 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000572840, 0xc00022e480, 0xc00023c660, 0xc00023c600)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 505 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000286000, 0xc000453080, 0xc0000fe720, 0xc0000fe6c0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 530 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000572780, 0xc00022e380, 0xc00023c3c0, 0xc00023c360)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 478 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc0005e2ae0, 0xc0005f4100, 0xc0005eeb40, 0xc0005eeae0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 516 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc0005e28a0, 0xc0005c1e00, 0xc000181620, 0xc0001815c0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 533 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000572900, 0xc00022e580, 0xc00023c900, 0xc00023c8a0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 545 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000572d80, 0xc00022eb80, 0xc00023d7a0, 0xc00023d740)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 564 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc0005e2d20, 0xc0005f4400, 0xc0005ef200, 0xc0005ef1a0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 509 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000286300, 0xc000453d80, 0xc0000ff0e0, 0xc0000ff080)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 480 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc0005e2ba0, 0xc0005f4200, 0xc0005eed80, 0xc0005eed20)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 511 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc0002865a0, 0xc000453f80, 0xc0000ff620, 0xc0000ff5c0)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 546 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000286660, 0xc00038c080, 0xc0000ff860, 0xc0000ff800)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 506 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc0002860c0, 0xc000453a80, 0xc0000fe960, 0xc0000fe900)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 518 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc000164900, 0xc000334200, 0xc000181aa0, 0xc000181a40)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f

goroutine 534 [select]:
net.(*netFD).connect.func2(0x4d0f520, 0xc0005729c0, 0xc00022e680, 0xc00023cba0, 0xc00023cb40)
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:129 +0xba
created by net.(*netFD).connect
	/Users/caioferreira/.asdf/installs/golang/1.14.6/go/src/net/fd_unix.go:128 +0x22f
FAIL	github.com/percona/mongodb_exporter/exporter	32.199s
?   	github.com/percona/mongodb_exporter/internal/tu	[no test files]
FAIL

This is logged repeatedly, hence I belive this is due to some loop. Do you have any clues?

Env:

  • Go v1.14.6

caiorcferreira avatar May 18 '21 16:05 caiorcferreira

Hello. Could you try with a newer Go version and modules enabled? I cannot see the errors:

go test -v -timeout 30s ./...
=== RUN   TestBuildExporter
time="2021-05-21T09:49:27-03:00" level=debug msg="Compatible mode: true"
time="2021-05-21T09:49:27-03:00" level=debug msg="Connection URI: mongodb://usr:[email protected]/"
--- PASS: TestBuildExporter (0.00s)
PASS
ok      github.com/percona/mongodb_exporter     0.008s
=== RUN   TestCollStatsCollector
--- PASS: TestCollStatsCollector (0.07s)
=== RUN   TestDebug
--- PASS: TestDebug (0.00s)
=== RUN   TestDiagnosticDataCollector
--- PASS: TestDiagnosticDataCollector (0.03s)
=== RUN   TestAllDiagnosticDataCollectorMetrics
--- PASS: TestAllDiagnosticDataCollectorMetrics (0.05s)
=== RUN   TestConnect
=== RUN   TestConnect/Connect_without_SSL
=== RUN   TestConnect/Test_per-request_connection
=== RUN   TestConnect/Test_global_connection
--- PASS: TestConnect (0.49s)
    --- PASS: TestConnect/Connect_without_SSL (0.01s)
    --- PASS: TestConnect/Test_per-request_connection (0.27s)
    --- PASS: TestConnect/Test_global_connection (0.21s)
=== RUN   TestGeneralCollector
time="2021-05-21T09:49:27-03:00" level=error msg="error while checking mongodb connection: client is disconnected. mongo_up is set to 0"
--- PASS: TestGeneralCollector (0.00s)
=== RUN   TestIndexStatsCollector
--- PASS: TestIndexStatsCollector (0.12s)
=== RUN   TestSanitize
=== RUN   TestSanitize/With_building
=== RUN   TestSanitize/Without_building
--- PASS: TestSanitize (0.00s)
    --- PASS: TestSanitize/With_building (0.00s)
    --- PASS: TestSanitize/Without_building (0.00s)
=== RUN   TestMetricName
--- PASS: TestMetricName (0.00s)
=== RUN   TestPrometeusize
--- PASS: TestPrometeusize (0.00s)
=== RUN   TestMakeRawMetric
--- PASS: TestMakeRawMetric (0.00s)
=== RUN   TestRawToCompatibleRawMetric
--- PASS: TestRawToCompatibleRawMetric (0.00s)
=== RUN   TestReplsetStatusCollector
--- PASS: TestReplsetStatusCollector (0.00s)
=== RUN   TestReplsetStatusCollectorNoSharding
--- PASS: TestReplsetStatusCollectorNoSharding (0.00s)
=== RUN   TestSecondaryLag
    secondary_lag_test.go:58: This is failing in GitHub actions. Cannot make secondary to lag behind
--- SKIP: TestSecondaryLag (0.00s)
=== RUN   TestServerStatusDataCollector
--- PASS: TestServerStatusDataCollector (0.02s)
=== RUN   TestTopologyLabels
--- PASS: TestTopologyLabels (0.00s)
=== RUN   TestWalkTo
--- PASS: TestWalkTo (0.00s)
=== RUN   TestMakeLockMetric
--- PASS: TestMakeLockMetric (0.00s)
=== RUN   TestAddLocksMetrics
--- PASS: TestAddLocksMetrics (0.00s)
=== RUN   TestSumMetrics
=== RUN   TestSumMetrics/timeAcquire
=== RUN   TestSumMetrics/timeAcquire#01
--- PASS: TestSumMetrics (0.00s)
    --- PASS: TestSumMetrics/timeAcquire (0.00s)
    --- PASS: TestSumMetrics/timeAcquire#01 (0.00s)
=== RUN   TestCreateOldMetricFromNew
--- PASS: TestCreateOldMetricFromNew (0.00s)
PASS
ok      github.com/percona/mongodb_exporter/exporter    0.825s
?       github.com/percona/mongodb_exporter/internal/tu [no test files]

percona-csalguero avatar May 21 '21 12:05 percona-csalguero

any update on this? I'm interested in these metrics.

Thanks

Iliyass avatar Jul 16 '21 15:07 Iliyass

Isn't this already captured by the getDiagnosticData collector under local.oplog.rs.stats (aliased as oplog_stats)? see https://github.com/percona/mongodb_exporter/blob/v0.34.0/exporter/testdata/get_diagnostic_data.json#L3-L20

daniel-shuy avatar Oct 12 '22 06:10 daniel-shuy

@daniel-shuy, I don't think so. For instance, tFirst and tLast from db.getReplicationInfo() does not seem to be captured anywhere. I've ended up here when trying to calculate the oplog window, pretty much the same context as the op.

jeffersongirao avatar Oct 14 '22 13:10 jeffersongirao

@jeffersongirao If I'm not mistaken, local.oplog.rs.stats_start is the tFirst and local.oplog.rs.stats_end is the tLast.

Maybe someone at Percona can kindly confirm if this is true or not?

daniel-shuy avatar Oct 14 '22 14:10 daniel-shuy

@daniel-shuy thanks for pointing out to that but unfortunately it seems those are different fields. The timestamp values do not correspond to the head and tail entries from the oplog, neither at the exporter or at the source.

jeffersongirao avatar Oct 19 '22 07:10 jeffersongirao

Happy to help on this issue. But I need some time.

vineelyalamarthy avatar Oct 26 '22 07:10 vineelyalamarthy

As a workaround, to get the oplog window, I divided the size of the local database, by the rate of replication network bytes (available in server status). This gives a rough estimate of the time that replication takes to fill in the oplog.rs collection.

I would prefer to have the getReplicationInfo metrics still.

What would be the best suited project to create a ticket in?

bvalente avatar Mar 08 '24 15:03 bvalente