use last second copy rate to project ETA
A Pull Request should be associated with an Issue.
We wish to have discussions in Issues. A single issue may be targeted by multiple PRs. If you're offering a new feature or fixing anything, we'd like to know beforehand in Issues, and potentially we'll be able to point development in a particular direction.
Related issue: https://github.com/github/gh-ost/issues/1204
Further notes in https://github.com/github/gh-ost/blob/master/.github/CONTRIBUTING.md Thank you! We are open to PRs, but please understand if for technical reasons we are unable to accept each and any PR
Description
Currently, the ETA that gh-ost uses is based on a projection that averages over the whole run time. Because (with a b+tree) the insert rate will slow down as the table gets larger, this can lead to inaccurate projections.
This uses a different approach, which is to project from the number of rows copied in the last second. It has the downside that the estimate can tend to "jump around" a bit, but sometimes this is helpful. I played around with changing the sample average to 10s etc, but it looks like 1s seems to be long enough to balance estimate stability.
Here is a sample output showing the time bouncing around a little:
~/go/src/github.com/morgo/gh-ost/go/cmd/gh-ost more-accurate-eta go run . \
--max-load=Threads_running=25 \
--critical-load=Threads_running=1000 \
--chunk-size=1000 \
--user="msandbox" \
--password="msandbox" \
--host="127.0.0.1" \
--port=8031 \
--allow-on-master \
--database="test" \
--table="customer" \
--initially-drop-old-table \
--ok-to-drop-table \
--execute --initially-drop-ghost-table --alter="engine=innodb"
[2022/12/14 14:28:21] [info] binlogsyncer.go:148 create BinlogSyncer with config {99999 mysql 127.0.0.1 8031 msandbox false false <nil> false UTC true 0 0s 0s 0 false false 0 <nil>}
[2022/12/14 14:28:21] [info] binlogsyncer.go:374 begin to sync binlog from position (binlog.000199, 157)
[2022/12/14 14:28:21] [info] binlogsyncer.go:791 rotate to (binlog.000199, 157)
# Migrating `test`.`customer`; Ghost table is `test`.`_customer_gho`
# Migrating mtocker-macbookpro.local:8031; inspecting mtocker-macbookpro.local:8031; executing on mtocker-macbookpro.local
# Migration started at Wed Dec 14 14:28:21 -0700 2022
# chunk-size: 1000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=25; critical-load: Threads_running=1000; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.test.customer.sock
Copy: 0/2881784 0.0%; Applied: 0; Backlog: 0/1000; Time: 0s(total), 0s(copy); streamer: binlog.000199:3566; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: N/A
Copy: 0/2881784 0.0%; Applied: 0; Backlog: 0/1000; Time: 1s(total), 1s(copy); streamer: binlog.000199:8040; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: N/A
Copy: 84000/2881784 2.9%; Applied: 0; Backlog: 0/1000; Time: 2s(total), 2s(copy); streamer: binlog.000199:47281787; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 1m6s
Copy: 168000/2881784 5.8%; Applied: 0; Backlog: 0/1000; Time: 3s(total), 3s(copy); streamer: binlog.000199:94566987; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 32s
Copy: 245000/2881784 8.5%; Applied: 0; Backlog: 0/1000; Time: 4s(total), 4s(copy); streamer: binlog.000199:137949978; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 34s
Copy: 323000/2881784 11.2%; Applied: 0; Backlog: 0/1000; Time: 5s(total), 5s(copy); streamer: binlog.000199:181313206; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 32s
Copy: 402000/2881784 13.9%; Applied: 0; Backlog: 0/1000; Time: 6s(total), 6s(copy); streamer: binlog.000199:226333018; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 31s
Copy: 476000/2881784 16.5%; Applied: 0; Backlog: 0/1000; Time: 7s(total), 7s(copy); streamer: binlog.000199:268011561; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 32s
Copy: 555000/2881784 19.3%; Applied: 0; Backlog: 0/1000; Time: 8s(total), 8s(copy); streamer: binlog.000199:312501660; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 29s
Copy: 615000/2881784 21.3%; Applied: 0; Backlog: 0/1000; Time: 9s(total), 9s(copy); streamer: binlog.000199:346275029; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 37s
Copy: 669000/2881784 23.2%; Applied: 0; Backlog: 0/1000; Time: 10s(total), 10s(copy); streamer: binlog.000199:376674938; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 40s
Copy: 726000/2881784 25.2%; Applied: 0; Backlog: 0/1000; Time: 11s(total), 11s(copy); streamer: binlog.000199:408728243; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 37s
Copy: 783000/2881784 27.2%; Applied: 0; Backlog: 0/1000; Time: 12s(total), 12s(copy); streamer: binlog.000199:440842291; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 36s
Copy: 846000/2881784 29.4%; Applied: 0; Backlog: 0/1000; Time: 13s(total), 13s(copy); streamer: binlog.000199:476296347; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 32s
Copy: 910000/2881784 31.6%; Applied: 0; Backlog: 0/1000; Time: 14s(total), 14s(copy); streamer: binlog.000199:512338908; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 30s
Copy: 969000/2881784 33.6%; Applied: 0; Backlog: 0/1000; Time: 15s(total), 15s(copy); streamer: binlog.000199:545572110; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 32s
Copy: 1026000/2881784 35.6%; Applied: 0; Backlog: 0/1000; Time: 16s(total), 16s(copy); streamer: binlog.000199:577675650; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 32s
Copy: 1087000/2881784 37.7%; Applied: 0; Backlog: 0/1000; Time: 17s(total), 17s(copy); streamer: binlog.000199:612023226; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 29s
Copy: 1148000/2881784 39.8%; Applied: 0; Backlog: 0/1000; Time: 18s(total), 18s(copy); streamer: binlog.000199:646342977; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 28s
Copy: 1204000/2881784 41.8%; Applied: 0; Backlog: 0/1000; Time: 19s(total), 19s(copy); streamer: binlog.000199:677881385; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 29s
Copy: 1265000/2881784 43.9%; Applied: 0; Backlog: 0/1000; Time: 20s(total), 20s(copy); streamer: binlog.000199:712235510; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 26s
Copy: 1335000/2881784 46.3%; Applied: 0; Backlog: 0/1000; Time: 21s(total), 21s(copy); streamer: binlog.000199:751664902; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 22s
Copy: 1403000/2881784 48.7%; Applied: 0; Backlog: 0/1000; Time: 22s(total), 22s(copy); streamer: binlog.000199:789461333; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 21s
Copy: 1464000/2881784 50.8%; Applied: 0; Backlog: 0/1000; Time: 23s(total), 23s(copy); streamer: binlog.000199:824289714; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 23s
Copy: 1527000/2881784 53.0%; Applied: 0; Backlog: 0/1000; Time: 24s(total), 24s(copy); streamer: binlog.000199:859468766; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 21s
Copy: 1592000/2881784 55.2%; Applied: 0; Backlog: 0/1000; Time: 25s(total), 25s(copy); streamer: binlog.000199:896372430; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 19s
Copy: 1662000/2881784 57.7%; Applied: 0; Backlog: 0/1000; Time: 26s(total), 26s(copy); streamer: binlog.000199:935798318; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 17s
Copy: 1725000/2881784 59.9%; Applied: 0; Backlog: 0/1000; Time: 27s(total), 27s(copy); streamer: binlog.000199:971271951; Lag: 0.11s, HeartbeatLag: 0.11s, State: migrating; ETA: 18s
Copy: 1793000/2881784 62.2%; Applied: 0; Backlog: 0/1000; Time: 28s(total), 28s(copy); streamer: binlog.000199:1009545478; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 16s
Copy: 1856000/2881784 64.4%; Applied: 0; Backlog: 0/1000; Time: 29s(total), 29s(copy); streamer: binlog.000199:1044945554; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 16s
[2022/12/14 14:28:51] [info] binlogsyncer.go:791 rotate to (binlog.000200, 4)
[2022/12/14 14:28:51] [info] binlogsyncer.go:791 rotate to (binlog.000200, 4)
Copy: 1920000/2881784 66.6%; Applied: 0; Backlog: 0/1000; Time: 30s(total), 30s(copy); streamer: binlog.000200:6753737; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 15s
Copy: 1977000/2881784 68.6%; Applied: 0; Backlog: 0/1000; Time: 31s(total), 31s(copy); streamer: binlog.000200:38877488; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 15s
Copy: 2032000/2881784 70.5%; Applied: 0; Backlog: 0/1000; Time: 32s(total), 32s(copy); streamer: binlog.000200:69863585; Lag: 0.11s, HeartbeatLag: 0.11s, State: migrating; ETA: 15s
Copy: 2093000/2881784 72.6%; Applied: 0; Backlog: 0/1000; Time: 33s(total), 33s(copy); streamer: binlog.000200:104128743; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 12s
Copy: 2158000/2881784 74.9%; Applied: 0; Backlog: 0/1000; Time: 34s(total), 34s(copy); streamer: binlog.000200:140822753; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 11s
Copy: 2220000/2881784 77.0%; Applied: 0; Backlog: 0/1000; Time: 35s(total), 35s(copy); streamer: binlog.000200:175718354; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 10s
Copy: 2280000/2881784 79.1%; Applied: 0; Backlog: 0/1000; Time: 36s(total), 36s(copy); streamer: binlog.000200:209458784; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 10s
Copy: 2342000/2881784 81.3%; Applied: 0; Backlog: 0/1000; Time: 37s(total), 37s(copy); streamer: binlog.000200:244384714; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 8s
Copy: 2405000/2881784 83.5%; Applied: 0; Backlog: 0/1000; Time: 38s(total), 38s(copy); streamer: binlog.000200:279865282; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 7s
Copy: 2469000/2881784 85.7%; Applied: 0; Backlog: 0/1000; Time: 39s(total), 39s(copy); streamer: binlog.000200:315425854; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 6s
Copy: 2529000/2881784 87.8%; Applied: 0; Backlog: 0/1000; Time: 40s(total), 40s(copy); streamer: binlog.000200:349673270; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 5s
Copy: 2598000/2881784 90.2%; Applied: 0; Backlog: 0/1000; Time: 41s(total), 41s(copy); streamer: binlog.000200:388534212; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 4s
Copy: 2661000/2881784 92.3%; Applied: 0; Backlog: 0/1000; Time: 42s(total), 42s(copy); streamer: binlog.000200:423824648; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 3s
Copy: 2718000/2881784 94.3%; Applied: 0; Backlog: 0/1000; Time: 43s(total), 43s(copy); streamer: binlog.000200:456090786; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 2s
Copy: 2775000/2881784 96.3%; Applied: 0; Backlog: 0/1000; Time: 44s(total), 44s(copy); streamer: binlog.000200:488195145; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: 1s
Copy: 2833000/2881784 98.3%; Applied: 0; Backlog: 0/1000; Time: 45s(total), 45s(copy); streamer: binlog.000200:520841157; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: due
Copy: 2894000/2881784 100.4%; Applied: 0; Backlog: 0/1000; Time: 46s(total), 46s(copy); streamer: binlog.000200:555202612; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: due
Copy: 2952000/2881784 102.4%; Applied: 0; Backlog: 0/1000; Time: 47s(total), 47s(copy); streamer: binlog.000200:587420664; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: due
Copy: 3000000/3000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 47s(total), 47s(copy); streamer: binlog.000200:614436979; Lag: 0.01s, HeartbeatLag: 0.03s, State: migrating; ETA: due
Copy: 3000000/3000000 100.0%; Applied: 0; Backlog: 1/1000; Time: 48s(total), 47s(copy); streamer: binlog.000200:614902121; Lag: 0.01s, HeartbeatLag: 0.01s, State: migrating; ETA: due
# Migrating `test`.`customer`; Ghost table is `test`.`_customer_gho`
# Migrating mtocker-macbookpro.local:8031; inspecting mtocker-macbookpro.local:8031; executing on mtocker-macbookpro.local
# Migration started at Wed Dec 14 14:28:21 -0700 2022
# chunk-size: 1000; max-lag-millis: 1500ms; dml-batch-size: 10; max-load: Threads_running=25; critical-load: Threads_running=1000; nice-ratio: 0.000000
# throttle-additional-flag-file: /tmp/gh-ost.throttle
# Serving on unix socket: /tmp/gh-ost.test.customer.sock
Copy: 3000000/3000000 100.0%; Applied: 0; Backlog: 0/1000; Time: 48s(total), 47s(copy); streamer: binlog.000200:614905425; Lag: 0.01s, HeartbeatLag: 0.03s, State: migrating; ETA: due
[2022/12/14 14:29:10] [info] binlogsyncer.go:180 syncer is closing...
[2022/12/14 14:29:10] [info] binlogsyncer.go:864 kill last connection id 11
[2022/12/14 14:29:10] [info] binlogsyncer.go:210 syncer is closed
# Done
In case this PR introduced Go code changes:
- [x] contributed code is using same conventions as original code
- [x]
script/cibuildreturns with no formatting errors, build errors or unit test errors.