database: add pebbledb support
Overview
This PR introduces PebbleDB as a new database engine, an alternative to leveldb. https://github.com/btcsuite/btcd/issues/2024 https://github.com/btcsuite/btcd/issues/1339 To support multiple database engines (PebbleDB and LevelDB), db interface has been implemented in ffldb.
How to run
./btcd --dbtype=pebbledb
Caution
This update requires Go 1.22+. https://github.com/btcsuite/btcd/issues/2306
Approved CI run.
This is a big change and I see that you're still making changes here. Do ping me once things are ready for review!
Thank you @kcalvinalvin, It is ready for review! By the way, bump go.mod version is also big change. it might be better to work on another pull request.
Approved CI run.
Pull Request Test Coverage Report for Build 13717690125
Details
- 552 of 689 (80.12%) changed or added relevant lines in 15 files are covered.
- 11 unchanged lines in 6 files lost coverage.
- Overall coverage increased (+0.3%) to 55.564%
| Changes Missing Coverage | Covered Lines | Changed/Added Lines | % |
|---|---|---|---|
| database/ffldb/ldbtreapiter.go | 1 | 2 | 50.0% |
| database/engine/pebbledb/pebbledb.go | 47 | 49 | 95.92% |
| database/engine/testsuite.go | 148 | 150 | 98.67% |
| database/ffldb/driver.go | 23 | 25 | 92.0% |
| database/ffldb/db.go | 40 | 45 | 88.89% |
| database/ffldb/dbcache.go | 19 | 24 | 79.17% |
| database/engine/pebbledb/snapshot.go | 35 | 42 | 83.33% |
| database/engine/pebbledb/transaction.go | 16 | 23 | 69.57% |
| database/engine/pebbledb/iterator.go | 17 | 26 | 65.38% |
| database/engine/iterator.go | 143 | 240 | 59.58% |
| <!-- | Total: | 552 | 689 |
| Files with Coverage Reduction | New Missed Lines | % |
|---|---|---|
| database/ffldb/ldbtreapiter.go | 1 | 86.67% |
| mempool/mempool.go | 1 | 66.67% |
| btcutil/gcs/gcs.go | 2 | 80.95% |
| database/ffldb/db.go | 2 | 90.61% |
| txscript/taproot.go | 2 | 95.98% |
| database/ffldb/dbcache.go | 3 | 77.2% |
| <!-- | Total: | 11 |
| Totals | |
|---|---|
| Change from base Build 13687806397: | 0.3% |
| Covered Lines: | 30411 |
| Relevant Lines: | 54731 |
💛 - Coveralls
@Roasbeef @yyforyongyu @kcalvinalvin Can you check if there's anything I should update on the PR? Seems like it’s been stuck.
Just found out that batch writing in pebbledb only supports up to 4GB. Shown in this test here:
func TestPebbleDBBatchWriteFail(t *testing.T) {
tmpDir := t.TempDir()
db, err := pebble.Open(tmpDir, nil)
if err != nil {
t.Fatal(err)
}
k := make([]byte, math.MaxUint32)
v := [1]byte{}
batch := db.NewBatch()
err = batch.Set(k[:], v[:], nil)
if err != nil {
t.Fatal(err)
}
batch.Commit(nil)
This does make things tricky to have everything written atomically.
EDIT:
Seems like we should be bypassing the internal memtable for pebbledb and just write straight to the tables on disk. https://github.com/cockroachdb/pebble/issues/702#issuecomment-632055748
Just found out that batch writing in pebbledb only supports up to 4GB. Shown in this test here:
Indeed. It must be a problem if a single batch exceeds 4GB. But I’m not sure this actually happens in practice. The batches I’ve looked at typically had ~10MB of keys and ~100MB of values (see this). Maybe in some cases, a utxoCache flush could exceed the limit. ( with --utxocachemaxsize=4000 )
Seems like we should be bypassing the internal memtable for pebbledb and just write straight to the tables on disk.
It can be a great alternative for large flushes. Writing directly via SSTable ingest avoids the 4GB batch limit and keeps memory usage low. A hybrid approach might work: use batch for regular commits and use Ingest for utxoCache flushes.
@kcalvinalvin It seems that no single batch exceeds 1GB of key/values while syncing the bitcoin mainnet, even with --utxocachemaxsize=2000.
Could you share the circumstances under which you encountered a 4GB flush?
( In fact, flushing 4GB of data from memory to disk at once is something to avoid. )
@kcalvinalvin It seems that no single batch exceeds 1GB of key/values, even with
--utxocachemaxsize=2000. Could you share the circumstances under which you encountered a 4GB flush? ( In fact, flushing 4GB of data from memory to disk at once is something to avoid. )
The current UTXO set size is ~12GB. Since on disk we use varints, it's gonna be bigger in memory. Because of this, it's reasonable for someone doing ibd to give the --utxocachemaxsize=16000.
Bitcoin Core also dropped the limit. https://github.com/bitcoin/bitcoin/issues/28249
@kcalvinalvin It seems reasonable to take a two-track approach. It might be a good idea to call ingest() in PebbleDB when processing a batch larger than 4GB.
We could also start pebbledb support with a limit on the utxoCacheSize ( < 4000 ), and then remove the limit once ingest() is supported.
As we all know, Pebble won't fully replace LevelDB. It's a good way to start with experimental usage and gradually improve it.
As we all know, Pebble won't fully replace LevelDB. It's a good way to start with experimental usage and gradually improve it.
Some of the code here may be of interest. https://github.com/utreexo/utreexod/pull/325
@kcalvinalvin What do you think about setting a 4GB limit for maxUtxoCacheSize when using PebbleDB? We could consider this as an experimental approach for PebbleDB.
// Don`t allow utxoCacheMaxSize greater than 4GB (4096 MiB) when using PebbleDB.
// This is due to the batch size limitation of PebbleDB.
if cfg.DbType == ffldb.PebbleDB && cfg.UtxoCacheMaxSizeMiB > 4096 {
str := "%s: The utxocachemaxsize option may not be greater " +
"than 4096 MiB for the PebbleDB database backend -- " +
"parsed [%d]"
err := fmt.Errorf(str, funcName, cfg.UtxoCacheMaxSizeMiB)
fmt.Fprintln(os.Stderr, err)
fmt.Fprintln(os.Stderr, usageMessage)
return nil, nil, err
}