goleveldb icon indicating copy to clipboard operation
goleveldb copied to clipboard

OpenFile really slow/large number of .ldb files

Open wingedpig opened this issue 8 years ago • 4 comments

We've been using goleveldb in production for 3 years now, and it's been great. But we're seeing an issue now with OpenFile() taking 1.5 minutes to open a small database. The database only contains < 300 items (email messages), yet there are around 9300 .ldb files. Most of the .ldb files appear old. Another database in the same application (used to store recipients) behaves as expected, with only a couple of .ldb files and very quick startup times. The code to manage both databases is identical.

The emails are only stored for a couple of days and then deleted, so there is constant turnover. And some of the emails can be up to 20MB in size.

Should we try CompactRange() to get things back to normal? Could this be an issue because of the large record sizes?

wingedpig avatar Feb 06 '17 20:02 wingedpig

There's seems to be an issue with compaction that unable to effectively compact data with high turnover. This rather not trivial to fix, for now you can try CompactRange() periodically as workaround.

syndtr avatar Mar 02 '17 03:03 syndtr

I think I have a simple code for this behavior. I just decided to give goleveldb a shot and tested with this code:

package lvldb

import (
	"testing"
	"fmt"
	"github.com/syndtr/goleveldb/leveldb"
	"github.com/syndtr/goleveldb/leveldb/util"
)

func TestWriteRead(t *testing.T) {
	db, err := leveldb.OpenFile("lvl.db", nil)
	if err != nil {
		fmt.Println(err)
	}
	defer db.Close()
	p(db, []byte("foo-key1"), []byte("value1"))
	p(db, []byte("foo-key2"), []byte("value2"))
	p(db, []byte("foo-key3"), []byte("value3"))
	p(db, []byte("foo-key4"), []byte("value4"))
	p(db, []byte("foo-key5"), []byte("value5"))
	p(db, []byte("foo-key6"), []byte("value6"))
	p(db, []byte("foo-key7"), []byte("value7"))
	p(db, []byte("foo-key8"), []byte("value8"))
	p(db, []byte("foo-key9"), []byte("value9"))
	p(db, []byte("sfoo-key8"), []byte("value8"))
	p(db, []byte("sfoo-key9"), []byte("value9"))

	iter := db.NewIterator(util.BytesPrefix([]byte("foo-key")), nil)
	for iter.Next() {
		// Use key/value.
		fmt.Println(string(iter.Key()), string(iter.Value()))
	}
	iter.Release()
}

func p(db *leveldb.DB, k, v []byte){
	err := db.Put(k, v, nil)
	if err != nil {
		fmt.Println(err)
	}
}

I run TestWriteRead over and over and it never ends to create new db files even though there is no new data to write. In my case I stopped when I reached the 000159.ldb file name. Hope it helps to identify the bug.

artvel avatar Jun 23 '17 15:06 artvel

I think I have a simple code for this behavior. I just decided to give goleveldb a shot and tested with this code:

package lvldb

import (
	"testing"
	"fmt"
	"github.com/syndtr/goleveldb/leveldb"
	"github.com/syndtr/goleveldb/leveldb/util"
)

func TestWriteRead(t *testing.T) {
	db, err := leveldb.OpenFile("lvl.db", nil)
	if err != nil {
		fmt.Println(err)
	}
	defer db.Close()
	p(db, []byte("foo-key1"), []byte("value1"))
	p(db, []byte("foo-key2"), []byte("value2"))
	p(db, []byte("foo-key3"), []byte("value3"))
	p(db, []byte("foo-key4"), []byte("value4"))
	p(db, []byte("foo-key5"), []byte("value5"))
	p(db, []byte("foo-key6"), []byte("value6"))
	p(db, []byte("foo-key7"), []byte("value7"))
	p(db, []byte("foo-key8"), []byte("value8"))
	p(db, []byte("foo-key9"), []byte("value9"))
	p(db, []byte("sfoo-key8"), []byte("value8"))
	p(db, []byte("sfoo-key9"), []byte("value9"))

	iter := db.NewIterator(util.BytesPrefix([]byte("foo-key")), nil)
	for iter.Next() {
		// Use key/value.
		fmt.Println(string(iter.Key()), string(iter.Value()))
	}
	iter.Release()
}

func p(db *leveldb.DB, k, v []byte){
	err := db.Put(k, v, nil)
	if err != nil {
		fmt.Println(err)
	}
}

I run TestWriteRead over and over and it never ends to create new db files even though there is no new data to write. In my case I stopped when I reached the 000159.ldb file name. Hope it helps to identify the bug.

Has this bee resolved?

AidenZuk avatar Sep 26 '19 08:09 AidenZuk

If I am not mistaken this is an inherit problem from the c++ version

See https://github.com/google/leveldb/issues/783

pedrinimm avatar Nov 05 '20 09:11 pedrinimm