bbolt icon indicating copy to clipboard operation
bbolt copied to clipboard

In-memory read-only database implementation

Open jdevelop opened this issue 5 years ago • 28 comments

I need to "embed" some key/value data into my go app ( packr/rice etc ). The database is to be used for prefix lookups, read-only.

Bbolt does require to have *os.File instance to work with, I wonder if it is possible to generalize it to something more abstract so I can supply []byte and that will do?

As a workaround I create a file in /tmp upon startup if it doesn't exist, then copy the content into that file and then supply this file to bbolt.Open, but that is something I'd rather try to avoid.

jdevelop avatar Jul 07 '20 14:07 jdevelop

How about more lightweight solutions, like https://github.com/google/btree. Is it important for your use-case to keep the data embedded as a serialized buffer ?

ptabor avatar Jul 07 '20 15:07 ptabor

@ptabor thanks for the quick response. Yes, for my use-case I need to embed the list of hardware I can work with into the executable. Building it upon startup might not be feasible under some circumstances. It would be ideal if I can just point to some byte buffer and pretend that it is Bolt database. Range / prefix queries are quite useful to have.

jdevelop avatar Jul 07 '20 16:07 jdevelop

As a workaround I create a file in /tmp upon startup if it doesn't exist, then copy the content into that file and then supply this file to bbolt.Open, but that is something I'd rather try to avoid.

That is a fine idea. It is also helpful to accelerate testing. You probably can find some existing solutions to mock the os.file api to begin with.

xiang90 avatar Jul 07 '20 23:07 xiang90

Would be nice to have that openFile func(string, int, os.FileMode) (*os.File, error) function return something like io.ReaderWriterSeeker instead of *os.File which is a pointer to a struct. That way it can be easily configured to work with any random-access storage type ( files, byte slices etc )

jdevelop avatar Jul 08 '20 12:07 jdevelop

My idea was to use something like the following:

import (
	"github.com/spf13/afero"
)

// ...
	fs := afero.NewMemMapFs()
	options := bolt.Options{
		OpenFile:  func(name string, flag int, perm os.FileMode) (*os.File, error) {
			f, err := fs.OpenFile(name, flag, perm)
			if err != nil {
				return nil, err
			}
			return f, nil
		}
	}
// ...

But the thing is that afero uses an interface instead of a real *os.File:

type File interface {
	io.Closer
	io.Reader
	io.ReaderAt
	io.Seeker
	io.Writer
	io.WriterAt

	Name() string
	Readdir(count int) ([]os.FileInfo, error)
	Readdirnames(n int) ([]string, error)
	Stat() (os.FileInfo, error)
	Sync() error
	Truncate(size int64) error
	WriteString(s string) (ret int, err error)
}

So, I was thinking, if boltdb could actually change to the same interface. It seems that this interface is 100% compatible with *os.File.

UPDATE: this is not actually easily possible, because bbolt uses Fd() function of the File and does some very specific os-dependent operations using the real file descriptor...

denisvmedia avatar Nov 16 '20 21:11 denisvmedia

It would be pretty simple to add support for MAP_ANONYMOUS (ie mmap not backed by a file).

Without changing the interfaces, this could be achieved by passing nil for *os.File and then putting guard statements in the appropriate places where the db.file.* methods are called.

In the case of a nil file, the correct FD to use is -1 and the flags should be syscall.MAP_PRIVATE | syscall.MAP_ANON. Other than that it should "just work" including calls to Mmap, Madvise etc, obviously Truncate and Sync wouldn't apply here.

https://man7.org/linux/man-pages/man2/mmap.2.html

@denisvmedia could you possibly replace afero with a call to shm_open?

missinglink avatar Dec 15 '20 22:12 missinglink

@missinglink thanks for your suggestion. It can work, but unfortunately it is not a universal/cross-platform solution.

denisvmedia avatar Dec 15 '20 22:12 denisvmedia

Agh that's true shm_open isn't portable, it would be nice to add support for in-memory databases, such as SQLite supports by passing the string ':memory:' as the filepath.

It would also make testing a lot easier, eg https://github.com/etcd-io/bbolt/blob/master/cmd/bbolt/main_test#L265-L277 could be replaced with code which doesn't touch the file system.

missinglink avatar Dec 15 '20 22:12 missinglink

Is this something people are interested in I can open a PR to add it?

missinglink avatar Oct 02 '21 23:10 missinglink

@missinglink definitely I would still need this feature to use in-memory data store. Please go ahead :)

jdevelop avatar Oct 03 '21 01:10 jdevelop

Okay I'll look into drafting a PR, it will need a bit of discussion about the specifics but I believe the use of MAP_ANONYMOUS will make the feature pretty much interoperable with the existing file-attached one without too many code changes.

I'm not familiar with Windows, I know it can work but will need additional testing there by someone with a Windows computer.

Worth mentioning that the title for this issue is a bit of a misnomer as an in-memory database would always need to be writable due to its ephemeral nature (if you dont write to it there is nothing to read).

missinglink avatar Oct 03 '21 16:10 missinglink

I'm not familiar with Windows, I know it can work but will need additional testing there by someone with a Windows computer.

Actually it's possible to test directly on github using github actions. They support linux (ubuntu), mac and windows.

denisvmedia avatar Oct 03 '21 18:10 denisvmedia

+1 for making OpenFile work with something more generic like afero's File so it's possible to use bolt with their in-memory option

KastenMike avatar Jan 12 '22 12:01 KastenMike

What about providing a higher level interface to separate database level from filesystem level concerns?

This is certainly more work, but the result would be cleaner. Something like:

  • OpenFile.
  • CreateLockFile.
  • ReadLockFile.
  • etc...

And leaving the lower level OS specifics to concrete implementations. This would allow an in memory implementation without having to provide low level primitives such as Fd(). The higher level should only care about opening, updating and reading files and locking / unlocking them.

Caveat: I haven't actually tried this approach yet, so it might be infeasible.

mbrt avatar Jan 26 '22 08:01 mbrt

I had another look at this today and made some progress with a new interface called Backing.

The concrete implementations of can be one of either FileBacking or MemoryBacking. These correspond to the two mmap modes MAP_FILE and MAP_ANONYMOUS.

It's not as simple as I would have hoped simply because I think MAP_ANONYMOUS wasn't considered during the initial development, so adding it now without changing method signatures is tricky, but not impossible.

I also had a quick look at afero, it's not suitable as a replacement since it simply stores the data in a go map on the heap rather than using the mmap syscall API: https://github.com/spf13/afero/blob/master/memmap.go#L32

missinglink avatar Jan 26 '22 13:01 missinglink

type backing interface {
	Fd() int
	Fsync() error
	Truncate(int64) error
	Sync() error
	Flock(bool, time.Duration) error
	File() *os.File
	Open(string, os.FileMode, *Options) error
	Close() error
	ShouldInit() (bool, error)
}

type memoryBacking struct{}

func (mb memoryBacking) Fd() int                                        { return -1 }
func (mb memoryBacking) Fsync() error                                   { return nil }
func (mb memoryBacking) Truncate(size int64) error                      { return nil }
func (mb memoryBacking) Sync() error                                    { return nil }
func (mb memoryBacking) Flock(e bool, t time.Duration) error            { return nil }
func (mb memoryBacking) File() *os.File                                 { return nil }
func (mb memoryBacking) Open(p string, m os.FileMode, o *Options) error { return nil }
func (mb memoryBacking) Close() error                                   { return nil }
func (mb memoryBacking) ShouldInit() (bool, error)                      { return true, nil }

type fileBacking struct {
	db   *DB
	file *os.File
}

func (fb fileBacking) Fd() int                             { return int(fb.file.Fd()) }
func (fb fileBacking) Fsync() error                        { return fdatasync(fb.db) }
func (fb fileBacking) Truncate(size int64) error           { return fb.file.Truncate(size) }
func (fb fileBacking) Sync() error                         { return fb.file.Sync() }
func (fb fileBacking) Flock(e bool, t time.Duration) error { return flock(fb.db, e, t) }
func (fb fileBacking) File() *os.File                      { return fb.file }

plus implementations of <Open> and <Close> which are more verbose

missinglink avatar Jan 26 '22 13:01 missinglink

Well, one of the use-cases on a potential in-memory storage are unit tests. For them we would probably not need any syscall...

denisvmedia avatar Jan 26 '22 13:01 denisvmedia

Well, one of the use-cases in-memory storage are unit tests. For them we would probably not need any syscall...

Yeah that's true, but people familiar with the mmap syscall API would assume that you could map a segment of memory larger than available RAM and that the OS would transparently handle this for you, this isn't the case for a solution where the bytes are held in heap memory in-process.

Particularly for testing it would be better to use the same storage engine since otherwise we wouldn't be able to guarantee that the behaviour which was tested would be exactly the same using the native mmap API in production.

missinglink avatar Jan 26 '22 13:01 missinglink

Particularly for testing it would be better to use the same storage engine since otherwise we wouldn't be able to guarantee that the behaviour which was tested would be exactly the same using the native mmap API in production.

In general, yes, but if the app relies on (in other words trusts) the behavior of BoltDB, it probably can assume it won't be 100% accurate in the tests. Anyway, I get your point, and your arguments make sense as well.

Another concern here would be - is it actually cross-platform? Will it run on Windows in your proposal?

denisvmedia avatar Jan 26 '22 14:01 denisvmedia

Another concern here would be - is it actually cross-platform? Will it run on Windows in your proposal?

Yes, the Windows equivalent of mmap is called MapViewOfFile and has an equivalent mode to MAP_ANONYMOUS.

missinglink avatar Jan 26 '22 14:01 missinglink

Particularly for testing it would be better to use the same storage engine since otherwise we wouldn't be able to guarantee that the behaviour which was tested would be exactly the same using the native mmap API in production.

Yeah true. My only concern would be that the interface could be overly restrictive / unnecessarily complicated for non-test use cases. For example, I was toying with the idea of using cloud storage as a backing store [I know it sounds like a stupid idea :)]. The mmap solution wouldn't work in that case and also seems unnatural for the OP's use case.

I'm also worried about the leaky abstraction of Fd(), where you return -1 and expect the caller to know what to do about it. It's oftentimes a bad sign.

Anyway, just my 2 cents.

mbrt avatar Jan 26 '22 17:01 mbrt

Agh I see, there's two different approaches here.

I think you're talking about providing a storage adapter pattern which would allow persistence to abitrary 'file-like-things'

What I'm talking about is just extending the existing implementation to allow different flags and therefore not require the database be backed by an actual file.

The -1 convention for the fd is inherited from Unix IIRC although it doesn't fit in a uintptr, so maybe I'm wrong about that, I forget where I read it, in a man page somewhere.

The thing about general purpose storage engines is that they need to be 'block devices' or imitate one, so it need to read/write 4kb page blocks.

The mmap API is ideal for this since it transparently caches pages in the filesystem cache meaning that often pages are in RAM rather on disk, which is a huge performance benefit that isn't available when using other storage engines.

There are some b-tree operations that can be quite heavy on the pager such as balance/merge operations, I suspect these would be very slow using something like HTTP range requests and PUT requests.

I think the two concerns are actually better implemented at different levels. The mmap block storage adapter can be modified to support anonymous backings while separately the concept of a block storage adapter can also be converted to an adapter pattern.

Sounds like a lot of work, and considering how hard it is to get a typo fix merged on this repo I don't see it being worth the effort 🤷‍♂️

missinglink avatar Jan 26 '22 18:01 missinglink

I think the two concerns are actually better implemented at different levels.

Fair point.

mbrt avatar Jan 26 '22 20:01 mbrt

I found this commit which seems to implement exactly this functionality:

https://github.com/boltdb/bolt/commit/ed31a3bd0058b926bb17cae9293a8af6e6f1c066

missinglink avatar Mar 05 '22 17:03 missinglink

#320 modified from commit https://github.com/boltdb/bolt/commit/ed31a3bd0058b926bb17cae9293a8af6e6f1c066 add simulate test, maybe need more test

this project need ci 😭

0x0177b11f avatar Mar 08 '22 10:03 0x0177b11f

Would be nice to have that openFile func(string, int, os.FileMode) (*os.File, error) function return something like io.ReaderWriterSeeker instead of *os.File which is a pointer to a struct. That way it can be easily configured to work with any random-access storage type ( files, byte slices etc )

This would be great. I am interested in this so I can create an overlay filesystem that encrypts the database before writing it to the underlying fileystem.

benma avatar May 02 '23 01:05 benma

is anyone assigned to this ?

Elbehery avatar Dec 19 '23 14:12 Elbehery

I would like to see this feature as well mainly for speeding up tests. If no one else is willing to take on the task I could attempt it, though IDK much about writing in-memory stuff.

SaintWish avatar Jan 31 '24 02:01 SaintWish