ergo icon indicating copy to clipboard operation
ergo copied to clipboard

alternative datastores (for history and HA)

Open slingamn opened this issue 6 years ago • 10 comments

  • We need to support a highly available database (to store application data) for #343
  • We need to support a database that can store logs for #348

These do not have to be the same database. We should continue to support buntdb for application data, so as to have a batteries-included way of spinning up a single server. So we should have some kind of abstraction layer covering the database --- probably at the macroscopic operation level ("persist this channel"), but maybe at a lower level ("store this value under this key").

Some things to think about:

  1. To what extent are we relying on ACID and the ability to iterate over keys (buntdb features that may not be available in another KV store)?
  2. Should we think about a more document-oriented approach (e.g., each channel is a single JSON blob)?
  3. Should we revisit the relationship between the datastore and its on-heap caches? Right now, some read-only operations (e.g., logging into an account) read from the datastore. Others (changing nicks) are covered by on-heap caches. (Right now, joining a new channel does a datastore lookup to see if the channel is registered; we should probably add an on-heap cache with all the registered channel names.)
  4. MySQL has a pure-Go driver: https://github.com/go-sql-driver/mysql
  5. So does Redis: https://github.com/go-redis/redis

slingamn avatar Feb 07 '19 21:02 slingamn

  1. We rely on ACID pretty heavily, and right now at least iterating over keys is important in a few areas. I'd be open to trying different ways of handling these operations though, maybe? Prolly worth thinking about this hard as it relates to #343
  2. Totally for a more document-oriented approach. goshubnc, the project I made after this one, uses document-based db storage and it's a lot easier to work with.
  3. I think this is mostly two questions: What needs to be persisted? (if persistence is required, db for sure). I wouldn't be against some more heap caches, but if we do wanna approach #343 then we'll need to think carefully about it.

Redis and MySQL both sound cool. Which one we go with for the primary store is just personal preference it feels like. One thing I really like about bunt is the general interface, feels nice and slick and simple to work with.

DanielOaks avatar Feb 07 '19 22:02 DanielOaks

Cassandra has a table abstraction (so it's possible to iterate over every row of a table) and a Go driver: https://github.com/gocql/gocql

slingamn avatar Feb 08 '19 08:02 slingamn

Cassandra can do compare-and-swap for documents via "lightweight transactions": https://docs.datastax.com/en/cql/3.3/cql/cql_using/useInsertLWT.html

Here's some skepticism about whether Cassandra can be operated successfully with small numbers of nodes: https://stackoverflow.com/a/27779449

although it actually sounds to me like it's perfectly viable to do, e.g., a 2-node cluster with "RF=2 (all data replicated), write CL=2 and read CL =1 [...] if a node is down, you can only read but not write"

slingamn avatar Feb 10 '19 20:02 slingamn

Cassandra may not be a good fit for repeated update of documents (updates are implemented as setting a "tombstone" flag to invalidate the old row, then creating a new one; compaction runs later to clean up the old rows): http://cassandra.apache.org/doc/4.0/operating/compaction.html

slingamn avatar Mar 12 '19 07:03 slingamn

Update: I'm leaning away from Cassandra for this because of reports that it scales down poorly (as in, it's hard to run a Cassandra node with less than a gigabyte of RAM available).

slingamn avatar Dec 16 '19 20:12 slingamn

If we do anything here, we're doing MySQL (same as history, although we would allow the use of a separate logical database). But this is no longer a very high priority because it no longer blocks HA (buntdb can be made HA using k8s volumes).

slingamn avatar Mar 02 '20 21:03 slingamn

Just wanted to put forward that sqlite would probably be a good alternative to mysql, its a lot-less resource intensive, and sqlite3 driver for go already exists.

sqlite has a similiar structure and tools exist to convert mysql dumps to sqlite already, so existing users could easily convert existing databases to sqlite.

EndlessEden avatar Aug 08 '24 06:08 EndlessEden

For ~20 years I've been running ircd-irc2 on 1 host for ~1 channel for ~5 people, no chanserv/nickserv, and supybot (now limnoria) purely to make a local text/plain log of what was said. I'm just trying ergochat today. If sqlite was a supported backend, I would definitely at least try it. Rolling out MariaDB just for persistent history feels overkill and stays in the "too hard basket" for me.

trentbuck avatar Sep 22 '25 19:09 trentbuck

I would consider a sqlite backend that is compiled out in the default build (the main problem with sqlite is that it requires cgo).

slingamn avatar Sep 26 '25 15:09 slingamn

I would consider a sqlite backend that is compiled out in the default build (the main problem with sqlite is that it requires cgo).

I've been using this package, modernc.org/sqlite, to add SQLite to my programs. It's a CGo-free port of SQLite3.

It's slightly slower than the original C code, but it works perfectly in my projects and is compatible with the standard Go SQL package.

It might be worth checking.

crgimenes avatar Sep 27 '25 08:09 crgimenes