[Feature] Support Postgres as DB backend
Stash stores all data in a SQLite database. Although SQLite is great and is the best option for 90% or more of Stash users, supporting external DB servers such as Postgres would be great.
Postgres:
- Is more reliable
- Has better tooling for operations (incl. backups and restore)
- Can be hosted on a separate node (better for K8s users)
- Doesn't fail spectacularly if the data is on a NFS/SMB drive like SQLite
- Can offer better performance.
At a quick glance, it appears that supporting Postgres shouldn't be too complex in stash. We should be able to add that as an option (alternative to SQLite).
Only thing that is an open question is what to do with the built-in backup feature of stash, since it's currently implemented so it does a VACUUM into a separate SQLite database. This is obviously not an option. My suggestion would be to disable built-in backups when using a DB server, since backups should be performed in the server.
I'd throw bounty cash at this if it also included mysql, I get that postgres is more popular right now.
IRT Backups, why not a giant JSON dump? JSON would be SQL agnostic.
@trollboy you mean like the Export task?
@MrX292 yeah. Exporting/backing up to JSON would be RDMS agnostic.
After investigating the main hindrance seems to be the custom functions registered to sqlite. https://github.com/stashapp/stash/blob/7086109d7879b64529578e3e16dcafbb8b57e842/pkg/sqlite/driver.go#L27-L39 To do this in postgres you would need to register a binary to the service with said functions, which could be possible but would require some careful considerations about how to do it.
It might also require some deep dives into specific sqlite and postgres macros like current date etc.
EDIT2: So after some investigation we need:
- Alternative mitigations (AUTOINCREMENT/integer, datetime => timestamp, etc. needs replacing)
- Create a Postgresql binary that registers the missing commands (not too hard, but i dont know that the project wants to maintain that).
- Specify database type and connection string in stash config.
EDIT3: So i added a bunch of switches for db type, i think the strategy currently is to create premigrations for creating tables such that i can select what type of index it is pr db type ("integer not null primary key autoincrement", vs "serial not null primary key")
Couldn't this be handled by using an abstraction layer?
Doesn't fail spectacularly if the data is on a NFS/SMB drive like SQLite
Just happened to me on my Unraid server. Wishing I'd stored in the dbms I'm already running on the same server :(
I would also throw a small bounty, like maybe 100+ usd, for support for using either MySQL or Postgres as the backend. I wouldn't need those settings exposed in the settings UI, hand-editing config.yml is fine, or automating the setup of the binary service mentioned above, basic instructions (community-sourced maybe?) for how that might be done and user has to set it up themselves.
I know it would require abstracting parts of the system that currently aren't abstracted like the regex, phash_distance, etc functions mentioned above, so it wouldn't be trivial... but the performance gains for very large databases would be awesome..
I've never done bounties before so I'm not sure how that works, but I give small amounts to all kinds of things.. $60/mo patreon budget, monthly donations to Wikipedia, Internet Archive...
I have create some preliminary work on a branch here: https://github.com/NodudeWasTaken/stash/tree/postgres-support Its NOT FUNCTIONAL DONT USE IT.
- There are many bugs, mostly around the fact that postgres doesn't support unnamed parameters like ?, but needs $1, $2 etc. (alot of rewriting of SQL is needed). goqu fixes this automatically, but there is alot of manually written SQL that needs to be changed.
- I removed lastinsertid since postgres doesnt support it, so we use "RETURNING id" instead.
- Im gonna need some abstractions, especially around the database commands like backup etc.
- There is some bug around covers which i haven't grasped yet. (FIXED) After investigating it seems to stem from some change i did that no longer automatically converts empty string to null, and it relies on null.
- In Sqlite int is a variable length of upto 8 bytes, but in postgres its fixed 4 bytes, so some things need to be bigint (or numeric maybe).
- I used bytea instead of blob (because blob doesnt exist in postgres), but some fingerprints look like bigint's and gives an sql error, that needs to be fixed.
- I removed the selection of distinct id's, as any "values" you mess with (order by etc.) need to be selected in postgres.
- The missing custom functions are still missing.
- I was lazy with the migrations, so we have one big one and it bugs out on first boot saying "found 2 migrations, expected 68" or something.
- There is some weird bug with transactions which i hotfixed by disabling exclusive transactions for postgres.
- We have some functionality around writeable db connections and "disableForeignKeys" which i dont yet know how to do in postgres.
There are probably more bugs which i haven't found. Any help would be appreciated 😊