rusqlite_migration
rusqlite_migration copied to clipboard
Suggestion: make `fn Migrations::max_schema_version` public
fn Migrations::max_schema_version(&self) is a currently private method that returns the number of available migrations wrapped in SchemaVersion.
I would like access to this value in my own code. I'm using Migrations::from_directory() and include_dir! to construct my Migrations, so getting the count myself is non-trivial AFAICT.
My use case is to work out whether any migrations will be run by fn Migrations::to_latest() before running it, and if so to take a backup first.
I created a simple PR for this: #171.
Thanks for opening this issue and this MR, I'll try to take a deeper look this weekend.
I think that your use case makes sense and is within the scope of what this library should enable.
That’s said, I’m unsure exposing this method is the most ergonomic option for users. The user needs quite a good mental model of how the crate works. What if instead we exposed either:
- a
needs_to_runboolean method onMigrationsthat tells the user whether any migration needs to run? It would be simpler in that users wouldn’t need to compare the version currently applied with the number of migrations, but it could lead to suboptimal uses of the API like this:
This is suboptimal because, as you know,if migrations.needs_to_run() { migrations.to_latest() }to_latestalready performs this check. - a way to run a hook before executing any migration, just like we have a hook at the level of an individual migration. It could something look like this:
I think the additional safety here is that it's much harder to forget to apply the backup hook in some code paths, if the migrations were applied in multiple code paths. And we could run the hook with the other tests we perform on migrations inlet mut migrations = Migrations::new(vec![ // ... ]); migrations.pre_hook(|tx| // Perform the backup, using tx as the connection to the database (maybe with the VACUUM INTO pragma). ); fn main() { // open a connection migrations.to_latest(&mut conn) }Migrations.validate(), although that could be a bit of double edged sword.
As a side note, even if we follow one of the options above, we could still make Migrations::max_schema_version public, if there is a distinct and compelling use case for it.
What do you think @fluffysquirrels @czocher?
From what I understand, the main idea here is to be able to tell if there are any unapplied migrations i.e. the schema version is below the number of defined migrations.
In my opinion exposing max_schema_version is an okay idea for some use cases, but:
- The name is rather non-specific - doesn't immediately say what it does. Changing it would be necessary. Maybe something like
target_schema_version? - Exposing this is just a way to check for "unapplied" migrations with extra steps. Maybe having an additional method for that exact use-case would be better, i.e.
are_there_any_unapplied_migrations() -> bool(just a placeholder name of course ;))
What you propose with migration hooks, seems like something general and possibly useful, but I'd separate that to a new task/functionality.
Hi both, thanks for taking a look!
Re naming:
Migrations already has to_latest, so perhaps max_schema_version could become latest_schema_version or latest_version? enum SchemaVersion includes 2 variants -- NoneSet and Outside(_) -- that don't make sense for latest_version, so perhaps it should just return a usize?
You both suggest a method that returns a bool to skip a step for user, which I think is a good plan.
My suggestions for a name:
is_migration_requiredis_latest
@cljoly, yes, people may redundantly check is_latest, but the documentation can call out that this is not required.
Re migration hooks: because this particular use case is so simple to implement yourself (1 if statement likely run once per database loaded) and I think a closure is possibly more complicated for the user (due to possible extra lifetimes, see examples below), I don't think it's justified to add hooks to do this.
let mut conn = _;
let mut backup_state = _;
let migrations = _;
// With bool fn, with or without state
if migrations.is_latest(&mut conn)? {
backup(&mut conn, &mut backup_state)?;
migrations.to_latest(&mut conn)?;
}
// With closure hook, no state
migrations.add_pre_hook(|conn: &mut rusqlite::Connection|
backup(conn));
migrations.to_latest(&mut conn)?;
// With closure hook, callback needs extra shared state,
// `Migrations` is stored and re-used
// (including if `AsyncMigrations` is used!).
let backup_state = Arc::new(Mutex::new(backup_state));
migrations.add_pre_hook({
let backup_state = backup_state.clone();
move |conn: &mut Connection| {
backup(conn, &mut backup_state.lock())
}
});
migrations.to_latest(&mut conn)?;
I just pushed a few more commits to #171 with these ideas.
Let me know what you think!
Thanks both for your comments and the updated PR.
On the three options
After reading your comments, I think we should only expose the method returning the boolean (it covers nearly all use cases we could think of) and not one returning a usize. We should try to keep a simple API and having too many methods to chose from detracts from that goal. I agree that the closure:
- is more complicated to use (with the lifetimes) than just an
if, and - can be done later anyway.
On returning a usize instead of the internal SchemaVersion enum
I think it makes sense, a usize is much easier to grasp for a user. Even if we don’t introduce such a method at this time, or ever, it’s a good call out.
On the the name
I like is_latest better, it’s a nice symetry with to_latest and I think this looks nicer
if migrations.is_latest(&mut conn)? {
backup(&mut conn, &mut backup_state)?;
migrations.to_latest(&mut conn)?;
}
than this:
if migrations.is_migration_required(&mut conn)? {
backup(&mut conn, &mut backup_state)?;
migrations.to_latest(&mut conn)?;
}
Maybe writing a plural noun and a singular verb is a bit awkward though, so maybe something like
if migrations.all_applied(&mut conn)? {
backup(&mut conn, &mut backup_state)?;
migrations.to_latest(&mut conn)?;
}
would feel more natural?
Misc.
From what I understand, the main idea here is to be able to tell if there are any unapplied migrations i.e. the schema version is below the number of defined migrations.
Yes
yes, people may redundantly check is_latest, but the documentation can call out that this is not required.
Agreed
Digression: Avoiding the double comparison
Maybe that’s too complicated an API and not worth the cost, but when we do this:
if migrations.all_applied(&mut conn)? {
backup(&mut conn, &mut backup_state)?;
migrations.to_latest(&mut conn)?;
}
if all migrations were not applied, we effectively query the database twice to get the number of migrations already applied (the user version field). It’s admittedly a small cost, and we could reduce it by storing a prepared statement.
If we wanted to avoid it though, perhaps we could have an API similar to the Entry API for hash maps (btree maps have a similar API). It could look something like this:
if let Some(unapplied_migrations) = migrations.unapplied_migrations(&mut conn)? {
backup(&mut conn, &mut backup_state)?;
unapplied_migrations.apply()
}
unapplied_migrations would return a new struct UnappliedMigrations, holding the list of migrations that need applying. And then apply() would not check again the number in the database, it would apply the migrations straight.
Obviously, there is still a possibility that another process changes the user version and then migrations would be applied based on incorrect assumptions. If we opened a transaction or something to avoid that, we would likely constraint the backup process too much.
In terms of next steps, if I’m right when I think we have enough consensus here to implement the boolean method, we can discuss the suggestions I made in my last review of the PR, refine it a bit and release an alpha version of the library. The alpha will enable:
- you @fluffysquirrels to use the new API in your project
- others to use the API and give feedback
- allow us to break compatibility if required to get to a better API.
@fluffysquirrels do you still have time to work on this?
I'd like a method to get the latest schema version as a simple integer. My use case is to have an individual aggregated latest schema SQL for fresh initialization, so that we don't need chaining ALTER TABLE from ground up. The schema SQL also serve as a documentation. Otherwise, each table and columns are scattered over multiple up.sql and is hard to track by human.
@oxalica in my code I have a unit test that runs all migrations then writes out the full schema to a SQL file, which I then save in version control. That does for me what I think you're looking for.
I'm away from my laptop so I don't recall the implementation, but I think it's just using a built-in.
I think fetching the maximum version as an integer is likely to be useful further down the road for someone. Because of the implementation of the crate (using the user version value, an integer) I doubt we'll be moving away from an integer version any time soon without a complete rewrite and hence a major version change.
I think that your use case makes sense and is within the scope of what this library should enable.
That’s said, I’m unsure exposing this method is the most ergonomic option for users. The user needs quite a good mental model of how the crate works. What if instead we exposed either:
- a
needs_to_runboolean method onMigrationsthat tells the user whether any migration needs to run? It would be simpler in that users wouldn’t need to compare the version currently applied with the number of migrations, but it could lead to suboptimal uses of the API like this:This is suboptimal because, as you know,if migrations.needs_to_run() { migrations.to_latest() }to_latestalready performs this check.- a way to run a hook before executing any migration, just like we have a hook at the level of an individual migration. It could something look like this:
I think the additional safety here is that it's much harder to forget to apply the backup hook in some code paths, if the migrations were applied in multiple code paths. And we could run the hook with the other tests we perform on migrations inlet mut migrations = Migrations::new(vec![ // ... ]); migrations.pre_hook(|tx| // Perform the backup, using tx as the connection to the database (maybe with the VACUUM INTO pragma). ); fn main() { // open a connection migrations.to_latest(&mut conn) }Migrations.validate(), although that could be a bit of double edged sword.As a side note, even if we follow one of the options above, we could still make
Migrations::max_schema_versionpublic, if there is a distinct and compelling use case for it.What do you think @fluffysquirrels @czocher?
I also need access to max_schema_version. My use-cases is that I want to make transparent that migrations are pending, to go ve the user the choice of backing up the database manually. I would be fine with needs_to_run, but prefer to see the version difference somehow.
Using the hooks will not allow me to achieve my goal, so I don't think that's really a good option.
- The name is rather non-specific - doesn't immediately say what it does. Changing it would be necessary. Maybe something like
target_schema_version?
I think Max ist just fine, as it tells the dev what's the highest version you can target, but you may target a lower version than the max. Thus, I believe, target is less ideal than max.
Thanks for sharing your use case @danielniccoli! I have updated the review on https://github.com/cljoly/rusqlite_migration/pull/171. I think that PR requires a little bit of polishing and we can merge it. I’ll take a look when I get the chance but that might take a while.
I think that PR requires a little bit of polishing and we can merge it. I’ll take a look when I get the chance but that might take a while
I came back to it eventually, with another idea: https://github.com/cljoly/rusqlite_migration/pull/248. Please feel free to share feedback on the PR.
This is now released in beta (v2.0.0-beta.1), could you all please test and provide feedback (even just “it meets my needs” 😄) in the discussion for that release?