redb
redb copied to clipboard
Include table stats in `DatabaseStats`
I'm auditing space usage in the ord
index, and having information about individual tables would be super useful. In particular, the total size of each table in bytes, and how many keys it holds.
Also, perhaps DatabaseStats
could also check for space leaks, i.e. check that the number of pages reachable from current tables is the number of allocated pages, and include a statistic that had that information. The ordinals index is now a whopping 117G (last I checked it was 50G), so I'm curious if there's some kind of leak there.
Would a separate method on Table
to retrieve just that table's stats work for you? I'd rather not add it to DatabaseStats, because then it's size in memory would be unbounded.
Ya, a leak check makes sense. I've been planning to add a fsck()
method that would also verify checksum integrity.
Would a separate method on Table to retrieve just that table's stats work for you? I'd rather not add it to DatabaseStats, because then it's size in memory would be unbounded.
Sort of! The issue is then, I don't have a way of getting information on all tables in the database. I know what tables I think are in the database, and I can open them one by one and get stats about them, but I also want to see if there are any tables in the database I don't know about, and get information on those.
ReadTransaction::list_tables
exists, but it returns strings, which can't be used to open tables. So I could see if there were other tables in the database that I didn't have definitions for, but I couldn't get definitions for them to open them and get the stats.
Ya, a leak check makes sense. I've been planning to add a fsck() method that would also verify checksum integrity.
That would be rad. A fsck that reported any problems/leaks it encountered would be sweet.
Hmm, ya I see. How would you end up with a table that you didn't know about? It's not exactly pretty, but to find out the type of a table you can just open it with two random types and you'll get an error message telling you the types: https://github.com/cberner/redb/blob/754e3616827404283305cbdbb3154be6071f12bf/src/tree_store/table_tree.rs#L344
I think it could actually happen pretty easily. For example, we release a version which changes the name of a table, but forget to remove the table with the old name from the database.
One idea is to have a separate Database::table_stats
function which either returns Vec<TableStats>
, or a Iterator<Item = TableStats>
.
Hmm, ok lemme think this over. Perhaps I can change list_tables
to return structs that have the name, and also allow you to fetch the stats.
Alternately, what if I change delete_table
to take a &str
instead of needing a TableDefinition
? That way you could list all the tables, diff against the expected tables, and delete any unexpected ones without needing to know their type
Hmm, ok lemme think this over. Perhaps I can change
list_tables
to return structs that have the name, and also allow you to fetch the stats.
That sounds good.
Alternately, what if I change
delete_table
to take a&str
instead of needing aTableDefinition
? That way you could list all the tables, diff against the expected tables, and delete any unexpected ones without needing to know their type
That would be useful for cleaning up unexpected tables, but if we have unexpected tables, I'd also like to have stats (type, size, etc) to try figure out what they are and what they contain, instead of just blindly deleting them.
@casey I'm finally getting around to this issue. Do you still need this API? I fixed the second issue (https://github.com/cberner/redb/pull/542), and I could add an API like open_table_untyped()
which takes a impl TableHandle
. That would return an UntypedTable
instead of a Table
which would only let you call APIs (like .stats()
) that don't need to know the K & V types.
Documenting my proposed API, but am going to close this for now.
struct UntypedTable;
impl UntypedTable {
pub fn stats(&self) -> Result<TableStats> {
...
}
}
impl ReadOnlyTransaction {
pub fn open_table_untyped(&self, handle: impl TableHandle) -> Result<UntypedTable> {
...
}
}
Sorry for not following up on this! Yah, this would be great. I just updated ord index info
command to return table info, and it's pretty heinous.
Also, the TableStats
type isn't showing up in the docs. For example, here the TableStats
type is showing up in grey. Kind of odd. It think maybe the type is pub
but it isn't exported from the crate.
Np! Can you give the linked PR a try?