sqlite-diffable
sqlite-diffable copied to clipboard
Ability to round-trip binary data
e.g. for the binary numbits
column in the .coverage
SQLite database generated by coveragepy
.
Those currently end up represented like this:
[4, 1, "b'\\xfe\\xff\\xfd{\\xe0\\x02\\x10\\x00W}o\\xdb{\\xef}o\\xef\\xbd\\xf7\\x92\\xe8\\x00\\x00\\xca\\t\\xe0\\xfb\\xdf\\x07y\\xdb\\xbe\\xf3\\x97s\\xd7\\xd8\\xeb\\x06\\xd9Y\\x16A\\x17\\xe6\\x02\\x02 @\\x08\\x10\\x00\\xbcH\\xc1$@\\xf7}?\\x01\\x04 \\x00\\x00\\x00\\x00\\x04%\\x00\\x04\\x00\\x00\\x00\\x00\\x00<\\x17H\\x00\\x00\\x12 \\xe9\\xc8\\x08\\x00\\x00\\x00\\x00\\x00\\x00@\\x00\\x00\\x00\\xd4M\\xb5\\x18\\x00w\\xd7\\xdd\\xdd\\xb6m\\xba\\xa9\\xe0\\xa7\\xf3Z\\x82\\xfbN\\x00\\x00\\x00\\x00\\x00\\x00\\x00\\x02\\x00\\x00\\x00\\x00\\x00$`\\x00\\x04'"]
Once I implement the load command (#3) these will be a problem, because they won't round-trip correctly.
I need some kind of special-case syntax for storing binary values such that they can be round-tripped properly.
The .metadata.json
file may be the place to do this. Right now the accompanying line_bits.metadata.json
for the above table looks like this:
{
"name": "line_bits",
"columns": [
"file_id",
"context_id",
"numbits"
],
"schema": "CREATE TABLE line_bits (\n -- If recording lines, a row per context per file executed.\n -- All of the line numbers for that file/context are in one numbits.\n file_id integer, -- foreign key to `file`.\n context_id integer, -- foreign key to `context`.\n numbits blob, -- see the numbits functions in coverage.numbits\n foreign key (file_id) references file (id),\n foreign key (context_id) references context (id),\n unique (file_id, context_id)\n)"
}
I could use this to say "the third column is binary, so treat it as such" somehow.
Maybe columns
could store type information:
"columns": [
["file_id", "integer"],
["context_id", "integer"],
["numbits", "blob"]
]
Here's how sqlite3 .coverage .dump
outputs this data:
INSERT INTO line_bits VALUES(1,1,X'0e');
INSERT INTO line_bits VALUES(2,1,X'5a');
INSERT INTO line_bits VALUES(3,1,X'36218410420821841042');
INSERT INTO line_bits VALUES(4,1,X'fefffd7be0021000577d6fdb7bef7d6fefbdf792e80000ca09e0fbdf0779dbbef39773d7d8eb06d959164117e602022040081000bc48c12440f77d3f010420000000000425000400000000003c174800001220e9c80800000000000040000000d44db5180077d7ddddb66dbaa9e0a7f35a82fb4e0000000000000002000000000024600004');
I can accompany this with a parametrized test that covers all of the other SQLite types as well.
Hello @simonw -- I love this project, thanks for making it happen. Is this issue essentially tracking the attempt to properly make a load
command, specially to handle dump-and-load of binary data stored in sqlite?
My main usage goal: enable more-efficient git
a) storage and b) diff-abiliity of sqlite databases. (Yes, git
-based.)
Additional questions:
- are there alternatives to
sqlite-diffable
other than https://stackoverflow.com/a/21789167/605356 ? - Is there anything I can do to help implement a
load
command?
Additional reference (for my sake): https://news.ycombinator.com/item?id=25004913
fyi. The following is my environment's data after installing sqlite-diffable
today:
$ sqlite-diffable --version
sqlite-diffable, version 0.2.1
$
$ sqlite-diffable --help
Usage: sqlite-diffable [OPTIONS] COMMAND [ARGS]...
Tools for dumping/loading a SQLite database to diffable directory structure
Options:
--version Show the version and exit.
--help Show this message and exit.
Commands:
dump
$
$ sw_vers
ProductName: Mac OS X
ProductVersion: 10.15.7
BuildVersion: 19H1713
$
$ date
Wed Feb 23 21:41:32 CST 2022
$
$ sqlite-diffable load
Usage: sqlite-diffable [OPTIONS] COMMAND [ARGS]...
Try 'sqlite-diffable --help' for help.
Error: No such command 'load'.
$
Checking in - any update(s) on this topic/issue/discussion? ( @simonw )