zuckdb.zig
zuckdb.zig copied to clipboard
A DuckDB driver for Zig
Zig driver for DuckDB.
Quick Example
const db = try zuckdb.DB.init(allocator, "/tmp/db.duck", .{});
defer db.deinit();
var conn = try db.conn();
defer conn.deinit();
// for insert/update/delete returns the # changed rows
// returns 0 for other statements
_ = try conn.exec("create table users(id int)", .{});
var rows = try conn.query("select * from users", .{});
defer rows.deinit();
while (try rows.next()) |row| {
// get the 0th column of the current row
const id = row.get(i32, 0);
std.debug.print("The id is: {d}", .{id});
}
Any non-primitive value that you get from the row
are valid only until the next call to next
or deinit
.
Install
This library is tested with DuckDB 1.0.0.
- Add into
dependencies
atbuild.zig.zon
:
.dependencies = .{
...
.zuckdb = .{
.url = "git+https://github.com/karlseguin/zuckdb.zig#master",
.hash = "{{ actual_hash string, remove this line before 'zig build' to get actual hash }}",
},
},
-
Download the libduckdb from the DuckDB download page.
-
Place the
duckdb.h
file and thelibduckdb.so
(linux) orlibduckdb.dylib
(mac) in your project'slib
folder. -
Add this in
build.zig
:
const zuckdb = b.dependency("zuckdb", .{
.target = target,
.optimize = optimize,
}).module("zuckdb");
// tell zuckdb.zig where to find the duckdb.h file
zuckdb.addIncludePath(LazyPath.relative("lib/"));
// Your app's program
const exe = b.addExecutable(.{
.name = "run",
.root_source_file = .{ .path = "test.zig" },
.target = target,
.optimize = optimize,
});
exe.root_module.addImport("zuckdb", zuckdb);
// link to libduckdb
exe.linkSystemLibrary("duckdb");
// tell the linker where to find libduckdb.so (linux) or libduckdb.dylib (macos)
exe.addLibraryPath(LazyPath.relative("lib/"));
// add other imports
Static Linking
It's also possible to statically link DuckDB. In order to do this, you must build DuckDB yourself, in order to compile it using Zig C++ and using the bundle-library target
git clone -b 1.0.0 --single-branch https://github.com/duckdb/duckdb.git
cd duckdb
export CXX="zig c++"
DUCKDB_EXTENSIONS='json' make bundle-library
When this finished (it will take several minutes), you can copy build/release/libduckdb_bundle.a
and src/include/duckdb.h
to your project's lib
folder. Rename libduckdb_bundle.a
to libduckdb.a
.
Finally, Add the following to your build.zig
:
exe.linkSystemLibrary("duckdb");
exe.linkSystemLibrary("stdc++");
exe.addLibraryPath(LazyPath.relative("lib/"));
DB
The DB
is used to initialize the database, open connections and, optionally, create a connection pool.
init
Creates or opens the database.
// can use the special path ":memory:" for an in-memory database
const db = try DB.init(allocator, "/tmp/db.duckdb", .{});
defer db.deinit();
The 3rd parameter is for options. The available options, with their default, are:
-
access_mode
- Sets theaccess_mode
DuckDB configuration. Defaults to.automatic
. Valid options are:.automatic
,.read_only
or.read_write
. -
enable_external_access
- Sets theenable_external_access
DuckDB configuration. Defaults totrue
.
initWithErr
Same as init
, but takes a 4th output parameter. On open failure, the output parameter will be set to the error message. This parameter must be freed if set.
var open_err: ?[]u8 = null;
const db = DB.initWithErr(allocator, "/does/not/exist", .{}, &open_err) catch |err| {
if (err == error.OpenDB) {
defer allocator.free(open_err.?);
std.debug.print("DB open: {}", .{open_err.?});
}
return err;
};
deinit
Closes the database.
conn
Returns a new connection object.
var conn = try db.conn();
defer conn.deinit();
...
pool
Initializes a pool of connections to the DB.
var pool = try db.pool(.{.size = 2});
// the pool owns the `db`, so pool.deinit will call `db.deinit`.
defer pool.deinit();
var conn = try pool.acquire();
defer pool.release(conn);
The pool
method takes an options parameter:
-
size: usize
- The number of connections to keep in the pool. Defaults to5
-
timeout: u64
- The time, in milliseconds, to wait for a connetion to be available when callingpool.acquire()
. Defaults to10_000
. -
on_connection: ?*const fn(conn: *Conn) anyerror!void
- The function to call when the pool first establishes the connection. Defaults tonull
. -
on_first_connection: ?*const fn(conn: *Conn) anyerror!void
- The function to call on the first connection opened by the pool. Defaults tonull
.
Conn
query
Use conn.query(sql, args) !Rows
to query the database and return a zuckdb.Rows
which can be iterated. You must call deinit
on the returned rows.
var rows = try conn.query("select * from users where power > $1", .{9000});
defer rows.deinit();
while (try rows.next()) |row| {
// ...
}
exec
conn.exec(sql, args) !usize
is a wrapper around query
which returns the number of affected rows for insert, updates or deletes.
row
conn.row(sql, args) !?OwningRow
is a wrapper around query
which returns a single optional row. You must call deinit
on the returned row:
var row = (try conn.query("select * from users where id = $1", .{22})) orelse return null;;
defer row.deinit();
// ...
begin/commit/rollback
The conn.begin()
, conn.commit()
and conn.rollback()
calls are wrappers around exec
, e.g.: conn.exec("begin", .{})
.
prepare
conn.prepare(sql, opts) !Stmt
prepares the given SQL and returns a zuckdb.Stmt
. For one-off queries, you should prefer using query
, exec
or row
which wrap prepare
and then call stmt.bind(values)
and finally stmt.execute()
. Getting an explicit Stmt is useful when executing the same statement multiple times with different values.
Values for opts are:
-
auto_release: bool
- This defaults to and should usually be kept asfalse
. Whentrue
, the statement is automatically discarded (deinit
) after the result of its first execution is complete. If you're going to set this totrue
, you might as well useconn.exec
,conn.query
orconn.row
instead of getting an explicit statement.
err
If a method of conn
returns error.DuckDBError
, conn.err
will be set:
var rows = conn.query("....", .{}) catch |err| {
if (err == error.DuckDBError) {
if (conn.err) |derr| {
std.log.err("DuckDB {s}\n", .{derr});
}
}
return err;
}
In the above snippet, it's possible to skip the if (err == error.DuckDBError)
check, but in that case conn.err could be set from some previous command (conn.err is always reset when acquired from the pool).
release
conn.release()
will release the connection back to the pool. This does nothing if the connection did not come from the pool (i.e. pool.acquire()
). This is the same as calling pool.release(conn)
.
Rows
The rows
returned from conn.query
exposes the following methods:
-
count()
- the number of rows in the result -
changed()
- the number of updated/deleted/inserted rows -
columnName(i: usize)
- the column name at positioni
in a result -
deinit()
- must be called to free resources associated with the result -
next() !?Row
- returns the next row
The most important method on rows
is next()
which is used to iterate the results. next()
is a typical Zig iterator and returns a ?Row
which will be null when no more rows exist to be iterated.
Row
get
Row
exposes a get(T, index) T
function. This function trusts you! If you ask for an i32
the library will crash if the column is not an int4
. Similarly, if the value can be null, you must use the optional type, e.g. ?i32
.
The supported types for get
, are:
-
[]u8
, -
[]const u8
-
i8
-
i16
-
i32
-
i64
-
i128
-
u8
-
u16
-
u32
-
u64
-
f32
-
f64
-
bool
-
zuckdb.Date
-
zuckdb.Time
-
zuckdb.Interval
-
zuckdb.UUID
-
zudkdb.Enum
Optional version of the above are all supported and must be used if it's possible the value is null.
String values and enums are only valid until the next call to next()
or deinit
. You must dupe the values if you want them to outlive the row.
list
Row
exposes a list
method which behaves similar to get
but returns a zuckdb.List(T)
.
const row = (try conn.row("select [1, 32, 99, null, -4]::int[]", .{})) orelse unreachable;
defer row.deinit();
const list = row.list(?i32, 0).?;
try t.expectEqual(5, list.len);
try t.expectEqual(1, list.get(0).?);
try t.expectEqual(32, list.get(1).?);
try t.expectEqual(99, list.get(2).?);
try t.expectEqual(null, list.get(3));
try t.expectEqual(-4, list.get(4).?);
list()
always returns a nullable, i.e. ?zuckdb.List(T)
. Besides the len
field, get
is used on the provided list to return a value at a specific index. row.list(T, col).get(idx)
works with any of the types supported by row.get(col)
.
a List(T)
also has a alloc(allocator: Allocator) ![]T
method. This will allocate a []T
and fill it with the list values. It is the caller's responsibility to free the returned slice.
Alternatively, fill(into: []T) void
can be used used to populate into
with items from the list. This will fill @min(into.len, list.len)
values.
zuckdb.Enum
The zuckdb.Enum
is a special type which exposes two functions: raw() [*c]const u8
and rowCache() ![]const u8
.
raw()
directly returns the DuckDB enum string value. If you want to turn this into a []const u8
, you'll need to wrap it in std.mem.span
. The value returned by raw()
is only valid until the next iteration.
rowCache()
takes the result of raw()
, and dupes it, giving ownership to the Rows. Thus, the string returned by rowCache()
outlives the current row iteration and is valid until rows.deinit()
is called. Essentially, it is an interned string representation of the enum value (which DuckDB internally represents as an integer).
Pool
The zuckdb.Pool
is a thread-safe connection pool:
const db = try zuckdb.DB.init(allocator, "/tmp/duckdb.zig.test", .{});
var pool = db.pool(.{
.size = 2,
.on_connection = &connInit,
.on_first_connection = &poolInit,
});
defer pool.deinit();
var conn = try pool.acquire();
defer conn.release();
The Pool takes ownership of the DB object, thus db.deinit
does not need to be called The on_connection
and on_first_connection
are optional callbacks. They both have the same signature:
?*const fn(conn: *Conn) anyerror!void
If both are specific, the first initialized connection will first be passed to on_first_connection
and then to on_connection
.
newConn() !Conn
Besides using acquire()
to get a !*Conn
from the pool, it's possible to create a new connection detached from the pool using pool.newConn()
. This is the same as calling db.conn()
but, on the pool. Again, this connection will not be part of the pool and release()
should not be called on it (but deinit()
should).
exec/query/row
The pool.exec
, pool.query
, pool.queryWithState
, pool.row
and pool.rowWithState
are convenience functions which behave like their Conn
counterparts.
pool.exec
is the same as:
var conn = try pool.acquire();
defer conn.release();
return conn.exec(sql, args);
pool.query
, pool.queryWithState
, pool.row
and pool.rowWithState
are similar, except the connection is automatically released back to the pool when the rows.deinit()
or row.deinit()
is called.s
Stmt
The zuckdb.Stmt
encapsulates a prepared statement. It is generated by calling conn.prepare([]const u8, opts)
.
deinit() !void
Deinitializes the statement.
clearBindings(stmt: *const Stmt) !void
Clears any previous bound values.
bind(stmt: *const Stmt, values: anytype) !void
Binds the values tuple to the statement.
bindValue(stmt: *const Stmt, value: anytype, index: usize) !void
Binds the specific value to the specified position.
exec(stmt: *const Stmt) !usize
Like conn.exec
, this executes the statement returning the number of affected rows. Should not be used with a query that returns results.
query(stmt: *const Stmt) !Rows
Like conn.query
, executes a result and returns the rows.
Query Optimizations
In very tight loops, performance might be improved by providing a stack-based state for the query logic to use. The query
and row
functions all have a WithState
alternative, e.g.: queryWithState
. These functions take 1 additional "query state" parameter:
var state = zuckdb.StaticState(2){};
var rows = try conn.queryWithState(SQL, .{ARGS}, &state);
// use rows normally
The value passed to zuckdb.StaticState
is the number of columns returned by the query. The state
must remain valid until rows.deinit()
is called.
Appender
The fastest way to insert a large amount of data is to use the appender:
// the first parameter is the schema, or null to use the default schema
var appender = try conn.appender(null, "my_table");
defer appender.deinit();
for (...) {
try appender.appendRow(.{"over", 9001, true});
}
// The appender auto-flushes, but it should be called once at the end.
try appender.flush();
The order of the values used in appendRow
is the order of the columns as they are defined in the table (e.g. the order that describe $table
returns).
Appender per-column append
The appender.appendRow
function depends on the fact that you have comptime knowledge of the underlying table. If you are dealing with dynamic (e.g. user-defined) schemas, that won't always be the case. Instead, use the more explicit beginRow()
, appendValue()
and endRow()
methods.
for (...) {
appender.beginRow();
try appender.appendValue("over", 0);
try appender.appendValue(9001, 1);
try appender.appendValue(true, 2);
try appender.endRow();
}
try appender.flush();
The appendRow()
method internally calls beginRow(),
appendValue()and
endRow()`.
Appender Type Support
The appender writes directly to the underlying storage and thus cannot leverage default column values. appendRow
asserts that the # of values matches the number of columns. However, when using the explicit beginRow
+ appendValue
+ endRow
, you must make sure to append a value for each column, else the behavior is undefined.
Enums aren't supporting, due to limitations in the DuckDB C API.
Decimals are supported, but be careful! When appending a float, the value will truncated to the decimal place specified by the scale of the column (i.e. a decimal(8, 3) will have the float truncated with 3 decimal places). When appending an int, the library assumes that you have already converted the decimal to the DuckDB internal representation. While surprising, this provides callers with precise control.
When dealing with ints, floats and decimals, appending a single value tends to be flexible. In other words, you can append an i64
to a tinyint
column, so long as the value fits (i.e. there's a runtime check). However, when dealing with lists (e.g. integer[]
), the exact type is required. Thus, only a []u16
can be bound to a usmallint[]
column. decimal[]
can bind to a []i64
, []f32
or []f64
.
List columns support null values, and thus can be bound to either a []const T
or a []const ?T
.
Appender Error
If any of the appender methods return an error, you can see if the optional appender.err
has an error description. This is a ?[]const u8
field. On error, you should not assume that this value is set, there are error cases where DuckDB doesn't provide an error description.