Bindings for Elixir
I'd like to bundle sqlite-vss as an Elixir package and distribute it on hex. Ideally Elixir developers should be able to run:
mix deps.get sqlite-vss
Existing Language Binding notes
The Python/Node.js bindings (and the future Ruby binding I'm working on) all do similar things:
- Bundle pre-compiled loadable extension files (
.dylib/.so/.dll) inside the language package (the.wheelfile for python, the bundled.tar.gzpackage for Node.js, and the.gemfile for Ruby - Distribute different version of the package for different platforms. For Python this means different wheels per platform, in Node different optionalDependencies, and in Ruby it's building different
.gemfiles withgem bundle --platform x86_64-darwinetc - The language code itself is typically just a single
loadable_pathfunction that returns the path to the compiled extension. Also maybe aload()function that wraps a sqlite3 db connection that just callsdb.load_extension(loadable_path())like the python api
Open questions
- How do we publish different platforms for the same package? Can we publish a single
sqlite-vsspackage that has different "wheels" for all our platforms, or do we need to publish separatesqlite-vss-macos-x86_64/sqlite-vss-linux-x86_64packages for every platform? - Can we just stuff the loadable extensions in
def project? This seems to work for me:
def project do
[
app: :hello_world,
version: "0.1.0",
elixir: "~> 1.12",
start_permanent: Mix.env() == :prod,
deps: deps(),
files: ["priv/*.dylib"]
]
end
Specifically the files: ["priv/*.dylib"] part. How can we make it work for multiple platforms?
cc @trodrigu
For more context: The exqlite lib similar to the python lib can load extensions like
alias Exqlite.Basic
{:ok, conn} = Basic.open("db.sqlite3")
:ok = Basic.enable_load_extension(conn)
# load the regexp extension - https://github.com/nalgeon/sqlean/blob/main/docs/re.md
Basic.load_extension(conn, ExSqlean.path_for("re"))
To make this work with ecto we have to do something like
config :sqlite_init, SqliteInit.Repo,
database: Path.expand("../sqlite_init_dev.db", Path.dirname(__ENV__.file)),
pool_size: 5,
after_connect: fn _conn ->
[db_conn] = Process.get(:"$callers")
db_connection_state = :sys.get_state(db_conn)
conn = db_connection_state.mod_state.state
IO.inspect(conn, label: "conn")
:ok = Exqlite.Basic.enable_load_extension(conn)
Exqlite.Basic.load_extension(conn, ExLitedb.path_for("re"))
end,
stacktrace: true,
show_sensitive_data_on_connection_error: true
@asg017 Are we feeling that loading the extension into ecto (the defacto Elixir orm) is outside the scope of what the lib should do?
priv is great for storing the compiled files. Exsqlean takes this approach.
Instead of loading it in the project it uses the helper :code.priv
The thing we can update from that library though is changing the "GET" request it does for the compiled files and bundle them in the package.
Thanks for the notes @trodrigu ! Some thoughts:
- The
:code.privapproach inExsqleanlooks perfect - It looks like
Exsqleanbundles all platform extensions underpriv/,darwin-amd64darwin-arm64linux-amd64windows-amd64andwindows-win32. I don't think this will be ideal for sqlite-vss: Each compiled extension is like ~4MB, so I'd like to have separate Elixir packages for each platform, if possible. - A question related to the above point: Is it possible to create platform-specific versions of a Hex package? In Python/pip we have a single
sqlite-vsspypi package that download different platform-specifc wheels when someone pip installs, and I'm wondering if Elixir has a similar pattern. Ruby has something similar, thesqlite3gem has different "version" for different platforms, likesqlite3-1.6.2-x86_64-linux.gem,sqlite3-1.6.2-x86_64-darwin.gem, etc.
As far as next steps: If you put up a PR that just gets the skeleton of what the mix.exsand the library Elixir code looks like, I can pick that up and figure out all the priv/ bundling. It'll be done in Github Actions (so nothing like the Makefile in ex_sqlean), so you can just assume that the platform-specific loadable extensions will be bundled in on publish.
For practicality, I may copy the Elixir code and test it with sqlite-hello at first: Each release of sqlite-vss takes like 20 minutes to build and release, but sqlite-hello is much more slimed down and takes ~15 seconds. And every typo/fix needs to be it's own separate release, so it takes a few tries to get right.
Let me know what you think!
I love this plan! ❤️ I'm posting a couple of places to see if this is possible.
Hey @asg017 !
I wanted to send you a reference to the Elixir Slack link where I’m asking about packages and architectures. Feel free to join at elixir-lang.slack.com. The gist of it is that the recommended pattern is to go with downloadable binaries and use certfile validation and checksum validation. Elixir even has a special library called Rustler Precompiled which makes this process simpler for Rust crates. An example of the checksum file (which is present in the package) is at https://preview.hex.pm/preview/explorer/0.5.7/show/checksum-Elixir.Explorer.PolarsBackend.Native.exs. The bundled package route would require users to add some code to the deps() function in the mix.exs file which could be a little unwieldy. I wanted to get your opinion on this as we may be all good to go with downloadable binaries.
Thanks for the notes @trodrigu! The download at install-time + cerfile verification approach sounds good to me.
So I imagine we'll have some checksum.exs file that looks like this:
%{
"sqlite-vss-elixir-macos-x86_64.tar.gz" => "sha256:d36805150e...2c3d25c7e",
"sqlite-vss-elixir-macos-aarch.tar.gz" => "sha256:50383f244c...50383f244c",
"sqlite-vss-elixir-linux-x86_64.tar.gz" => "sha256:3e61118a...8443b34d"
}
Some questions:
- When exactly does the
"GET"request to download these packages happen? Is it when a user addssqlite_vssto theirdeps()and runsmix hex.install, or at runtime whensqlite_vssis ran for the first time? - Likewise, does the checksum check also happen at install-time?
- For the actual GET request, does that require any additional dependencies, or is there one built into elixir we can use? And is certfile a builtin package?
If you're able to get a skeleton elixir package PR up, don't worry about the checksums.exs file, I think we can generate that at release-time in a github actions runner. But I'm not sure what the mix.exs or the lib/sqlite_vss.ex files should look like
When exactly does the "GET" request to download these packages happen? Is it when a user adds sqlite_vss to their deps() and runs mix hex.install, or at runtime when sqlite_vss is ran for the first time?
The former where a user will run a mix sqlite_vss.install command.
Likewise, does the checksum check also happen at install-time?
Exactly! The approach would emulate the Rustler precompiled project.
For the actual GET request, does that require any additional dependencies, or is there one built into elixir we can use? And is certfile a builtin package?
The approach will use :httpc which is built in to Elixir. For the cacertfile we can use the :castore library which makes keeping your castore up to date easier.
For reference https://github.com/asg017/sqlite-vss/pull/33
For more context: The exqlite lib similar to the python lib can load extensions like
alias Exqlite.Basic {:ok, conn} = Basic.open("db.sqlite3") :ok = Basic.enable_load_extension(conn) # load the regexp extension - https://github.com/nalgeon/sqlean/blob/main/docs/re.md Basic.load_extension(conn, ExSqlean.path_for("re"))To make this work with ecto we have to do something like
config :sqlite_init, SqliteInit.Repo, database: Path.expand("../sqlite_init_dev.db", Path.dirname(__ENV__.file)), pool_size: 5, after_connect: fn _conn -> [db_conn] = Process.get(:"$callers") db_connection_state = :sys.get_state(db_conn) conn = db_connection_state.mod_state.state IO.inspect(conn, label: "conn") :ok = Exqlite.Basic.enable_load_extension(conn) Exqlite.Basic.load_extension(conn, ExLitedb.path_for("re")) end, stacktrace: true, show_sensitive_data_on_connection_error: true@asg017 Are we feeling that loading the extension into ecto (the defacto Elixir orm) is outside the scope of what the lib should do?
Just wanted to point this PR out ... I actually committed this back to exqlite specifically to enable usage of this lib in a more convenient way. The after_connect stuff is not needed, just a load_extensions key at the repo or global config. If the package directly exposes a path to the embedded libs, this becomes as simple as adding the hex package, and adding the load_extensions block with the helper as the path. See this comment for an example. I was originally thinking about shipping a hex package for exactly the same purpose.
Haven't been through the full comment thread here yet, apologies if I missed something.
@bkono Thanks for pointing this out! load_extensions is the way to go and I'll be refactoring my production app to use that API. @asg017 I wasn't sure if we wanted to keep this as bare bones as possible and not include a reference to Ecto in the README?
@trodrigu we can include an Ecto reference/code snippet in the README!
kk @trodrigu #33 is merged and sqlite_vss is now published, can you let me know if it works as expected?
A few more lingering questions I have:
- Does it make sense to move
def install()fromlib/sqlite_vss.extomix/tasks/sqlite_vss.install.ex? I'd ideally like to keepsqlite_vss.exas small as possible and only containdef loadable_path_vector0()anddef loadable_path_vss0(), but I'm not sure if the rest has to be insqlite_vss.ex - Do you think it's worth doing the
sqlite-vss-checksum.exschecksum checks at install-time?
Does it make sense to move def install() from lib/sqlite_vss.ex to mix/tasks/sqlite_vss.install.ex? I'd ideally like to keep sqlite_vss.ex as small as possible and only contain def loadable_path_vector0() and def loadable_path_vss0(), but I'm not sure if the rest has to be in sqlite_vss.ex
Yes, this would be a worthy refactor. It makes sense to move the install related functions into the mix task file.
Do you think it's worth doing the sqlite-vss-checksum.exs checksum checks at install-time?
After reading through elixir_make I think it should be easy to do this actually. Another worthy refactor.
The published sqlite_vss package gets installed successfully in my production app on fly.io. Great work!