sqlx icon indicating copy to clipboard operation
sqlx copied to clipboard

Support Snowflake

Open jkoudys opened this issue 3 years ago • 4 comments

Snowflake is getting huge - hit market cap ~$90B and getting popular fast. Might be the best time for rust to become well-known for supporting it as a DB option. They're well documented and they have a lot of good staff who've already built drivers on a few languages: https://docs.snowflake.com/en/user-guide/conns-drivers.html

The big things you don't get out of box with a raw ODBC driver, mainly connection pooling and futures, mean adding snowflake as another possible DB type. They stick pretty close to ANSI SQL:1999 afaik, and I can help get a test acct available.

Any suggestions or warnings before I start work on this?

jkoudys avatar Jan 16 '21 00:01 jkoudys

Proprietary DBs present a huge maintenance hazard for us as far as the open-source half of SQLx goes. I don't even see a free FOSS license offering on their site.

I can help get a test acct available.

Even with a test account, that would presumably limit it to individual developers or at most Launchbadge personnel only. And how long would it last? Forever? What if it stops working? Is it usage-limited? Who would we contact when we have problems?

On top of that, it doesn't look like their protocol is publicly documented, so we'd have to basically port one of their existing connector libraries (although fortunately those are open source). What happens if they decide to make a breaking change to their protocol? They can just silently update their connector libraries and don't have to tell anyone.

These are all reasons why we're moving to a dual-license model. We could certainly discuss integrating Snowflake into our premium offering where we can justify paying for our own account for testing, but I don't believe we'll be able to accept external contributions for those, for liability reasons.

abonander avatar Jan 16 '21 01:01 abonander

ODBC support would also help in calling snowflake without introducing proprietary licenced API support.

From #116 ,

  • SQLx wants to remain as pure Rust as possible. There are obvious concessions we have to make to support Oracle and SQLite. However, where possible, we aim to be pure, safe Rust.

I've only found an FFI for odbc so far ( https://github.com/pacman82/odbc-sys ) -- is this a big reason why the sqlx mission of pure-rust makes ODBC so hard? Realistically how big an effort is it to reimplement odbc as a rust lib w/o the ffi?

jkoudys avatar Jan 16 '21 02:01 jkoudys

I am interested in communicating with a Snowflake database from a Rust application. I'm curious what the status of this effort is.

mbuscemi avatar Jun 02 '22 15:06 mbuscemi

I have been using Snowflake at a company who uses it for their Data Warehouse for ~2 years now and when I started I wrote an integration to get it working with Elixir.

Since then, their connectors have improved drastically and their official connectors are pretty good and you can read through their code and it's understandable.

Snowflake is very similar to Postgres, but has way more functions designed for data analysis, and designed for large datasets. It works pretty well.

Technical details:

  1. All connectors communicate over HTTP to Snowflake using their internal REST API. This doesn't break or randomly, as the official connectors use this. In my time using it with Elixir, I've had 0 breaking changes.
  2. When logging into Snowflake, you can set your application to be PythonConnector to get back Arrow Streaming Files, or can set it as Javascript to get back JSON (not related, their Javascript connector does not support Arrow yet). 2a. Logging in gives you a token, which lasts for 10 minutes by default.
  3. When performing queries, they give you back the first resultset as base64 encoded Arrow/JSON, these chunk sizes range from 100kb-~50mb uncompressed, they are gzipped from the server. 3a. The rust crate Arrow2 has support for this, which I've used to parse results. In fact, I'm using it to power the new Snowflake client for Elixir I'm working on to deserialise from Arrow (it's much faster than using JSON to get results back, almost 2x as fast from my benchmarks).

IMHO, this work should be done as a crate users could include which would tie in with this one. Once it's ready and battle tested in production, discussion could be made about putting it behind a feature flag or something in this repo, if the maintainers find it worthwhile, but IMHO staying as a seperate crate might be worthwhile.

joshuataylor avatar Jun 26 '22 07:06 joshuataylor

I came to this package looking for a package to connect to Snowflake. I totally understand the concerns and why it hasn't been adopted here.

I'm new to Rust so take this with a grain of salt but I'm wondering if there is a future for rust+sql that looks like golang's database/sql package. It is effectively a general purpose SQL interface that can be used along side db specific drivers. They have a design goal doc here that lays out the reasons for doing it. The goals aren't unlike the goals of this package but since the dbs are written directly into sqlx it feels like the addition of other dbs is enough of a lift that it might not get done. Or like mentioned above they will be separate crates -- likely leading to diverging interfaces.

That kind leaves the folks wanting to do data + analytics in rust stuck with the less mature package that shows up in google snowflake-connector or maybe using the rust odbc crate (maybe that is fine?).

Mostly just curious what people think. This was spurred on by some recent chatter/hype around Data Engineering + Rust.

jsnb avatar Jul 04 '23 18:07 jsnb

I wrote an adapter for Snowflake a couple of years ago in Rust, mainly as a POC, and Rust had AMAZING performance. This is mostly down to the arrow2 Rust library.

joshuataylor avatar Jul 05 '23 03:07 joshuataylor