trustfall icon indicating copy to clipboard operation
trustfall copied to clipboard

Docs: Add comparison between Trustfall and SQL

Open obi1kenobi opened this issue 11 months ago • 0 comments

We should help prospective users decide whether Trustfall is for them or not. This should be in the FAQ and perhaps could make a good blog post topic as well.

TL;DR:

  • If your data is already in a SQL database, use SQL.
  • If you need to query data format or schema that may change over time, Trustfall is really good at that. For example, cargo-semver-checks uses Trustfall to query a JSON format that has breaking changes around once a month on average — but the Trustfall queries don't change at all.
  • If your queries get data from a remote API that is rate-limited or charges per request, Trustfall is really good at that. Its queries are lazily evaluated, so you'll only pay for what you use. This is easier and cheaper than building ETL pipelines, where you'd pay for running all the API calls regardless of whether your queries use the results or not.
  • Trustfall doesn't (yet) natively support datetimes; the currently-recommended workaround is to encode them as string. If this won't work for your workload, use SQL.
  • Trustfall doesn't (yet) natively support ORDER BY, though some schemas may allow setting ordering in your query. The best way to impose a global order on results is to run a query to completion, then sort the results (e.g. by deriving PartialOrd, Ord on your query result struct, then collecting the results iterator and sorting it). If this won't work for your workload, use SQL instead.

In some cases, you may want to use SQL together with Trustfall. For example, it can be useful to combine SQLite with Trustfall to cache the results of complex transformations or expensive API calls. As another example, if your dataset is split between a SQL database and a series of files in S3, Trustfall could be used to run federated (cross-datasource) queries across SQL and S3. In both cases, Trustfall is the query engine and SQL is a storage system that helps run portions of those queries as needed.

obi1kenobi avatar Feb 28 '24 21:02 obi1kenobi