ibis
ibis copied to clipboard
docs(marketing): homepage > comparison to other tools
Suggestions for updates to the homepage:
- [ ] Update comparison to SQLAlchemy
- ~~[ ] Add blaze / dask~~
SQLAlchemy is one back end that Ibis can compile expressions using.
https://github.com/ibis-project/ibis/blob/master/ibis/sql/alchemy.py
Current: http://ibis-project.org/
Why not use SQLAlchemy? SQLAlchemy is very convenient as an ORM (Object Relational Mapper), providing a Python interface to SQL databases. But SQLAlchemy is focussed on access to the data, and not to perform analytics on it. And it is mostly limited to conventional SQL databases, and doesn't support big data platforms or specialized analytical tools.
Suggested:
SQLAlchemy is an ORM (Object Relational Mapper) which provides a Python interface to SQL databases. Ibis also provides a Python interface to SQL databases. SQLAlchemy is one backend that Ibis can use to compile Python expressions to SQL expressions. Ibis is one of a number of analytics tools built atop SQLAlchemy. Ibis also has backends to support a number of non-SQL databases.
A different final paragraph on the first page of the docs might be more welcoming.
Ibis compared with Blaze + Dask would also be useful https://blaze.pydata.org/
The Blaze ecosystem is a set of libraries that help users store, describe, query and process data. It is composed of the following core projects:
- Blaze: An interface to query data on different storage systems
- Dask: Parallel computing through task scheduling and blocked algorithms
- Datashape: A data description language
- DyND: A C++ library for dynamic, multidimensional arrays
- Odo: Data migration between different storage systems
@westurner blaze, DyND, odo, datashape are long abandonware :-<
A few interesting tools to compare to in 2023:
DataFrame APIs:
- Pandas
- Dask
- Polars
- PySpark
- Modin
- Vaex
- ...
Libraries that may draw comparisons but are not DataFrame APIs:
- SQLAlchemy
- sqlglot
- CuDF
- https://docs.rapids.ai/api/cudf/stable/user_guide/pandas-comparison.html
- Pandas does PyArrow Now: https://arrow.apache.org/docs/python/pandas.html
- https://docs.rapids.ai/api/cudf/stable/user_guide/pandas-comparison.html
Dataclasses are or could be like DataFrames, though they don't do columnar storage and so:
FWIW, dataclasses -> arrow/pandas takes less ram than list(map(tuple, dataclasses_list))
https://pypi.org/project/pandas-dataclasses/
Since Ibis wraps computational engines, it doesn't really make sense to compare it to a bunch of different engines. We've added a "Why Ibis" page in #5958 that covers the points we think should be included in the docs. Closing.