azure-cosmosdb-spark
azure-cosmosdb-spark copied to clipboard
Improve push down predicates
Make better use of DocumentDB's native capabilities (e.g. aggregations, ORDER BY
, LIMIT
, etc.) so a more optimized dataset is returned to Apache Spark. For example:
- For cumulative aggregations, DocumentDB should return those aggregations and Spark should then aggregate accordingly.
- For
ORDER BY
, DocumentDB can return the data ordered - For
LIMIT
, the connector should execute aTOP
query to DocumentDB