azure-cosmosdb-spark icon indicating copy to clipboard operation
azure-cosmosdb-spark copied to clipboard

Improve push down predicates

Open dennyglee opened this issue 7 years ago • 0 comments

Make better use of DocumentDB's native capabilities (e.g. aggregations, ORDER BY, LIMIT, etc.) so a more optimized dataset is returned to Apache Spark. For example:

  • For cumulative aggregations, DocumentDB should return those aggregations and Spark should then aggregate accordingly.
  • For ORDER BY, DocumentDB can return the data ordered
  • For LIMIT, the connector should execute a TOP query to DocumentDB

dennyglee avatar Mar 11 '17 01:03 dennyglee