azure-cosmosdb-spark icon indicating copy to clipboard operation
azure-cosmosdb-spark copied to clipboard

Using ComosDB collections as Spark lookup tables

Open bharadwaj221 opened this issue 4 years ago • 1 comments

Hi,

I am writing a Spark streaming application, in which I want to filter the data that is not in CosmosDB. I have set this column as the id column in CosmosDB. Which of the following would be a better and faster option:

  1. Load the entire Cosmos DB into a df and do a left anti-join

(or)

  1. Do a lookup query in Cosmos DB for each item in my streaming data.

bharadwaj221 avatar May 22 '20 12:05 bharadwaj221

Hi, Pls clarify what is the best way to achieve the above.

bharadwaj221 avatar Jun 02 '20 12:06 bharadwaj221