azure-cosmosdb-spark
azure-cosmosdb-spark copied to clipboard
Using ComosDB collections as Spark lookup tables
Hi,
I am writing a Spark streaming application, in which I want to filter the data that is not in CosmosDB. I have set this column as the id column in CosmosDB. Which of the following would be a better and faster option:
- Load the entire Cosmos DB into a df and do a left anti-join
(or)
- Do a lookup query in Cosmos DB for each item in my streaming data.
Hi, Pls clarify what is the best way to achieve the above.