Session Token Merge
Changes
- Adding a new method to merge session tokens for customers wanting to keep track of their own session tokens.
- Added a new api for converting logical partition key to feed range
- Added new api for checking if a feed range is a subset of another feed range
APIs
Container.py
def merge_session_tokens(feed_ranges_to_session_tokens: List, target_feed_range: str): --> str
def feed_range_for_logical_partition(pk: PartitionKey): --> str
def is_feed_range_subset(parent_feed_range: str, child_feed_range: str): --> str
Samples
# This would be happening through different clients
feed_ranges_and_session_tokens = []
for doc in docs_to_create:
container.create_item(doc)
# the feed range returned in the request context will correspond to the logical partition key
feed_range = container.client_connection.last_response_headers["request-context"]["feed-range"]
session_token = container.client_connection.last_response_headers["request-context"]["session-token"]
feed_ranges_and_session_tokens.append((feed_range, session_token))
# Note that the list of feed ranges and session tokens here would be aggregated from different clients
# for these examples
# All of these are getting the most updated session token for a target_feed_range
# ---------------------1. using logical partition key ---------------------------------------------------
# could also use the one stored from the responses headers
for logical_pk in logical_pks:
target_feed_range = container.feed_range_for_logical_partition(logical_pk)
updated_session_token = container.merge(feed_ranges_and_session_tokens, target_feed_range)
# ---------------------2. using arbitrary feed range ----------------------------------------------------
container_feed_ranges = container.read_feed_ranges()
for target_feed_range in target_feed_ranges:
updated_session_token = container.merge(feed_ranges_and_session_tokens, target_feed_range)
# ---------------------3. using physical partitions -----------------------------------------------------
target_feed_range = container.feed_range_for_logical_partition(logical_pk)
updated_session_token = container.merge(feed_ranges_and_session_tokens, target_feed_range)
# ------------------------------------------------------------------------------------------------------
Implementation
Glossary
Session Token Format: PKRangeId:VersionNumber#GlobalLSN#RegionId1=LocalLSN1#RegionId2=LocalLSN2... Compound session token: Comma separated session tokens
API change check
APIView has identified API level changes in this PR and created following API reviews.
Of the above operations, can you flag which ones involve reads/writes to the actual container vs are local? We are curious on the metadata read properties (and latency) of the example flows.
With respect to the artificial feed ranges case, from the API is appears that is a client-only change? Why is it out of scope?
Of the above operations, can you flag which ones involve reads/writes to the actual container vs are local? We are curious on the metadata read properties (and latency) of the example flows.
With respect to the artificial feed ranges case, from the API is appears that is a client-only change? Why is it out of scope?
@nickcoai The following are the operations and whether they require metadata calls. None of them will do a metadata call on each invocation as the relevant info would be cached. The latency for these flows should be low as most of time will not require metadata call.
def get_updated_session_token(feed_ranges_to_session_tokens: List, target_feed_range: str): --> str - Requires no metadata calls
def feed_range_for_logical_partition(pk: PartitionKey): --> FeedRange - There could be metadata calls for the collection properties, but it is cached
def is_feed_range_subset(parent_feed_range: str, child_feed_range: str): --> bool - Currently, no metadata calls are necessary for this, but the feed range implementation is being worked on in parallel so this could change. Will update pr accordingly.
def read_feed_ranges(num_of_ranges: int): --> List - This would require metadata calls sometimes because it requires the pkrange cache.
For the artificial feed ranges case, the artificial feed ranges can easily have negative side effects (not necessarily for session token merge, but when using them as scoping filter for query/change feed) when the service is not being able to effectively apply them. So, minimizing surface area also minimizes risks of sending down customers a route that results in issues later. This is why we left it out of scope for this pr.
/azp run python - cosmos - tests
Azure Pipelines successfully started running 1 pipeline(s).
/azp run python - cosmos - tests
Azure Pipelines successfully started running 1 pipeline(s).
/azp run python - cosmos - tests
Azure Pipelines successfully started running 1 pipeline(s).
/azp run python - cosmos - tests
Azure Pipelines successfully started running 1 pipeline(s).
/azp run python - cosmos - tests
Azure Pipelines successfully started running 1 pipeline(s).
/azp run python - cosmos - tests
Azure Pipelines successfully started running 1 pipeline(s).
/azp run python - cosmos - tests
Azure Pipelines successfully started running 1 pipeline(s).
/azp run python - cosmos - ci
Azure Pipelines successfully started running 1 pipeline(s).