IdentityServer review ICache<T> and IDistribiutedCache

We have two different uses of caching. Perhaps there's some cleanup/consolidation that we could do.

Oct 18 '22 18:10 brockallen

Also to consider, different data has different sensitivity.

Jan 10 '23 15:01 brockallen

Quick review here -- ICache<T> is generally used for caching strongly-typed data returned from stores, and IDistributedCache is used for more temp/operational/volatile data such as replay caches, request params, and throttling services.

I could see consolidation by adding strongly-typed helpers to IDistributedCache, but we'd also need a way to created named instances to avoid key conflicts for different types of objects. In essence, this is why we decided we needed ICache<T> in the first place. Maybe the solution (if the problem is the need for consolation) is to an add an ICache<T> based on IDistributedCache.

Jun 12 '23 17:06 brockallen

@josephdecock any input here as to suggested direction? IOW, do we see that people want some improvement here? or maybe it's just more to document what's used where?

Anyway, we should discuss.

Jun 12 '23 17:06 brockallen

For reference if useful (not sure that it is): https://devblogs.microsoft.com/dotnet/caching-abstraction-improvements-in-aspnetcore/

Jul 08 '23 00:07 brockallen

I don't hear much confusion about the two interfaces. Whenever production deployments/scaling comes up, I always mention that IDistributedCache might be necessary, and our docs on how we use it have seen some recent improvements. I think we could add a bit of clarification to them to point out how it is distinct from the ICache.

I also don't think there would be a big enough benefit to justify making a breaking changing here.

Jul 09 '23 18:07 josephdecock

@brockallen we would like to see an improvement here. We are using the provided Caching Stores in a multi regional architecture. The current caching setup has a few disadvantages for our usecase.

The default implementation of ICache<T> uses IMemoryCache. This could lead differences between regions. A provided implementation of ICache<T> using IDistributedCache could solve this issue.

A client may change, however the Caching Stores have no API to invalidate a cache entry. We could only guess/copy what key the Stores are using.

Is there already a solution to these issues I may have missed?

Jul 11 '23 05:07 gpolcode

it is not trivial to implement. However, IS is interface dependent. In our case, it's quite a complex k8s deployment with multiple pods in different regions, and Memory Cache is simply not a solution. If you implement ICache<T> you can use whatever you want under it. For DI, just use your distributed implementation.

builder.Services.AddTransient(typeof(ICache<>), typeof(WhateverCache<>));

However, it is not well documented and requires a lot of digging around in the code. @josephdecock @brockallen

Jul 19 '23 11:07 ivannedyalkov

Ok so then we can consider additional docs here. Thanks!

Jul 19 '23 19:07 brockallen

I've noticed an issue with Caching in Identity Server, it seems to be wildly inefficient. I don't know that I fully understand all of the angles here, if we oofed our implementation but I was able to dramatically improve the caching performance through a few changes and things I observed in IDS.

Here's what I figured out.

TLDR;

The .NET Distributed Cache does not support getting multiple keys in one call where Redis or Stack Exchange's Redis does. Retrieving multiple key/values through one call should be an option. By slightly, modifying the code I was able to dramatically improve performance.

We had calls that went from 2-3 seconds to <500ms. This was on a token exchange process that called the Validate Token method which would make 10+ calls to Redis for each resource totally up to 70 calls to Redis.

Also, I'm happy to throw up a pull request to demonstrate all of the changes I made.. it's not a lot.. but it leaves the ICache open to implementing a single call to retrieve multiple key/values. I know this isn't the case with all caching services.

Problem

Now, this happens for multiple resources.. this is just one.. This adds up substantially..

If we look at this image below, we're passing a list of Scope Names but iteratively loading them from the Cache.. one.. by.. one..

Proposed Solution

I think ICache should exist but also have another method for returning multiple keys or just make the first call a list of keys. Microsoft's IDistributedCache does not support that. This is an easier win. For caching services that don't support a single call we could iterate through the list or asynchronously retrieve those keys/values.

ICache.cs

https://github.com/DuendeSoftware/IdentityServer/blob/df2b00bb905905be27a25a3c00397c5020165ef6/src/IdentityServer/Services/ICache.cs#L16

Add or modify...

    /// <summary>
    /// Gets the cached data based upon a key index.
    /// </summary>
    /// <param name="keys"></param>
    /// <returns>The cached item, or <c>null</c> if no item matches the key.</returns>
    Task<Dictionary<string, T>> GetAsync(List<string> keys);

Our Redis Cache implementation looks like this

public class RedisCache<T> : ICache<T> where T : class
{

    public async Task<Dictionary<string, T>> GetAsync(List<string> keys)
    {
        using var activity = OTel.Application.StartActivity("RedisCache.GetAsync");

        try
        {
            var tasks = new List<Task>();
            var items = new Dictionary<string, T>();

            // keys
            List<RedisKey> redisKeys = new List<RedisKey>();
            keys.ForEach(x => redisKeys.Add(new RedisKey(type.ToString() + x)));

           // redis supports retrieving multiple keys in one call.. other cache's may not.. this can be accommodated through their implementation.
            var redisValues = await Db.StringGetAsync(redisKeys.ToArray());

            int i = -1;
            foreach (var key in redisValues)
            {
                i++;
                if (key.HasValue) {
                    items.Add(keys[i], _serializer.Deserialize<T>(key));
                }
            }

            return items;
        }
        catch (Exception ex)
        {
            // don't crash the application when the redis cache is not available, just log the error and continue, the value will be retrieved from the database instead of cache
            _logger.LogError("Unable to retrieve {type} with Key: {key} from Redis Cache because of connection error {redisConnectionException}.", type.ToString(), keys, ex.Message);
            activity?.RecordException(ex);
        }

        _logger.LogDebug("Missed {type} with Key: {key} from Redis Cache.", type.ToString(), keys);
        return default(Dictionary<string, T>);
    }

IDS: CachingResourceStore.cs

This code in IDS was modified to use a List of scopeNames as opposed to iteratively retrieving each individual item. This is just one example, b/c there are multiple functions like this.. It adds an enormous amount of latency doing these individually.

    public async Task<IEnumerable<ApiResource>> FindApiResourcesByScopeNameAsync(IEnumerable<string> scopeNames)
    {
        using var activity = Tracing.StoreActivitySource.StartActivity("CachingResourceStore.FindApiResourcesByScopeName");
        activity?.SetTag(Tracing.Properties.ScopeNames, scopeNames.ToSpaceSeparatedString());

        var apiResourceNames = new HashSet<string>();
        var uncachedScopes = scopeNames.ToList();

        // NOTE: this returns 10 Redis keys/values in one call as opposed to 10...
        var items = await _apiResourceNames.GetAsync(scopeNames.ToList());

        // cached items
        foreach (KeyValuePair<string, ApiResourceNames> item in items)
        {
            foreach (var name in item.Value.Names)
            {
                apiResourceNames.Add(name);
            }

            uncachedScopes.Remove(item.Key);
        }

        if (uncachedScopes.Any())

Outcome

Note: the latency here is significantly lower since I'm running Redis locally. Please disregard the difference between the latency within the calls. Having less calls is far more efficient.

Aug 15 '23 20:08 mrjamiebowman

@mrjamiebowman, generally we would recommend that you use an in memory implementation of ICache for your configuration data. In most cases, changes to configuration data are rare and not particularly time-sensitive (it's okay if you have to wait a few minutes for the cache to expire), and inconsistencies across nodes typically can be tolerated. Usually we don't find that there's a need for real-time updates to these systems, and the extra overhead of making requests to redis isn't gaining much.

I'm curious why it is that you're caching the configuration data in redis in the first place. I'm also wondering if you were able to make the changes that you're describing within your implementation of IdentityServer.

Sep 05 '23 17:09 josephdecock

@brockallen we would like to see an improvement here. We are using the provided Caching Stores in a multi regional architecture. The current caching setup has a few disadvantages for our usecase.

The default implementation of ICache uses IMemoryCache. This could lead differences between regions. A provided implementation of ICache using IDistributedCache could solve this issue.

A client may change, however the Caching Stores have no API to invalidate a cache entry. We could only guess/copy what key the Stores are using.

Is there already a solution to these issues I may have missed?

I believe that I already said this, but having API to invalidate cache entries would be very useful. I'm doing that also by hand, which obviously is not good.

Sep 05 '23 18:09 hugoqribeiro

@josephdecock I'm still working on this, you can step through the Caching Code.. but I wanted to go ahead and share... DB seeding works, bootstraps cache, and caching works.. and I need to make sure App Insights connects.. We could really use something like this... and I would love to contribute to Duende Identity Server if at all possible. I absolutely love Identity Server and want to see this tool grow!

Customized IDS with Cache (Branch: main-cache) https://github.com/mrjamiebowman/IdentityServer/tree/main-cache

IDS Cache Implementation https://github.com/mrjamiebowman/IdentityServer-Cache-Improvements

git clone --recurse-submodules [email protected]:mrjamiebowman/IdentityServer-Cache-Improvements.git

Sep 28 '23 03:09 mrjamiebowman

@josephdecock oh, so we want to use KeyDB which is a drop-in replacement for Redis. This is substantially faster. We are looking to make our Identity Server as fast as humanly/machinely possible. We have SLAs, and we are also using Open Policy Agent (OPA), along with Azure APIM Gateway. Latency is a real problem for us and we have found that by using Redis and modifying the code we were able to dramatically improve the performance. We don't want to use in-memory. Our whole system uses caching and we want everything to be consistent. One of the challenges here is that the EF code in the Duende Identity Server calls SplitQueries() to deal with cartesian explosions, so I assume.. but that breaks each call apart. So, we were able to make it significantly faster using Redis and modifying the ICache implementation in IDS... Now, we want to use KeyDB. This is extremely important to us. We need this to be insanely fast. We have microservices and multiple customer IDPs... We also rolled out our own Admin/APIs. It's all connected for us.

A couple of other things I would add.. we are a multi-tenant company where we provide our client's authentication to our services. We deeply customize our system and Identity Server is the only product capable of this. We have Token Exchanges, and our customers have tons of scopes. That's why we are seeing this issue. We also operate in Azure.

Sep 28 '23 03:09 mrjamiebowman

@josephdecock This was just brought up.. but one of the reasons Memory Cache won't work for us.. is that.. when we do a deployment ..we have to bust the cache and if we use In Memory.. then we have to restart all of those pods in Kubernetes. We have a very large implementation of IDS being that we have 500 customers whom we are interfacing with their IDPs, with custom roles, and claims.

Oct 12 '23 14:10 mrjamiebowman

@mrjamiebowman and @hugoqribeiro - thanks for your feedback on this issue!

Our current plans are that, while we might revisit the design of the cache in a future major release, for the short term (including the upcoming 7.0 release), we're not going to make changes, since that would probably involve rather significant breaking changes.

For the short-to-medium term, our advice is to do one of two things:

Use in memory cache and accept inconsistencies. Our opinion is that for most use cases, "this is fine, actually". Our experience is that generally configuration changes tolerate eventual consistency quite well. If you add new scopes or clients or other configuration items, the caches will miss, and those new items will be retrieved and added to cache. Most other configuration changes that involve an active client already need to be done in a pretty carefully controlled way. You're going to be coordinating with that client application so that you don't disrupt them (probably you would add new configuration, wait for them to start using the new configuration, and then remove old configuration if necessary). The addition of a few minutes of cache latency doesn't change that process significantly.
If you really do need consistent caches (perhaps there's a self-service UI for making configuration changes, and you can't convey the eventual consistency in the UI?) we recommend that you implement a custom store. A custom store gives you the ability to optimize for your particular environment in a way that IdentityServer can't - since we're abstracted from the particular database engine that you're using. It also allows you to eschew our cache abstractions entirely, and provide your own caching implementation within your custom store.

Nov 10 '23 20:11 josephdecock

Since the internal Duende cache is already quite sophisticated, we are using it as is in our environment (active-active, geo redundant in Azure). We added cache invalidation with Redis pub/sub based on entity keys ( DefaultCache<T>.GetKey - having it public instead of protected would have been nice.) where each instance performing write-requests publishes the respective entity- key, other instances subscribe to that event and clear it from the local memory cache. On subsequent read-requests, it would get re-added to cache using the Duende caching mechanism.

May 13 '24 06:05 4865783a5d

IdentityServer IdentityServer copied to clipboard

review ICache<T> and IDistribiutedCache

TLDR;

Problem

Proposed Solution

ICache.cs

Our Redis Cache implementation looks like this

IDS: CachingResourceStore.cs

Outcome

IdentityServer
IdentityServer copied to clipboard