Thread-Safe Auth Token Store Using ConcurrentDictionary and AsyncLazy

13 minute read

Recently I needed to implement a caching store for JWTs in an ASP.NET Core service I’m working on. There’s not really a ready made library available for this that I could find, and most if not all the solutions I ran across in my searching relied on thread locking, had inherent bottlenecks, just generally did not play as well with async/await, or just felt very clunky. I did however run across some resources that sparked an idea, and I decided to see if I could tie them all together into a different solution.

Adding Auth Tokens via a DelegatingHandler

Usually when you work with Bearer token-style auth with REST APIs in .NET, you will want to add that token to your request via a DelegatingHandler. Combined with using IHttpClientFactory to build HttpClient instances for typed API clients, you can automatically provide auth tokens for all calls made via that HttpClient.

A typical handler could look like the following:

Note the ITokenStore provided via this handler’s constructor. I’ll start covering that more in the next section. To configure an HttpClient to use this handler, we need to add this to Startup.cs:

A Naive Approach to a Token Store

One simple way to implement ITokenStore would be to, somewhat ironically, not store any tokens at all. We would just request a new token each time that GetTokenAsync() is called, getting a TokenResponse back from ITokenStore.GetTokenAsnc() (all shown below).

It would work, but we’d be taking the performance cost of the token retrieval call every time we make a call to the downstream API. At this point, NaiveTokenStore isn’t providing much value over just using the instance of ITokenService directly. In order to make our system more performant, and to reduce the load and number of requests on the token server, we would want to introduce some sort of caching for our tokens.

Challenges With Caching

Other than all the standard challenges typically discussed with caching (lifetimes, revocation, etc.), we have one major problem we will need to consider: we can end up with a race condition if we handle multiple requests that need the same token (all using the same downstream API, for example), and one has not yet been retrieved. Unless mitigated, this could result in multiple calls to retrieve a new token at the same time. While we would be reducing overall load thanks to caching, we would still have bursts of unnecessary activity at service start and when a token expires.

Depending on the typical load of our service, this problem could range from “not a big deal” to “we just simulated a denial of service attack against the token server.” Or, if the token server is provided by a third party, this might even result in us getting rate limited. We should also consider these calls as expensive (in terms of time as well using up socket handles) and limit them to only what’s needed.

You might have encountered similar problems when writing multi-threaded code before and think “well, I can use a lock to make sure only one of these requests is attempting to get a token at a time.” There are a few reasons why, in this scenario, this isn’t a good approach.

Why You Shouldn’t Use Locks or Blocking Code

In general, when working with async/await, it is important for any relatively long running code to be asynchronous. Synchronous code within an async method ends up preventing the scheduler from switching to other tasks while that work is being processed and hogs resources and threads. So that’s one strike.

Beyond that, there’s a limitation here enforced by the compiler. While we can actually use lock within an async method, we can’t await while inside of a lock block. This Stack Overflow answer by Eric Lippert1 covers it fairly succinctly, and it becomes clear if you think about how await works.

Say we have some code like this:

public async Task OuterMethodAsync()
{
    lock(someLock)
    {
        DoSynchronousThing();
        await DoAsynchronousThingAsync();
    }
}

When we await on an async method, the scheduler actually leaves the currently running context, then later re-enters when it checks back on the awaited task and sees that it has completed. This means that we are actually leaving the lock context here. This opens up a whole host of bad possiblities. As put by Eric Lippert:

Awaiting inside a lock is a recipe for producing deadlocks.

I’m sure you can see why: arbitrary code runs between the time the await returns control to the caller and the method resumes. That arbitrary code could be taking out locks that produce lock ordering inversions, and therefore deadlocks.

Worse, the code could resume on another thread (in advanced scenarios; normally you pick up again on the thread that did the await, but not necessarily) in which case the unlock would be unlocking a lock on a different thread than the thread that took out the lock. Is that a good idea? No.

So now we have strike two, this isn’t even allowed.

But what if we really wanted to still use a lock? One, don’t do this. But, for the sake of argument, let’s say we decided to eschew using await altogether and just get the .Result off our ITokenService.GetTokenAsync() call, or used some other way to force it to be synchronous. What would happen?

We’ve now returned to our strike one from above. Even worse, this long running code is now happening within a lock, meaning any other simultaneous requests are now also synchronously blocked waiting for their own chance to check for token existence and/or generating a new one, since you can’t await on a lock. Even in a modestly busy system, this is a recipe for exhausting the task scheduler and deadlocking our system. This is a Very Bad Idea™. Strike three.

Now, one possible option that keeps a similar idea and flow to lock could be to use SemaphoreSlim and WaitAsync to limit access in an awaitable way. However, we’d still have to limit this to a single thread at a time within the context of the semaphore, giving us somewhat of a bottleneck and still requiring some amount of double-checking on the presence of the token and/or whether it’s expired, and ends up being fairly ugly and error prone. So I tried to come up with a different solution that better meets our criteria.

ConcurrentDictionary and Lazy

In my searching, I ran across an interesting post by Andrew Lock covering a pattern used by the ASP.NET Core team that makes the GetOrAdd method of ConcurrentDictionary thread safe using Lazy. It’s a really good and in-depth post, and I highly recommended reading it.

Essentially, GetOrAdd has an overload that accepts a valueFactory function that’s used to generate a new value for the dictionary if one could not be found. It handles thread safety using locking, but it does not lock while running the valueFactory. This is intentional, as otherwise the valueFactory could end up unexpectedly (to other callers) blocking all threads trying to access the values in the dictionary. The lock happens when writing the resulting value to the dictionary, which ensures that the lock is only kept for a very short time and that all callers get the same value. What this means is that valueFactory could get run multiple times, even though only one result actually ends up in the dictionary.

This has negative effects if the function for valueFactory is ultimately expensive to run. For the ASP.NET Core team, this can definitely be true for cases such as setting up middleware for the request pipeline. And for our case, as we have already covered, we need to consider generating a token to be an expensive operation. Running valueFactory multiple times would result in a lot of wasted computation.

The ASP.NET Core team got around this problem by using a Lazy<T> as the actual value stored in the dictionary. If you haven’t encountered Lazy<T> and lazy initialization before, then Lazy<T> provides a way to defer initialization of an expensive resource. You provide the Lazy instance with an initializer that doesn’t get run until the Value property gets accessed for the first time.

By using Lazy<T> as the value that gets stored in ConcurrentDictionary, this means that the code to create the expensive resource (the middleware pipeline, our token result from the token request, etc.) is not actually run when valueFactory runs. It’s a neat trick, because creating instances of Lazy itself is fairly cheap in terms of memory and CPU usage. So, at worst, some extra instances of Lazy<T> get created, but only one of those gets added to the dictionary, and all accessors of the dictionary end up using a single generated instance of Lazy<T>, which will only be evaluated once (by the first caller to access .Value). The unused instances will get garbage collected in fairly short order.

This however doesn’t completely solve our problem, because while Lazy<T> defers execution of the initializer, it is still synchronous. Enter AsyncLazy<T>.

Adding AsyncLazy (and its Limitations)

There unfortunately is not currently a built-in AsyncLazy<T> type in .NET, though there has been some discussion around it. There are a few community solutions and some posts discussing the concept, with Stephen Toub2 providing one such implementation.

Like much of the discussion so far, this has some caveats. His solution always runs the initializer delegates on a thread pool thread, rather than trying to use the original synchronization context. This has implications for some scenarios, but especially for any case where one might be running on a UI thread. For our use case, this should mostly be okay. Keep in mind though that by doing so, we won’t have access to things like HttpContext.Current while executing within our AsyncLazy initializer, due to the fact that we’ll be on a different thread context than the originating request. For generating tokens though, this should be fine.

(Note that this is also what keeps this solution from being more general purpose. Other community/NuGet library solutions for AsyncLazy<T> make better attempts at maintaining the same synchronization context, so you can try investigating those solutions if that is something you need.)

What AsyncLazy provides for us is the ability to await the result of what is returned from GetOrAdd on the ConcurrentDictionary, thus not blocking the request thread and allowing the scheduler to operate as intended. With our implementation of AsyncLazy in hand, we can now start building our intended solution.

(Also note that this provides an implementation of GetAwaiter(), which allows us to directly await the AsyncLazy, rather than having to access its .Value property. Read more about that in Stephen’s post linked above.)

Putting it All Together

As mentioned before, the key here will be using GetOrAdd on the ConcurrentDictionary every time we try to access a cached auth token. In doing so, we let ConcurrentDictionary handle the thread synchronization and most of the race conditions for us. If more than one request thread determines we need to generate a token at once, the worst case is that we create extra instances of AsyncLazy, but with only one of those instances actually getting evaluted in the end. Our token request won’t start until the first thread actually tries to await that instance.

To walk through our GetTokenAsync() implementation, we start by calling _tokenCache.GetOrAdd(). At this point, if no token has been generated yet for tokenName, a new one will be requested. After that, we have to await the token entry to get the result out of the AsyncLazy (or kick off the call, if we’re the first request to evaluate it.)

After that, we have to check if the token is expired. If it is, we’ll need to request a new one. We eagerly call TryRemove here before generating the new token so other threads won’t get the expired token in the intervening time, and also so we can use GetOrAdd again with the intent of generating a new token.

There is a minor race condition here, in that a second request could hit the first GetOrAdd after our current request called TryRemove but before it calls the next GetOrAdd inside the token expiration check. This is okay though, because we do the same thing in both scenarios to generate a new token. So this still falls under the “might make additional instances of AsyncLazy, but only one of them will be evaluated” scenario and thus won’t cause any problems.

After we’re finished evaluating whether the token is expired, we actually need to await the cache entry one more time, in case a new one did get created by the second GetOrAdd. If we’re working with an existing token, the await should have negligible impact on accessing the value stored in the AsyncLazy. At this point, we should have a token and we can return token value to the caller. This results in some fairly concise code that ends up not having to explicitly worry about thread synchronization mechanics.

All we need to do now is update our Startup.cs to used CachedTokenStore and we’re all set.

One More Thing

I’ve left out some of the finer details, edge cases, and exception handling that you would of course need to account for in a production solution, but I do want to cover one particular scenario. Currently, if our async token retrieval fails, that failed result will remain as the entry in our store. This is because we’ve already added the corresponding AsyncLazy to our store for the given tokenName. This means that we would get an exception every time we await this failed entry.

So, in the case of failure, we need to make sure to remove the entry from the store, so the next request will again attempt to get a fresh token. To accomplish this, we need to make the following changes to our CachedTokenStore:

By using GetTokenEntryFromAsyncLazy() each time we need to await the cache entry, rather than awaiting it directly, we can ensure that we remove a bad entry if one is present. The next request that calls GetOrAdd will end up running TokenCacheFactory (our valueFactory for the dictionary) and request a new token.

Other Improvements to Consider

There are several improvements that could be made to this pattern. For example, it’s generally a good idea to include some kind of buffer when considering whether the token is expired. This would help alleviate race conditions where the token was valid when the DelegatingHandler executed, but then expired before the actual API call is made, thus causing an authentication issue with the downstream API. It also helps for long running or multi-step processes, so you don’t end up with scenarios where the token was valid for the early steps but not for the later ones. In this case, we would just change CachedTokenStore to consider a (preferrably configurable) buffer TimeSpan again the token’s expiration date, rather than directly checking token.IsExpired.

If your use case calls for it, you might even consider updating the tokens via a background process, so specific requests aren’t burdened with the occasional additional time of token retrieval. The use of AsyncLazy may not be as useful in this case, since there would be no need for the requests to await the get token call. But I’ll leave that as an exercise for the reader. 😉

  1. Eric Lippert was formerly on the C# compiler team and knows a thing or two about the internals of how async/await work. 

  2. Stephen Toub is a Partner Engineer at Microsoft working on .NET. Also see him on GitHub