Testing an SDK for Async/Await SynchronizationContext Deadlocks

Overview

The purpose of this post is to explain how to write unit and/or integration tests that ensure you don’t have synchronization context deadlocks in an SDK. A very detailed explanation of the problem and the solution can be found here. However, before explaining how to write the tests, I will give a brief summary of the problem and the solutions. Then we’ll get into how to test to make sure that the solution is correctly implemented now and won’t regress in the future.

The Problem

One of the common API developer pitfalls in asynchronous development with the .Net TPL is deadlocks. Most commonly, this is caused by SDK consumers using your asynchronous SDK in a synchronous manner. For example:

public ActionResult Index()
{
    // Note to consumers: Where possible DON'T DO THIS.  Just make your MVC action async, it works MUCH better.
    var data = api.SomeActionAsync().Result;

    return View(data);
}

The example above is typically a problem because MVC runs actions inside a SynchronizationContext. The MVC synchronization context prevents more than one thread from operating within the context simultaneously. The process flow works as follows:

  1. Thread A runs the action above, and requests “SomeActionAsync”
  2. Thread A blocks waiting on “Result” for “SomeActionAsync”
  3. Thread B begins processing the work for “SomeActionAsync”
  4. At some point, Thread B attempts to synchronize onto the SynchronizationContext from the MVC action, and is blocked waiting for Thread A to release it.
  5. We have a deadlock!

So why does Step 4 above happen? I know I didn’t write any code that requested that the MVC SynchronizationContext be used! Well, if you are using the async/await programming model, you’re doing so without even knowing it.

public async Task<SomeResult> SomeActionAsync()
{
    // do some work

    var temp = await obj.SomeOtherActionAsync();

    // do more work

    return result;
}

The await above automatically tries to synchronize with the SynchronizationContext. This is actually pretty important for writing async MVC actions. When an async method is awaited in the action, once it’s complete we really want the remainder of the action to run within the SynchronizationContext. But within our SDK, we probably don’t want that to happen because it usually doesn’t have any value to us.

The Solution

There are two solutions to this problem.  The good one, and the bad one.

The Bad Solution: Require that SDK consumers use the SDK asynchronously

I don’t like this solution because it’s difficult to ensure that consumers use it correctly. It’s also a barrier to SDK use. I really believe that consumers should be given the option to consume the SDK how they like, even if it’s a bad way to consume it. The TPL provides a .Result call, so where possible we should make it work.

public async Task<ActionResult> Index()
{
    var data = await api.SomeActionAsync();

    return View(data);
}

An important note for SDK consumers, though. For you, this is the Good Solution. You should always use asynchronous API calls in an asynchronous manner whenever possible. This is only a Bad Solution if the SDK developer is assuming that you always do this.

The Good Solution: Fix the problem on the SDK side

Thankfully, the TPL provides us with a simple workaround in the SDK, using ConfigureAwait(false). Calling this routine before awaiting a task will cause it to ignore the SynchronizationContext.

public async Task<SomeResult> SomeActionAsync()
{
    // do some work

    var temp = await obj.SomeOtherActionAsync().ConfigureAwait(false);

    // do more work

    return result;
}

Most Importantly, The Test

The problem with the Good Solution is that it requires you to place a lot of ConfigureAwait(false) calls throughout the SDK. This can be cumbersome and easy to forget. Though there is a Resharper plugin to help, ConfigureAwaitHelper.

Any good SDK comes with a battery of unit and integration tests. So the trick is to add tests to the SDK that will ensure that we don’t forget any ConfigureAwait(false) calls. So how do we add a test or tests that ensure we called ConfigureAwait(false)?

The trick is understanding how the SynchronizationContext works. Anytime it is used, a call will be made to either it’s Send or Post method. So, all we need to do is make a mock and ensure that these methods never get called:

[Test]
public void Test_SomeActionNoDeadlock()
{
    // Arrange
    var context = new Mock<SynchronizationContext>
    {
        CallBase = true
    };

    // Do other arrange actions here

    SynchronizationContext.SetSynchronizationContext(context.Object);
    try
    {
        // Act
        class.SomeActionAsync().Wait();

        // Assert

        // If the method is incorrectly awaiting on the current SynchronizationContext
        // We will see calls to Post or Send on the mock

        context.Verify(m => m.Post(It.IsAny<SendOrPostCallback>(), It.IsAny<object>()), Times.Never);
        context.Verify(m => m.Send(It.IsAny<SendOrPostCallback>(), It.IsAny<object>()), Times.Never);
    }
    finally
    {
        SynchronizationContext.SetSynchronizationContext(null);
    }
}

This example uses NUnit and Moq, but it should work just as well with other testing frameworks. Now we have a way to guarantee that ConfigureAwait(false) was used throughout the SDK method, so long as we get 100% test coverage through the logical paths in the method.

Of course, you may ask “Why do I need this?  I just looked at the code, I’m always calling ConfigureAwait(false)!” The answer is preventing regressions. You might have remembered today, but next month when you’re making a change it’s very easy to forget. This test is your fallback plan in case you make a mistake in the future.

Query NoSQL From .Net Using Couchbase and LINQ

Historically, one of the biggest hurdles dealing with NoSQL solutions has been querying the data.  In particular, the learning curve for new developers trained in SQL has been very steep.

Couchbase recently helped to address this limitation with the release of N1QL (a SQL variant) in Couchbase Server 4.0.  N1QL vastly expands upon both the power of Couchbase as well as the ease with which developers can query the data.  Because the language is so similar to SQL, developers can quickly make the transition.  As an example, at CenterEdge we recently hired a new developer, and we had him up and working with N1QL queries within a day.

Now, Couchbase has announced the release of their Couchbase LINQ Provider for .Net.  The provider is available as a Nuget package which adds support to the existing SDK.

For .Net developers, using the LINQ provider for Couchbase makes the learning curve even shallower.  With just a few exceptions, developers trained to write queries using LINQ with NHibernate or Entity Framework can make the transition without learning any new syntax.  Plus, due to Couchbase’s JSON document data structure, the LINQ provider adds support for a variety of N1QL features that aren’t possible with more traditional table-based data stores.

So, if you’re a .Net developer interested in NoSQL and big data, I strongly encourage you to check it out.  It’s a big step in bringing the performance and scalability of NoSQL to the masses without the headaches.

Disclaimer:  I’m a extra excited about this release because it is an open source project I was able to contribute to.  You can see the source code on GitHub.

Couchbase and N1QL Security

As a developer at CenterEdge Software, I’ve had a lot of cause to use Couchbase as our NoSQL databasing platform over the last few years.  I’ve gotten really excited about the potential of the new Couchbase query language in Couchbase 4.0, called N1QL.  So excited that I’ve spent a lot of time contributing to the Linq2Couchbase library, which allows developers to use LINQ to transparently create N1QL queries.

In doing work with N1QL, I quickly realized that it may have some of the same security concerns as SQL.  In particular, N1QL injection could be a new surface area for attack in Couchbase 4.0.  That’s what I call the N1QL equivalent of SQL injection.  I found that while the risks are lower in N1QL than in SQL, there are still some areas that need to be addressed by application developers using Couchbase.

As a result, I did some research and recently wrote a guest post on N1QL security for Couchbase users.  It researches possible N1QL injection security concerns, then goes into how to protect your applications when using N1QL.

http://blog.couchbase.com/2015/september/couchbase-and-n1ql-security-centeredgesoftware

Windows Domain Account Lockout Mystery

In addition to development, I sometimes get saddled with some domain administration.  We recently encountered a strange mystery, where a user’s account was being locked out every day as soon as they booted up their computer.  They hadn’t even tried to login yet, but their account was being magically locked out.

After lots of research, all of the obvious solutions were excluded.  We finally tracked it down by turning on Kerberos logging on the client computer (http://support.microsoft.com/kb/262177).  We then found Event ID 14, stating “The password stored in Credential Manager is invalid“.  But there were no passwords stored in the Credential Manager!

At this point, we found this very helpful forum discussion that explains it: http://social.technet.microsoft.com/Forums/windows/en-US/e1ef04fa-6aea-47fe-9392-45929239bd68/securitykerberos-event-id-14-credential-manager-causes-system-to-login-to-network-with-invalid?forum=w7itprosecurity.

Apparently, your user account credentials can get saved to the SYSTEM (a.ka. local computer) account on the computer.  Once there, you can’t access it through any normal UI to remove it.  We think this probably had something to do with our RADIUS auth on the WiFi network, but we’re not sure.  Fortunately, the instructions in the post were spot on.

Download PsExec.exe from http://technet.microsoft.com/en-us/sysinternals/bb897553.aspx and copy it to C:\Windows\System32 .

From a command prompt run:    psexec -i -s -d cmd.exe

From the new DOS window run:  rundll32 keymgr.dll,KRShowKeyMgr

The only additional note I would add is that you need to run the command prompt as an Administrator, if you have UAC enabled.

Correcting MVC 3 EditorFor Template Field Names When Using Collections

So, I recently ran into a problem with ASP.NET MVC 3 and editor templates when dealing with models that contain collections. If you handle the collection directly in the view, it works fine:

@For i as integer = 0 To Model.Locations.Count-1
    Dim index As Integer = i

    @<div>
        @Html.EditorFor(Function(model) model.Locations(i).Name)
        @Html.EditorFor(Function(model) model.Locations(i).Description)
    </div>
Next

In the above example, the fields will receive the names “Locations[0].Name” and “Locations[0].Description”, and so on for each field. This is correct, and will result in the model begin parsed correctly by the DefaultModelBinder if it’s a parameter on your POST action.

However, what if you want to make an editor template that always gets used for a specific collection type?  Let’s say, for example, that the collection in the model is of type LocationCollection, and you make an editor template named LocationCollection:

@ModelType LocationCollection
@For i as integer = 0 To Model.Count-1
    Dim index As Integer = i
    @<div>
        @Html.EditorFor(Function(model) model(i).Name)
        @Html.EditorFor(Function(model) model(i).Description)
    </div>
Next

Then you reference it in your view like this:

@Model.EditorFor(Function(model) model.Locations)

In this case, the fields will actually have the incorrect names on them, the will be named “Locations.[0].Name” and “Locations.[0].Description”.  Notice the extra period before the array index specifier.  With these field names, the model binder won’t recognize the fields when they come back in the post.

The solution to this problem is a little cumbersome, due to the way the field names are built.  First, the prefix is passed down to the editor template in ViewData.TemplateInfo.HtmlFieldPrefix.  However, this prefix is passed down WITHOUT the period.  The period is added as a delimiter by the ViewData.TemplateInfo.GetFullHtmlFieldName function, which is used by all of the editor helpers, like Html.TextBox.

This means that we actually can’t fix it in the editor template shown above.  Instead, we need TWO editor templates.  The first one for LocationCollection, and then a second one for Location:

@ModelType LocationCollection
@For i as integer = 0 To Model.Count-1
    Dim index As Integer = i
    @Html.EditorFor(Function(model) model(index))
Next
@ModelType Location
@Code
    ViewData.TemplateInfo.FixCollectionPrefix()
End Code
<div>
    @Html.EditorFor(Function(model) model(i).Name)
    @Html.EditorFor(Function(model) model(i).Description)
</div>

By breaking this up into two editor templates, we can now correct the prefix in the second template.  The first editor template receives an HtmlFieldPrefix of “Locations”, so we can’t do anything with that.  However, the Location editor template receives an HtmlFieldPrefix of “Locations.[0]“, so all we need to do is remove the extra period and the problem is solved.  As you can see in my example above, I’m calling a method on TemplateInfo, FixCollectionPrefix.  This is a simple extension method which corrects prefix, which you just need implement somewhere in an imported namespace:

<Extension()>
Public Sub FixCollectionPrefix(templateInfo As TemplateInfo)
Dim prefix As String = templateInfo.HtmlFieldPrefix
If Not String.IsNullOrEmpty(prefix) Then
templateInfo.HtmlFieldPrefix = Regex.Replace(prefix, ".(\[\d+\])$", "$1")
End If
End Sub

And that’s all there is to it. I certainly hope that the MVC team fixes this problem internally for their next release, but until then we’ll just have to keep working around it.