Rebuild All Couchbase N1QL Indexes After Restore


When restoring a Couchbase cluster from a backup, the restore utility is kind enough to recreate the N1QL indexes for you.  To improve speed and efficiency, the indexes are only created, they are not built automatically.  Before they can be used, you must execute a build command such as this:

BUILD INDEX ON BucketName (IndexName1, IndexName2, IndexName3)

It is important that this query be issued as a single command for all indexes on a bucket.  This allows the indexes to be built together, resulting in only one read of the data from the cluster while building multiple indexes.

The Problem

Unfortunately, N1QL doesn’t currently offer a wildcard option, so there is no quick way to rebuild all indexes without typing the all of their names.  If you’re trying to script your environments for development or QA this can be particularly problematic, as the list of indexes may not be constant. It could also be a problem when creating scripts for a disaster recovery plan.

The Solution

If you’re running on Linux (you should be for production clusters), the solution is to use this script:



for i in "BucketName1" "BucketName2" "BucketName3"
  /opt/couchbase/bin/cbq -e $QUERY_HOST -s="$( \
    echo "BUILD INDEX ON $i (\`$( \
      /opt/couchbase/bin/cbq -e $QUERY_HOST -q=true -s="SELECT name FROM system:indexes where keyspace_id = '$i' AND state = 'deferred'" | \
        sed -n -e '/{/,$p' | \
        jq -r '.results[].name' | \
        sed ':a;/.*/{N;s/\n/\`,\`/;ba}')\`)")"

  # Wait for completion
  until [ `/opt/couchbase/bin/cbq -e $QUERY_HOST -q=true -s="SELECT COUNT(*) as unbuilt FROM system:indexes WHERE keyspace_id = '$i' AND state <> 'online'" | sed -n -e '/{/,$p' | jq -r '.results[].unbuilt'` -eq 0 ];
    sleep 5

Replace the QUERY_HOST parameter as needed to point to a query node, and replace the BucketName values with the names of the buckets where indexes must be built.  It will process each bucket one at a time, waiting for the indexes to be built before continuing to the next bucket.

The only depedency is the jq utility, which is a command line JSON parser. On Ubuntu, this can be installed via:

sudo apt-get install -y jq

The script isn’t pretty, but it gets the job done. Hopefully N1QL will get a wildcard BUILD INDEX command in the future.

Note: Revised 9/15/2016 to better strip header information from the query output before the JSON object. Previously it only stripped the first line, now it strips all lines until it encounters the curly brace starting the JSON result.

Couchbase Server and Windows 10 Anniversary Edition Problems

The Problem

Recently, I ran into some problems with my Couchbase Server 4.5 installation on my Windows 10 development box. The memcached process would crash over and over again with an error code 255.

After doing some research (and getting some assistance, thanks @ingenthr), I determined it’s a known bug in Couchbase Server introduced by the recent release of Windows 10 Anniversary Edition. Apparently, Couchbase Server uses a third party library which incorrectly uses some private Windows APIs for memory allocation. The Windows 10 Anniversary Edition update removed these API calls, causing the crashes. The bug report is filed with Couchbase as MB-20519.

The Workaround

The only known direct workaround is to uninstall the Windows 10 Anniversary Update. Personally, I don’t find this to be a very good solution. Additionally, based on the bug report, I’m not optimistic about a quick fix from Couchbase. It seems like there’s a lot of work involved, and it understandably isn’t urgent because Windows is only supported for development, not production.

I decided instead to play with Docker, and I was very pleasantly surprised at how easy it was to use Docker to get Couchbase Server running on a Windows box. It only took me a few minutes.

  1. Be sure that Hyper-V is installed on your machine via “Turn Windows features on or off” in Control Panel
  2. Install Docker for Windows (I used the Stable Channel)
  3. Start Docker (I did this as the last step of the installation)
  4. Right click the Docker icon in your system tray (next to the clock), and open Settings.  Go to Shared Drives, and share your C drive.  This will require your WIndows password.
  5. Open Powershell and run this command to make a data folder:

    mkdir $env:userprofile\Couchbase

  6. Then run this command to startup the Docker container:

    docker run -d --name db -p 8091-8094:8091-8094 -p 11207:11207 -p 11210-11211:11210-11211 -p 18091-18093:
    18091-18093 -v $env:userprofile/Couchbase:/opt/couchbase/var couchbase

  7. Once complete, open http://localhost:8091/ to complete server configuration


This configuration will always create the Docker container with the latest version of Couchbase Server, currently 4.5.  Command line arguments can be used to alter this, see the Docker pages for Couchbase for more information.

This configuration puts all Couchbase data in your C:\Users\myusername\Couchbase folder.  If you remove the Docker container and recreate, it will start up with your configuration and data already intact.  If you want to start from scratch, delete this folder before recreating the Docker container.

There are a few of compatibility requirements for this solution:

  1. Hyper-V is incompatible with VirtualBox. If you are using VirtualBox, you should use a different solution.
  2. The client and management ports used by Couchbase must be available on your local machine.
  3. This setup only supports running a single Couchbase node, otherwise there would be network port contention.

Testing an SDK for Async/Await SynchronizationContext Deadlocks


The purpose of this post is to explain how to write unit and/or integration tests that ensure you don’t have synchronization context deadlocks in an SDK. A very detailed explanation of the problem and the solution can be found here. However, before explaining how to write the tests, I will give a brief summary of the problem and the solutions. Then we’ll get into how to test to make sure that the solution is correctly implemented now and won’t regress in the future.

The Problem

One of the common API developer pitfalls in asynchronous development with the .Net TPL is deadlocks. Most commonly, this is caused by SDK consumers using your asynchronous SDK in a synchronous manner. For example:

public ActionResult Index()
    // Note to consumers: Where possible DON'T DO THIS.  Just make your MVC action async, it works MUCH better.
    var data = api.SomeActionAsync().Result;

    return View(data);

The example above is typically a problem because MVC runs actions inside a SynchronizationContext. The MVC synchronization context prevents more than one thread from operating within the context simultaneously. The process flow works as follows:

  1. Thread A runs the action above, and requests “SomeActionAsync”
  2. Thread A blocks waiting on “Result” for “SomeActionAsync”
  3. Thread B begins processing the work for “SomeActionAsync”
  4. At some point, Thread B attempts to synchronize onto the SynchronizationContext from the MVC action, and is blocked waiting for Thread A to release it.
  5. We have a deadlock!

So why does Step 4 above happen? I know I didn’t write any code that requested that the MVC SynchronizationContext be used! Well, if you are using the async/await programming model, you’re doing so without even knowing it.

public async Task<SomeResult> SomeActionAsync()
    // do some work

    var temp = await obj.SomeOtherActionAsync();

    // do more work

    return result;

The await above automatically tries to synchronize with the SynchronizationContext. This is actually pretty important for writing async MVC actions. When an async method is awaited in the action, once it’s complete we really want the remainder of the action to run within the SynchronizationContext. But within our SDK, we probably don’t want that to happen because it usually doesn’t have any value to us.

The Solution

There are two solutions to this problem.  The good one, and the bad one.

The Bad Solution: Require that SDK consumers use the SDK asynchronously

I don’t like this solution because it’s difficult to ensure that consumers use it correctly. It’s also a barrier to SDK use. I really believe that consumers should be given the option to consume the SDK how they like, even if it’s a bad way to consume it. The TPL provides a .Result call, so where possible we should make it work.

public async Task<ActionResult> Index()
    var data = await api.SomeActionAsync();

    return View(data);

An important note for SDK consumers, though. For you, this is the Good Solution. You should always use asynchronous API calls in an asynchronous manner whenever possible. This is only a Bad Solution if the SDK developer is assuming that you always do this.

The Good Solution: Fix the problem on the SDK side

Thankfully, the TPL provides us with a simple workaround in the SDK, using ConfigureAwait(false). Calling this routine before awaiting a task will cause it to ignore the SynchronizationContext.

public async Task<SomeResult> SomeActionAsync()
    // do some work

    var temp = await obj.SomeOtherActionAsync().ConfigureAwait(false);

    // do more work

    return result;

Most Importantly, The Test

The problem with the Good Solution is that it requires you to place a lot of ConfigureAwait(false) calls throughout the SDK. This can be cumbersome and easy to forget. Though there is a Resharper plugin to help, ConfigureAwaitHelper.

Any good SDK comes with a battery of unit and integration tests. So the trick is to add tests to the SDK that will ensure that we don’t forget any ConfigureAwait(false) calls. So how do we add a test or tests that ensure we called ConfigureAwait(false)?

The trick is understanding how the SynchronizationContext works. Anytime it is used, a call will be made to either it’s Send or Post method. So, all we need to do is make a mock and ensure that these methods never get called:

public void Test_SomeActionNoDeadlock()
    // Arrange
    var context = new Mock<SynchronizationContext>
        CallBase = true

    // Do other arrange actions here

        // Act

        // Assert

        // If the method is incorrectly awaiting on the current SynchronizationContext
        // We will see calls to Post or Send on the mock

        context.Verify(m => m.Post(It.IsAny<SendOrPostCallback>(), It.IsAny<object>()), Times.Never);
        context.Verify(m => m.Send(It.IsAny<SendOrPostCallback>(), It.IsAny<object>()), Times.Never);

This example uses NUnit and Moq, but it should work just as well with other testing frameworks. Now we have a way to guarantee that ConfigureAwait(false) was used throughout the SDK method, so long as we get 100% test coverage through the logical paths in the method.

Of course, you may ask “Why do I need this?  I just looked at the code, I’m always calling ConfigureAwait(false)!” The answer is preventing regressions. You might have remembered today, but next month when you’re making a change it’s very easy to forget. This test is your fallback plan in case you make a mistake in the future.

Query NoSQL From .Net Using Couchbase and LINQ

Historically, one of the biggest hurdles dealing with NoSQL solutions has been querying the data.  In particular, the learning curve for new developers trained in SQL has been very steep.

Couchbase recently helped to address this limitation with the release of N1QL (a SQL variant) in Couchbase Server 4.0.  N1QL vastly expands upon both the power of Couchbase as well as the ease with which developers can query the data.  Because the language is so similar to SQL, developers can quickly make the transition.  As an example, at CenterEdge we recently hired a new developer, and we had him up and working with N1QL queries within a day.

Now, Couchbase has announced the release of their Couchbase LINQ Provider for .Net.  The provider is available as a Nuget package which adds support to the existing SDK.

For .Net developers, using the LINQ provider for Couchbase makes the learning curve even shallower.  With just a few exceptions, developers trained to write queries using LINQ with NHibernate or Entity Framework can make the transition without learning any new syntax.  Plus, due to Couchbase’s JSON document data structure, the LINQ provider adds support for a variety of N1QL features that aren’t possible with more traditional table-based data stores.

So, if you’re a .Net developer interested in NoSQL and big data, I strongly encourage you to check it out.  It’s a big step in bringing the performance and scalability of NoSQL to the masses without the headaches.

Disclaimer:  I’m a extra excited about this release because it is an open source project I was able to contribute to.  You can see the source code on GitHub.

Couchbase and N1QL Security

As a developer at CenterEdge Software, I’ve had a lot of cause to use Couchbase as our NoSQL databasing platform over the last few years.  I’ve gotten really excited about the potential of the new Couchbase query language in Couchbase 4.0, called N1QL.  So excited that I’ve spent a lot of time contributing to the Linq2Couchbase library, which allows developers to use LINQ to transparently create N1QL queries.

In doing work with N1QL, I quickly realized that it may have some of the same security concerns as SQL.  In particular, N1QL injection could be a new surface area for attack in Couchbase 4.0.  That’s what I call the N1QL equivalent of SQL injection.  I found that while the risks are lower in N1QL than in SQL, there are still some areas that need to be addressed by application developers using Couchbase.

As a result, I did some research and recently wrote a guest post on N1QL security for Couchbase users.  It researches possible N1QL injection security concerns, then goes into how to protect your applications when using N1QL.