Multithreaded code is easy. And very hard. It’s easy when you learn a few patterns and perfect your use of them. And then it’s wickedly difficult anytime you get confident and stray outside of those bounds and try to do something awesome. This seems to happen a few times per year.

I recently had a deadlock in my Paint.NET 4.0 code that I’m still untangling*. I have a custom interop layer for handling Direct2D, DirectWrite, UI Animation Manager, and Windows Imaging Component. There are 3 systems that constantly headbutt into each other:

  1. RCW caching. “RCW” stands for Runtime Callable Wrapper, where “runtime” refers to “.NET Runtime.” These is a managed object that holds onto a COM object reference, and provides proxy methods and properties. Whenever a COM object passes over into .NET land, a wrapper must be created. However, in order to maintain object identity, you need to cache these. This way, you can use Object.ReferenceEquals(a, b) to see if ‘a’ and ‘b’ refer to the same COM object. This RCW cache became especially important when I added support for the UI Animation Manager. It raises a lot of events, and you need to be able to know which object is raising the event in order to determine what to do. That isn’t possible if every ‘sender’ object is a brand new RCW.
  2. Deferred resource initialization and caching. In Direct2D, when you create a “solid color brush” (for example), you’re allocating a resource on a specific render target. You can think of this as a “physical” resource. However, managing the lifetime of these can get a bit tedious. You have two choices. The first is to create the resource, draw with it, and then immediately delete it. This results in simple, clear code, but it’s not very good for performance. So, you want to reuse these resources across your OnPaint() calls, and you need to dispose them whenever the render target needs to be recreated. This isn’t difficult, but multiplied across a lot of code, you end up with a lot of bug hazards. So, I came up with a system whereby I can just create a brush descriptor, which implements the same ISolidColorBrush managed interface, and pass it to the same methods on the Direct2D render target wrapper which take “physical” resources. Before the real drawing calls are invoked (e.g., ID2D1RenderTarget::FillRectangle(brush, rect)), it first checks to see if it’s a deferred resource and whether the physical counterpoint is in the cache.
  3. Change notification for deferred resources. Some deferred resources must encapsulate other deferred resources. For instance, a LinearGradientBrush contains a GradientStopCollection. In my deferred resource system you can make changes to the deferred resource (“Color” has a property setter, etc.), whereas the hardware resources are read-only (as per Direct2D’s definition). When you make a change to a GradientStopCollection, it invalidates itself and must also cause its containing LinearGradientBrush (if any) to be invalidated. In order to simplify object lifetime management and avoid memory leaks, this uses resource IDs and weak references. These events go through an event router, e.g. “I’m resource #15, please send a NotifyChanged event to my subscribers.” as well as, “I’m resource #16 and I want to know whenever resource #15 is changed or disposed.” The router then has a list of weak references to resources which it notifies.

All of these caches use reader/writer locks. Usually a resource has already been created, so we can just grab the read lock and be done with it. The problem I ran into, once I started to use Direct2D across multiple threads, was that my “GetPhysicalResource” method in the resource cache looked something like this:

private ReaderWriterLockSlim rwLock;
private Dictionary<ResourceDescriptor, Resource> cache;

Resource GetPhysicalResource(ResourceDescriptor key)
{
    this.rwLock.EnterReadLock();

    try
    {
        Resource resource;
        if (this.cache.TryGetValue(key, out resource))
            return resource;
    }

    finally
    {
        this.rwLock.ExitReadLock();
    }

    this.rwLock.EnterWriteLock();

    try
    {
        Resource resource;
        if (this.cache.TryGetValue(key, out resource))
            return resource;
        else
        {
            resource = key.CreatePhysicalResource(this); // ruh roh
            this.cache.Add(key, resource);
            return resource;
        }
    }
   
    finally
    {
        this.rwLock.ExitWriteLock();
    }
}

I’ve highlighted the problematic piece of code (well, I tried to). The reason this line of code is bad is because it’s invoking a callback method while holding a lock (it’s a virtual method). The method it’s calling may need to create RCWs, or create other physical resources, etc. in order to finish the request. When this is happening on multiple threads we wind up with these 3 systems all trying to enter various locks in various orders. Woops. And, I might add, duh.

It took me awhile to figure out what was going on, and it was a face palm moment when I finally did: I had just committed a Multithreading 101 fallacy. It was just a simple deadlock, and nothing nefarious such as a bug in ReaderWriterLockSlim, or a solar flare, or a glitch hiding somewhere in ~250,000 lines of code. Nope, it was right there, plain as day, and easy to understand. Sometimes it really is the simplest answer possible. It’s like the guy who can’t get his car to start, and is trying and trying and trying, but he just can’t get it to start and can’t figure out why. Then his 8-year old daughter comes outside and says something simple like, “Maybe it’s out of gas!” And it is. Derp de derp.

Clearly, we don’t want to invoke the callback method while holding the lock. But, we must update the cache! Tossing ‘null’ in there as a placeholder isn’t really going to work well, as we’ll just have 2 problems at that point (lots of extra synchronization, and messy error handling).

.NET 4.0 introduces a class called Lazy<T>. You give it a Func<T> in the constructor, and it provides a Value property for you to retrieve the result of executing that delegate. However, it won’t execute the delegate until the Value property’s getter is called for the first time.

Let’s use it:

private ReaderWriterLockSlim rwLock;
private Dictionary<ResourceDescriptor, Lazy<Resource>> cache;

Resource GetPhysicalResource(ResourceDescriptor key)
{
    this.rwLock.EnterReadLock();

    Lazy<Resource> lazyResource;
    try
    {
        this.cache.TryGetValue(key, out lazyResource);
    }

    finally
    {
        this.rwLock.ExitReadLock();
    }

    if (lazyResource != null)
        return lazyResource.Value;

    this.rwLock.EnterWriteLock();

    try
    {
        if (!this.cache.TryGetValue(key, out lazyResource))
        {
            lazyResource = new Lazy<Resource>(() => key.CreatePhysicalResource(this)); // better!
            this.cache.Add(key, lazyResource);
        }
    }
   
    finally
    {
        this.rwLock.ExitWriteLock();
    }

    if (lazyResource!= null)
        return lazyResource.Value;
}

This solves the problem. Locks are only used when working with the cache itself, and never while doing other work. The callback is always called outside of holding a lock, and there’s no hijinks involved in checking to see whether something is null (as a placeholder), and no need for complicated synchronization if you retrieve the resource from the Dictionary while it’s still being created by another thread (Lazy<T>.Value will block other threads). Error handling is also simple. If the value generator throws an exception, well then we’ll crash (ok ok, my error handling is actually a little more interesting than that – you don’t want a “broken” resource to remain in the cache, for instance.)

There is one bit of weirdness with this system. If thread 1 creates the Lazy<T> and adds it to the cache, it’s possible that thread 2 will retrieve it from the cache and use Lazy<T>.Value before thread 1 gets to it. So, you may end up with misattribution for an error in case your Func<T> throws an exception. However, this seems minor – all requests for the same resource descriptor can be treated as equivalent, so it really doesn’t matter which thread gets the exception. (Ok, I’m simplifying, because this depends heavily on context. I’ve made the executive decision to live with thread misattribution of exceptions instead of deadlocks.)

This code is, of course, abbreviated. I’m actually using a custom class I wrote called LazyResult<T, TArg> because I don’t want to allocate a new anonymous delegate every time I need to create a resource. It uses a Func<TArg, T> instead of a Func<T>, and you provide a TArg to its constructor. With this class I’m able to allocate one delegate per cache, and each LazyResult<T, TArg> takes that same delegate but with a different “TArg” (either the ResourceDescriptor for the resource cache, or a pointer to the COM object for the RCW cache). This cuts the object allocations in half. There is a bit more error handling involved, and I’m also using WeakReferences to LazyResult.

Anyway, I found this useful. Hopefully someone else will too.

* mostly because I haven’t had much time to work on it :(

About these ads