The theme of Paint.NET v3.5 is … performance

I sat down to write some notes before starting this blog entry, and I wound up with two full pages in OneNote on the 1920×1200 monitor it was sitting in. The more I’ve been working on it the more I’m excited about the Paint.NET v3.5 release. It isn’t one that introduces a lot of really cool or big-ticket features, but the list of small improvements is really adding up. I’ve been able to do a lot of research and prototyping in esoteric areas of multithreading and concurrency, and have gained both more mastery and more fear for these topics.

Performance work in Paint.NET v3.5 has wound up focusing on 3 areas:

  1. Scaling up. As everyone’s been saying for years, the future is increasingly multithreaded. My newest CPU upgrade leaves me with 8 threads in Task Manager (Intel Core i7 overclocked to 3.8GHz). A lot of research and work has gone into making sure that Paint.NET continues to scale with more threads, and that I have better tools for safely and correctly implementing this across more of the application.

    ”A high-end 64-bit Intel Core i7 desktop should run Paint.NET very fast.”

     

  2. Scaling down. Those $300 netbooks that are taking everyone by storm only run about as fast as what I was using 7 years ago (Pentium 4 at 2.0 – 2.5ghz). Clearly, classic optimization strategies are important as well: trimming cycles, removing or deferring code execution, and optimizing repainting.

    “A brand-new netbook with an Atom processor should run Paint.NET comfortably.”

  3. Reducing memory usage. I guess this goes with scaling down. I made a bet a long time ago that 64-bit would slowly take care of the way I was allocating memory, which simplified development work but has had the consequence of consuming vast amounts of virtual address space . I was wrong: 32-bit will be here for a long time, especially since most of those hot-like-pancakes $300 netbooks are not 64-bit capable. This is currently my top reliability issue, as running out of memory causes Paint.NET to crash.

    ”It’s not all yours.”

I’ve had to split this discussion over several blog entries because otherwise it was too long and even I would have fallen asleep reading it. I’ll summarize the results here though:

  • Images open much faster, especially on single-core/single-thread systems. Actually, I already wrote about this, so go read that first 🙂
  • I ordered and assembled my own Atom-based mini-desktop (“nettop”), in order to keep myself honest as I was working on my Core 2 Quad QX6700 2.67 GHz monster and subsequently as I upgraded to a Core i7 920 2.66GHz overclocked to 3.8 GHz.
  • The selection renderer has been completely rewritten. No more dancing ants and no more GDI+ means much lower CPU usage and better performance with multiple CPU cores.
  • Much better CPU scaling for the image composition rendering pipeline using LINQ-esque functional programming and deferred execution techniques.
  • A rewritten “render cache” has resulted in an average of 30-50% less memory usage when opening multiple images, especially those with just a single layer (PNG, JPEG). This means fewer out of memory crashes, and the ability to open more images without out-of-memory errors.

Paint.NET v3.5 is a stepping stone towards a hopefully epic v4.0. I’m slowly rebuilding the application from the inside out, and it takes a lot of time to do the necessary research and development. About 2 years ago, right around the time I was preparing to release Paint.NET v3.0, I had this nagging feeling in the back of my head that said basically “ur doin’ it wrong”. My document model was wrong, my application model was brittle, and I just couldn’t implement really cool features without using up a ton of memory. I also couldn’t provide features like scripting or a better extensibility model (plugins) in a manner that was both safe and powerful.

However, I didn’t really know how to solve all of this at a scale lower than the 50,000-foot view. Since then I’ve been slowly piecing together the tools and knowledge that I’ll need to create the best version of Paint.NET ever – one that’s great both outside (for users) and inside (for developers).

Now, if you’ll excuse me, I’ve got to stop breaking things and start fixing them so that I can push out an alpha release.

Mid-January Progress Update on Paint.NET v3.5

I think it’s best to quote a private-message between myself and Ed Harvey on the forums:

I’ve got to stop breaking things before I start fixing them …

Paint.NET v3.5 is turning out to be more work than I originally anticipated! What started out as a “simple” rewrite of the selection rendering system has turned into a major refactor of large portions of the code base. I’m done a wholesale adoption of WPF’s mathematics primitives such as Point, Rect, Int32Rect, Vector, Size, and Matrix. These classes do a better job and are more consistent than GDI+’s Point, PointF, Rectangle, RectangleF, Matrix, etc. (I’m still befuddled as to why System.Drawing.Drawing2D.Matrix, which is six floats and 24 bytes, needs a Dispose() method. Give me a struct please.)

The goal is to make sure that the entire data flow from the selection tools to the selection renderer is as performant as possible. Right now rendering performance is not favorable compared to Paint.NET v3.36, but it’s steadily improving and there’s a lot of tricks left up my sleeve.

Speaking of WPF, I’m not using it for the UI, although I’ve been learning a lot more about it. I’m starting to come up with devious and evil plans for how I can use it a lot more in the future. I’m also realizing that a lot of the current codebase is doing things “the very hard way”, and that certain ideas implemented across multiple files and tens of lines of code can often be expressed in just 1 or 2 lines of XAML.

Oh, but I am using WPF for the About dialog. It was a good exercise and learning experience 🙂

I fixed the “can’t move a small selection” bug. The mouse input system for tools now uses double-precision floating point throughout, instead of integers. The problem here was that the tools were getting truncated mouse coordinates and even if you were zoomed in, and your 2×2 pixel selection was filling your whole monitor, you still couldn’t move the selection around in an intuitive way because the Move tool only got integers describing the mouse position in terms of image coordinates.

Tablet PC “ink” and “pressure” support is out. It was implemented in a very bizarre way and was seriously preventing further progress and bug fixes. I haven’t had any hardware to test this for at least 3 years, so it has always been a best-faith feature. Hopefully it will make a comeback.

I’m itching to release something to the public. Maybe I should start putting up daily/weekly builds on the forum, even if just to get more testing done on the install and update code path. I’ve got a small private crowd of testers on the forum, and they’re a big help, but some fresh eyes would be useful.

I’ve also finished what I hope are my last round of “edits” or “drafts” on Paint.NET’s functional and asynchronous programming models. They both revolve around a base type called Result<T>, which is an implementation of the “either” monad specialized for values and errors. Here’s a simplified version:

public class Result<T>
{
    public T Value { get; }
    public bool IsValue { get; }
    public Exception Error { get; }
    public bool IsError { get; }
    public bool NeedsObservation { get; }
    public void Observe();
}

You see, it’s always bugged me (more so recently) that in C# every method signature implicitely has a “I might throw an exception” tag on it. To borrow some C++ syntax:

public delegate TRet Func<T1, TRet>(T1 arg1) throw(…); // jee golly, I might throw! or not!

There’s no way to specify “nothrow” and have the compiler statically enforce it. Because of this, every asynchronous programming model I’ve seen has its own special way of communicating things like success, aborted, canceled, or that an exception was thrown. The documentation never seems to be clear what happens if your callback throws an exception in its guest environment. It’s such a shame. Instead, let’s start with Func.Eval which helps us to normalize the situation:

public static class Func
{
    public static Result<TRet> Eval(Func<TRet> f)
    {
        TRet value;

        try       
        {
            value = f();
        }

        catch (Exception ex)
        {
            return Result.NewError<TRet>(ex);
        }

        return Result.New<TRet>(value);
    }
}

In order to support “Eval” for Action delegates, Result<T> actually derives from a base Result class, which omits the Value and IsValue properties.

If a Result contains an error, then it must be observed. Put simply, you must either call Observe() or access the property getter for the Error property. Otherwise, once the Result instance is finalized by the garbage collector it will throw an exception, crash, and then burn. This ensures that no exceptions get lost or eaten. Also, when creating a Result that contains an error, the current stack trace is captured. This has already helped me a lot in debugging!

This whole system snakes through a few namespaces and DLL’s, and has undergone several waves of refactoring. I’m actually using all of it, so any clumsiness or impedance mismatch with the method overload resolution in the compiler is quickly caught and dealt with.

Oh, I mentioned that this ties into asynchronous programming as well. I’ll go into that in more detail later, but I’ve now got a very natural programming model for continuation-passing style which 1) makes it trivial to write sequences of code that “hops” between threads (think loading/computing in background and updating UI in foreground), and 2) doesn’t require any locks or mutexes to use (the implementation uses them), and 3) is almost as natural to use as synchronous code ala Func.Eval().

It’s also served as the basis for what is now a trivial implementation of iterative tasks. I mentioned these briefly in an older blog post. It’s a clever hack that many people have developed independently whereby you “yield” instructions to a dispatcher to perform things like switching to another thread or doing efficient waits on other objects. Combine this with a data / task parallel library like what’s coming in .NET 4.0, and we’ve finally graduated to the toddler stage of concurrent programming.