Optimizing .PDN Save Times

Every once in awhile you get an idea for an optimization that is so simple and so effective that you wonder why you didn’t think of it much earlier. As a case in point, someone recently posted to the forum that their image was taking a long time to save. It was an image comprised of 20 layers at 1280×720 pixels, and 20 seconds was deemed to be too long. The guy had a brand new AMD Phenom II X4 960T 3.0GHz gizmo with 4 CPU cores, plenty of RAM, and an SSD. It was obviously bottlenecked on the CPU because Task Manager was pegged at 100% while saving.

Please wait forever

The only advice I could give for the latest public release of Paint.NET (v3.5.10) was that if you need it to save quicker, either 1) use fewer layers, or 2) get a faster CPU with more cores. This is very technical advice, and in some circles that works just fine especially if it’s practical and economical, or if you’re dependent on saving time in order to make money (saving 5 minutes every day is often worth $100 to buy more RAM). Using fewer layers isn’t a good solution though, and replacing the CPU usually isn’t very practical either especially if it’s already a brand new upgrade.

It got me thinking though, that Paint.NET can still do better. If you’re working with a lot of layers then chances are that each layer is mostly blank space. Lots of transparent pixels with an RGBA value of #00000000 and a simple integer value of 0. So why not find a way to short-circuit that? Paint.NET does not yet used a “tiled” memory management scheme for layer data, but there was still a very good, very simple, and very effective solution just begging to be implemented.

PDN images contain a header, a bunch of structural information (# of layers, etc.), and then a whole lot of raw pixel data compressed using GZIP. It isn’t one long GZIP stream, however. Each layer is broken up into 256 KB chunks which are compressed separately. This does increase the file size, but not by much (a percent or two), and it lets me compress each block in its own thread and makes saving a much faster ordeal than it otherwise would be. Save and load times are able to drop almost linearly with the number of CPU cores.

So yesterday, in the 4.0 code base, I fixed this scenario further. For regions of the image that contain “homogenous values,” which would be a 256 KB chunk of all the same color, I compress it once and then cache the result. The next time I find a region that has the same homogenous color pattern, I skip the compression altogether and emit the result of the previous time I compressed it. If a region isn’t “homogenous” then it’s discoverable pretty quick and therefore doesn’t waste much CPU time.

(These 256KB regions do not have any geometric shape associated with them, such as being a 128×128 tile. A 256KB chunk in an 800×600 image would be the first 81 rows, and then most of the 82nd row. The next chunk would start off near the end of the 82nd row, and then include almost the next 81 rows of pixel data.)

The result? About 50% faster save times! The first time you save the image will take the longest. After that the cache will be “warmed up” and saving is even faster, resulting in up to about a 90% reduction in save time.

This is a great optimization: there’s no change to the PDN format, so images saved in 4.0 are still completely compatible with earlier versions or with other applications that have done the grunt work to support PDN (I think GIMP does nowadays). It’s just faster.

Anyway, it’s been awhile since I posted. Fear not, work is still progressing on 4.0. Silence is just an indicator that I’ve been really busy, not abandonment. Ed Harvey has contributed a few really cool features that have been popular requests, such as the ability to use the Color Picker tool to sample all layers, and to sample the average from a 3×3, 5×5, 11×11, 33×33, or 55×55 region (apparently that’s what Photoshop does). He also added a few enhancements to the color wheel so you can use Ctrl, Alt, or Shift to constrain the hue, saturation, or value. I’ve finally added drag-and-drop to the Layers window for moving layers around, along with spiffy animations and transitions (it uses the same code from the newer image thumbnail list at the top of the window). I even added a setting to choose between the Blue theme (from 3.5) and a new Light/white theme (which is the new default). I’ve decided to cut some features in the interest of being able to finish and ship 4.0 this year, such as a plugin manager, but there’s always another train coming (4.1, 4.2, etc.) so I wouldn’t hold a memorial service just yet.

28 thoughts on “Optimizing .PDN Save Times

    • Martin Pollard says:

      Correction: already implemented. The most current plugin pack for IrfanView includes Paint.NET support (read only). 🙂

      What still surprises me is that both GIMP and the much-lauded Photoshop *still* don’t support indexed PNG files. How a free program like Paint.NET can implement full support for it while an app costing many hundreds of dollars doesn’t still surprises me. (I’m very grateful it does, too. I’m working on Android themes, and PDN’s ability to handle the various indexed PNGs found in Android ROMs, themes, and apps is invaluable.)

      • Tom says:

        Paint.NET uses the PNG functionality that is built into Windows.

        The GIMP suffers from the perennial problem of open-source projects that do not have substantial corporate backing: Only the sexy stuff gets worked on. File formats are not sexy, so they take a back seat.

  1. Alex Kamsteeg says:

    You said the first save will be the slowest, because you need to cache the gzip-stream of the empty block for the first time. But, this stream is always the same so you can pre-generate/hardcode it? I really don’t know whether it would be a significant improvement but it saves the user from generating the stream once every process lifecycle.

    • Rick Brewster says:

      The user won’t need to worry about a thing, they’ll just be happier with faster save times. I did think about if there was a good way to skip even the “cold” cache state by just embedding the knowledge of which GZIP byte sequences corresponded to what I’ve already implemented, and I still may go that route. But I would do that mainly to avoid the possibility of the cache’s memory usage growing unbounded over time and then needing to write an eviction policy, which *could* become very un-fun for multithreading reasons and which adds a bunch of complexity to something that hasn’t proven yet that it needs it. I’m trying to focus on shipping 4.0, which means not rat-holing on perfecting every single algorithm. There’s always time for things like that later, e.g. 4.0.1 or whatever, if it becomes necessary, and until then I will use the extra time for doing other things (like other 4.0 features, drinking beer, and sleeping). For now I’m happy with this simple, even quaint, and almost free, performance improvement.

  2. Schniefelus says:

    Rick, are you planning to release a metro-style-version of paint.net for Windows 8?
    Will there be a beta-release of paint.net 4.0?

    • Rick Brewster says:

      Once 4.0 ships I’ll be looking at what to do next. That may or may not include Metro. I’d like to support it, but from what I gather you can’t yet share UI code between Metro and non-Metro apps. Any direction I go in would require substantial rewriting to move away from WinForms, but once that hurdle is jumped I do absolutely need to be able to write the UI code *once* (and then Metro vs desktop would be styling and layout differences). I simply do not have the capacity to manage multiple platforms worth of code deltas, and this is also the primary reason I do not release Paint.NET for any other platform (Mac, Linux, etc.).

  3. Olivier says:

    He probably meant Rick and the add-ins developers. PDN is a killer app, and becomes a super killer app with these plug-ins.

  4. alex says:

    If a layer do not change between saves, is it still rescanned and compressed or is the gzipped version kept in ram? My tests on that effect are not conclusive, maybe I need a bigger picture to compare save times.

    • Rick Brewster says:

      No image data is kept in memory. It’s the result of what a 256KB block of the same value (that is, 65536 pixels of the same color) looks like after compression that is stored in memory. It’s kind of hard to describe. The PDN file itself will always be the same, it’ll just save faster since it already knows what to do when it finds a 256KB block that’s identical to one it already compressed.

      • Rimas says:

        Do you plan to store such cache on disk? IMO, it would be the next logical step generate such compressed chunks once and then use them indefinitely.

        • Rick Brewster says:

          No. Once you start persisting a cache it becomes vulnerable to corruption and attack. Suddenly an optimization that took 2 hours from experiment to implementation becomes a 4 week long security nightmare. It is intended as a simple opportunistic cache, nothing more nothing less.

      • alex says:

        There’s a misunderstanding.

        Your blog post was clear, I understood what the optimization does.

        You say that caching blocks of the same color would reduce save time because we avoid compression. I was wondering if caching the whole compressed layer was not a better idea. Because let’s face it, most people save every couple edits, and they usually happen on a single layer at a time leaving the others unchanged.

        What I’m talking about is the actual behavior. Do you rescan and re-compress every layer even the ones that didn’t change every time I click on save?

        • Rick Brewster says:

          It’s just an opportunistic optimization. There isn’t much complexity to it, and that’s kind of the point. I only wanted to spend a few spare hours on it; anything further would be better spent on features. I won’t be spending weeks architecting some perfect caching solution for the existing file format when I’d rather use a better format in the future anyway (e.g. 128×128 tiles which already keep track of whether they are homogenous).

  5. Jimbo says:

    I agree with Alex. This is a good optimisation per layer, but why not cache each layer too (with the relevant chunks), if it has no changes?

    If you were being really groovy, you could background thread cache layers which don’t get a lot of attention.

    But I do agree that PDN should not cache to the file system, it’s slow!

    • Rick Brewster says:

      Yes these are all definitely excellent ideas. But for now I’ve just got to spend my time on other things which will let me ship 4.0 this year (I’ll be rather disappointed if it slips another year to 2013). Any time that I spend trying to optimize this takes away from that. I’m not saying it won’t happen, just that it won’t happen right now. I’ll probably wait until I upgrade the PDN file format and internal Document/BitmapLayer types, at which point they can do all sorts of smart internal bookkeeping to make those optimizations really easy and safe. The optimization described here was just a very simple opportunity to provide some relief in this area without adding complexity elsewhere, so I figured why not 🙂

  6. Adam Zey (@guspaz) says:

    I guess it’s not a terribly practical suggestion, since it would potentially require an on-disk format change, but an option to disable compression in pdn files could provide significant speedups in some use cases. 20 layers of 1280×720 at 24-bit is about 52MB (I’m sure there’s other data going in there, such as undo history), which could take as little as a tenth of a second to write to disk on a modern SSD; if the content is photographic, gzip may not provide much space savings, but you could spend a chunk of time working on compressing that when you could just spew it out to disk much faster.

    It’s personal preference, but I’d tend to prefer source files for stuff like pdn to be uncompressed (unless there would be a performance improvement), and I’ll 7zip it myself if I have to send it to somebody.

      • Adam Zey (@guspaz) says:

        Could this be something that could be implemented in OpenCL, then? If you’re compressing 256KB chunks independently, the massively wide nature of a typical GPU would seem well suited to working on compressing all chunks simultaneously. Although I guess it would probably not be worth the effort involved just to save a bit of time saving. It could be helpful in a future tile-based system, though, where an operation could be done on the GPU on all tiles simultaneously.

Comments are closed.