College Adventures in Open Source: Fraginator edition

Back in 2000, I had just finished my freshman year of college at Washington State University and was back at the best summer job you could possibly imagine: sales associate at the local OfficeMax (office supplies store). Now, by “best” I really mean “boring” but it wasn’t a bad job, especially for a 19 year old (at least I wasn’t flipping burgers, right?). Selling printer ink and paper wasn’t the best use of my time, but maybe there’s an analogy with Einstein working at the patent office (we can all dream).

Anyway, to pass the time I decided to see if I could write a fully professional-quality Windows utility from start to finish in just a few weeks. I had recently read an article by Mark Russinovich about the new Windows 2000 defragmentation APIs, and it had source code for a Contig command-line utility that would defragment 1 file. (Back then I remember there being some source code but I can’t find that now.) Using that information as a basis I decide to write my own disk defragmenter, because clearly a college sophomore knows how to do this stuff. It would have a GUI version (“Fraginator”) and a command-line version (unfrag.exe), full CHM documentation, and a mascot. I wrote it in some mangled form of Visual C++ 6.0 with STL, if I remember correctly.

“Defrag, baby!” The Fraginator was not joking around with your data (I was a bit goofier with my coding back then). The picture is actually stolen from Something Awful’s satirical Jeff K from one of his comics where he thinks he’s a buff super hero or something (warning: those links may not be safe for work, or even safe for life, but they are at least occasionally hilarious). Hopefully Lowtax won’t come after me for that.

I finished the project up, put it up with source code (GPL) on my podunk of a website (I think it was hosted at zipcon.net, a small Seattle ISP), and mostly just forgot about it…

… until last week when I searched around for it and discovered that the ReactOS guys had started to include it in the applications distro about, oh, 6 years ago. They had even converted it to Unicode and translated it to a dozen languages or so. I thought, hey that’s great, someone actually likes it and is maybe even using it! I certainly was not expecting to find anything of the sort.

I was browsing through their “new” source code tree for it and immediately found a relic in the dialog resource file, clearly from a goofier moment:

I think I had this label control in there for layout purposes only, as it clearly doesn’t show up in the UI when you boot it up. But wait, it gets better. They haven’t removed this (not a bad thing), and in fact they’ve translated it.

So there you go. “I am a monkey, here me eeK” in Norwegian (I think). If you scout around with a web search you should be able to find a bunch of other translations. The French one is probably the most romantic pick-up line you’ll ever find, and there’s no need to thank me for your next success at the bars.

The last timestamp on the Fraginator.exe sitting on my hard drive is from 2003 and it doesn’t seem to work on Windows 7 anymore, unless you use it on a FAT32 USB stick. I doubt it’ll even compile in Visual Studio 2010. Oh well Smile I’m glad the ReactOS guys are having use and fun with it. If you want the source, you’re better off checking out their copy of it. I don’t know if I even have a ZIP of that lying around anymore, and they’ve made decent improvements since then anyway.

Optimizing .PDN Save Times

Every once in awhile you get an idea for an optimization that is so simple and so effective that you wonder why you didn’t think of it much earlier. As a case in point, someone recently posted to the forum that their image was taking a long time to save. It was an image comprised of 20 layers at 1280×720 pixels, and 20 seconds was deemed to be too long. The guy had a brand new AMD Phenom II X4 960T 3.0GHz gizmo with 4 CPU cores, plenty of RAM, and an SSD. It was obviously bottlenecked on the CPU because Task Manager was pegged at 100% while saving.

Please wait forever

The only advice I could give for the latest public release of Paint.NET (v3.5.10) was that if you need it to save quicker, either 1) use fewer layers, or 2) get a faster CPU with more cores. This is very technical advice, and in some circles that works just fine especially if it’s practical and economical, or if you’re dependent on saving time in order to make money (saving 5 minutes every day is often worth $100 to buy more RAM). Using fewer layers isn’t a good solution though, and replacing the CPU usually isn’t very practical either especially if it’s already a brand new upgrade.

It got me thinking though, that Paint.NET can still do better. If you’re working with a lot of layers then chances are that each layer is mostly blank space. Lots of transparent pixels with an RGBA value of #00000000 and a simple integer value of 0. So why not find a way to short-circuit that? Paint.NET does not yet used a “tiled” memory management scheme for layer data, but there was still a very good, very simple, and very effective solution just begging to be implemented.

PDN images contain a header, a bunch of structural information (# of layers, etc.), and then a whole lot of raw pixel data compressed using GZIP. It isn’t one long GZIP stream, however. Each layer is broken up into 256 KB chunks which are compressed separately. This does increase the file size, but not by much (a percent or two), and it lets me compress each block in its own thread and makes saving a much faster ordeal than it otherwise would be. Save and load times are able to drop almost linearly with the number of CPU cores.

So yesterday, in the 4.0 code base, I fixed this scenario further. For regions of the image that contain “homogenous values,” which would be a 256 KB chunk of all the same color, I compress it once and then cache the result. The next time I find a region that has the same homogenous color pattern, I skip the compression altogether and emit the result of the previous time I compressed it. If a region isn’t “homogenous” then it’s discoverable pretty quick and therefore doesn’t waste much CPU time.

(These 256KB regions do not have any geometric shape associated with them, such as being a 128×128 tile. A 256KB chunk in an 800×600 image would be the first 81 rows, and then most of the 82nd row. The next chunk would start off near the end of the 82nd row, and then include almost the next 81 rows of pixel data.)

The result? About 50% faster save times! The first time you save the image will take the longest. After that the cache will be “warmed up” and saving is even faster, resulting in up to about a 90% reduction in save time.

This is a great optimization: there’s no change to the PDN format, so images saved in 4.0 are still completely compatible with earlier versions or with other applications that have done the grunt work to support PDN (I think GIMP does nowadays). It’s just faster.

Anyway, it’s been awhile since I posted. Fear not, work is still progressing on 4.0. Silence is just an indicator that I’ve been really busy, not abandonment. Ed Harvey has contributed a few really cool features that have been popular requests, such as the ability to use the Color Picker tool to sample all layers, and to sample the average from a 3×3, 5×5, 11×11, 33×33, or 55×55 region (apparently that’s what Photoshop does). He also added a few enhancements to the color wheel so you can use Ctrl, Alt, or Shift to constrain the hue, saturation, or value. I’ve finally added drag-and-drop to the Layers window for moving layers around, along with spiffy animations and transitions (it uses the same code from the newer image thumbnail list at the top of the window). I even added a setting to choose between the Blue theme (from 3.5) and a new Light/white theme (which is the new default). I’ve decided to cut some features in the interest of being able to finish and ship 4.0 this year, such as a plugin manager, but there’s always another train coming (4.1, 4.2, etc.) so I wouldn’t hold a memorial service just yet.