Dealing With Legacy, or The Story of Paint.NET’s Bug Database

As a computer scientist and software developer, I have been conditioned to live in fear, or at least gross distaste, of legacy. That is to say, legacy code, legacy systems, legacy process/procedures, etc. As a self-proclaimed cowboy rock star egotastic developer, anytime I see code that wasn’t written by myself (or a select group of those whom I deem as “peers”), my first instinct is to proclaim that it’s horribly written and that I should redo it. Likewise, the same goes for legacy systems and process/procedures. “Oh gosh, we’d be so much more efficient without all this process in place!” … “Oh gosh, our server is running on an OS that’s 4 years old, we need to upgrade it!” … “Oh gosh, oh gosh!”

Sound familiar? It should if you’re a software developer on a project with more people assigned to it than you can count on two fingers. I’ve found that one important part of career and professional growth in software development is learning how to deal with legacy, and to reign back the instincts to immediately and bombastically proclaim it all as a bunch of crap. Sometimes dealing with legacy means refactoring utility code while you’re fixing a bug that is dependent on it (“I’m in this part of the code anyway, why not?”). Other times that means checking out half the source code tree to make a tiny change in every single file that has positive consequences in areas of performance, stability, maintainability, or whatever. On rare occasions it really does mean rewriting all the code, or redefining the process by which you’re doing development.

Oftentimes, however, I’ve learned that you should just leave it alone.

If you have something in place that works, and you don’t have plans to change it, then guess what … it’s probably ok to leave it alone! Even if the code is nasty, or the implementation uses 5 year old design patterns, or if there’s a new intern who really really wants to prove him or herself. For example, I would argue that Notepad works. It does its job, it does it well, and there’s really no need to rewrite it to use C# or WPF or sock puppets or anything. It’s a stable legacy system. (That’s not to say there isn’t a need or a market for something more, such as Notepad++ or Notepad2.)

Now, you might wonder what in the world this has to do with Paint.NET’s bug database. Well, before I start, you should first answer the following question: What do you think I’m using for Paint.NET defect/bug tracking?

Ok, seriously, ask yourself that question. Give yourself at least 10 seconds to come up with a few different possibilities, both low- and high-tech. Pieces of paper stapled to the wall? Sourceforge? A custom WinForms or ASP.NET app? Notepad? Excel? The back of my hand and a marker?

Here’s the answer and the story:

Paint.NET’s bug database is Bugzilla 2.16.3 running on Apache in Redhat Linux 8.0. It was originally hosted on a Pentium II named “elbonian3” in some dark corner of the EECS department of Washington State University. It was set up by (I think) Chris Crosetto during Fall semester of 2003 (~August) for the Computer Science 422 class (“Software Testing”) that Jack Hagemeister was running. The purpose was to inspect the source code of some client software that Gene Apperson (Microsoft alumni, Diligent founder) was developing, and to then file bugs on it. I was in this class and filed many bugs on the code, although I don’t know if Gene ever fixed the bugs or if his project was ever finished. Paint.NET started the following semester, and version 1.0 was written from start to finish in 15 weeks without any official bug tracking system. I think Brandon had an Excel file but nobody else really had access. I do not recommend doing that, by the way, that’s just the way things went down.

For the Fall 2004 semester, I talked with Jack and we set up a senior design project to ramp Paint.NET up to version 2.0. We got three students to work on it: Tom Jackson, Michael Kelsey, and Craig Taylor. The first thing we did was to have me finish up my work on version 1.1, and have the three students do bug filing and write updated documentation (haha, suckers!). As part of this, they managed to get the new semester’s CS422 students to do a lot of testing. To do this they picked up and ran with the “elbonian3” server that had been set up the year before. It worked great: students would file bugs, I would get e-mails from the server, and we could respond to each bug appropriately. The three students learned a lot about bug triaging, as well as release and defect management. Next up, they did their development work to push Paint.NET up to version 2.0 which was released right before Christmas. Then Paint.NET was Slashdotted. It was amazing, and comedic: the attention caused enough network traffic to bring down the entire EECS network for 2 days.

For about the next year and a half, I continued to use this server remotely from 300 miles away to do Paint.NET’s defect tracking. However, in early- to mid-2006 it was clear that it was no longer appropriate to house this software at EECS. The network administrator was overworked (as they always are!) and the server was often down (for whatever reason), and Paint.NET was not really a “WSU project” anymore. The admin wanted to retire the server, and said he could send me the contents of the hard drive as a VMWare virtual hard drive file. I said that would be great, and the next day I started downloading a 2 GB tarball over HTTP.

Once I had downloaded and extracted the file, my next job was to install the free Windows version of VMWare Player on my server box in a dark corner of my house. This was no problem; it booted right up and spat out a login screen. But, you see, there was a major problem: the server still thought it was in Pullman on the eecs.wsu.edu domain. I managed to figure out what IP address the server had obtained, but whenever I tried to access it Apache would respond with headers pointing my browser back to the old eecs.wsu.edu address. Argh! Try as I might, I couldn’t figure out how to reconfigure Apache not to do this.

I’m sure some magical combination of grep and sed and other 3-to-4 letter combinations could fix it quick, but I found a faster-for-me solution: I cheated. I edited my hosts file in Windows so that it would map elbonian3’s network name to the local IP address of the virtual machine. Then, when I typed in the IP address in to my browser, it would get redirected to elbonian3 which would then get redirected right back to the IP address in a way that was completely transparent to the browser. Success! I had a working bug server again!

On a quick side note, about a month after I set this up I found that my IP range was being blacklisted by spamhaus or something. The reason was that sendmail was still active on my virtual server. Bugzilla kept trying to blast e-mail notifications about bug changes through an EECS mail server, which for obvious reasons rejected them. I’m a dumb clumsy oaf when it comes to Linux administration so I had to find a clever way to disable sendmail … I think in the end I may have just removed execute permissions on the appropriate binary. Bugzilla still claims to be sending out e-mails in its UI, but none of them actually get sent.

Anyway. A few weeks ago I decided to retire that server, as it wasn’t really doing much other than hosting this virtual machine and a CVSNT server. Why pay for that electricity if I can get it to run on my main workstation anyway? These were two things I figured could be done much faster if I just ran them locally (CVSNT is slow for Paint.NET’s repository on a 100mbit connection). There was another problem though: I use VirtualPC so that I can host copies of Windows XP and Vista in various configurations (high DPI, weird visual themes, etc.) in order to do testing for Paint.NET. I had VMWare up and running successfully, but the moment I tried to run VirtualPC the whole system hung and my awesome techno music started skipping. Bleh!

I managed to find a utility to convert the VMWare “VMDK” hard drive file to a Virtual PC “VHD” file. It’s called, appropriately, Vmdk2Vhd. And now I have my Bugzilla running in Virtual PC on my desktop system.

So there you have it. I use an old version Bugzilla in a virtualized Linux installation which still thinks it’s sitting in a dusty corner of my old university that’s over 300 miles away.

And it works.

And past that, who cares? Right now, I sure don’t. Legacy isn’t all bad, as long as it works and you don’t have to maintain or reconfigure it all that often. I’m basically the sole developer on Paint.NET right now so my requirements are pretty slim for things like bug tracking and source code control (Linus thinks I’m “ugly and stupid” for using CVS … but hey, you gotta admire his passion).

By the way, the quote in the Bugzilla screenshot above reads: “Using exceptions for bounds checking is like driving a car … it’s cheaper to just do a little extra work and stay in-bounds than it is to crash off the side of the freeway and say, ‘Oh the insurance can pay for that.'”

Successful Freeware Tip #3: Release Often to Keep People Interested

Prior to Paint.NET v2.6, I was releasing at a pace that covered several months. 2.0 to 2.1 was about 5 months, and forward to 2.5 was another 6 months. V2.6 was done in 3 months, and 3.0 took about 11 months. However, between 2.6 and 3.0 I release 2.61, 2.62, 2.63, 2.64, 2.70, and 2.72 – at a pace of roughly one every 4-6 weeks. Once 3.0 came out, I released 3.01, 3.05, 3.07, and 3.08 at a similarly fast pace, followed by v3.10 just shy of 3 months later.

There were several reasons for all the small “.01” releases from the 2.6 codebase. One was that I needed to put out some important bug fixes to decrease support costs (“costs” meaning “my time” and “user happiness”). If you can just fix common bugs, or work around common user mistakes, then they are no longer bugs or mistakes: it just works! Less e-mails for you, more working software for the masses. It’s a win-win.

The second reason was that I wanted to see what would happen if I released more often. I knew that 3.0 was going to take awhile to develop, and I wanted to keep online attention devoted towards Paint.NET. People forget about things quickly, and I wanted to avoid seeing any comments like, “Whatever happened to Paint.NET?”

So I forked the code and continued to release off the 2.6 tree. My theory was that if I didn’t release every so often, the media and online community would slowly forget about Paint.NET. I don’t really do any marketing or advertising to fill that niche (it wouldn’t be profitable). My theory continued such that if I did release often (4-6 weeks is about right), even with just a +.01 bugfix-only release, that the media and online community would be consistently reminded about Paint.NET. Users would hear more about it, which breeds familiarity, which coaxes more and more people to download and try it out. This means more users sticking with the program, more people donating, more traffic to my website, and generally just more awesomeness all around. Oh, and quick releases means users get bug fixes and features quicker too 😉

Here’s the great thing: this theory is proving itself to be correct. Here are some facts that back it up:

  • Nearly 80% of the 3-year old forum‘s activity is from this calendar year. Part of this I believe can also be attributed to having placed a link to the forum in the application’s Help menu. (Comparing # of posts from December 31st, 2006 to today’s count.)
  • Almost 60% of the all Paint.NET downloads have occurred during this calendar year. (Comparing download counts for December 31st, 2006 versus today at BetaNews, historically the primary download mirror.)
  • Revenue via donations is about 2x what it was before 3.0 was released. (This is comparing February 2007 through August 2007, with August 2006 through December 2006).
  • Revenue via AdSense is about 10x what it used to be. (This is comparing May 2007 through September 2007, with August 2006 through December 2006, but excluding November 2006 because there was some glitch that nuked my earnings for a week or so).

After the v3.0 release things really started picking up in the donations and AdSense departments. This is of course partly due to the fact that v3.0 simply rocked compared to previous versions and reached a critical mass of features and press coverage, and I started getting more traffic.

The 3 months gap between 3.08 and 3.10 confirmed my theory that frequent releases create a positive feedback cycle for earnings and avoids media forgetfulness. Before the 3.10 release, it was quite apparent that donations and AdSense were losing their steam from the 3.07 and 3.08 releases. Check out this graph showing my relative daily AdSense earnings:

The amount of traffic coming to the site is mostly constant, but it appears like AdSense was just, I don’t know, getting bored with the site or something. Now that 3.10 is out the door, AdSense is continuing on a slow upward trend, even a full month after the release. Maybe AdSense has a bias against stale sites? I even saw a very happy spike on Sunday which set an all-time 1-day record. AdSense is definitely a strange, strange beast.

The graph for donations is very different and always shows huge spikes for about 10 days following a release. I do not have a graph prepared because of the difference in how PayPal provides its data: I have to do some crazy Excel programming to get it to work L

The base theory is that every time I release, I get a short-term spike in traffic after which it settles down to a level that is slightly higher than what the average was before.

Considering Direct Ad Sales

One thing that John Chow has given advice on is selling ads directly. I’ve been thinking about this a bit the last few days and I think it may be time for me to try it out. I certainly have the traffic for it! The getpaint.net website gets over 1 million “page impressions” per month as reported by Google Analytics. The index page does around 400,000, and the download page sits at just under 500,000. Right now I’m using Google AdSense and it is doing very well for itself, at least in absolute terms (never say “no” to free money, right?).

John Chow’s advice says to take the amount you’re earning with AdSense and double it for when you try selling ads directly. The hypothesis is that Google is sharing revenue at a rate of about 50%. His other general advice to diversify your income is one that I’ve implemented as well – albeit by implementing Search that is still provided by Google, and by moving the Help content online and adding AdSense to it (together they added enough to my Paint.NET earnings to buy a Bluray player!).

This could help to significantly diversify my income sources and reduce my reliance on AdSense, which in August accounted for 80% of Paint.NET revenue. It’s not that I dislike AdSense, and I bet the feeling is mutual. I also don’t think I will be banned, but it’s still a significant risk factor — just ask Henry and Wilson about the time they lost out on $200,000 (although they seem to have broken some of the AdSense TOS, such as not having more than 1 account).

I also have advertising space available in the Paint.NET installer. You know when you install the program and it says “Please wait, optimizing…” and there’s a little banner that says “Please donate!” along with one for the download mirror (“This update brought to you by BetaNews”) or for searchpaint.net? I bet I could sell that as advertising space as well. It reaches hundreds of thousands of users per month and is on screen for a good chunk of time.

The only thing I’m wary of is that John Chow also suggests that you create an Advertisers page that list your rates directly instead of saying, “please e-mail us for a quote.” But hey, if I have to disclose revenue to get a huge increase in it, it might just be worth it. John Chow does it every month and when he posts earnings just shy of $18,000 for August, people are stunned and inspired (or, shocked and awed?).

I would probably sweeten the deal by allowing the advertiser (or advertisers, I don’t know how I’ll do things) access to the site’s Google Analytics reports. That way they could see what types of visitors they are reaching, and retarget their ad appropriately if they wanted.

So … comments?

Anyway I’m off to Bumbershoot in the morning*. I’ve never been but it’s supposed to be awesome, and it’s a friend’s birthday too.

* Yeah yeah I posted at 4:30am …

August 2007 usage statistics

Now that I’m caught up on stats after having published them for June and July, and now that it’s September, it’s time to publish stats for August.

Overall usage is up a surprising 15% over July, and 25% since May. Vista usage is still growing strongly, and is up another 17% over July and 47% since May! Even the 64-bit slice of the pie saw good growth, showing a 10% increase. Sadly, 64-bit is still only 1.25% of the user base.

 

July

August

July -> August

Total Hits

1,025,580

1,182,822

+15.33%

Hits Per Day

33,083

38,156

 
       

32-bit

98.83%

98.75%

-0.08%

64-bit

1.17%

1.25%

+6.55%

       

Windows XP

86.53%

84.29%

-2.59%

Windows 2003

0.68%

0.66%

-2.29%

Windows Vista

12.79%

15.05%

+17.61%

       

English

46.39%

46.66%

+0.58%

German

18.37%

18.02%

-1.94%

French

8.23%

7.62%

-7.32%

Portuguese

6.03%

5.92%

-1.85%

Spanish

4.06%

4.58%

+12.94%

Japanese

2.40%

2.20%

-8.47%

Italian

2.09%

1.67%

-20.00%

Netherlands

1
.71%

1.93%

+12.38%

Russian

1.45%

1.66%

+14.11%

Chinese (Simplified)

1.30%

1.51%

+16.17%

Polish

1.87%

1.87%

-0.15%

Chinese (Traditional)

0.92%

0.92%

+0.62%

Turkey

0.69%

0.74%

+6.81%

Korean

0.68%

0.63%

-8.33%

The rest

3.80%

4.08%

+7.36%

Have translations

87.46%

87.13%

-0.38%

Don’t have translation

12.54%

12.87%

2.62%

 

For the next release of Paint.NET, I am planning on adding some extra telemetry. Namely, I want to be able to count the total number of installations, first-time installations versus upgrade installations, and first-time runs of the application. This will let me gather some more useful information about the actual size of the Paint.NET installed base. The reporting will be completely anonymous, but of course be something you may opt out of.

Elusive Bugs: No known cause for missing resources, but solved regardless

Sometimes I get bug reports and, since no one else has ever reported them and they seem weird (“they” referring to the crash, not the user!), I write them off as a fluke, a one-time random occurrence, or what you might call “bit rot”. However, when you start to get many crash logs with the same error messages over and over again, it’s obviously none of those. That doesn’t mean anybody knows how to reproduce the error though!

For months, I used to receive e-mails with the following crash log:

Exception details:
System.Resources.MissingManifestResourceException: Could not find any resources appropriate for the specified culture (or the neutral culture) on disk.
baseName: PaintDotNet.Strings locationInfo: <null> fileName: PaintDotNet.Strings.resources

I could tell that the PaintDotNet.Strings.resources file was missing. I could even reproduce this crash by simply deleting the file. Fixing was usually an easy matter as well – the user just had to reinstall Paint.NET, or click “Repair” in the Add/Remove Programs control panel. However, I had no idea why or how the file was missing. Was it just being randomly deleted? Was it a race condition in the installer? Amazingly enough, I actually had this randomly happen to myself once or twice while doing regular installation and upgrading of some old builds of Paint.NET v3.0. So I knew for 100% sure that it was a real issue that had to be solved. And I wanted my inbox back! This was a top crash report for Paint.NET v2.6 through v2.72.

To this day, I still have no idea what causes this to happen. However, I did solve the problem for users! I decided that Paint.NET would auto-repair itself if it discovered any important files were missing. Thus, PdnRepair.exe was born!

All it really does it execute a “repair” operation on the Paint.NET installation, functionality that is built-in to Windows Installer via MsiReinstallProduct() and the REINSTALLMODE_FILEMISSING flag. This required a small change to the installer for Paint.NET, whereby it simply had to write Paint.NET’s product code GUID out to the registry so that PdnRepair would know which MSI it needed to repair (it ain’t psychic!).

You can see this in action pretty easily. Just go to your installation directory for Paint.NET and delete any *.EXE, *.DLL, or *.resources files except for PaintDotNet.exe and PaintDotNet.SystemLayer.dll. Then, launch Paint.NET as normal.

Once you click Repair, a console window will show up with some diagnostic text, followed by some standard Windows Installer dialogs and progress bars. When it’s finished, Paint.NET will automatically relaunch itself. Pretty easy, eh? The only problem with this dialog is that the text had to be hard coded and is English-only. This is because the file that’s always missing in the crash logs I’ve received is the file with the string resources in it!

And now I only get about 1 or 2 of these crash logs per month and it’s always from folks who are still running version 2.72. Even though I was never able to get a solid repro on the bug, it was still solved and one of my top support issues was eradicated! It also served as a generic repair utility for any number of other small issues that may crop up.

Lastly, the source code for this utility is very straightforward. If you want to see how it’s done for your project that uses MSI’s for installation, just go to the Paint.NET download page and grab the source code! It’s in the src/PdnRepair directory.