A fluent approach to C# parameter validation

Fluent programming gets a bad reputation, since some developers like to write code like the following:

var time = 7.Days().Plus(4.Hours())

Barf. However, when used properly, I think it’s very powerful. Let’s look at a typical method with some parameter validation:

// Copy src[srcOffset, srcOffset + length) into dst[dstOffset, dstOffset + length)
public static void Copy<T>(T[] dst, long dstOffset, T[] src, long srcOffset, long length)
{
    if (dst == null)
        throw new ArgumentNullException(“dst”);

    if (src == null)
        throw new ArgumentNullException(“src”);

    if (dstOffset + length > dst.Length || dstOffset < 0)
        throw new ArgumentOutOfRangeException(
            “dst, dstOffset, length”,
            string.Format(“dst range is not within bounds, ({0} + {1}) > {2}”, dstOffset, length, dst.Length));

    if (srcOffset + length > src.Length || srcOffset < 0)
        throw new ArgumentOutOfRangeException(
            “src, srcOffset, length”,
            string.Format(“src range is not within bounds, ({0} + {1}) > {2}”, srcOffset, length, src.Length));

    if (length < 0)
        throw new ArgumentOutOfRangeException(“length”, “length must be >= 0, ” + length.ToString());

    for (int di = dstOffset; di < dstOffset + length; ++di)
        dst[di] = src[di – dstOffset + srcOffset];

}

That’s a lot of code for parameter validation, but in a robust system it is necessary. For debugging purposes, having all the information in there with the actual parameter values is invaluable, so that you can get a stack trace that tells you, “Length was too big. It was 50, but the max was 49.”

The problem here is twofold. One, code like this gets sprinkled all over the codebase of a large application and so it gets repetitive, tiresome, and is a bug hazard. Having an off-by-1 error is many times worse if it’s in your validation code. Or, because it’s tiresome, sometimes there just won’t be any validation.

The second problem is actually much more subtle. Ask yourself this: if both src and dst are null, what exception does the caller get? (and subsequently, what goes into the crash log or Watson upload?) It will only tell you that dst is null. This leads to more iterations in debugging than is optimal, where you fix the problem of dst equaling null only to immediately get it crashing on you again when src is null. If the exception told you about both errors, you could have saved a lot of time.

This happens more often than I’d like when debugging issues on other people’s systems, especially ones I don’t have any direct access to (physical or remote, ala Remote Desktop). The end-user will post a Paint.NET crashlog to the forum, I’ll fix it and send them a patch or new build, and then the same method will crash on the very next line of code. This is especially relevant to methods for graphics which take parameters for stuff like width, height, location, bounding box, etc. The X value may be bad, but the Y value might also be bad. I need to know about both, along with the valid ranges (and not just “out of range”).

There are times where I have fixed issues with no direct interaction with a user: if I get a bunch of crash logs for a certain issue, but I can’t reproduce it, I have often been able to fix it by incorporating a hopeful and conservative fix into the next release and then monitoring to make sure that no more crash logs come in. And yes, I’ve done that many times with Paint.NET.

Reporting an aggregated judgement like this is just not fun. To go the extra mile you need to create a StringBuilder, decide on the preciding exception type, manage concatenation of multiple parameter names (“sentence-ization”), etc. Like this …

public static void Copy<T>(T[] dst, long dstOffset, T[] src, long srcOffset, long length)
{
    StringBuilder sb = new StringBuilder();
       
    if (dst == null)
        sb.Append(“dst. “);

    if (src == null)
        sb.Append(“src. “);

    if (sb.Length > 0)
        throw new ArgumentNullException(sb.ToString());

    if (dstOffset + length > dst.Length || dstOffset < 0)
        …

    if (srcOffset + length > src.Length || srcOffset < 0)
        …

    if (length < 0)
        …

    if (sb.Length > 0)
        throw new ArgumentOutOfRangeException(sb.ToString());

    …
}

Boo. This is still tiresome, and creates extra objects, etc. Because of the extra work involved, this tends to be done reactively instead of proactively. Only the “hot” methods get the comprehensive logic.

I’ve come up with another method. Check this out:

public static void Copy<T>(T[] dst, long dstOffset, T[] src, long srcOffset, long length)
{
    Validate.Begin()
            .IsNotNull(dst, “dst”)
            .IsNotNull(src, “src”)
            .Check()
            .IsPositive(length)

            .IsIndexInRange(dst, dstOffset, “dstOffset”)
            .IsIndexInRange(dst, dstOffset + length, “dstOffset + length”)
            .IsIndexInRange(src, srcOffset, “srcOffset”)
            .IsIndexInRange(src, srcOffset + length, “srcOffset + length”)
            .Check();

    for (int di = dstOffset; di < dstOffset + length; ++di)
        dst[di] = src[di – dstOffset + srcOffset];
}

Yow! Ok that’s much easier to read. And here’s the kicker: if no problems are found with your parameters, then no extra objects are allocated. The cost for this pattern is only in the extra method calls.

There are three classes involved here: Validate, Validation, and ValidationExtensions. Here’s the Validate class:

public static class Validate
{
    public static Validation Begin()
    {
        return null;
    }
}

That was easy. This allows us to not allocate a “Validation” object, and its enclosed fields, until we actually encounter a problem. The presiding philosophy in code that uses exception handling is to optimize for the non-exceptional code path, and that’s exactly what we’re doing here. Here’s the Validation class:

public sealed class Validation
{
    private List<Exception> exceptions;

    public IEnumerable<Exception> Exceptions
    {
        get
        {
           
return this.exceptions;
       
}
    }

    public Validation AddException(Exception ex)
    {
        lock (this.exceptions)
        {
            this.exceptions.Add(ex);
        }

        return this;
    }

    public Validation()
    {
        this.exceptions = new List<Exception>(1); // optimize for only having 1 exception
    }
}

It’s basically just a list of exceptions. AddException() returns ‘this’ to make some of the code in the ValidationExtensions class easier to write. Check it out:

public static class ValidationExtensions
{
    public static Validation IsNotNull<T>(this Validation validation, T theObject, string paramName)
        where T : class
    {
        if (theObject == null)
            return (validation ?? new Validation()).AddException(new ArgumentNullException(paramName));
        else
            return validation;
    }

    public static Validation IsPositive(this Validation validation, long value, string paramName)
    {
        if (value < 0)
            return (validation ?? new Validation()).AddException(new ArgumentOutOfRangeException(paramName, “must be positive, but was ” + value.ToString()));
        else
            return validation;
    }

    …

    public static Validation Check(this Validation validation)
    {
        if (validation == null)
            return validation;
        else
        {
            if (validation.Exceptions.Take(2).Count() == 1)
                throw new ValidationException(message, validation.Exceptions.First()); // ValidationException is just a standard Exception-derived class with the usual four constructors
            else
                throw new ValidationException(message, new MultiException(validation.Exceptions)); // implementation shown below
        }
    }
}

The sum of these collections allows us to write validation code in a very clean and readable format. It reduces friction for having proper validation in more (or all? :)) methods, and reduces the bug hazard of either incorrect or omitted validation code.

Missing from this implementation, and other kinks to work out:

  • Could use lots of additional methods within ValidationExtensions. (some were omitted for brevity in this blog post)
  • Calling ValidationExtensions.Check() is itself not validated. So, if you forget to put a call to it at the end of your validation expression then the exception will not be thrown. Often you’ll end up plowing into a null reference and getting a NullReferenceException, especially if you were relying on ValidationExtensions.IsNotNull(), but this isn’t guaranteed for the other validations (esp. when dealing with unmanaged data types). It would be simple to add code to Validation to ensure that its list of exceptions was “observed”, and if not then in the finalizer it could yell and scream with an exception.
  • The exception type coming out of any method that uses this will be ValidationException. This isn’t an issue for crash logs, but it is for when you call a method and want to discriminate among multiple exception types and decide what to do next (e.g., FileNotFoundException vs. AccessDeniedException). I’m sure there’s a way to fix that, with better aggregation, and (hopefully) without reflection.
  • Should probably change the IEnumerable<Exception> in Validation to be Exception[].

Here’s the implementation of MultiException, as promised in the code above. And, in fact, it’s incomplete because it does not print all of the exceptions in a ToString() type of call. Umm … how about I leave that as an exercise for the reader? 🙂

[Serializable]
public sealed class MultiException
    : Exception
{
    private Exception[] innerExceptions;

    public IEnumerable<Exception> InnerExceptions
    {
        get
        {
            if (this.innerExceptions != null)
            {
                for (int i = 0; i < this.innerExceptions.Length; ++i)
                {
                    yield return this.innerExceptions[i];
                }
            }
        }
    }

    public MultiException()
        : base()
    {
    }

    public MultiException(string message)
        : base()
    {
    }

    public MultiException(string message, Exception innerException)
        : base(message, innerException)
    {
        this.innerExceptions = new Exception[1] { innerException };
    }

    public MultiException(IEnumerable<Exception> innerExceptions)
        : this(null, innerExceptions)
    {
    }

    public MultiException(Exception[] innerExceptions)
        : this(null, (IEnumerable<Exception>)innerExceptions)
    {
    }

    public MultiException(string message, Exception[] innerExceptions)
        : this(message, (IEnumerable<Exception>)innerExceptions)
    {
    }

    public MultiException(string message, IEnumerable<Exception> innerExceptions)
        : base(message, innerExceptions.FirstOrDefault())
    {
        if (innerExceptions.AnyNull())
        {
            throw new ArgumentNullException();
        }

        this.innerExceptions = innerExceptions.ToArray();
    }

    private MultiException(SerializationInfo info, StreamingContext context)
        : base(info, context)
    {
    }
}

Advertisements

50 thoughts on “A fluent approach to C# parameter validation

  1. Mats G says:

    Hmm… You’re giving me ideas.

    Now, I use PHP, where the syntax would be a little different, but… might just try something like that. 🙂

  2. flipdoubt says:

    Truly slick.

    Rick, are you saying Validate, Validation, and ValidationExtensions are meant as a “generic example” or the problematic “Copy” method? I don’t see why Validate, Validation, and ValidationExtensions are not production ready …

  3. Zyxil says:

    In Validation.Check() you have:

    if (validation.Exceptions.Take(2).Count() == 1)
    throw new ValidationException(message, validation.Exceptions.First());

    Why not just:
    throw validation.Exceptions.First();
    ?

    So a single validation failure is of the type that failed, and multiple failures is a MultiException.

  4. Mike L says:

    Using “throw new ValidationException(message, validation.Exceptions.First());” will preserve the original stack trace.

  5. Jon Fuller says:

    Great idea! I just wrote a post last week on how to use static reflection to make parameter validation better too (no more passing the parameters, and the name of that parameter to include in the thrown exception).

    I think your “framework” above combined with my static reflection trick make for highly readable and elegant parameter validation.

  6. Rick Brewster says:

    Jon Fuller — Hmm. Your approach has significant overhead, with allocation of many temporary objects and emission of IL as a result of using Expression.

    The great part of the approach I list above is that *no* objects are allocated in the mainline code path.

    What we really need is a nameof() operator, or a paraminfo() operator.

    var n = nameof(x); // n is the string “x”
    var p = paraminfo(x); // p is a tuple, { “x”, x }

  7. MAlonso says:

    I fell in love with this method of validating the second I saw it. My only problem with it is the continuous checking to see if the instance of Validator is null or not. It makes the code ugly and you are violating the DRY principle. I don’t think the minimal cost of instantiating an object is worth having that code strewn about all over the place. This is what I don’t like:

    (validation ?? new Validation())

    I know we all love it when we get to use the null coalescing operator, but this is bad use of it.

  8. Rick Brewster says:

    MAlonso — I completely disagree. This code is not “all over”, it’s only within the ValidationExtensions class. The number one priority here is to avoid object allocations unless a validation error (exception) occurs. Otherwise the performance cost is higher than just doing everything manually, and so then anytime you want to use this pattern you have to decide, “Hmm, is the performance worth it?” This way the answer is always yes and the class is universally applicable. It’s one less thing to worry about. Once you’ve written the methods for ValidationExtensions, you’re done.

    This is also *not* a violation of the DRY principle.

  9. Alexey Romanov says:

    Why do you define no-arguments and message-only constructors for MultiException? Sure, the guidelines say so, but this seems to me to be a case where they are wrong.

    As a marginal point of style I would strongly prefer to invert the if conditions in validation checks, so they would actually check what they say:

    public static class ValidationExtensions
    {
    public static Validation IsNotNull(this Validation validation, T theObject, string paramName)
    {
    if (theObject != null)
    return validation;
    return (validation ?? new Validation()).AddException(new ArgumentNullException(paramName));
    }

    IsNotNull probably should not have the “T : class” constraint, since it can also be useful for Nullable struct.

    Finally, given that so many people love it, would you consider making this into a Google Code (or CodePlex, or whatever) project?

  10. Rick Brewster says:

    Alexey — Ok so when/if you adapt the code, feel free to make those changes. The point of the article was not to analyze MultiException. I see no point in making a Google Code / Codeplex project for it, as it’s small enough that you can just adapt it from what you see above.

    IsNotNull would need another override for Nullable. It’s a struct, and the ability to assign and compare to ‘null’ is a operator overloading trick (maybe there’s some compiler syntactic sugar in there too).

  11. Alexey Romanov says:

    Just checked that this test passes:

    int? nullableIntNull = null;
    Assert.Throws(() =>
    Validate.Begin()
    .IsNotNull(nullableIntNull, “nullableIntNull”)
    .Check());

    Sure it’s small, but it would be easier to find and add to. On the other hand, this post _is_ currently the third Google hit for “C# validation”, for me at least.

  12. Barbie Brooks says:

    It’s funny you should mention the hazards of off by one errors in your validation code! Shouldn’t this be:

    .IsIndexInRange(dst, dstOffset + length – 1, “dstOffset + length”)

    I assume being InRange means i being valid for an array[i] access (i>=0 && i < array.Length)

  13. Rick Brewster says:

    Alexey — It may not be doing what you want it to, however. My bet (and I’m happy to be proven wrong) is that it’s calling Object.Equals(object, object). That means that what you’re passing to it is being boxed, nor will it correctly compare ‘reall null’ to Nullable’s ‘null’. No good 😦

  14. Ifeanyi Echeruo says:

    That’s awfully useful. Thanks for sharing. BTW a compromise between always having to test (validation ?? new Validation()) and not allocating heap objects is to change Validation from a class to a struct. C# structs are allocated on the stack (basically incrementing a stack pointer and zeroing memory) not the heap so the GC isn’t involved.

  15. Joe Chung says:

    var time = 7.Days().Plus(4.Hours())

    is ugly, but only because you failed to use overloaded operators, not because of any failing in extension methods:

    var time = 7.Days() + 4.Hours()

    is fine and is just sugar for:

    var time = TimeSpan.FromDays(7) + TimeSpan.FromHours(4)

  16. Rick Brewster says:

    Ifeanyi — Yes, I know what a struct is. If Validation were a struct, there’d be a lot of added copying of data to and from the stack as the methods were called. Not to mention the JIT is currently known to be not as good at inlining or optimizing methods that take structs as parameters. The ‘start with a null reference’ pattern is honestly the cheapest way to go in order to optimize for the non-exceptional case.

  17. Rick Brewster says:

    Joe Chung — I still dislike it; I hardly failed at anything. The methods are very context-sensitive and it’s easy to wind up with an Intellisense dropdown filled to the brim with all these little one-off things. It doesn’t scale.

    Imagine if *everyone* started adding extension methods for every little word they thought would be neat and would “bridge the gap to English” …

    var myPets = 7.Dogs() + 2.Cats() + 3.Goldfish();

    var myIP = 127.Zero().Zero().One();

    etc. etc. etc.

    I repeat: *barf*

  18. Alexey Romanov says:

    Yes, it is boxing structs and calling Object.Equals(object, object). However, null in Nullable is boxed into a null reference, so this works as desired. It may be better to have a more specific overload for Nullable as well, to prevent boxing.

    From MSDN:
    “Boxing and Unboxing

    If the HasValue property of a nullable type is false, the result of a boxing operation is null reference. Consequently, if a boxed nullable type is passed to a method that expects an object argument, that method must be prepared to handle the case where the argument is null reference.”

  19. Rick Brewster says:

    Alexey — Hmm, cool to know! I haven’t used Nullable much so I haven’t really looked into it much. I would still have an overload that specialized on Nullable, because then the non-exceptional case (non-null) will not have boxing.

    Joe Chung — I finally was able to articulate to myself why I strongly dislike the “7.Hours()” pattern. What you’re doing is essentially hanging a TimeSpan constructor off of an unrelated type (Int32). Or at least, from Int32’s perspective it has nothing to do with TimeSpan.

    Consider type R which take types T(1) through T(n) to construct (e.g., a Point takes a double, or a Timespan takes a list of Int32s or a Long), and then hang extension methods on types T(1) through T(n) for the benefit of constructing an R. You will have a *lot* of extension methods. Now extrapolate this to the full cross product of R(1)…R(m) and the types they take for construction, T(R(1), 1) …T(R(1), n) up through T(R(m), 1)…T(R(m), n).

    You end up with the “big pile of utility code” anti-pattern. (Paint.NET has this problem, in fact, in a class named … tada … Utility.)

    Also, what happens when you have different time representations? What does Hours() mean, and how do you disambiguate? For example, System.TimeSpan versus. various other interop formats (FILETIME, SYSTEMTIME, ticks, RDTSC, etc.). It is still much better to specify the target context first (TimeSpan). Otherwise you’re just gaming the system and you have to keep introducing implicit conversion operators, “bounce” types, and more extension methods.

    (A “bounce” type, B, is one that you use as an interim factory object towards constructing T using a syntactic style that T does not support.)

  20. Alexey Romanov says:

    I was surprised, but given these two overloads:

    public static Validation IsNotNull(this Validation validation, T t, string paramName)

    public static Validation IsNotNull(this Validation validation, T? t, string paramName) where T : struct

    the compiler picks the desired case correctly in both of these situations:

    object obj = null;
    int? nullableIntNull = null;

    Requires.That
    .IsNotNull(objNull, “obj”)
    .IsNotNull(nullableIntNull, “nullableIntNull”);

    I expected an ambiguous overload error here.

  21. Alexey Romanov says:

    To those who don’t like repeating (validation ?? new Validation()) why not just add this:

    private static Validation AddException1(this Validation validation, Exception exception) {
    return (validation ?? new Validation()).AddException(exception);
    }

    It should be inlined, the control flow is simple enough and it’s small enough.

  22. Rick Brewster says:

    Alexey — The compiler seems to “correctly” pick the more specialized overload. Offhand, I don’t know what rules it uses. I’m sure the C# language spec spells it out pretty clearly.

  23. Alexey Romanov says:

    Getting back to “The exception type coming out of any method that uses this will be ValidationException” issue. Mike L’s comment says it is to “preserve the original stack trace.” I don’t see how it helps. Given these methods:

    public void AMethodInAStack1() {
    Requires.That
    .IsNotNull(null, “obj”)
    .Check();
    }

    public void AMethodInAStack2() {
    AMethodInAStack1();
    }

    [Fact]
    public void AMethodInAStack3() {
    AMethodInAStack2();
    }

    I get this exception with use of ValidationException in Check(this Validation):

    FP.Validation.ValidationException : Exception of type ‘FP.Validation.ValidationException’ was thrown.
    —- System.ArgumentNullException : Value cannot be null.
    Parameter name: obj
    F:\MyProgramming\FP\FP.Validation\Validation.cs(43,0): at FP.Validation.Validation.Throw()
    F:\MyProgramming\FP\FP.Validation\ValidationExtensions.cs(215,0): at FP.Validation.ValidationExtensions.Check(Validation validation)
    F:\MyProgramming\FP\FPTests\ValidationTests.cs(50,0): at FPTests.ValidationTests.Stack1()
    F:\MyProgramming\FP\FPTests\ValidationTests.cs(56,0): at FPTests.ValidationTests.Stack2()
    F:\MyProgramming\FP\FPTests\ValidationTests.cs(61,0): at FPTests.ValidationTests.Stack3()
    —– Inner Stack Trace —–

    and this one without:

    System.ArgumentNullException : Value cannot be null.
    Parameter name: obj
    F:\MyProgramming\FP\FP.Validation\Validation.cs(43,0): at FP.Validation.Validation.Throw()
    F:\MyProgramming\FP\FP.Validation\ValidationExtensions.cs(215,0): at FP.Validation.ValidationExtensions.Check(Validation validation)
    F:\MyProgramming\FP\FPTests\ValidationTests.cs(50,0): at FPTests.ValidationTests.Stack1()
    F:\MyProgramming\FP\FPTests\ValidationTests.cs(56,0): at FPTests.ValidationTests.Stack2()
    F:\MyProgramming\FP\FPTests\ValidationTests.cs(61,0): at FPTests.ValidationTests.Stack3()

    So what’s the difference? :confused: Indeed, Reflector shows that the constructor of Exception doesn’t set its stack trace, that’s only set by throwing it. And since our Validation extension methods create exception, but don’t throw them, there is no original stack trace to preserve! Am I missing something?

  24. Mark Stafford says:

    Why not just use PostSharp? We’ve had tremendous success battling the same validation issues, and now we have beautiful code that uses attributes for validation. The only thing it wouldn’t get you is the interspersed checks, but I’m sure it would be possible to work that out as well.

  25. Rick Brewster says:

    Mark Stafford — Wouldn’t attributes require reflection in order to enforce them? The whole point of this pattern is to avoid introducing any overhead so that it can be used in any method — including light-weight get/set property accessors. Reflection is prohibitively expensive at runtime for the most part.

  26. Mark Stafford says:

    Rick,

    The whole point of PostSharp was to create AOP that works at compile time. Agreed that usage of reflection to enforce parameter validation is probably a bad idea, but you might be VERY surprised at how much PostSharp gets you with no run-time hit. The only downside is that there is a bit of natural obfuscation that is emitted in the IL as PostSharp generates code. I really encourage you to check out Gael’s videos – they should get you up and running quickly.

  27. Rick Brewster says:

    Mark — makes sense 🙂 The whole point of the above code is to enable the continued use of standard imperative programming with no performance cost on the non-exceptional code path, and to easily do composite exception reporting (“x is bad, and y is also bad”). (Not that Post# necessarily has a perf hit — just stating that as a goal for what I was doing.)

  28. Joan Venge says:

    Hi Rick, I assume you are the author of this article 🙂

    What do you think about the new DbC additions in .NET and C#?

    I think contracts are really the best approach to develop bullet-proof class frameworks.

    I just wish they bring in the Spec#’s full contractual syntax into C# as it looks very slink and complementary.

    What do you think about these?

  29. Rinat Abdullin says:

    Disadvantages of this approach – code looks bulky, plus refactoring could rename variable while keeping its name different. I prefer for null checks (variable type and name are used if exception is to be thrown and this is not an expression-based approach):

    Enforce.Arguments(() => controller, () => viewManager,() => workspace);

    or with rules

    Enforce.Arguments(() => username, StringIs.Limited(2,65), StringIs.ValidEmail);

    Enfor

  30. Flominator says:

    Thanks for this post. I’m quite new to C# and I wasn’t able to get your code running 😦

    * public static Validation Check () seem to lack the definition of “message”. Is it a string?

    * System.Collections.Generic.IEnumerable doesn’t seem to support First(), FirstOrDefault(), AnyNull() and ToArray()

    I’d be thankful for any clues.

    Thanks,

    Flo

  31. Johannes Rössel says:

    Flominator: You need .NET 3.0 or 3.5 for that, since the methods you describe as lacking belong to LINQ, which wasn’t present in .NET 2. You may also simply lack an assembly reference.

  32. Flominator says:

    Johannes meanwhile replied via email and told me, that you could implement something like this:

    public static boolean AnyNull(this IEnumerable ie)
    {
    foreach (x in ie)
    if (x == null) return true;
    return false;
    }

    Thank you, Johannes

Comments are closed.