It is a very common pattern to return null when something does not exist. Suppose we have a system where a UserRepository object can retrieve User objects from persistent storage - like a database.

public class UserRepository
{
    public User Get(string email) =>
        context.Users.SingleOrDefault(u => u.Email == email);

    /* ... */
}

This is also a very common source of bugs. Consumers of the UserRepository routinely forget to check whether the reference returned is null, causing the code to blow up at runtime unexpectedly.

var user = userRepository.Get(nonexistentEmail);
var viewModel = new UserViewModel
{
    FullName = user.FullName // BOOM!!!
};

It could be argued that the design of the API of the UserRepository class is misleading because it does not indicate in any way that the reference returned by Get can be null. That argument, however, is not valid, at least technically. Since all references in the C# language can be set to null, it is therefore reasonable to expect that references returned by methods can also be null and should be checked. Just by declaring a reference return type, you indicate the possibility of returning a null reference.

But the fact still remains that null checking is very easy to forget, and even if applied correctly every time, it is distracting and redundant. NullReferenceExceptions often occur very far, many layers away from the source of the problem, so they are hard to fix.

The Type of null

We are soon going to find out what can be done about this inconvenient situation, but first, let me show you something that might surprise you. Please tell me, without actually running it, the value of isString after running this code:

string str = null;
bool isString = str is string;

If you said true, you are in good company. Some of the best developers I know answered the same. Nevertheless, you would be wrong because of a little-known fact:

null is not only a value but also a type. The null type has a single value, which is the null value.

Let’s make up a new syntax for union types in C#.

string|null str = null;
bool isString = str is string; // false, of course
bool isStringOrNull = str is string|null; // true, of course

This nonexistent syntax would make explicit what is the feature of every reference typed variable: that it may hold values of two different types, the type declared and null.

To make things worse, this is mentioned only in the C# 2.0 specification. Every version since then has no mention of a null type which effectively makes null a value without a type.

Documentation

After this little detour, let’s return to our original problem. How can we indicate that the reference returned by Get can be null?

Documentation, how else?

public class UserRepository
{
    /// <summary>
    ///     Retrieves the <see cref="User"/> with the <paramref name="email"/>
    ///     specified from the database.
    /// </summary>
    /// <param name="email">
    ///     The email address of the user to retrieve.
    /// </param>
    /// <returns>
    ///     The <see cref="User"/> with the <paramref name="email"/> specified
    ///     if it exists; otherwise, <c>null</c>.
    /// </returns>
    public User Get(string email) =>
        context.Users.SingleOrDefault(u => u.Email == email);

    /* ... */
}

Suddenly, our short function turned into a monster. And honestly, this is not user-friendly at all. It shifts responsibility to the client without providing any actual help - and don’t try to convince me that anyone reads the XML comments.

Throwing an Exception

So we saw that documenting the possibility of returning null is not good enough. Uncle Bob agrees with our conclusion.

I think that any discussion about error handling should include mention of the things we do that invite errors. The first on the list is returning null.

So we really should find a more obvious way of indicating a missing user.

public class UserRepository
{
    public User Get(string email)
    {
        var user = context.Users.SingleOrDefault(u => u.Email == email);
        if (user == null)
            throw new NotFoundException(email);

        return user;
    }

    /* ... */
}

And what is more obvious than an exception?

Let’s look at how a naรฏve programmer using our API would implement the login function in our system.

// in the login method
var user = userRepository.Get(email);
if (user.PasswordHash != hash(password))
    // show error message and return
// handle successful login

The first time our conscientious programmer tests that function with a nonexistent user, they are greeted with a big fat exception. That is more useful than returning null because the exception is thrown at the exact location of the error rather than later, the first time a member is accessed on the reference. So, after it failed the test, the code is modified to catch the exception and handle the case of a missing user.

// in the login method
try
{
    var user = userRepository.Get(email);
}
catch (NotFoundException)
{
    // show error message and return
}

if (user.PasswordHash != hash(password))
    // show error message and return
// handle successful login

The code now works correctly. But exceptions are slow, and try-catch blocks hurt readability. More importantly, a missing user is not at all an exceptional case. It is part of the normal behavior of our program. Exceptions should not be used for control flow, they are better reserved for programming errors or truly exceptional occurrences, like when there is no space left on the disk while writing to a file.

Providing an Exists Method

A quick, and actually not that bad way to fix this issue is to provide an Exists method that can be used to check whether the user exists before trying to retrieve it.

public class UserRepository
{
    public User Get(string email)
    {
        var user = context.Users.SingleOrDefault(u => u.Email == email);
        if (user == null)
            throw new NotFoundException(email);

        return user;
    }

    public void Exists(string email) { /* ... */ }

    /* ... */
}

This makes the client code more readable and avoids having to deal with the exception every time a nonexistent email address is specified. But it still suffers from the problem that the possible nullity of the return value is not indicated on the API. Another problem is that this solution is a bit more inefficient than the previous ones as it requires two database queries.

The Try-Get Pattern

Parsing integers entered by the end user is also not a fitting use case for exception-based error reporting. The designers of the .NET Framework recognized this, so they provided two parsing methods on the Int32 class: one that throws an exception in case of failure and one that returns a bool value to indicate success, and writes its result into an out variable on successful parsing.

var input = "a";
int i;
i = int.Parse(input); // throws FormatException

if (int.TryParse(input, out i))
    // use i
else
    // handle format error

We can employ the same pattern for our UserRepository class.

public class UserRepository
{
    public bool TryGet(string email, out User user)
    {
        user = context.Users.SingleOrDefault(u => u.Email == email);

        return user != null;
    }

    /* ... */
}

This has the advantage of being a standard, if a bit clunky pattern in .NET that every C# programmer worth their salt recognizes.

// in the login method
User user;
if (!userRepository.TryGet(email, out user))
    // show error message and return
if (user.PasswordHash != hash(password))
    // show error message and return
// handle successful login

A new feature, out variables, that will appear in C# 7, will make this a bit more compact. You will be able to declare the user variable inside the funciton call expression.

// in the login method
if (!userRepository.TryGet(email, out var user))
    // show error message and return
if (user.PasswordHash != hash(password))
    // show error message and return
// handle successful login

Null Object

We can also consider returning a null object called NonexistentUser instead of null if a user does not exist. This and the User class would derive from an IUser interface that would define all of the properties of the earlier User class and an extra Exists property. That would return a constant true in User and false in NonexistentUser instances. But what should the other properties, like FullName, return? Returning null or throwing an exception would make this solution just as problematic as returning null from the repository. Returning a default value, like an empty string would introduce unpredictable, sneaky behavior into our system.

We can conclude that the Null Object pattern is not a good solution to our problem.

Expressing Nullability in the Type System

Let’s imagine a version of C# that does not allow null as the value of a reference. Everything stays the same, except null is from now on banned from the language. The idea is not as crazy as it sounds. Lots of languages work like that, for example F#, Haskell, OCaml, Rust and Swift. How could we express that our UserRepository may or may not return a User in such a language?

If we look at all of the languages mentioned, they solve the problem the same exact way, using a type called Maybe or Option, based on the same idea as Nullable types in C#, as discussed earlier on this blog.

This is a great idea because now we can use the Option type when we actually want to express that a value can be missing, and use regular (non-nullable) reference types everywhere else.

We can build such an Option type in regular C# to express optionality:

public struct Option<T> where T : class
{
    public static Option<T> None => default(Option<T>);
    public static Option<T> Some(T value) => new Option<T>(value);
    public static Option<T> From(T value) =>
        value == null ? None : Some(value);

    public bool HasValue { get; private set; }

    private readonly T value;
    public T Value
    {
        get
        {
            if (!HasValue)
                throw new InvalidOperationException();

            return value;
        }
    }

    private Option(T value)
    {
        if (value == null)
            throw new ArgumentNullException(nameof(value));

        this.value = value;
        HasValue = true;
    }

    public void Match(Action<T> ifSome = null, Action ifNone = null)
    {
        if (ifSome != null && HasValue)
            ifSome(Value);
        if (ifNone != null && !HasValue)
            ifNone();
    }

    /* Equals, GetHashCode and conversion operators omitted for brevity */
}

While this does not solve the problem that any reference typed value can be null, it can make our code more expressive.

public class UserRepository
{
    public Option<User> Get(string email)
    {
        var user = context.Users.SingleOrDefault(u => u.Email == email);
        return Option<User>.From(user);
    }

    /* ... */
}

// in the login method:
var user = userRepository.Get(nonexistentemail);
user.Match(
    ifSome: u => handleExistingUser(u),
    ifNone: () => handleNonexistentUser());

My personal opinion is that this added expressivity is not worth the effort of introducing this awkward and non-standard concept into our projects.

Unfortunately, completely disallowing null references in C# while keeping backwards-compatibility is extremely hard, as Eric Lippert points out.

The Case for Nesting

(coming soon)

Non-Nullability and the Future of C#

(coming soon)

References

  1. Null has no type, but Maybe has by Mark Seemann
  2. What is the type of the null literal? by Eric Lippert
  3. Clean Code @mdash; A Handbook of Agile Software Craftsmanship by Robert C. Martin