On null and undefined

An important distinction?
Text saying 'null and undefined in JavaScript and Json', the JS logo, and the website's logo.

I came across a tweet the other day (no link, sorry), where the author said they'd stopped using null in JavaScript, instead using only undefined. The author was encouraging the readers to do the same. From what I can remember, the reason for this was that mixing null and undefined can be confusing and lead to weird bugs in your application.

I've also heard a number of dev friends express negative emotions towards the fact that JavaScript has both null and undefined, saying it makes things harder.

However, I'm in the camp where I think having both null and undefined is a good thing because they are separate concepts. In this post, I want to discuss and put forward the way I view them, both in JavaScript itself, and also as part of a JSON API.

Note: this is largely my own opinion, and I do not have a significant amount of literature to back it up. I welcome opinions and discussions on this!

JavaScript

In JavaScript, both null and undefined represent an absence of value. However, the kind of absence they represent is quite different from one another.

undefined values are values that have simply not been defined. MDN says that 'undefined is a primitive value automatically assigned to variables that have just been declared, or to formal arguments for which there are no actual arguments.':

    let x // x is `undefined`

    var y // y is `undefined`

    const f = (x) =>
      console.log('x is', x)

    f() // prints 'x is undefined'

Additionally, the return value of a function that doesn't return anything is undefined:

    // returns `undefined`
    const g = () => {}

On the other hand, null is never implicitly assigned. It must always be assigned explicitly by the programmer. This makes null well suited for cases when you want to represent a value that can either be present or not, similar to Haskell's Maybe and Rust's Option.

In short

An undefined value is something that hasn't been assigned, while a null value has been assigned that specific value.

Does that matter in your application code? I'm inclined to think that it's not much of a big deal whether you use only one or both. In my experience, most cases will be testing the 'truthiness' of a value anyway, so null and undefined can intermingle freely.

Where it does matter, though, is in areas where these two concepts carry different meanings, such as when communicating via JSON, which is what we'll be looking at in the next section.

Personally, I'm 'Team Both'. I'll use undefined for optional parameters to React components or optional keys of object arguments that I expect, and I'll set a value to null if I intend it to carry a semantic meaning.

What do I think you should do? Whatever fits your use case. If you have no need to differentiate between these two, then it really doesn't matter much what you do. If you'd prefer to only use the one, that's perfectly fine by me.

JSON and APIs

While null and undefined may carry largely the same meaning in JavaScript, they have quite distinct meanings in JSON1 and when communicating with an API. This is especially true when making PATCH requests.

The RFC for the JSON Merge Patch document format (7396) states that when patching, an undefined value should be left as is and a null value should be removed. This means that the two values have wildly different meanings and that mixing them up can cause some very unwanted side effects.

The API story

However, even if JSON works as described above and JavaScript has both undefined and null, a lot of server-side languages don't have this distinction. And they're also often statically typed languages that need to populate a model. This can lead to problems.

Where I work, most of the backend systems are written in C#, where we must have defined classes for JSON payloads. C# doesn't have undefined, so depending on how you configure your system, it can either mark non-existing fields as null or throw an error because it didn't get all the data it needed. When you're working with PATCH as described above, neither of those are ideal.

I don't know of any libraries (in any statically typed languages) that have an elegant solution for this, but I'd be very interested in hearing about it if you do. It may not be applicable to any languages I work with, but just seeing a solution would be very interesting.


So those are my thoughts. I think there is a clear difference between null and undefined, and I support having both. In some cases, they might be interchangeable, but if nothing else, I maintain that they express intent differently, so I think it's worth it just for that.

Footnotes

To be clear: JSON doesn't have an explicit undefined value, but you can leave out a field, which serves the same purpose:

    // JS object
    {
      a: null,
      b: undefined,
      c: 1
    }

    // JSON object, converted
    {
      "a": null,
      "c": 1
    }

Haskell's Maybe and Either types

Let's Read: Haskell Programming from First Principles, pt XII
The Haskell logo over the words 'Maybe && Either: An introduction'

Haskell logo fill by Paweł Czerwiński on Unsplash

If you don't have exceptions and you don't have null: How do you handle errors and invalid inputs? Based on your background, this can either be trivial or mind-bending. Coming into the Haskell (actually: Elm and Rust) world from C#, JavaScript, and Python, it definitely wasn't obvious to me at first.

Luckily, the Maybe and Either types are there for you, and they are fairly easy to get started with. In this post, I want to give a quick introduction to both of these data types and how they're used.

To any returning readers: This post is based chapter 12 of /Haskell Programming
from First Principles/, which deals with basic error handling in Haskell, using
the ~Maybe~ and ~Either~ types. The chapter doesn't actually introduce much else
of interest, so I wanted to approach this entry slightly differently.

What we will (and won't) be covering

This post is intended to be a brief introduction to the Maybe and Eitherdata types in Haskell; suitable for someone who has no previous experience with these. Some understanding of Haskell syntax would be beneficial, but is not required. The post should give the reader a basic understanding of whatMaybe and Either are and how they can be used for modeling data and responses.

This post is not intended to be a thorough examination of these types. It will not discuss anything related to typeclasses, mapping, or binding. There will also not be any extensive code samples. As such, please note that there is a lot more to these data types than we will look at here, but that most of it is out of scope for this post.

For further reading, please consult the section at the end of the article.

Maybe

Maybe represents the potential absence of a value. It is used when some data can be in one of two states: defined/present, or undefined/absent. Because Maybe represents a value that may or may not be present, we also need to specify the type of the potentially contained item: Maybe String,Maybe Integer, or Maybe a for instance.

In many ways, Maybe is similar to null in C#, JavaScript, etc., None in Python, and related concepts (like nil) in a lot of other languages. However, in these other languages, the potential lack of a value is (almost) always implicit, meaning the programmer can never be sure whether the value is there or not. In Haskell, this is always explicit. The Maybe type is the same as the Option type in languages like Rust and F#.

The data declaration looks a little something like this:

       data Maybe a = Nothing | Just a

As such, Maybe has two data constructors, Nothing and Just, where the latter takes a value to wrap.

Basic usage

While there's any number of ways you can use the Maybe type in your programs, I find the two most obvious ways to be

  1. to model data where some parts are optional
  2. to return something from a function if the input is invalid

Being able to indicate that a value may or may not be present is a simple, but very powerful tool of modeling data. Imagine you've created a dating app for Haskellers---let's call it Hinder---where you want to display a list of users. The list should display only their profile picture and their name, but there's no requirement to have set a profile picture to be listed, so we'll need to account for that. A simplistic way to model that would be by using a list of records like this:

      data User = User
        { name :: String
        , picture :: Maybe String
        }

Where the picture value is Nothing if there is no profile picture set. If they have a profile picture, it's Just <picture id>.

The other way to use Maybe is as the return value of a function. As much as possible, you should use the type system to constrain the set of allowed arguments to a function, but that'll only get you so far. The built-in divfunction throws an exception if you pass 0 as the second argument, so let's fix that by using Maybe. If the divisor is 0, return Nothing. Otherwise, return Just <result>.

      safeDiv :: Integral a => a -> a -> Maybe a
      safeDiv _ 0 = Nothing
      safeDiv x y = Just $ x `div` y

Either

The Either type is similar to the Maybe type, but comes with some added functionality. It's also used to model things that can fail, but it provides a way to report what went wrong, not just that something failed.

Either represents a piece of data that can be one of two things. That is, it can be either 'this' or 'that', whatever 'this' and 'that' may be.Either is parameterized by two types (so 'this' and 'that' can be of different types), meaning you'll usually see Either a b or something likeEither String Int.

Where Maybe has a very easy analog in the concept of a null value,Either doesn't enjoy the same luxury. However, it is conceptually equivalent to theResult type in Rust and F#, even if the order of the arguments are flipped.

The data declaration (simplified) looks like this:

     data Either a b = Left a | Right b

Either has two data constructors, Left and Right, each of which take a single value. By convention, the Left value is used for when something goes wrong, and the Right value for when something goes, well, right.

Basic usage

Either is usually used for error handling and validation, so it's ideal as a return type of functions that can 'fail' or that are otherwise not able to create a valid result based on the arguments given.

Let's return to our safeDiv function from before. This time, however, we'll return an error message instead of Nothing:

     safeDiv :: Integral a => a -> a -> Either String a
     safeDiv _ 0 = Left "You cannot divide by zero."
     safeDiv x y = Right $ x `div` y

The error may be obvious in this case, but for more complex functions (or chains of them), it gets ever more useful.

For a slightly more advanced example, imagine you're creating a Personrecord. In this case, a Person has a name of type String and an Age of type Integer. But with these constraints, we can easily pass in invalid values, so we'll have to validate that the name isn't an empty string and that the age is non-negative:

     data PersonError = InvalidName | InvalidAge

     data Person = Person
       { name :: String
       , age :: Integer
       }

     makePerson :: String -> Integer -> Either PersonError Person
     makePerson name age = undefined -- exercise for the reader ;)

The actual implementation of the validation isn't important in this case, so I've left that as an exercise for the reader. Oh, and what if there are multiple errors? Well, we can deal with that too, but that's not for now.

Further reading

If you have found the article useful and would like to learn more aboutMaybe, Either, and error handling, here's a little list of useful resources:

The Haskell Wiki
The Haskell wiki is usually one of the first places I check when I want more information on anything Haskell. They have an article on Maybe as well as one on error handling, which has sections on both ~Maybe~ and ~Either~.
Learn You a Haskell's chapter /A Fistful of Monads/ and the section on ~Maybe~
Learn You a Haskell is a pretty good introduction to Haskell that's available in its entirety from the website. The content in the Fistful of Monads chapter is a bit more advanced and might require reading some of the previous chapters, but both the chapter and the book could be well worth a look.
The School of Haskell and the chapter on error handling
While I've not used The School of Haskell as a resource myself, I did find the error handling chapter to be well written. Oh, and it's written by Bartosz Milewski so you can expect some interesting insights.

On Generics and Associated Types

A Rust language feature adventure!

Compared to other languages I've learned, Rust has a fair few concepts that can be a bit tricky to get your head around. Borrowing, ownership, and the borrow-checker are common enough to have spawned a range of memes on their own, and I've personally spent hours on lifetime issues only to give up and rewrite something using cloning.

Associated types, though not something that'll have you banging your head on your desk for hours, is something that it took me quite a few tries to finally understand---or at least think I understand.

What really never stuck was how it was different from generics, and why you'd need (or event want) associated types. So what's a good way to learn and internalize a topic like this? Well, write something down and release it to the internet, of course! They'll let you know if you're wrong.1

tl;dr:

The quick and dirty answer to when to use generics and when to use associated types is: Use generics if it makes sense to have multiple implementations of a trait for a specific type (such as the From<T> trait). Otherwise, use associated types (like Iterator and Deref).

Goals and constraints of this post

This post is intended to demonstrate and explain the differences and similarities between associated types and generic types. It will deal specifically with traits, as this is the only place where associated types come into play.

Furthermore, even if we're talking about associated types, we will not venture into the dark forest of generic associated types. However, if you're curious about this and haven't kept up to date, I encourage you to look through the RFC.

If, after reading this post, you're still not sure what I'm on about, check out the /Advanced Traits/ chapter of the Book, and specifically the section on associated types.

Finally, this post assumes you have some familiarity with programming (and with Rust specifically), and with some forms of generic programming. For a primer on generic types in Rust, check out Chapter 10.1, /Generic Data Types/, of the book

Definitions

To make sure we're all on the same page, let's have some quick definitions to start us off, shall we?

Generic types

In the context of traits, generic types, also known as type parameters, are a way of deferring the specific types that your trait uses until the trait is implemented. Generic types can be completely open, such that any type would work, or they can be constrained to types that implement some trait.

Take, for instance, Rust's ~std::convert::From<T>~ trait. The T in the type signature says that this type is generic over any type T2, meaning you can convert any type T into the type you're implementing the trait for.

Constrained (or bounded) generic types are more often seen in generic functions than in generic traits, but what they do is allow the author of trait X to say that 'only types which implement some other trait ~Y~ can be used for this trait'. We'll see examples of this later, so don't worry if it's a bit fuzzy right now.

Associated types

Associated types are, as the name implies, types that are associated with a trait. When you define the trait, the type is still unspecified. Much like with generics, you can put constraints on the type if you want to, or you can choose not to.

One of the most prominent examples of a trait with associated types is the ~Iterator~ trait. The Iterator trait has an associated type Item and a function next. The next function returns an Option<Self::Item>. You could have done the same thing with generic types, but, as we'll see later, using associated types offer some benefits in certain situations.

Syntax

Before we go any further, let's just quickly review the syntax for these concepts. If for no other reason, then just to make everything a bit less abstract. We'll define two traits, Generic and Associated, which use generics and associated types respectively, and we'll look at using bounds and default types as well.

Basic traits

A type-parameterized trait can look a little something like this:

    trait Generic<T> {
        fn get(&self) -> T;
    }

Similarly, a similar trait with an associated type looks like this:

    trait Associated {
        type T;
        fn get(&self) -> Self::T;
    }

Note how the type gets moved from the type signature and into the trait definition itself, and how, when referencing it later, we need to use Self::T, instead of just T.

With constraints

If we want to set constraints on the associated type or type parameter, we use the same syntax as Rust uses everywhere else for bounds: the : operator.

For instance, say we want to constrain our types to only types that implement the core::fmt::Display trait:

     trait Generic<T: Display> {
         fn get(&self) -> T;
     }

     // or using the `where` keyword
     trait Generic<T>
     where
         T: Display,
     {
         fn get(&self) -> T;
     }

With associated types, the syntax is much the same:

     trait Associated {
         type T: Display;
         fn get(&self) -> Self::T;
     }

With default types

Rust has a cool feature for generic types where you can set the default type, which will be assumed if no type is specified. This can be useful if, for most use cases, you want to use a specific type, but want to be able to override it sometimes. See the section on default generic types in the Book for more information.

They look like this:

     // basic trait, no constraint
     trait Generic<T = String> {
         // ...
     }

     // with constraint
     trait Generic<T: Display = String> {
         // ...
     }

     // or using the `where` clause
     trait Generic<T = String>
     where
         T: Display,
     {
         // ...
     }

From what I tried, it seems you cannot put the default type (= String) in the where clause of the trait.

For associated types, there is no such thing as default types on stable rust today. However, if you're on nightly, you can use the #![feature(associated_type_defaults)] flag, which enables this. Judging from the GitHub tracking issue, this is being worked on but I've not seen anything about stabilization of this yet.

But let's see what it'd look like:

     #![feature(associated_type_defaults)]

     // simple
     trait Associated {
         type T = String;
         // ...
     }

     // with constraint
     trait Associated {
         type T: Display = String;
         // ...
     }

It's pretty neat and pretty similar. You could even do something like this:

     trait Associated {
         type T: Display = String;
         type U = Self::T;
         // ...
     }

Don't know when it'd be useful, but it's nifty 🤷

Commonalities

Now that we've covered what they are and what the syntax looks like, let's continue by looking at what they have in common.

The most important thing is that both generics and associated types allow you to defer the decision of what type to use for the trait implementation. Even if the notation is a bit different, anywhere you have an associated type, you can replace it with a generic type instead (though the opposite does not hold true). As the the RFC puts it: "associated types do not increase the expressiveness of traits per se, because you can always use extra type parameters to a trait instead". However, they do offer other benefits.

The fact that you can always use generics instead of associated types is why it took me so long to understand what associated types exist for. As we move into the next section, we'll examine what makes them different, and why you'd want to choose one over the other.

Differences

As we've seen, generics and associated types cover a lot of the same use cases, but there are some reasons why you might choose one over the other.

Generics allow you to implement the same trait numerous times for the same type by changing the type parameter. The From<T> trait was mentioned earlier, and it's a great example of this. Because From<T> uses a generic type, we can implement it for any number of type parameters.

For instance, say you have a type MyNumeric. You can implement From<u8>, From<u16>, From<u32>, and so on. This makes generics very useful if it makes sense to have multiple trait implementations varying only in type parameters.

Associated types, on the other hand, only allow a single implementation. Because a type can only implement a trait once, this can be used to constrain the number of implementations.

The ~Deref~ trait comes to mind. Deref has an associated type Target, which it can be dereferenced to. It would get really confusing if a type could implement Deref into an arbitrary set of other types (and probably very tricky for type inference!).

Because a trait can only be implemented once per type, associated types also offer some notational benefits. Using associated types means you don't have to add type annotations for all the extra types. This is touted as an engineering benefit in the RFC.

Summary and further reading

In short, use generics when you want to a type A to be able to implement a trait any number of times for different type parameters, such as in the case of the From<T> trait.

Use associated types if it makes sense for a type to only implement the trait once, such as with Iterator and Deref.

If you want to know more about associated types and what problems they solve, I recommend starting with the RFC that introduced them and the section on associated types in the Book. The section on the ~Add~ trait, which uses both generics (with defaults) and associated types, is also well worth a read. As an alternative resource, this Stack Overflow response also contains a well-worded explanation, with examples, of when you'd use one over the other.

Footnotes

This is mostly said in jest. I haven't actually gotten very many corrections on statements I've made at all; and the ones I have had, have always been very nice. I like you, reader. You're my friend.

Well, actually, it says that this type is generic over any type T that implements the Sized trait. We'll gloss over this in this article, but for more information, check out this section from chapter 19.4, /Advanced Functions and Closures/, of the Book.