Polymorphism and Structs in Elixir

Polymorphism has always relied on tags. Elixir’s structs make them safer.

In the earliest days of computing, language designers wanted polymorphism. Take addition: what should + be able to combine? Some languages restricted it to integers and floats, while others extended it to lists and strings. But whichever choice they made, the machine still needed a way to interpret what the raw bits of memory represented.

The solution was tags. Each value carried a small piece of meta information declaring what type it represented. Polymorphic functions used that information to perform the operation when it made sense — or raise an error when it didn’t.

JavaScript

Let’s look at today’s most ubiquitous language, JavaScript. From the very beginning it included typeof, which tagged values as number, string, object, or function (later extended to boolean, undefined, symbol, and bigint). Over time this proved too coarse, so the language added the hidden [[Class]] tag to distinguish arrays, dates, and regular expressions. Later, JavaScript introduced symbols as another tagging mechanism, supporting iteration and custom types.

Here’s a list of numbers:

const list = [10, 20, 30]

Originally we needed to manage the loop with an index. Think of how many times you’ve written this classic three-part shortcut:

for (var i = 0; i < list.length; i++) {
  console.log(list[i]) // 10 20 30
}

Now, because our array is identified as an iterable type, we can use of:

for (const n of list) {
  console.log(n) // 10 20 30
}

JavaScript has other iterator types, such as a string:

for (const ch of "hi") {
  console.log(ch) // "h" "i"
}

This is polymorphism, where JavaScript is discriminating between array and string inputs to provide different results.

It is also common in Javascript for libraries to use tagging.

Redux, for example, uses a type tag in its actions:

const addTodo = {
  type: "ADD_TODO",
  payload: { text: "Buy milk" }
}

And the reducer is a polymorphic function applied to action types, using a switch/case to dispatch based on the tag:

function todos(state = [], action) {
  switch (action.type) {
    case "ADD_TODO":
      return [...state, action.payload.text]
    default:
      return state
  }
}

While JavaScript supports custom tagging, much of modern JS relies on passing JSON objects, which have no easy way to embed those custom tags. Instead, most libraries that use tagging still fall back to adding a type field, with modern libraries commonly distinguishing the tag as metadata by using the underscore convention.

For example, GraphQL results include a __typename field.

When searching for The Hobbit, we might get a book:

{
  "data": {
    "node": {
      "__typename": "Book",
      "name": "The Hobbit"
    }
  }
}

Or a DVD:

{
  "data": {
    "node": {
      "__typename": "DVD",
      "name": "The Hobbit"
    }
  }
}

Again, with the tagged type, we can distinguish what we are dealing with, enabling polymorphic behavior, such as calculating shipping by media type.

Erlang

Erlang carries a tag for disambiguating types, which include integer, float, atom, tuple, list, binary, map, function, pid, port, and reference.

And like other languages, the + operator is polymorphic. In Erlang, it’s limited to integers and floats. It’s not just guarding against bad input, it returns different results based on input type.

If we add two integers we get an integer:

1 + 2.   % 3

And if at least one number is a float, we get back a float:

1 + 2.0. % 3.0

(Note, the trailing . is Erlang’s command terminator, not part of the number.)

Mismatched types throw an exception:

"hi" + "there".
% ** exception error: bad argument in an arithmetic expression

Erlang developers also use the tagged tuple convention to identify types. The most common is the ok/error pattern. Instead of throwing an exception, a function can return either success or failure:

{ok, Result}.
{error, Reason}.

Here, we are populating the first position of the tuple with an atom, ok to tag for success or error for failure.

Like our Redux example, we can use case logic to implement polymorphism:

case file:read_file("data.txt") of
  {ok, Contents} ->
    io:format("Read file successfully~n");
  {error, Reason} ->
    io:format("Failed: ~p~n", [Reason])
end.

Erlang also supports function head pattern matching, letting us handle polymorphism with separate function clauses:

handle({ok, Contents}) ->
    io:format("Read file successfully~n");
handle({error, Reason}) ->
    io:format("Failed: ~p~n", [Reason]).

Warning

Tagged tuples are not globally unique. Two different libraries or teams can choose the same tag for different meanings. The runtime has no way to detect this tag collision. When it happens, polymorphic dispatch will break.

Elixir

Elixir implements Erlang, but with a different syntax.

The + operator is the same, it is polymorphic within numbers:

1 + 2    # 3
1 + 2.0  # 3.0

Non-numeric values still raise an exception:

"hi" + "there"
# ** (ArithmeticError) bad argument in arithmetic expression

It is common in Elixir to see Erlang’s tagged-tuple pattern:

{:ok, result}
{:error, reason}

Where we can implement polymorphism by branching with case:

case File.read("data.txt") do
  {:ok, contents} ->
    IO.puts("Read file successfully")
  {:error, reason} ->
    IO.puts("Failed: #{reason}")
end

Or by using pattern matching in the function heads:

def handle({:ok, contents}) do
  IO.puts("Read file successfully")
end

def handle({:error, reason}) do
  IO.puts("Failed: #{reason}")
end

Elixir has a solution to the global-tag problem: the struct. A struct is a map with a __struct__ key that Elixir automatically fills with the module name. The VM enforces module name uniqueness, which makes this a globally unique tag.

In Elixir we use the defstruct syntax to define a struct:

defmodule Ride do
  defstruct [:name, :status]
end

defmodule Patron do
  defstruct [:name, :status]
end

Again, because the struct uses the module name, we can have only one struct per module.

Elixir also has a syntax to construct struct values:

ride = %Ride{name: "Whirlwind", status: :offline}
patron = %Patron{name: "Alice",    status: :vip}

And it has a syntax for pattern matching structs in the function heads:

def greet(%Ride{name: name}) do
  IO.puts("Welcome aboard the #{name}")
end

def greet(%Patron{name: name}) do
  IO.puts("Hello #{name}")
end

With this, we get polymorphism, where different input types branch to dispatch different logic:

greet(ride)   # Welcome aboard the Whirlwind
greet(patron) # Hello Alice

Also, structs support defaults:

defmodule Ride do
  defstruct name: "Unnamed", status: :offline
end

Now, when we create a ride but leave out status:

ride = %Ride{name: "Whirlwind"}
# %Ride{name: "Whirlwind", status: :offline}

Elixir fills in the default :offline value.

And structs also protect their shape. We can use struct/2 to update our ride to be online:

ride = struct(ride, %{status: :online})
# %Ride{name: "Whirlwind", status: :online}

But if we try to update a key that has not been declared:

ride = struct(ride, %{color: :red})
# %Ride{name: "Whirlwind", status: :online} 

Our Ride struct ignores the change, protecting its shape.

In a nutshell, structs behave like maps but with defaults, enforced shape, and a globally unique tag.

You might also be interested in my earlier post on polymorphism through protocols: Polymorphism in Elixir.

This topic is explored in more depth in my book, Advanced Functional Programming with Elixir, available now in beta from The Pragmatic Bookshelf.