Polymorphism and Structs in Elixir
Polymorphism has always relied on tags. Elixir’s structs make them safer.
In the earliest days of computing, language designers wanted polymorphism. Take addition: what should +
be able to combine? Some languages restricted it to integers and floats, while others extended it to lists and strings. But whichever choice they made, the machine still needed a way to tell what those raw bits of memory represented.
The solution was tags. Each value carried a small piece of meta information declaring what type it represented. Polymorphic functions used that information, performing the operation when it was logically possible or raising an error when it was not.
JavaScript
Let’s look at today’s most ubiquitous language, JavaScript. From the very beginning it included typeof
, which tagged values as number
, string
, object
, or function
(later extended to boolean
, undefined
, symbol
, and bigint
). Over time this proved too coarse, so the language added the hidden [[Class]]
tag to distinguish arrays, dates, and regular expressions. Later, JavaScript introduced symbols as another tagging mechanism, supporting iteration and custom types.
Here’s a list of numbers:
const list = [10, 20, 30]
Originally we needed to manage the loop with an index. Think of how many times you’ve written this classic three-part shortcut:
for (var i = 0; i < list.length; i++) {
console.log(list[i]) // 10 20 30
}
Now, because our array is identified as an iterable type, we can use of
:
for (const n of list) {
console.log(n) // 10 20 30
}
JavaScript has other iterator types, such as a string:
for (const ch of "hi") {
console.log(ch) // "h" "i"
}
This is polymorphism, where JavaScript is discriminating between array and string inputs to provide different results.
It is also common in Javascript for libraries to use tagging.
Redux, for example, uses a type
tag in its actions:
const addTodo = {
type: "ADD_TODO",
payload: { text: "Buy milk" }
}
And the reducer is a polymorphic function applied to action types, using a switch/case
to dispatch based on the tag:
function todos(state = [], action) {
switch (action.type) {
case "ADD_TODO":
return [...state, action.payload.text]
default:
return state
}
}
While JavaScript supports custom tagging, much of modern JS relies on passing JSON objects, which have no easy way to embed those custom tags. Instead, most libraries that use tagging still fall back to adding a type
field, with modern libraries commonly distinguishing the tag as metadata by using the underscore convention.
For example, GraphQL results include a __typename
field.
When searching for The Hobbit, we might get a book:
{
"data": {
"node": {
"__typename": "Book",
"name": "The Hobbit"
}
}
}
Or a DVD:
{
"data": {
"node": {
"__typename": "DVD",
"name": "The Hobbit"
}
}
}
Again, with the tagged type, we can distinguish what we are dealing with, enabling polymorphic behavior, such as calculating shipping costs by media type.
Erlang
Erlang carries a tag for disambiguating types, which include integer
, float
, atom
, tuple
, list
, binary
, map
, function
, pid
, port
, and reference
.
And like other languages, the +
operator is polymorphic. In Erlang, it’s limited to integers and floats. It’s not just guarding against bad input, it returns different results based on input type.
If we add two integers we get an integer:
1 + 2. % 3
And if at least one number is a float, we get back a float:
1 + 2.0. % 3.0
(Note, the trailing .
is Erlang’s command terminator, not part of the number.)
Mismatched types throw an exception:
"hi" + "there".
% ** exception error: bad argument in an arithmetic expression
Erlang developers also use the tagged tuple convention to identify types. The most common is the ok
/error
pattern. Instead of throwing an exception, a function can return either success or failure:
{ok, Result}.
{error, Reason}.
Here, we are populating the first position of the tuple with an atom, ok
to tag for success or error
for failure.
Like our Redux example, we can use case
logic to implement polymorphism:
case file:read_file("data.txt") of
{ok, Contents} ->
io:format("Read file successfully~n");
{error, Reason} ->
io:format("Failed: ~p~n", [Reason])
end.
Erlang also supports function head pattern matching, letting us handle polymorphism with separate function clauses:
handle({ok, Contents}) ->
io:format("Read file successfully~n");
handle({error, Reason}) ->
io:format("Failed: ~p~n", [Reason]).
Warning
Tagged tuples are not globally unique. Two different libraries or teams can choose the same tag for different meanings. The runtime has no way to detect this tag collision. When it happens, polymorphic dispatch will break.
Elixir
Elixir implements Erlang, but with a different syntax.
The +
operator is the same, it is polymorphic within numbers:
1 + 2 # 3
1 + 2.0 # 3.0
Non-numeric values still raise an exception:
"hi" + "there"
# ** (ArithmeticError) bad argument in arithmetic expression
It is common in Elixir to see Erlang’s tagged-tuple pattern:
{:ok, result}
{:error, reason}
Where we can implement polymorphism by branching with case
:
case File.read("data.txt") do
{:ok, contents} ->
IO.puts("Read file successfully")
{:error, reason} ->
IO.puts("Failed: #{reason}")
end
Or by using pattern matching in the function heads:
def handle({:ok, contents}) do
IO.puts("Read file successfully")
end
def handle({:error, reason}) do
IO.puts("Failed: #{reason}")
end
Elixir has a solution to the global-tag problem: the struct. A struct is a map with a __struct__
key that Elixir automatically fills with the module name. The VM enforces module name uniqueness, which makes this a globally unique tag.
In Elixir we use the defstruct
syntax to define a struct:
defmodule Ride do
defstruct [:name, :status]
end
defmodule Patron do
defstruct [:name, :status]
end
Note that because the struct uses the module name, we can have only one struct per module.
Elixir also has a syntax to construct struct values:
ride = %Ride{name: "Whirlwind", status: :offline}
patron = %Patron{name: "Alice", status: :vip}
And it has a syntax for pattern matching structs in the function heads:
def greet(%Ride{name: name}) do
IO.puts("Welcome aboard the #{name}")
end
def greet(%Patron{name: name}) do
IO.puts("Hello #{name}")
end
With this, we get polymorphism, where different input types branch to dispatch different logic:
greet(ride) # Welcome aboard the Whirlwind
greet(patron) # Hello Alice
Also, structs support defaults:
defmodule Ride do
defstruct name: "Unnamed", status: :offline
end
Now, when we create a ride but leave out status:
ride = %Ride{name: "Whirlwind"}
# %Ride{name: "Whirlwind", status: :offline}
Elixir fills in the default :offline
value.
And structs also protect their shape. We can use struct/2
to update our ride to be online:
ride = struct(ride, %{status: :online})
# %Ride{name: "Whirlwind", status: :online}
But if we try to update a key that has not been declared:
ride = struct(ride, %{color: :red})
# %Ride{name: "Whirlwind", status: :online}
Our Ride
struct ignores the change, protecting its shape.
In a nutshell, structs behave like maps but with defaults, enforced shape, and a globally unique tag.
You might also be interested in my earlier post on polymorphism through protocols: Polymorphism in Elixir.
This topic is explored in more depth in my book, Advanced Functional Programming with Elixir, available now in beta from The Pragmatic Bookshelf.