Elixir Syntax: Data Types and Collections
Elixir Syntax Series
Data Types
If you’re coming from languages like C++ or Java, you might notice that Erlang—and by extension, Elixir—does not include some of the basic data types you’re accustomed to, such as fixed-size integers (e.g., int
, long
) and character types (char
). It also introduces the atom
, a deliberate design choice for pattern matching.
Atom
Atoms are constants where the name is the value. They are commonly used to signify states like success or error.
In Erlang, variables start with an uppercase letter and atoms start with a lowercase:
ok
error
Elixir updated this convention, where atoms are prefixed with an :
:
:ok
:error
Atoms are not garbage-collected, which means dynamically generating atoms from user inputs or other uncontrolled sources is discouraged due to potential memory exhaustion risks.
Booleans
There is no distinct boolean type. Instead, boolean values are represented using the reserved atoms true
and false
.
true
false
Integers
Integers are whole numbers without fractional components. Elixir supports arbitrarily large integers, so there is no upper bound to their size.
123
Floats
Floating-point numbers in Elixir provide double-precision as defined by the IEEE 754 standard. All floats include a decimal point to distinguish them from integers.
3.0
3.1415
Binaries
Binaries are sequences of bytes primarily used to handle encoded text and data. In Elixir, strings are UTF-8 encoded binaries.
"Hello, Elixir!"
Type Checking
Elixir is dynamically typed, so it includes runtime predicate type-checks for all basic data types.
is_boolean(value)
is_integer(value)
is_float(value)
is_binary(value)
is_atom(value)
Data Collections
Erlang (and by extension, Elixir) provides a fundamental set of built-in data structures, such as lists, tuples, and maps, but lacks an extensive range of complex data collections. This design choice is intentional; the language prioritizes solving problems related to concurrency, fault tolerance, and distributed computing over offering a broad assortment of general-purpose data structures.
Lists
In Elixir, lists are ordered collections of values implemented as linked lists. Each element consists of a value and a pointer to the next element, forming a chain of connected cells. This structure allows the list to grow dynamically and makes certain operations—such as prepending elements—particularly efficient.
Appending elements requires traversing the entire list to reach the end. Therefore, it’s more common to build linked lists by prepending elements to the head and then reversing or sorting the list as needed:
list = []
# Prepend values
list = [1 | list] # [1]
list = [2 | list] # [2, 1]
list = [3 | list] # [3, 2, 1]
reversed_list = Enum.reverse(list) # [1, 2, 3]
The syntax [head | tail]
is used for list construction, where head
is the first item, and tail
is the remainder of the list.
Elixir efficiently manages lists by leveraging immutability. While updating a list might seem to involve copying, the underlying implementation actually shares elements. Since each element’s value is immutable, multiple lists can share the same elements without risk of unintended alterations.
Note: Lists can contain elements of different types
mixed_list = [:ok, "Hello", 3.14, 42]
Range
Ranges generate sequences of numbers, which are especially useful for iteration. A range is defined by specifying the start and end numbers, inclusive.
range = 1..5
squared = Enum.map(range, fn x -> x * x end) # [1, 4, 9, 16, 25]
Ranges are typically used with the Enum
module to perform operations over a series of numbers.
List Comprehensions
List comprehensions provide a concise syntax for transforming lists:
squared = for n <- 1..5, do: n * n # [1, 4, 9, 16, 25]
evens = for n <- 1..10, rem(n, 2) == 0, do: n # [2, 4, 6, 8, 10]
products = for x <- [1, 2, 3], y <- [10, 20], do: x * y # [10, 20, 20, 40, 30, 60]
- Keyword
for
: Initiates the list comprehension. - Generator: The source of data, typically a range or a list.
- Filters (optional): Conditions that must be met for an element to be included in the results.
- Keyword
do
: Precedes the expression that defines the operation performed on each element.
Erlang also supports list comprehensions, but with a slightly different syntax:
Squared = [N * N || N <- lists:seq(1, 5)], % [1, 4, 9, 16, 25]
Evens = [N || N <- lists:seq(1, 10), rem(N, 2) == 0], % [2, 4, 6, 8, 10]
Products = [X * Y || X <- [1, 2, 3], Y <- [10, 20]]. % [10, 20, 20, 40, 30, 60]
While list comprehensions are a powerful feature in Erlang and Elixir, most Elixir code prefers transforming lists using the Enum
module combined with the pipe operator.
Enum Module
The Enum
module in Elixir provides an extensive set of algorithms for enumerating over collections. These include the basics, such as:
Enum.map(collection, function)
: Transforms each item in the collection using the provided function.Enum.filter(collection, predicate)
: Returns a new collection filtered by the predicate.Enum.reduce(collection, accumulator, function)
: Reduces the collection to a single value with an accumulator and function.EnumEnum.find(collection, default, predicate)
: Finds the first value in the collection using the predicate, returning the default if none exists.Enum.sort(collection, order)
: Sorts the collection based on the order function.
Keyword Lists
Keyword lists (Erlang has Prop Lists) are a special form of lists constructed from tuples, where each tuple’s key is represented by an atom. These work somewhat like Records, but preserve their order of insertion and allow for duplicate keys.
options = [min: 1, max: 100]
min_option = Keyword.get(options, :min) # Returns 1
options = [{:timeout, 500} | options]
timeout_option = Keyword.get(options, :timeout) # Returns 500
Keyword lists are an older artifact of the Elixir/Erlang ecosystem. They do not provide constant time complexity for key lookup, so are not a good substitute for records or maps. They tend to show up for the specific use case of passing options to a function.
Maps
Maps provide a dynamic alternative to Erlang’s records, which are static data structures with predefined schemas. They maintain constant-time key lookups while allowing the addition and removal of key-value pairs. Maps accept a variety of key types, including integers, strings, atoms, tuples, lists, and even other maps.
Maps are Enumerable
, allowing them to be passed into functions in the Enum
module, which includes operations such as mapping and reducing.
person = %{:name => "Alice", "age" => 12}
name = person[:name] # "Alice"
name = person.name # Alice
age = person["age"] # 12
age = person.age # Error, dot notation only works on atoms
person = Map.put(person, :location, "Wonderland")
Although it is possible to mix the types of keys in a map, it is typical to use atoms.
Structs
Structs are maps with an enforced schema available only in Elixir. Defined at the module level, each struct is uniquely associated with a module, ensuring a structured and predictable data format. By providing a consistent structure with predefined keys and default values, structs maintain data integrity and reduce errors in function calls that expect certain data shapes.
defmodule Person do
defstruct name: "", age: 0, height: 0
def ride_roller_coaster?(%Person{height: height}) do
if height >= 120 do
IO.puts("Enjoy the ride!")
true
else
IO.puts("Sorry, you are too short!")
false
end
end
end
Above, the function ride_roller_coaster?
leverages the Person
struct to determine eligibility for a roller coaster ride based on height.
Attempting to use a map with the appropriate keys instead of a struct will lead to an error, as the function expects a Person
struct:
alice = %{name: "Alice", age: 28, height: 122}
Person.ride_roller_coaster?(alice) # Error
Using the Person
struct will execute:
alice = %Person{name: "Alice", age: 28, height: 122}
Person.ride_roller_coaster?(alice) # true
Constructing a Person
struct without specifying a height also works as expected:
alice = %Person{name: "Alice", age: 28}
Person.ride_roller_coaster?(alice) # false
The height defaults to 0, and the function handles it appropriately.
Structs constrain the dynamic nature of the maps, reducing the need for defensive programming.
Map Module
The Map
module in Elixir provides functions for working with maps and structs. Some key functions include:
Map.get(map, key, default)
: Retrieves the value for the given key, returning the default if the key doesn’t exist.Map.put(map, key, value)
: Inserts or updates the key with the given value in the map.Map.delete(map, key)
: Removes the key from the map.Map.keys(map)
: Returns a list of all the keys in the map.Map.merge(map1, map2)
: Merges two maps, with values from the second map overwriting those from the first if keys conflict.
Map Update Syntax
A map has a concise update syntax using a |
:
superman = %{name: "Clark Kent", city: "Metropolis"}
updated_superman = %{superman | city: "Gotham"}
In this example, the city
key is updated from “Metropolis” to “Gotham” while the name
key remains unchanged.
Tuples
Tuples are fixed-size collections of elements stored contiguously in memory, making it very fast to access their elements. In many languages, tuples are sort of an afterthought, but in Elixir, they are central to how functions return values and handle pattern-matching. Tuples are particularly useful for representing success or failure in function results, providing a clear and structured way to handle different outcomes.
{:success, result}
{:error, reason}
Conclusion
Erlang was designed to build distributed, scalable, reliable, fault-tolerant systems to handle massive amounts of concurrent communication. Its choice of data types and collections were limited to solving those tasks, and Elixir reflects these goals, but with a more modern syntax.