PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.5)

Managing a single pool (continued)

(This post is part of a series on writing a process pool manager in Elixir.)

After last post, we’ve got a pretty fancy-looking pool:

Unfortunately, we don’t have any sort of mechanism to make use of the worker processes. We need to let client processes check out a worker, do some work with it, and check the worker back in once the client is done with it.

Implementing worker checkin and checkout

In lib/pool_toy/pool_man.ex:

def checkout() do
  GenServer.call(@name, :checkout)
end

def checkin(worker) do
  GenServer.cast(@name, {:checkin, worker})
end

def init(size) do
  # truncated for brevity
end

def handle_call(:checkout, _from, %State{workers: []} = state) do
  {:reply, :full, state}
end

def handle_call(:checkout, _from, %State{workers: [worker | rest]} = state) do
  {:reply, worker, %{state | workers: rest}}
end

def handle_cast({:checkin, worker}, %State{workers: workers} = state) do
  {:noreply, %{state | workers: [worker | workers]}}
end

Pretty easy, right? Add checkout/0 and checkin/1 to the API (lines 1-7), and implement the matching server-side functions (lines13-23). When a client wants to check out a worker, we reply :full if none are available (line 14), and otherwise we provide the pid of the first available worker (line 18). When checking in a worker, we simply add that pid to the list of available worker pids (line 22). Note that since all workers are equal, there is no need to differentiate them and we can therefore always take/return pids from the head of the workers list (i.e. workers at the head of the list will be used more often, but we don’t care in this case).

But wait, how come this time around we didn’t implement catch-all clauses for handle_call/3 and handle_cast/2? After all, we had to add one for handle_info/2 back in part 1.4, why would these be different? Here’s what the getting started guide has to say about it:

Since any message, including the ones sent via send/2, go to handle_info/2, there is a chance unexpected messages will arrive to the server. Therefore, if we don’t define the catch-all clause, those messages could cause our [pool manager] to crash, because no clause would match. We don’t need to worry about such cases for handle_call/3 and handle_cast/2though. Calls and casts are only done via the GenServer API, so an unknown message is quite likely a developer mistake.

In other words, this is the “let it crash” philosophy in action: we shouldn’t receive any unexpected messages in calls or casts. But if we do, it means something went wrong and we should simply crash.

Our changes so far

We can now fire up IEx with iex -S mix and try out our worker pool:

iex(1)> w1 = PoolToy.PoolMan.checkout()
#PID<0.116.0>
iex(2)> w2 = PoolToy.PoolMan.checkout()
#PID<0.117.0>
iex(3)> w3 = PoolToy.PoolMan.checkout()
#PID<0.118.0>
iex(4)> w4 = PoolToy.PoolMan.checkout()
:full
iex(5)> PoolToy.PoolMan.checkin(w1)
:ok
iex(6)> w4 = PoolToy.PoolMan.checkout()
#PID<0.116.0>
iex(7)> Doubler.compute(w4, 21)
Doubling 21
42

Great success! We were able to check out workers until the pool told us it was :full, but then as soon as we checked in a worker (and the pool was therefore no longer full), we were once again able to check out a worker.

So we’re done with the basic pool implementation, right? No. When working in the OTP world, we need to constantly think about how processes failing will impact our software, and how it should react.

In this case for example, what happens if a client checks out a worker, but dies before it can check the worker back in? We’ve got a worker that isn’t doing any work (because the client died), but still isn’t available for other clients to check out because it was never returned to the pool.

But don’t take my word for it, let’s verify the problem in IEx (after either starting a new session, or checking all the above workers back into the pool):

iex(1)> client = spawn(fn -> PoolToy.PoolMan.checkout() end)
#PID<0.121.0>
iex(2)> Process.alive? client
false
iex(3)> :observer.start()

Within the Observer, if you go to the applications tab (selecting pool_toy on the left if it isn’t already displayed) and double-click PoolMan, then navigate to the “state” tab in the newly opened window, you can see that only 2 workers are available. In other words, even though the worker we checked out is no longer is use by the client (since that process died), the worker process will never be assigned any more work because it was never checked back into the pool.

How can we handle this problem? How about babysitting? Coming right up next!


Would you like to see more Elixir content like this? Sign up to my mailing list so I can gauge how much interest there is in this type of content.

Posted in Elixir | Comments Off on PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.5)

PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.4)

Managing a single pool (continued)

(This post is part of a series on writing a process pool manager in Elixir.)

After the last part, we’ve made some headway but still need some workers:

More useful state

Let’s start by enhancing the state in the pool manager to be a struct: we’re going to have to remember a bunch of things to manage the pool, so a struct will be handy. In lib/pool_toy/pool_man.ex:

defmodule PoolToy.PoolMan do
  use GenServer

  defmodule State do
    defstruct [:size]
  end

  @name __MODULE__

  def start_link(size) when is_integer(size) and size > 0 do
    # edited for brevity
  end

  def init(size) do
    {:ok, %State{size: size}}
  end
end

Starting workers

Our pool manager must now get some worker processes started. Let’s begin with attempting to start a single worker dynamically, from within init/1. Look at the docs for DynamicSupervisor and see if you can figure out how to start a child.

Here we go (still in lib/pool_toy/pool_man.ex):

def init(size) do
  DynamicSupervisor.start_child(PoolToy.WorkerSup, Doubler)
  {:ok, %State{size: size}}
end

start_child/2 requires the dynamic supervisor location and a child spec. Since we’ve named our worker supervisor, we reuse the name to locate it here (but giving its pid as the argument would have worked also). Child specs can be provided the same way as for a normal supervisor: complete child spec map, tuple with arguments, or just a module name (if we don’t need any start arguments). We’ve gone with the latter.

Let’s try it out in IEx. When running iex -S mix we get:

Erlang/OTP 20 [erts-9.3]  [64-bit] [smp:4:4] [ds:4:4:10] [async-threads:10] [hipe] [kernel-poll:false]

Compiling 6 files (.ex)
Generated pool_toy app

08:52:07.275 [info]  Application pool_toy exited: PoolToy.Application.start(:normal, []) returned an error: shutdown: failed to start child: PoolToy.PoolSup
    ** (EXIT) shutdown: failed to start child: PoolToy.PoolMan
        ** (EXIT) exited in: GenServer.call(PoolToy.WorkerSup, {:start_child, {{Doubler, :start_link, [[]]}, :permanent, 5000, :worker, [Doubler]}}, :infinity)
            ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started
** (Mix) Could not start application pool_toy: PoolToy.Application.start(:normal, []) returned an error: shutdown: failed to start child: PoolToy.PoolSup
    ** (EXIT) shutdown: failed to start child: PoolToy.PoolMan
        ** (EXIT) exited in: GenServer.call(PoolToy.WorkerSup, {:start_child, {{Doubler, :start_link, [[]]}, :permanent, 5000, :worker, [Doubler]}}, :infinity)
            ** (EXIT) no process: the process is not alive or there's no process currently associated with the given name, possibly because its application isn't started

Oops. What’s going on? Well, let’s read through the error messages:

  1. The pool toy application couldn’t start, because:
  2. The pool supervisor couldn’t start, because:
  3. The pool manager couldn’t start, because:
  4. It tries calling a GenServer process called PoolToy.WorkerSup, which fails because:
  5. the process is not alive or there’s no process currently associated with the given name, possibly because its application isn’t started.

Wait a minute: the worker supervisor should be up and running with that name, since it was started by the pool supervisor! Let’s take a look at that code and see how the pool supervisor goes about its job of starting its children (lib/pool_toy/pool_sup.ex):

def init(args) do
    pool_size = args |> Keyword.fetch!(:size)

    children = [
      {PoolToy.PoolMan, pool_size},
      PoolToy.WorkerSup
    ]

    Supervisor.init(children, strategy: :one_for_all)
  end

Right, so on line 9 the pool supervisor start its children. Which children? The ones defined on lines 4-7. And how are they started? The docs say

When the supervisor starts, it traverses all child specifications and then starts each child in the order they are defined.

So in our code, we start the pool manager and then move on to the worker supervisor. But since the pool manager expects the worker supervisor to already be up and running when the pool manager initializes itself (since it calls the worker sup), our code blows up.

Put another way, our software design has taken the position that the worker supervisor will ALWAYS be available if the pool manager is running. However, in our current implementation we’ve failed this guarantee quite spectacularly. (Take a look at what Fred Hébert has to say about supervisor start up and guarantees.)

The fix is easy enough: tell the pool supervisor to start the worker supervisor before the manager (lib/pool_toy/pool_sup.ex).

children = [
  PoolToy.WorkerSup,
  {PoolToy.PoolMan, pool_size}
]

Try running PoolToy in IEx again, and not only does it work, you can actually see a baby worker in the Observer! Huzzah!

As a quick side note, the current example is quite simple and straightforward but initialization order needs to be properly thought through in your OTP projects. Once again, take a look at Fred’s excellent writing on the subject.

With our first worker process created, let’s finish the job and actually start enough workers to fill the requested pool size. Let’s get started with the naive approach (lib/pool_toy/pool_man.ex):

defmodule State do
  defstruct [:size, workers: []]
end

def init(size) do
  start_worker = fn _ ->
    {:ok, pid} = DynamicSupervisor.start_child(PoolToy.WorkerSup, Doubler)
    pid
  end

  workers = 1..size |> Enum.map(start_worker)

  {:ok, %State{size: size, workers: workers}}
end

Pretty straightforward, right? Since we’re going to need to keep track of the worker pids (to know which ones are available), we add a :workers attribute (initialized to an empty list) to our state on line 2. On lines 6-9 we define a helper function that starts a worker and returns its pid. You’ll note that with the pattern match on line 7 we expect the worker creation to always succeed: if anything but an “ok tuple” is returned, we want to crash. Finally, on line 11 we start the requested number of workers, and add them to the state on line 13.

If you take this for a run in IEx and peek at the state for the PoolMan process, you’ll see we indeed have pids in the :workers state property, and that they match the ones visible in the application tab. Great!

Our changes so far

Deferring initialization work

Actually, it’s not so great: GenServer initialization is synchronous, yet we’re doing a bunch of work in there. This could tie up the client process and contribute to making our project unresponsive: the init/1 callback should always return as soon as possible.

So what are we to do, given we need to start the worker processes when the pool manager is starting up? Well, we have the server process send a message to itself to start the workers. That way, init/1 will return right away, and will then process the next message in its mailbox which should be the one telling it to start the worker processes. Here we go (lib/pool_toy/pool_man.ex):

def init(size) do
  send(self(), :start_workers)
  {:ok, %State{size: size}}
end

def handle_info(:start_workers, %State{size: size} = state) do
  start_worker = fn _ ->
    {:ok, pid} = DynamicSupervisor.start_child(PoolToy.WorkerSup, Doubler)
    pid
  end

  workers = 1..size |> Enum.map(start_worker)

  {:noreply, %{state | workers: workers}}
end

def handle_info(msg, state) do
  IO.puts("Received unexpected message: #{inspect(msg)}")
  {:noreply, state}
end

Take a good long look at this code, and try to digest it. Think about why each line/function is there.

First things first: in init/1, we have the process send a message to itself. Remember that init/1 is executed within the server process of a gen server, so send(self(), ...) will send the given message to the server process’ inbox.

Although we’re sending the message immediately, it’s important that nothing (e.g. starting new workers) will actually get done: the only thing happening is a message being sent. After sending the message, init/1 is free to do whatever it wants, before returning {:ok, state}. Only after init/1 has returned will the server process look in its mailbox and see the :start_workers message we sent (assuming no other message came in before).

Side note: if there’s a possibility in your project that a message could be sent to the gen server before it is fully initialized, you can have it respond with e.g. {:error, :not_ready} (in handle_call, etc.) when the state indicates that prerequisites are missing. It is then up to the client to handle this case when it presents itself.

Ok, onwards: we’ve sent a message to the server process so we need code to handle it. If a message is simply sent to a gen server process (as opposed to using something like GenServer.call/3), it will be treated as an “info” message. So on lines 6-15 we handle that info message, start the worker processes, and store their pids in the server state.

But why did we define another handle_info function on lines 17-20?

A short primer on process mailboxes

Each process in the BEAM has its mailbox where it receives messages sent to it, and from which it can read using receive/1. But as you may recall, receive/1 will pattern match on the message, and process the first message it finds that matches a pattern. What about the messages that don’t match any pattern? They stay in the mailbox.

Every time receive/1 is invoked, the mailbox is consulted, starting with the oldest message and moving chronologically. So if messages don’t match clauses in receive/1 they will just linger in the mailbox and be checked again and again for matching clauses. This is bad: not only does it slow your process down (it’s spending a lot of time needlessly checking messages that will never match), but you also run the risk of running out of memory for the process’ mailbox as the list of unprocessed messages grows until finally the process is killed (and all unread messages are lost).

To prevent this undesirable state of affairs, it is a good practice to have a “catch all” clause in handle_info/2 that will log the fact that an unprocessed message was received while also removing it from the process’ mailbox. In other words, it’s a good practice to have a catch all version of handle_info/2 to prevent unprocessed messages from piling up in the mailbox.

When we use GenServer, default catch all implementations of handle_info/2 (as well as handle_call/3 and handle_cast/2, which we’ll meet later) are created for us and added to the module. But as soon as we define a function with the same name (e.g. we write a handle_info/2 function as above) it will override the one created by the use statement. So when we write a handle_info/2 message-handling function in a gen server, we need to follow through and write a catch all version also.

Since these unexpected messages shouldn’t have been sent to this process, in production code we would most likely log them as errors using the Logger instead of just printing a message.

Improving the internal API

Right now we’re starting worker like so (lib/pool_toy/pool_man.ex):

{:ok, pid} = DynamicSupervisor.start_child(PoolToy.WorkerSup, Doubler)

This isn’t great, because it means that our pool manager needs relatively intimate knowledge about the internals of the worker supervisor (namely, that it is a dynamic supervisor, and not a plain vanilla one). It also means that if we change how children get started (e.g. by setting a default config), we’d have to make that change everywhere that requests new workers to be started.

Let’s move the knowledge of how to start workers where it belongs: within the WorkerSup module (lib/pool_toy/worker_sup.ex):

defmodule PoolToy.WorkerSup do
  use DynamicSupervisor

  @name __MODULE__

  def start_link(args) do
    DynamicSupervisor.start_link(__MODULE__, args, name: @name)
  end

  defdelegate start_worker(sup, spec), to: DynamicSupervisor, as: :start_child

  def init(_arg) do
    DynamicSupervisor.init(strategy: :one_for_one)
  end
end

I’ve reproduced the whole module here, but the only change is on line 10: we define a start_worker/2 function, which in effect proxies the call to DynamicSupervisor.start_child/2. If you’re not familiar with the defdelegate macro, you can learn more about it here, but in essence the line we’ve added is equivalent to:

def start_worker(sup, spec) do
  DynamicSupervisor.start_child(sup, spec)
end

With this addition in place, we can update our pool manager code to make use of it (lib/pool_toy/pool_man.ex):

def handle_info(:start_workers, %State{size: size} = state) do
  workers =
    for _ <- 1..size do
      {:ok, pid} = PoolToy.WorkerSup.start_worker(PoolToy.WorkerSup, Doubler)
      pid
    end

  {:noreply, %{state | workers: workers}}
end

Our changes so far

At this stage in our journey, we’ve got a pool with worker processes. However, right now clients can’t check out worker processes to do work with them. Let’s get to that next!


Would you like to see more Elixir content like this? Sign up to my mailing list so I can gauge how much interest there is in this type of content.

Posted in Elixir | Comments Off on PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.4)

PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.3)

Managing a single pool (continued)

(This post is part of a series on writing a process pool manager in Elixir.)

In the previous post, we managed to get our pool supervisor to start the pool manager. Now, our next objective is to have the pool supervisor start the worker supervisor, and we’ll be one step closer to out intermediary goal:

Before we just dive in, let’s think about our worker supervisor a bit and how it differs from the pool supervisor. The worker supervisor’s children are all going to be the same process type (i.e. all workers based on the same module), whereas the pool supervisor has different children. Further, the number of children under the worker supervisor’s care will vary: in the future, we’ll want to be able to spin up temporary “overflow” workers to handle bursts in demand. These features make it a great fit for the DynamicSupervisor.

The worker supervisor

Here’s our bare-bones implementation of the worker supervisor (lib/pool_toy/worker_sup.ex):

defmodule PoolToy.WorkerSup do
  use DynamicSupervisor

  @name __MODULE__

  def start_link(args) do
    DynamicSupervisor.start_link(__MODULE__, args, name: @name)
  end

  def init(_arg) do
    DynamicSupervisor.init(strategy: :one_for_one)
  end
end

This should look familiar: it’s not very different from the code we wrote for the pool supervisor. The main differences aside from useing DynamicSupervisor on line 2 are the restart strategy on line 11 (DynamicSupervisor only supports :one_for_one at this time), and the fact that we don’t initialize any children. Dynamic supervisors always start with no children and add them later dynamically (hence their name).

We still need our pool supervisor to start the worker supervisor (lib/pool_toy/pool_sup.ex):

def init(args) do
    pool_size = args |> Keyword.fetch!(:size)
    children = [
      {PoolToy.PoolMan, pool_size},
      PoolToy.WorkerSup
    ]

    Supervisor.init(children, strategy: :one_for_all)
  end

Since this time around we don’t need to pass any special values for the initialization, we don’t need a tuple and can just pass in the module name directly. A quick detour in IEx to try everything out:

iex -S mix
PoolToy.PoolSup.start_link(size: 4)
:observer.start()

In the processes tab, you can see we’ve got entries for PoolSup, PoolMan, and WorkerSup. In addition, if you double click on the pool manager and worker supervisor entries and navigate to their “state” tabs, you’ll see that they both have the pool supervisor as their parent. Great (intermediary) success!

Our changes so far

Throwing the OTP application into the mix

Poking around the Observer using the process list has been helpful, but it’d be even better if we could have a more visual representation of the hierarchy between our processes. You may have noticed the “applications” tab in the Observer: it has pretty charts of process hierarchies, but our processes are nowhere to be found. It turns out, that’s because right now our code is just a bunch of processes, they’re not an actual OTP application.

First of all, what is an OTP application? Well, it’s probably not the same size/scope as the application you’re thinking of: OTP applications are more like components in that they’re bundles of reusable code that gets started/stopped as a unit. In other words, the software you build in OTP (whether it’s with Elixir, Erlang, or something else) will nearly always be contain or depend on several OTP applications.

To turn our project into an application, we need to first write the application module callback (lib/pool_toy/application.ex):

defmodule PoolToy.Application do
  use Application

  def start(_type, _args) do
    children = [
      {PoolToy.PoolSup, [size: 3]}
    ]

    opts = [strategy: :one_for_one]
    Supervisor.start_link(children, opts)
  end
end

Pretty straightforward, right? This code once again closely resembles the code we’ve been writing thus far, except we’re using Application on line 2 and the callback we needed to implement is start/2.

You’ll probably have noticed that we’ve hard-coded a pool size of 3 on line 6. That’s just temporary to get us up and running: in our final implementation, the application will start a top-level supervisor and we’ll once again be able to specify pool sizes as we create them.

So now that we’ve defined how our application should be started, we still need to wire it up within mix so it gets started automatically (mix.exs):

def application do
    [
      mod: {PoolToy.Application, []},
      registered: [
        PoolToy.PoolSup,
        PoolToy.PoolMan,
        PoolToy.WorkerSup
      ],
      extra_applications: [:logger]
    ]
  end

Line 3 is where the magic happens: we specify the module that is the entry point for our application, as well as the start argument. Here, that means the application gets started by calling the start/2 callback in PoolToy.Application and providing it [] as the second argument (the first argument is used to specify how the application is started, which is useful in more complex failover/takeover scenarios which won’t be covered here).

You can safely ignore the registered key/value here: I’ve only included it for completeness. It’s essentially a list of named processes our application will register. The Erlang runtime uses this information to detect name collisions. If you leave it out, it will default to [] and your app will still work.

Our changes so far

Start an IEx session once again (using iex -S mix, remember?), fire up the Observer with :observer.start(), and check out what we can see in the “applications” tab:

Pretty sweet, right? (If you don’t see this, click on “pool_toy” on the left.)

Now that we’ve got a visual representation of what the relationships between our processes look like, let’s mess around with them a bit… Double click on each of our processes, take note of their pids (visible in the window’s title bar), and close their windows. Now, right click on PoolMan and select “Kill process” (press “ok” when prompted for the exit reason). Look at the pid for PoolMan by double clicking on it again: it’s different now, since it was restarted by PoolSup. If you look at WorkerSup‘s pid, you’ll see it’s also different: PoolSup restarted it as well, because we told it to use a :one_for_all strategy. Ah, the magic of Erlang’s supervision trees…

Starting new projects as OTP applications

We went about writing our application in a bit of a round about way: we wrote our code, and introduced the OTP application when we needed/wanted it by writing the application callback module ourselves and adding to mix.exs.

Naturally, Elixir can helps us out here when we’re starting a new project. By passing the --sup option to mix new, which will create a skeleton of an OTP application with a supervision tree (docs). So we could have saved ourselves some work by starting out our project with

mix new pool_toy --sup

The more you know!

Our application is starting to look like the figure at the top of the post, but we still need actual worker processes. Let’s get to that next!


Would you like to see more Elixir content like this? Sign up to my mailing list so I can gauge how much interest there is in this type of content.

Posted in Elixir | Comments Off on PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.3)

PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.2)

Managing a single pool (continued)

(This post is part of a series on writing a process pool manager in Elixir.)

Continuing from the previous post, we’re working on having a single pool of processes that we can check out to do some work with. When we’re done, it should look like this:

The pool manager

Our pool manager is going to be a GenServer: it needs to be a process so it can maintain its state (e.g. tracking which worker processes are checked out). Let’s start with a super simplified version (lib/pool_toy/pool_man.ex):

defmodule PoolToy.PoolMan do
  use GenServer

  @name __MODULE__

  def start_link(size) when is_integer(size) and size > 0 do
    GenServer.start_link(__MODULE__, size, name: @name)
  end

  def init(size) do
    state = List.duplicate(:worker, size)
    {:ok, state}
  end
end

start_link/2 on line 6 takes a size indicating the number of workers we want to have in the pool. At this time, our pool manager is basically a trivial GenServer whose state consists of a list of identical atoms. These atoms represent workers we’ll implement later: right now, we’ll just use these atoms as a simple stand ins.

Let’s try it out and make sure it’s behaving as intended:

iex -S mix

iex(1)> PoolToy.PoolMan.start_link(3)
{:ok, #PID<0.112.0>}
iex(2)> :observer.start()
:ok

In the Observer, navigate to the pool manager process, and take a look at its state. Having trouble? Follow the steps in part 1.1. You can see that the process’ state is [worker, worker, worker] as expected (don’t forget, those are Erlang’s representation of atoms). Yay!

Our changes so far

And now, for our next trick, let’s have the pool supervisor start the pool manager upon startup.

Starting supervised children

This is what our pool supervisor currently has (lib/pool_toy/pool_sup.ex):

defmodule PoolToy.PoolSup do
  use Supervisor

  @name __MODULE__

  def start_link() do
    Supervisor.start_link(__MODULE__, [], name: @name)
  end

  def init([]) do
    children = []

    Supervisor.init(children, strategy: :one_for_all)
  end
end

We’ve got this convenient children value on line 11, let’s throw the pool manager value in there and see what happens: children = [PoolToy.PoolMan]. Within an IEx session (again, started with iex -S mix), try calling PoolToy.PoolSup.start_link() and you’ll get the following result:

** (EXIT from #PID<0.119.0>) shell process exited with reason: shutdown: failed to start child: PoolToy.PoolMan
    ** (EXIT) an exception was raised:
    ** (FunctionClauseError) no function clause matching in PoolToy.PoolMan.start_link/1
        (pool_toy) lib/pool_toy/pool_man.ex:6: PoolToy.PoolMan.start_link([])
        (stdlib) supervisor.erl:365: :supervisor.do_start_child/2
        (stdlib) supervisor.erl:348: :supervisor.start_children/3
        (stdlib) supervisor.erl:314: :supervisor.init_children/2
        (stdlib) gen_server.erl:365: :gen_server.init_it/2
        (stdlib) gen_server.erl:333: :gen_server.init_it/6
        (stdlib) proc_lib.erl:247: :proc_lib.init_p_do_apply/3

Darn. It looks like this programming stuff is going to be harder than just making random changes until we get something to work! Let’s take a closer look at the problem: the process couldn’t start PoolMan (line 1), because it was calling start_link([]) (line 4) but no function clause matched (line 3).

So how come PoolMan.start_link/1 is getting called with []? Let’s back up and see what Supervisor.init/2 does (docs). It refers us to more docs for additional information, where we’re told the first argument given to init/2 may be either:

  • a map representing the child specification itself – as outlined in the “Child specification” section
  • a tuple with a module as first element and the start argument as second – such as {Stack, [:hello]}. In this case, Stack.child_spec([:hello]) is called to retrieve the child specification
  • a module – such as Stack. In this case, Stack.child_spec([]) is called to retrieve the child specification

What’s this about a child specification? It’s basically how a supervisor will (re)start and shutdown processes it supervises. Let’s leave it at that for now (but feel free to read more about child specs in the docs).

So based on what we’ve read, since we’re giving just a module within the children variable, at some point PoolMan.child_spec([]) gets called. This is pretty suspicious, because we haven’t defined a child_spec/1 function in the pool manager module. Let’s once again turn to the console to investigate what the heck is going on:

iex(1)> PoolToy.PoolMan.child_spec([])
%{id: PoolToy.PoolMan, start: {PoolToy.PoolMan, :start_link, [[]]}}
iex(2)> PoolToy.PoolMan.child_spec([foo: :bar])
%{id: PoolToy.PoolMan, start: {PoolToy.PoolMan, :start_link, [[foo: :bar]]}}

Let’s take another look at our pool manager module (lib/pool_toy/pool_man.ex):

defmodule PoolToy.PoolMan do
  use GenServer

  @name __MODULE__

  def start_link(size) when is_integer(size) and size > 0 do
    GenServer.start_link(__MODULE__, size, name: @name)
  end

  def init(size) do
    state = List.duplicate(:worker, size)
    {:ok, state}
  end
end

There’s definitely no child_spec/1 defined in there, so where’s it coming from? Well the only candidate is line 2, since use can generate code within our module. Sure enough, the GenServer docs tell us what happened:

use GenServer also defines a child_spec/1 function, allowing the defined module to be put under a supervision tree.

Now that that’s cleared up, let’s look at the resulting child spec again:

iex(2)> PoolToy.PoolMan.child_spec([foo: :bar])
%{id: PoolToy.PoolMan, start: {PoolToy.PoolMan, :start_link, [[foo: :bar]]}}

We’ve got an id attribute that the supervisor uses to differentiate its children: parents do the same thing by giving their kids different names, right? (Besides George Foreman and his sons, I mean.) The other key in there is used to define how the child should be started: as the docs indicate, it is a tuple containing the module and function to call, as well as the arguments to pass in. This is often referred to as an MFA (ie. Module, Function, Arguments). You’ll note the arguments are always wrapped within a single list. So in the example above where we want to pass a keyword list as an argument, it gets wrapped within another list in the MFA tuple.

Fixing the startup problem

Right, so after this scenic detour discussing child specs, let’s get back to our immediate problem:

  1. If we give just a module name to Supervisor.init/2, it will call PoolMan.child_spec([]);
  2. This will generate a child spec with a :start value of `{PoolToy.PoolMan, :start_link, [[]]}`;
  3. The supervisor will attempt to call PoolMan‘s start_link function with [] as the argument;
  4. The process crashes, because PoolMan only defines a start_link/1 function expecting an integer.

To fix this issue, we need the child spec’s start value to be something like `%{id: PoolToy.PoolMan, start: {PoolToy.PoolMan, :start_link, [3]}}` which would make the supervisor call PoolMan‘s start_link/1 function with size 3 and we’d get a pool with 3 workers.

How can this be solved? One option is to override the child_spec/1 function defined by use GenServer, to return the child spec we want, for example (lib/pool_toy/pool_man.ex):

def child_spec(_) do
    %{
      id: @name,
      start: {__MODULE__, :start_link, [3]}
    }
  end

That’ll work, but it’s not the best choice in our case: this isn’t flexible and is overkill for what we’re trying to achieve.

Another possibility is to customize the generated child_spec/1 function by altering the use GenServer statement on line 2 (lib/pool_toy/pool_man.ex):

use GenServer, start: {__MODULE__, :start_link, [3]}

Once again, not great: we want to be able to specify the pool size dynamically.

Referring back to the Supervisor.init/2 docs, we can see the other options we’ve got:

  • a map representing the child specification itself – as outlined in the “Child specification” section
  • a tuple with a module as first element and the start argument as second – such as {Stack, [:hello]}. In this case, Stack.child_spec([:hello]) is called to retrieve the child specification
  • a module – such as Stack. In this case, Stack.child_spec([]) is called to retrieve the child specification

Per the first bullet point, we could also directly provide the child spec to Supervisor.init/2 in lib/pool_toy/pool_sup.ex:

defmodule PoolToy.PoolSup do
  use Supervisor

  @name __MODULE__

  def start_link(args) when is_list(args) do
    Supervisor.start_link(__MODULE__, args, name: @name)
  end

  def init(args) do
    pool_size = args |> Keyword.fetch!(:size)
    children = [
      %{
        id: PoolToy.PoolMan,
        start: {PoolToy.PoolMan, :start_link, [pool_size]}
      }
    ]

    Supervisor.init(children, strategy: :one_for_all)
  end
end

On line 6 we’ve modified start_link to take a keyword list with our options (as recommended by José) and forward it to init/1. In there, we extract the size value on line 11, and use that in our hand made child spec on line 15. In IEx, we can now call PoolToy.PoolSup.start_link(size: 5) and have our pool supervisor start up, and start its pool manager child. So this version also works, but it seems writing our own child spec is a lot of extra work when we were almost there using just the module name…

If look at the 2nd bullet point in the quoted docs above, you’ll find the simpler solution: use a tuple to specify the child spec. Just provide a tuple with the child module as the first element, and the startup args as the second element (lib/pool_toy/pool_sup.ex):

defmodule PoolToy.PoolSup do
  use Supervisor

  @name __MODULE__

  def start_link(args) when is_list(args) do
    Supervisor.start_link(__MODULE__, args, name: @name)
  end

  def init(args) do
    pool_size = args |> Keyword.fetch!(:size)
    children = [{PoolToy.PoolMan, pool_size}]

    Supervisor.init(children, strategy: :one_for_all)
  end
end

Back in our IEx shell, let’s make sure we didn’t break anything:

iex(1)> PoolToy.PoolSup.start_link(size: 5)
{:ok, #PID<0.121.0>}

Our changes so far

Another look at Observer

Let’s take a look in Obeserver with :observer.start(). Go to the “Processes” tab, find the Elixir.PoolToy.PoolMan process, and double-click it to open its process info window. If you now navigate to the “State” tab, you’ll see that we indeed have 5 worker atoms as the state: the size option was properly forwarded down to our pool manager from the pool supervisor! You can also see that there’s a parent pid on this screen: click it.

In the new window, you’ll be looking at the pool manager’s parent process: the window title indicates that its Elixir.PoolToy.PoolSup! So our pool supervisor did indeed start the pool manager as its child, as we intended. Finally, go to the “State” tab in this window for PoolSup and click on the “expand above term” link: you’ll see something like

{state,{local,'Elixir.PoolToy.PoolSup'},
       one_for_all,
       {['Elixir.PoolToy.PoolMan'],
        #{'Elixir.PoolToy.PoolMan' =>
              {child,<0.143.0>,'Elixir.PoolToy.PoolMan',
                     {'Elixir.PoolToy.PoolMan',start_link,[5]},
                     permanent,5000,worker,
                     ['Elixir.PoolToy.PoolMan']}}},
       undefined,3,5,[],0,'Elixir.PoolToy.PoolSup',
       [{size,5}]}

Once again, we’re poking our nose into something that we’re not really supposed to be aware of, but you can probably guess what a lot of information in there corresponds to: we’ve got a one_for_all restart strategy, a single childwith pid <0.143.0> (followed essentially by Erlang’s representation of the child spec used to start this particular child).

Continue on to the next post to add the worker supervisor and convert our app into an OTP application.


Would you like to see more Elixir content like this? Sign up to my mailing list so I can gauge how much interest there is in this type of content.

Posted in Elixir | Comments Off on PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.2)

PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.1)

A new project

(This post is part of a series on writing a process pool manager in Elixir.)

Without further ado, let’s get started with PoolToy:

mix new pool_toy

Pools will contain worker processes that will actually be doing the work (as their name implies). In the real world, workers could be connections to databases, APIs, etc. for which we want to limit concurrency. In this example, we’ll pretend that doubling a number is expensive in terms of computing resources, so we want to avoid having too many computations happening at the same time.

Here’s our worker GenServer (lib/doubler.ex):

defmodule Doubler do
  use GenServer

  def start_link([] = args) do
    GenServer.start_link(__MODULE__, args, [])
  end

  defdelegate stop(pid), to: GenServer

  def compute(pid, n) when is_pid(pid) and is_integer(n) do
    GenServer.call(pid, {:compute, n})
  end

  def init([]) do
    {:ok, nil}
  end

  def handle_call({:compute, n}, _from, state) do
    IO.puts("Doubling #{n}")
    :timer.sleep(500)
    {:reply, 2 * n, state}
  end
end

Nothing fancy going on here: we’ve got a public API with start_link/1, stop/1, and compute/1 functions. As you can see from the return value in init/1, we’re not making use of state here, since our use case is so trivial. Finally, when handling a compute request on the server, we sleep for half a second to simulate expensive processing.

Also of note, the Doubler module is a the top level within the lib folder, and not within a pool_toy subfolder. This is because we really only have Doubler within PoolToy for convenience: if PoolToy became a fully-fledged library, the worker would be provided by the client code.

Our changes so far

Managing a single pool

So what’s our pool going to look like? Roughly (ok, exactly…) like this:

 

No need to praise me for my elite design skills, this is straight from the Observer tool, which we’ll get to know better later on.

At the top level, we’ve got PoolSup which is responsible for supervising an entire pool: after all, once we’ve got several pools being managed, if one pool crashes the others should keep chugging along. Below that, we’ve got PoolMan and WorkerSup. PoolMan is the pool manager: it’s the process we communicate with to interact with the pool (such as borrowing a worker to do some work, then returning it).

WorkerSup supervises the actual worker processes: if one goes down, a new worker will be started in its place. The unnamed workers will be instances of Doubler in our case. After all, we didn’t write that module for nothing…

This nice little attroupement of processes is commonly referred to as a supervision tree: each black line essentially means “supervises” when read from left to right. You’ve probably heard of Erlang’s “let it crash” philosophy, and it goes hand in hand with the supervision tree: if something goes wrong within a process and we don’t know how to fix it or haven’t anticipated that failure, it’s best to simply kill the process and start a new one from a “known good” state.

After all, we do the same thing when our computers or phones start acting up, and no obvious actions seem to fix the issue: reboot it. Rebooting a device will let it start from a clean state, and fixes many problems. In fact, if you’ve ever had to provide some sort of IT support to friends or family, “Have you tried turning it off and on again?” is probably on of your trusty suggested remedies.

But why are we bothering with a worker supervisor and a pool manager? Couldn’t we combine them? While it’s definitely possible to do so technically, it’s a bad idea. In order to provide robust fault-tolerance, supervisors should essentially be near impossible to crash, which means they should have as little code as possible (since the probability of bugs will only increase with more code). Therefore, our design has a worker supervisor that only takes care of supervision (start/stopping processes, etc.), while the manager is the brains of the operation (tracking which workers are available, which process is currently using a checked out worker, etc.).

The pool supervisor

Let’s write our pool supervisor (lib/pool_toy/pool_sup.ex):

defmodule PoolToy.PoolSup do
  use Supervisor

  @name __MODULE__

  def start_link() do
    Supervisor.start_link(__MODULE__, [], name: @name)
  end

  def init([]) do
    children = []

    Supervisor.init(children, strategy: :one_for_all)
  end
end

On line 2, we indicate this module will be a supervisor which will do a few things for us. Don’t worry about what it does for now, we’ll get back to it later.

We define a start_link/0 function on line 6 because, hey, it’s going to start the supervisor and link the calling process to it. Technically, the function could have been given any name, but start_link is conventional as it communicates clearly what the expected outcome is. You’ll note that we also give this supervisor process a name, to make it easier to locate later.

Our code so far

Naming processes

Since we expect there to be a single pool supervisor (at least for now), we can name the process to make it easier to find and work with.

Process names have to be unique, so we need to make sure we don’t accidentally provide a name that is already (or will be) in use. You know what else needs to be unique? Module names! This is one of the main reasons unique processes are named with their module name. As an added bonus, it makes it easy to know at a glance what the process is supposed to be doing (because we know what code it’s running).

But why have Supervisor.start_link(__MODULE__, [], name: @name) when Supervisor.start_link(__MODULE__, [], name: __MODULE__) would work just as well? Because the first argument is actually the current module’s name, while the name option could be anything (i.e. using the module name is just a matter of convention/convenience). By declaring a @name module attribute and using that as the option value, we’re free to have our module and process names change independently.

And now, back to our regularly scheduled programming

The supervisor behaviour requires an init/1 callback to be defined (see docs):

def init([]) do
  children = []

  Supervisor.init(children, strategy: :one_for_all)
end

At some point in the future, our supervisor will supervise some child processes (namely the “pool manager” and “worker supervisor” processes we introduced above). But for now, let’s keep our life simple and child-free ;-)

Finally, we call Supervisor.init/2 (docs) to properly set up the initialization information as the supervisor behaviour expects it. We provided :one_for_all as the supervision strategy. Supervision strategies dictate what a supervisor should do when a supervised process dies. In our case, if a child process dies, we want the supervisor to kill all other surviving supervised processes, before restarting them all.

But is that really necessary, or is it overkill? After all, there are other supervision strategies we could use (only restarting the dead process, or restarting the processes that were started after it), why kill all supervised processes if a single one dies? Let’s take another look at our process tree and think it through:

As mentioned above, PoolMan will take care of the pool management (checking out a worker, keeping track of which workers are busy, etc.), while WorkerSup will supervise the workers (replacing dead ones with fresh instances, creating a new worker if the pool manager asks for it, etc.).

What happens if PoolMan dies? Well, we’ll lose all information about which workers are checked out (and busy) and which ones can be used by clients needing work to be done. So if PoolMan dies, we want to kill WorkerSup also, because then once WorkerSup gets restarted all of its children will be available (and therefore PoolMan will know all workers are available for use).

You might be worried about the poor clients that checked workers to perform a task, and suddenly have that worker killed. The truth is, in the Erlang/Elixir world, you always have to think about processes dying, being unreachable, and so on: after all, the worker process could have died at any time and for whatever reason. In other words, the client process should have code to handle the worker dying and handle that case appropriately: after all, the worker process can die for any number of reasons (e.g. software bug, remote timeout) and not necessarily due to a supervisor killing it. And of course, the client process could very well decide that “appropriately handling the death of a worker process” means “just crash”. We are in the Erlang world, after all :D

Ok, so we’ve determined that if PoolMan dies, we should bring down WorkerSup along with it. What about the other way around? What happens if WorkerSup dies? We’ll have no more workers, and the accounting done within PoolMan will be useless: the list of busy processes (referenced by their pid) will no longer match any existing pid, since the new workers will have been given new pids. So we’ll have to kill PoolMan to ensure it starts back up with an empty state (i.e. no worker processes registered as checked out).

Since we’ve concluded that in the event of a child process dying we should kill the other one, the correct supervision strategy here in :one_for_one.

Poking around in Observer

Let’s start an IEx session and investigate what we’ve got so far. From within the pool_toy folder, run iex -S mix : in case you’ve forgotten, this will start an IEx session and run the mix script, so we’ll have our project available to use within IEx.

First, let’s start the pool supervisor with PoolToy.PoolSup.start_link(). Then, let’s start Erlang’s Observer with :observer.start() (note that autocompletion doesn’t work when calling Erlang modules, so you have to type the whole thing out): since Observer is an Erlang module, the syntax to call it is slightly different (because we use Erlang’s syntax). Here’s a quick (and very inadequate) primer on Erlang syntax: in Erlang, atoms are written in lower snake_case, while upper CamelCase tokens are variables (which cannot be rebound in Erlang). Whereas in Elixir module atom names are CamelCase while “normal” atoms are lower snake_case, in Erlang both are lower snake_case (i.e. a module name is an atom like any other).

To call an Erlang module’s function you join them with a colon. So in Erlang, foo:bar(MyVal) would execute the bar/1 function within the foo module with the MyVal variable as the argument. Finally, back in the Elixir world, we need to prepend the Erlang module’s name with a colon to make it an atom: the Observer module therefore gets referenced as :observer and :observer.start() will call the start/0 function within the Observer module. Whew!

Ok, so a new window should have popped up, containing the Erlang Observer :

If nothing came up, search the web for the error message you get: it’s likely you’re missing a dependency (e.g. wxWidgets).

Click on the “Processes” tab, then on the “Name or Initial Func” header to sort the list of processes, then scroll down to Elixir.PoolToy.PoolSup which is the supervisor process we just created (don’t forget that Elixir prefixes all module names with Elixir). Let’s now open the information window for that particular process by double clicking on it (or right-clicking and selecting the “Process info for <pid>” option):

Notice that I’ve switched to the last tab, because that’s all we’ll be looking at for now. We can see a few things of interest here: our supervisor implements the GenServer behaviour, it’s running, and it’s parent process is <0.115.0> (this pid could very well be different in your case). If you click on the pid, a new window will open, where you’ll find out that the parent is indeed the IEx session (in the “Process Information” tab, the “Current Function” value is Elixir.IEx.Evaluator:loop/1).

We can also see our supervisor has some sort of internal state: click the provided link in the window to see what the state contains. You’ll see something like

{state,{local,'Elixir.PoolToy.PoolSup'}, 
           one_for_all,{[],#{}},undefined,3,5,[],0,'Elixir.PoolToy.PoolSup',[]}

This is the supervisor’s internal state (as Erlang terms), so we’re kind of peeking behind the curtains here, but we can see that the state contains the :one_for_all strategy, and the name of the module where we’ve defined the supervisor behaviour. Don’t worry about the other stuff: we’re looking at the internal details of something we didn’t write, so it’s not really our business to know what all that stuff is. It’ll be much more meaningful later when we observe processes where we defined the state ourselves (because then what the state contains will make sense to us!).

Take a break, and join me in the next post where we’ll implement the pool manager and have our supervisor start it.


Would you like to see more Elixir content like this? Sign up to my mailing list so I can gauge how much interest there is in this type of content.

Posted in Elixir | Comments Off on PoolToy: A (toy) process pool manager in Elixir 1.6 (part 1.1)

PoolToy: A (toy) process pool manager in Elixir 1.6

Contents

This series of posts will guide you in writing an OTP application that will manage pools of processes. In particular, it will do so using features introduced in Elixir 1.6 such as the DynamicSupervisor and the Registry.

There’s quite a bit of content I wanted to cover, and I tried to present the material in a way for readers to learn (and retain!) as much as possible given the time spent. One reason for the length is that we’ll be making mistakes on our journey. Improving a skill is about learning from your mistakes, and Elixir/OTP is no different: I want you to see how/why things don’t work out sometimes, and to understand why another design is better suited. In other words, I won’t show you the “best” implementation right away, but my hope is that you’ll be better off for it: you’ll be able to think critically about your own designs and implement corrections as necessary.

Another focus is on the Observer: everybody says it’s a great tool, but beyond that you’re most often left to your own devices. We’ll periodically use the Observer as we develop PoolToy to see what’s going on in our software and to help us diagnose and fix design problems.

Without further ado, here is the contents of this series:

  1. Managing a single pool
    • Part 1.1: create a new mix project, add a worker and pool supervisor
    • Part 1.2: implement the pool manager and have our pool supervisor start it
    • Part 1.3: add the (dynamic) worker supervisor and have our pool supervisor start it ; convert our code into an OTP application
    • Part 1.4: make the worker supervisor start workers when it initializes
    • Part 1.5: implement worker checkin and checkout
    • Part 1.6: monitor client processes
    • Part 1.7: use an ETS table to track client monitors
    • Part 1.8: handle worker deaths
    • Part 1.9: make workers temporary and have the pool manager in charge of restarting them
  2. Managing multiple pools
    • Part 2.1: remove static process names, provide the worker spec dynamically
    • Part 2.2: preparing for multiple pools with the pools supervisor
    • Part 2.3: starting multiple pools, using the Registry to locate pool supervisors and stop them
  3. Enhancing our pool capabilities (coming soon?)
    1. Allow overflow workers
    2. Cooldown period before terminating overflow workers
    3. Queueing client demand

Intro

This series of posts assumes that you’re familiar with Elixir syntax and basic concepts (such as GenServers and supervisors). But don’t worry: no need to be an expert, if you’ve read an introduction or two about Elixir you should be fine. If you haven’t read anything about Elixir yet, have a quick look at the excellent Getting started guide.

Processes and supervision are a core tenet of Elixir, Erlang, and the OTP ecosystem. In this series of posts, we’ll see how to create a process pool similar to Poolboy as this will give us ample opportunity to see how processes behave and how they can be handled within a supervision tree.

As a quick reminder, process pools can be used to:

  • limit concurrency (e.g. only N simultaneous connections to a DB, or M simultaneous requests to an API with rate limiting)
  • smooth occasional heavy activity bursts by queueing the excess demand
  • allocate more resources to more important parts of the system (e.g. payment processing gets a bigger pool than report creation)

Hopefully, you won’t have any trouble following along (if you struggle, please let me know!), but if you do here’s a list of alternative learning resources that might give you a different perspective and get you “unstuck”:

  • a blog post on the service/worker pattern (the code is in Erlang, but the concepts fully apply to Elixir).
  • building an OTP application from Fred Hébert’s “Learn you some Erlang”. As you can guess, it’s in Erlang and the design is a bit different (e.g. no DynamicSupervisor in Erlang, so you must use a :simple_one_for_one strategy) but even so it’s definitely worth a read to give you a better understanding about process wrangling.
  • pooly another simple process pool manager written in Elixir, but with older concepts (as it was written before they were introduced). Mainly: it uses a supervisor with a :simple_one_for_one strategy instead of a DynamicSupervisor, and doesn’t use Elixir’s registry to locate the pools.
  • the poolboy application itself (in Erlang), which is probably what you’d reach for if you need pool management in a production application

And if you’re the kind of person who prefers to just dive into the code, here it is.

Now that we’re situated, let’s get started in the next post.


Would you like to see more Elixir content like this? Sign up to my mailing list so I can gauge how much interest there is in this type of content.

Posted in Elixir | Comments Off on PoolToy: A (toy) process pool manager in Elixir 1.6

Data massaging in pipes with anonymous functions

This technique is obsolete since Elixir v.1.12, as it has introduced then/2

The pipe operator is great. So great, in fact, that sometimes you just want to keep that pipe juice flowing and feel bad about breaking it up just to massage data into the appropriate shape before continuing with a new pipe.

Anonymous functions can help out here. But you need to realize that each step in the pipe is a function call. Therefore, to use an anonymous function within a pipe, you have to call it with my_func.():

def to_email(full_name, domain) do
    full_name
    |> String.split()
    |> Enum.join(".")
    |> (&"#{&1}@#{domain}").()
    |> String.downcase()
 end

# iex> to_email("John Doe", "example.com")
# john.doe@example.com

On line 5, I simply wanted to create an email string without going through the fuss of writing a new function. Of course you can name the anonymous function, too:

def to_email(full_name, domain) do
    append_domain = &"#{&1}@#{domain}"

    full_name
    |> String.split()
    |> Enum.join(".")
    |> append_domain.()
    |> String.downcase()
end

As you’ve likely noticed, this technique is rife with potential for misuse and abuse: if you decide to use it, make sure your code’s readability doesn’t suffer! Otherwise, simply resort to the usual technique of grouping piped operations within larger functions (who have compatible output/input data shapes), and pipe those functions together.

Posted in Elixir, Pattern snippets | Comments Off on Data massaging in pipes with anonymous functions

Finer control in `with` failed matches

Elixir gives us the with construct to combine matching clauses, which is very handy as programs frequently to perform an operation only if/when a set of preconditions is met.

Let’s say we want to charge a product to a credit card:

with {:ok, validated_card} <- validate_cc(card),
      {:ok, amount} <- apply_sales_tax(item_price, :base),
      {:ok, amount} <- apply_sales_tax(item_price, :luxury),
      {:ok, transaction_id} <- charge_card(validated_card, amount) do
  {:ok, %{id: transaction_id, card: validated_card, amount: amount}}
else
  _ -> {:error, "unable to charge credit card"}
end

We’ve got a few steps we need to succeed, and if any one of those steps fail, we return an error. Unfortunately, as our code stands, the calling code won’t be able to differentiate what caused the error. Maybe it would ask the customer to enter credit card details again if the card validation failed, or a support ticket would be logged with the credit card processing if the card couldn’t be charged? Too bad: you can’t decide what to do without first knowing what went wrong.

But take a closer look at line 7: we’re pattern matching there just like we would in a case. So we should be able to pattern match on the returned errors, right? Unfortunately, all 3 calls on lines 1-4 just return :error if they’re not successful. Foiled again!

But maybe, just maybe, we could tag these statements? That would allow us to determine exactly what step failed even in cases where we’re calling the same function several times (like on lines 2-3). Let’s give it a go:

with {:validation, {:ok, validated_card}} <- {:validation, validate_cc(card)},
      {:base_tax, {:ok, amount}} <- {:base_tax, apply_sales_tax(item_price, :base)},
      {:luxury_tax, {:ok, amount}} <- {:luxury_tax, apply_sales_tax(item_price, :luxury)},
      {:transaction, {:ok, transaction_id}} <- {:transaction, charge_card(validated_card, amount)} do
  {:ok, %{id: transaction_id, card: validated_card, amount: amount}}
else
  {:validation, _} -> {:error, :invalid_card}
  {:base_tax, _} -> {:error, :base_tax_failed}
  {:luxury_tax, _} -> {:error, :luxury_tax_failed}
  {:transaction, _} -> {:error, :transaction_failed}
end

Now, the calling code can know exactly which step failed and respond appropriately. Or if it doesn’t case, it can simply match on {:error, _} and handle all failures the same. And of course, it could handle only a subset of the errors (e.g. invalid cards) and just use the _ match to have a catch all case for handling other errors. The point is: the power is in the caller’s hands, where it should be.

Of course, don’t forget that you can choose to tag some result values, but not others. And of course, you could tag several lines with the same atom:

with {:validation, {:ok, validated_card}} <- {:validation, validate_cc(card)},
      {:tax, {:ok, amount}} <- {:tax, apply_sales_tax(item_price, :base)},
      {:tax, {:ok, amount}} <- {:tax, apply_sales_tax(item_price, :luxury)},
      {:ok, transaction_id} <- charge_card(validated_card, amount) do
  {:ok, %{id: transaction_id, card: validated_card, amount: amount}}
else
  {:validation, _} -> {:error, :invalid_card}
  {:tax, _} -> {:error, :tax_failed}
  _ -> {:error, :unknown}
end

I hope you’ll find this concept of tagging useful. In fact, you might come across it in other places, such as in Ecto.Multi.

Posted in Elixir, Pattern snippets | Comments Off on Finer control in `with` failed matches

Pattern matching in function heads: don’t go overboard

Elixir’s pattern matching is great, but make sure it’s not reducing your code’s readability. In particular, just because you can match all the variables you’ll need from within the function’s head doesn’t mean you should.

Take this code:

  def handle_call(:checkout, %{workers: [h | t], monitors: monitors}) do
    # ...
  end

  def handle_call(:checkout, %{workers: [], idle_overflow: [h | t]}) do
    # ...
  end

  def handle_call(:checkout, %{workers: [], idle_overflow: [],
      overflow: overflow, overflow_max: max, worker_sup: sup,
      spec: spec, monitors: monitors})
      when overflow < max do
    # ...
  end

  def handle_call(:checkout, %{workers: [], idle_overflow: [],
      overflow: overflow, overflow_max: max, waiting: waiting}) do
    # ...
  end

A lot of matching is going on there, but how much of it is to differentiate the function heads, and how much is convenience (i.e. binding for later use)?

By leaving only the matches that differentiate the heads and moving other bindings to the function bodies, readability can be improved (although, sadly, this is poorly demonstrated due to the line wrapping required by this blog format):

def handle_call(:checkout, %{workers: [h | t]} = state) do
    %{monitors: monitors} = state
    # ...
  end

  def handle_call(:checkout, %{workers: [], idle_overflow: [h | t]}) do
    # ...
  end

  def handle_call(:checkout, %{workers: [], idle_overflow: [],
      overflow: overflow, overflow_max: max} = state) when overflow < max do
    %{worker_sup: sup, spec: spec, monitors: monitors} = state
    # ...
  end

  def handle_call(:checkout, state) do
    %{workers: [], idle_overflow: [], overflow: overflow,
        overflow_max: max, waiting: waiting} = state
    # ...
  end

Of course, I’m not saying that you should always split matching/binding between function head and body. But if you find yourself binding so many variables in your function heads that your code’s readability suffers, take a good look at them. The bindings that aren’t discriminating can most likely be moved to the function’s body, simplifying your code.

Update June 28th, 2018: It turns out José also uses this technique:

I tend to use this rule: if the key is necessary when matching the pattern, keep it in the pattern, otherwise move it to the body. So I end-up with code like this:

def some_fun(%{field1: :value} = struct) do
  %{field2: value2, field3: value3} = struct
  ...
end
Posted in Elixir, Pattern snippets | Comments Off on Pattern matching in function heads: don’t go overboard

Keyword list reduction

Keyword lists are often used to provide optional values, which are then processed (for example to initialize the state). One really nice way to do so, is with a reduce pattern.

I’ve never seen anyone bringing attention to it in training materials I’ve seen on Elixir (although I did see it mentioned in this thread). Given how prevalent keyword list processing is for option handling, and how readable the resulting code is, I thought I’d mention it here.

Here’s, the code:

defmodule ReducerDemo do
  @options [:name, :timeout]

  def init([type, count | opts]) when is_atom(type) and is_integer(count) and count > 0 do
    init(opts, %{type: type, count: count})
  end

  defp init([], state), do: {:ok, state}

  defp init([{:timeout, _} | _rest], %{type: :immediate}) do
    {:stop, {:conflicting_option, "cannot use `timeout` option with type :immediate"}}
  end

  defp init([{:timeout, t} | rest], state) when is_integer(t) and t > 0 do
    init(rest, Map.put(state, :timeout, t))
  end

  defp init([{:name, n} | rest], state) when is_atom(n) do
    init(rest, Map.put(state, :name, n))
  end

  defp init([{name, value} | _], _state) when name in @options do
    {:stop, {:invalid_option, "invalid value `#{value}` given to option `#{name}`"}}
  end

  defp init([{name, _value} | _], _state) do
    {:stop, {:invalid_option, "`#{name}` is not a valid option"}}
  end
end

Let’s say init/1 is a callback for a GenServer: it should return {:ok, state} or {:stop, reason} (see docs). Within the list, it expects 2 mandatory arguments, along with an optional list of options. (Note that all of these are provided within a list due to the callback’s arity.)

In the public init/1 function head, we typecheck the mandatory arguments. Then, within its body, we call a private init/2 function with the list of options and the initial state containing the mandatory values.

On line 8, we have the actual return taking place: if no options are left to process, just return the current state we’ve built up.

Let’s ignore lines 10-12 for now, and swing back to them later.

Lines 14-20 represent the happy path: match one a variable (with guards ensuring it’s the expected type), add it to the state, and recursively call init/2. Recall that a keyword list such as [a: 1, b:2] is just nicer syntax for the real representation, which is a list of tuples: [{:a, 1}, {:b, 2}] (read more about that here). In order to pattern match, we need to rely on the tuple representation.

Finally, we get to error handling. The first type of errors we handle are conflicting options on lines 10-12: if we discover an option that is incompatible with one already provided/processed, we return an error. Error functions return directly: there is no further processing of remaining options since we don’t recursively call init/2. The {:stop, reason} tuple is returned only because this example dealt with initializing state for a GenServer, and that is the expected error return value (see docs). Of course, your own code should return whatever makes sense in the current context.

We’ve got 2 more cases to handle: we were provided a valid option with an invalid value, or we were given an invalid option.

The first case is handled with the function on line 22: if the option name is in the whitelisted options (defined as a module attribute on line 2), then its value is incorrect.

If one the other hand an unrecognized option was provided, the function head on line 26 will be matched and return the appropriate message.

Let’s give it a spin:

iex(1)> c "reducer_demo.ex"
iex(2)> ReducerDemo.init([:delayed, 3, name: :foobar, timeout: 500])
{:ok, %{count: 3, name: :foobar, timeout: 500, type: :delayed}}
iex(3)> ReducerDemo.init([:delayed, 3, timeout: 500]) 
{:ok, %{count: 3, timeout: 500, type: :delayed}}
iex(4)> ReducerDemo.init([:delayed, 3]) 
{:ok, %{count: 3, type: :delayed}}
iex(5)> ReducerDemo.init([:immediate, 3, timeout: 500])
{:stop, {:conflicting_option, "cannot use `timeout` option with type :immediate"}}
iex(6)> ReducerDemo.init([:delayed, 3, nane: :foobar])
{:stop, {:invalid_option, "`nane` is not a valid option"}}
iex(7)> ReducerDemo.init([:delayed, 3, name: "foobar"])
{:stop, {:invalid_option, "invalid value `foobar` given to option `name`"}}

Finally, astute readers will have noticed that the “reason” tagged tuples on lines 23 and 27 have the same tag and just differ in their message. This was an arbitrary API choice: in this case, it would be expected that clients only match on the tag (therefore being unable to determine whether the error was caused by an invalid option name or value). In other words, the string message isn’t intended for flow control, it’s there for the developer: if an invalid option pops up, the message will tell you if you’ve mistyped an option name or if the value was invalid.

Posted in Elixir, Pattern snippets | Comments Off on Keyword list reduction