Introduction §
I’ve been using Nix and NixOS for many years now,
both personally and professionally, on workstations and servers.
While the tools aren’t without their warts, I strongly believe the
model espoused by Nix for package management is a leap ahead of other
available tools. However, the Nix language can be very unstructured,
and knowing how to use it in an effective and composable way can
involve a lot of searching and deep-diving through nixpkgs
. This
series aims to collect patterns and good practices I encounter or
devise, significantly for my own future benefit, but hopefully other
people will find it helpful too.
In this entry, I aim to document a pattern for defining services in NixOS that I’ve found useful for separating concerns and producing reüsable code.
The problem §
Service configuration often mixes together service inputs and outputs. Inputs are things that the user cares about configuring — things that change the user-visible behaviour of the service. Outputs are things the user doesn’t care about, except to make sure that they are passed around correctly — examples include socket paths or database names.
Nix teaches us that only the inputs to a build process should be user-provided. Outputs (such as installation paths) can be generated by the process itself; the user shouldn’t have to care about them. All we need to do is make sure the outputs can be passed effectively from the process to its caller.
But NixOS modules often forget this. For example, when setting up a NGINX configuration to proxy to a backend application, we typically see:
services.backend-service.address = "address";
services.nginx
.virtualHosts."host"
.locations."/location"
.proxyPass = "address";
There are two problems here. Firstly, it is impossible to set up
multiple instances of the backend-service
. This makes sense for
some services, for which it is meaningless to have more than one per
machine, but in general we’d like to be able to have as many instances
as we need. The second issue is that the administrator has to deal
with the address "address"
and keep it consistent between output
(our backend-service
) and input (NGINX).
Both of these problems have the same root cause: both
services.backend-service
and "address"
are essentially global
variables. "address"
is particularly bad, as it must be manually
plumbed around by hand, and mismatches will produce a configuration
that will activate correctly, but whose resulting system components
will be unable to connect to each other.
User-defined instances §
The first part of the problem can be solved by letting the user define their own names for their service instances. This takes the form of expecting an attrset (with user-defined keys) as configuration for our service module, instead of just a single instance’s configuration. At this juncture we could actually use a list, as the key names are largely useless, but having a human-readable name to attach to things belonging to the instance helps significantly with debugging. In the next section we will want to refer back to the name, and it is much less fragile to do this with a user-chosen name than an integer list index.
services.backend-service.instance1 = {
enable = true;
address = "address1";
};
services.backend-service.instance2 = {
enable = true;
address = "address2";
};
services.nginx
.virtualHosts."host"
.locations."/instance1"
.proxyPass = "address1";
services.nginx
.virtualHosts."host"
.locations."/instance2"
.proxyPass = "address2";
The module interface to allow this kind of usage looks something like:
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
{ lib, config }:
let
instance = name: cfg: lib.mkIf cfg.enable {
# per-instance system configuration here, e.g.:
systemd.services."backend-service.${name}" = {
# …
serviceConfig.ExecStart = ''
${cfg.package}/bin/backend-service \
--address "${cfg.address}"
'';
# …
};
# …
};
in
{
options.services.backend-service = mkOption {
description = ''
Named instances of backend-service to run.
'';
type = types.attrsOf (types.submodule {
options = {
# instance configuration options here, e.g.
enable = lib.mkEnableOption { … };
package = lib.mkOption { … };
address = lib.mkOption { … };
# …
};
});
};
# `lib.mergeAttrsList` combines all our instance configs into a
# larger config object by merging
config = lib.mergeAttrsList
# `lib.mapAttrsToList` applies our `instance` function to each of
# the instance configuration attrsets and returns the result as a
# list of attrsets
(lib.mapAttrsToList
instance
config.services.backend-service);
}
… at least, that’s what we’d like to write. Unfortunately, since
attrset keys in Nix are strict, this will cause an infinite recursion
when Nix attempts to evaluate the keys of config
.
Instead, we need to expressly restrict the keys that can appear in the output:
34
35
36
37
38
39
40
config = let
c = lib.mapAttrsToList
instance
config.services.gunicorn;
in {
systemd = lib.mkMerge (lib.catAttrs "systemd" c);
};
This helps the first problem, but it’s still unpleasant to manually
have to plumb around "address1"
and "address2"
, when we really
don’t care about it. One can work around this problem by having
backend-service
understand NGINX, and indeed this is a pattern used
in several places in nixpkgs
, but it’s rather clumsy to push
knowledge of the outer proxy into the backend service.
Passing data back out of instances §
A helpful insight is that modules can update their own configuration attrset as well as read from it, a fact that we can use to implement a sort of out-param pattern. This example assumes we’re using UNIX sockets; finding a fresh TCP port is much harder. As John Day notes in Patterns in Network Architecture, the system of TCP port numbers essentially commits the sin we note here on a much larger scale.
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
options.services.backend-service = mkOption {
default = { };
description = ''
Named instances of backend-service to run.
'';
type = types.attrsOf (types.submodule ({ name, ... }: {
options = {
enable = lib.mkEnableOption { … };
package = lib.mkOption { … };
address = lib.mkOption {
description = "Read-only attribute!";
…
};
};
config.address = "/var/lib/backend-service/${name}.sock";
}));
};
Here we use the
parameterized
submodule functionality to generate the socket
attribute locally,
guaranteeing that the keys of our submodules can only depend on their
names, not other keys.
With this, our end-user can use
config.services.backend-service."some-identifier".address
to refer
to the address of the service they’ve defined, without having to
manually devise that address and plumb it through:
services.backend-service.instance1.enable = true;
services.backend-service.instance2.enable = true;
services.nginx
.virtualHosts."host"
.locations."/instance1"
.proxyPass
= config.services.backend-service.instance1.address;
services.nginx
.virtualHosts."host"
.locations."/instance2"
.proxyPass
= config.services.backend-service.instance2.address;
We do unfortunately still have to use the instance1
and instance2
names, despite them only being used in one place, but this is a much
better scenario: an illegal instance name will be caught at
configuration-build time, and not result in a broken system.
Imagining a better solution §
An even better way to write this might be to use something like algebraic effects to allow us to write:
services.nginx
.virtualHosts."host"
.locations."/instance"
.proxyPass
= mk-backend-service { … };
where mk-backend-service
is an effectful function that both
registers the service configuration in the global system configuration
attrset and returns the address of the registered service directly to
its caller (i.e. a
state
effect). This way we can avoid naming the expression if we like, and
if we do want to refer to it multiple times we can re-use the Nix
identifier binding system, e.g.
let
instance1 = mk-backend-service { … };
in {
services.nginx
.virtualHosts."host1"
.locations."/instance1"
.proxyPass
= instance1;
services.nginx
.virtualHosts."host2"
.locations."/instance1"
.proxyPass
= instance1;
}
This is not currently supported by the Nix language, though, and the encoding of it (using a manual CPS transform to pass the remainder of the configuration to the function) is probably too clumsy to be worthwhile.