Architecture
This page explains how vpn-confinement implements fail-closed VPN confinement
for selected systemd services.
At a glance
Section titled “At a glance”- The trust boundary is the namespace, not the individual service.
- The WireGuard interface lives inside that namespace while host networking stays unchanged for non-confined services.
- Namespace-local nftables and resolver files enforce the policy surface that confined services see.
- systemd
BindsTo=relationships make namespace or tunnel loss propagate to dependent services and sockets.
Use this page to understand the moving parts and lifecycle. Read
Threat Model for explicit guarantees and non-goals, and
Generated Options Reference when tuning a
real deployment.
Detailed design
Section titled “Detailed design”- Services opt in with
systemd.services.<name>.vpn.enable = true. - Socket units opt in with
systemd.sockets.<name>.vpn.enable = true. - In the common path, each vpn-enabled service or socket sets
vpn.namespaceexplicitly. - Per-service behavior config is limited to namespace attachment and hardening; network policy is namespace-level.
services.vpnConfinement.namespaces.<name>.securityProfileprovides a small, opinionated top-level selector for stronger defaults and assertions.- Confinement uses a dedicated Linux network namespace at
/run/netns/<name>. - WireGuard is configured via
networking.wireguard.interfaces.<if>and assigned withinterfaceNamespace. - The module can also set WireGuard
socketNamespacefor advanced cases, but the recommended path is to leave it unset or use"init". - Hostname WireGuard endpoints are treated as an explicit advanced opt-in with
wireguard.allowHostnameEndpoints = true. interfaceNamespaceis the main mechanism: the WireGuard link itself is kept inside the confinement namespace.socketNamespacecontrols only the UDP socket birthplace and should be viewed as an advanced escape hatch, not the primary design surface.wireguard-<if>.serviceexplicitly requires and orders after the namespace preparation unit and also binds to it for fail-closed teardown.- Namespace-local nftables enforces deny-by-default egress and allows only tunnel traffic according to namespace egress mode.
- Store-generated resolver files are bind-mounted directly into confined units.
- DNS policy is namespace-scoped and controlled by
services.vpnConfinement.namespaces.<name>.dns.mode. - In
dns.mode = "strict", DNS policy blocks non-allowlisted DNS-like traffic on ports53,853,5353, and5355before generic tunnel egress allow. dns.mode = "strict"is about common resolver leak resistance.securityProfile = "highAssurance"defaultsegress.mode = "allowList"and rejects weaker compatibility paths such as hostname endpoints or host resolver IPC.securityProfile = "highAssurance"also rejects inlinenetworking.wireguard.interfaces.<if>.privateKey; useprivateKeyFileorgeneratePrivateKeyFileinstead.securityProfile = "highAssurance"requires non-emptyegress.allowedCidrsso outbound policy remains destination-constrained.- In allowlist mode, narrow ICMP / ICMPv6 error traffic is still permitted for PMTU and control-plane reliability when destination CIDRs are configured.
- Strict mode also bind-mounts namespace
resolv.confandnsswitch.conf(hosts: files myhostname dns) into confined services while hiding resolver helper paths. - In
highAssurance, vpn-enabled services must run as non-root by default (DynamicUser = trueor explicit non-rootUser) unless explicitly opted out per service. dns.allowHostResolverIPC = false(default) blocks common host resolver helpers (/run/nscdand system D-Bus sockets) in strict mode; setting it totrueopts out of those helper blocks.dns.mode = "compat"is a weaker compatibility path that skips strict DNS containment entirely.- Egress policy is explicit:
egress.mode = "allowAllTunnel": allow all tunnel egress (after DNS policy).egress.mode = "allowList": allow only configured ports/CIDRs.
- nftables rules use named sets for DNS servers, blocked DNS ports, allowed ports, and allowed CIDRs so the policy stays auditable as the ruleset grows.
- IPv6 defaults to fail-closed
(
services.vpnConfinement.namespaces.<name>.ipv6.mode = "disable"). - Namespace lifecycle is on-demand through
vpn-confinement-netns@<name>.serviceand cleaned up when unneeded. - Namespace setup validates the generated nftables rules before applying them and uses shell traps to clean up partial state on failed starts.
- Confined services bind to both the namespace unit and the WireGuard unit so namespace teardown propagates cleanly.
- Optional
vpn.restrictBind = truederivesSocketBindAllow/SocketBindDenyfrom namespace ingress policy as defense in depth for service-created listeners when ingress ports are declared. publishToHost.tcpis the common-path host ingress abstraction and maps to host-link based ingress behavior.- Effective host-link addresses are exported under
services.vpnConfinement.namespaces.<name>.derived.hostLink.*.
Security model
Section titled “Security model”- Host network remains unchanged unless a service explicitly enables VPN confinement.
- The trust boundary is the namespace, not the individual service.
- Confined services fail closed if tunnel dependencies are required and unavailable.
- Runtime tunnel drops are propagated to vpn-enabled services and sockets with
BindsTo=wireguard-<if>.service. - Namespace teardown is propagated through
BindsTo=vpn-confinement-netns@...on services, sockets, and generated WireGuard dependency units. - DNS leakage is reduced by namespace resolver pinning and blocked DNS-like ports.
- Literal WireGuard peer endpoints are preferred.
- Hostname endpoints are permitted only with explicit opt-in and endpoint refresh enabled, and remain outside the module’s strict DNS guarantee.
- Direct resolver API use over D-Bus is outside the strict DNS guarantee unless
dns.allowHostResolverIPC = false(or equivalent unit-local restrictions). - Bind restrictions are supplemental hardening only; nftables remains the primary policy mechanism.
- Advanced knobs such as
wireguard.socketNamespaceand manualhostLink.*tuning are optional escape hatches, not the primary deployment path. - vpn-enabled services and sockets must not manually set namespace attachment
controls that conflict with module-managed
NetworkNamespacePathbehavior.
Socket activation pattern
Section titled “Socket activation pattern”- Recommended default for host-facing services: leave
.socketin host namespace and run.servicein VPN namespace. - This preserves host listener behavior while confining service-originated outbound traffic.
- Use socket namespace attachment only when the listening socket itself must be inside the VPN namespace.
For a practical host reverse-proxy pattern, see
Reverse Proxy.
Limitations
Section titled “Limitations”- Generic HTTPS-based DoH on port 443 is not reliably detectable with simple port-based policy.
- DNS-over-HTTPS/DNS-over-QUIC can still traverse generic egress paths unless destination allowlisting is enabled.
networking.wireguard.interfaces.<if>.fwMarkand.mturemain upstream WireGuard controls; this module documents them but does not build policy around them.- This module supports WireGuard integration through
networking.wireguard.interfacesonly. - Endpoint pinning is implemented with
wireguard.endpointPinning.enable = true. - Endpoint pinning applies in the effective WireGuard socket birthplace
namespace (
initby default, or customwireguard.socketNamespacewhen configured). - Endpoint pinning requires literal peer endpoints and applies nftables policy keyed by WireGuard fwmark in that birthplace namespace.
- For strict environments, combine confinement with application policy and egress inspection.
Why netns over policy routing
Section titled “Why netns over policy routing”- The module is designed for “only selected services use VPN”. A dedicated namespace is a cleaner trust boundary than host-global policy-routing rules.
- The WireGuard interface is moved into the namespace, so confined cleartext traffic lives inside that namespace boundary.
- This model minimizes accidental clearnet fallback paths for confined units and keeps host networking behavior unchanged for non-confined services.
Compatibility baseline
Section titled “Compatibility baseline”- Supported baseline: NixOS unstable.
- This assumes modern systemd features required by this module, including
NetworkNamespacePath=andRestrictNetworkInterfaces=.
Documentation map
Section titled “Documentation map”- Threat and guarantee boundaries:
threat-model. - Generated option reference:
reference/options-generated.
Read next
Section titled “Read next”Threat Modelfor guarantees, weaker modes, and non-goals.Generated Options Referencefor exact option names and defaults.