Skip to content

Architecture

This page explains how vpn-confinement implements fail-closed VPN confinement for selected systemd services.

  • The trust boundary is the namespace, not the individual service.
  • The WireGuard interface lives inside that namespace while host networking stays unchanged for non-confined services.
  • Namespace-local nftables and resolver files enforce the policy surface that confined services see.
  • systemd BindsTo= relationships make namespace or tunnel loss propagate to dependent services and sockets.

Use this page to understand the moving parts and lifecycle. Read Threat Model for explicit guarantees and non-goals, and Generated Options Reference when tuning a real deployment.

  • Services opt in with systemd.services.<name>.vpn.enable = true.
  • Socket units opt in with systemd.sockets.<name>.vpn.enable = true.
  • In the common path, each vpn-enabled service or socket sets vpn.namespace explicitly.
  • Per-service behavior config is limited to namespace attachment and hardening; network policy is namespace-level.
  • services.vpnConfinement.namespaces.<name>.securityProfile provides a small, opinionated top-level selector for stronger defaults and assertions.
  • Confinement uses a dedicated Linux network namespace at /run/netns/<name>.
  • WireGuard is configured via networking.wireguard.interfaces.<if> and assigned with interfaceNamespace.
  • The module can also set WireGuard socketNamespace for advanced cases, but the recommended path is to leave it unset or use "init".
  • Hostname WireGuard endpoints are treated as an explicit advanced opt-in with wireguard.allowHostnameEndpoints = true.
  • interfaceNamespace is the main mechanism: the WireGuard link itself is kept inside the confinement namespace.
  • socketNamespace controls only the UDP socket birthplace and should be viewed as an advanced escape hatch, not the primary design surface.
  • wireguard-<if>.service explicitly requires and orders after the namespace preparation unit and also binds to it for fail-closed teardown.
  • Namespace-local nftables enforces deny-by-default egress and allows only tunnel traffic according to namespace egress mode.
  • Store-generated resolver files are bind-mounted directly into confined units.
  • DNS policy is namespace-scoped and controlled by services.vpnConfinement.namespaces.<name>.dns.mode.
  • In dns.mode = "strict", DNS policy blocks non-allowlisted DNS-like traffic on ports 53, 853, 5353, and 5355 before generic tunnel egress allow.
  • dns.mode = "strict" is about common resolver leak resistance.
  • securityProfile = "highAssurance" defaults egress.mode = "allowList" and rejects weaker compatibility paths such as hostname endpoints or host resolver IPC.
  • securityProfile = "highAssurance" also rejects inline networking.wireguard.interfaces.<if>.privateKey; use privateKeyFile or generatePrivateKeyFile instead.
  • securityProfile = "highAssurance" requires non-empty egress.allowedCidrs so outbound policy remains destination-constrained.
  • In allowlist mode, narrow ICMP / ICMPv6 error traffic is still permitted for PMTU and control-plane reliability when destination CIDRs are configured.
  • Strict mode also bind-mounts namespace resolv.conf and nsswitch.conf (hosts: files myhostname dns) into confined services while hiding resolver helper paths.
  • In highAssurance, vpn-enabled services must run as non-root by default (DynamicUser = true or explicit non-root User) unless explicitly opted out per service.
  • dns.allowHostResolverIPC = false (default) blocks common host resolver helpers (/run/nscd and system D-Bus sockets) in strict mode; setting it to true opts out of those helper blocks.
  • dns.mode = "compat" is a weaker compatibility path that skips strict DNS containment entirely.
  • Egress policy is explicit:
    • egress.mode = "allowAllTunnel": allow all tunnel egress (after DNS policy).
    • egress.mode = "allowList": allow only configured ports/CIDRs.
  • nftables rules use named sets for DNS servers, blocked DNS ports, allowed ports, and allowed CIDRs so the policy stays auditable as the ruleset grows.
  • IPv6 defaults to fail-closed (services.vpnConfinement.namespaces.<name>.ipv6.mode = "disable").
  • Namespace lifecycle is on-demand through vpn-confinement-netns@<name>.service and cleaned up when unneeded.
  • Namespace setup validates the generated nftables rules before applying them and uses shell traps to clean up partial state on failed starts.
  • Confined services bind to both the namespace unit and the WireGuard unit so namespace teardown propagates cleanly.
  • Optional vpn.restrictBind = true derives SocketBindAllow / SocketBindDeny from namespace ingress policy as defense in depth for service-created listeners when ingress ports are declared.
  • publishToHost.tcp is the common-path host ingress abstraction and maps to host-link based ingress behavior.
  • Effective host-link addresses are exported under services.vpnConfinement.namespaces.<name>.derived.hostLink.*.
  • Host network remains unchanged unless a service explicitly enables VPN confinement.
  • The trust boundary is the namespace, not the individual service.
  • Confined services fail closed if tunnel dependencies are required and unavailable.
  • Runtime tunnel drops are propagated to vpn-enabled services and sockets with BindsTo=wireguard-<if>.service.
  • Namespace teardown is propagated through BindsTo=vpn-confinement-netns@... on services, sockets, and generated WireGuard dependency units.
  • DNS leakage is reduced by namespace resolver pinning and blocked DNS-like ports.
  • Literal WireGuard peer endpoints are preferred.
  • Hostname endpoints are permitted only with explicit opt-in and endpoint refresh enabled, and remain outside the module’s strict DNS guarantee.
  • Direct resolver API use over D-Bus is outside the strict DNS guarantee unless dns.allowHostResolverIPC = false (or equivalent unit-local restrictions).
  • Bind restrictions are supplemental hardening only; nftables remains the primary policy mechanism.
  • Advanced knobs such as wireguard.socketNamespace and manual hostLink.* tuning are optional escape hatches, not the primary deployment path.
  • vpn-enabled services and sockets must not manually set namespace attachment controls that conflict with module-managed NetworkNamespacePath behavior.
  • Recommended default for host-facing services: leave .socket in host namespace and run .service in VPN namespace.
  • This preserves host listener behavior while confining service-originated outbound traffic.
  • Use socket namespace attachment only when the listening socket itself must be inside the VPN namespace.

For a practical host reverse-proxy pattern, see Reverse Proxy.

  • Generic HTTPS-based DoH on port 443 is not reliably detectable with simple port-based policy.
  • DNS-over-HTTPS/DNS-over-QUIC can still traverse generic egress paths unless destination allowlisting is enabled.
  • networking.wireguard.interfaces.<if>.fwMark and .mtu remain upstream WireGuard controls; this module documents them but does not build policy around them.
  • This module supports WireGuard integration through networking.wireguard.interfaces only.
  • Endpoint pinning is implemented with wireguard.endpointPinning.enable = true.
  • Endpoint pinning applies in the effective WireGuard socket birthplace namespace (init by default, or custom wireguard.socketNamespace when configured).
  • Endpoint pinning requires literal peer endpoints and applies nftables policy keyed by WireGuard fwmark in that birthplace namespace.
  • For strict environments, combine confinement with application policy and egress inspection.
  • The module is designed for “only selected services use VPN”. A dedicated namespace is a cleaner trust boundary than host-global policy-routing rules.
  • The WireGuard interface is moved into the namespace, so confined cleartext traffic lives inside that namespace boundary.
  • This model minimizes accidental clearnet fallback paths for confined units and keeps host networking behavior unchanged for non-confined services.
  • Supported baseline: NixOS unstable.
  • This assumes modern systemd features required by this module, including NetworkNamespacePath= and RestrictNetworkInterfaces=.
  • Threat and guarantee boundaries: threat-model.
  • Generated option reference: reference/options-generated.