← All articlestechnical

Zero-Trust Networking for Self-Hosted Infrastructure: Enterprise Security on a Startup Budget

You run your own servers. Maybe a dedicated box at Hetzner, a Proxmox cluster in a colocation rack, or a stack of NUCs in your office. You have 20, 50, maybe 80+ containers running services your team

Yash Pritwani

28 April 202610 min read read

# Zero-Trust Networking for Self-Hosted Infrastructure: Enterprise Security on a Startup Budget

You run your own servers. Maybe a dedicated box at Hetzner, a Proxmox cluster in a colocation rack, or a stack of NUCs in your office. You have 20, 50, maybe 80+ containers running services your team depends on daily: Git forges, CI runners, monitoring dashboards, internal tools, databases.

Now answer this: how many inbound ports do you have open?

If the answer is anything other than zero, this article is for you.

The Problem: Exposing Self-Hosted Services Securely

The traditional approach to making self-hosted services reachable from the internet follows a well-worn path. You open port 443 on your firewall. You point a DNS record at your public IP. You terminate TLS with Let's Encrypt. Maybe you put nginx or Traefik in front. You add HTTP Basic Auth or IP allowlisting for sensitive dashboards.

This model worked for two decades. It also has fundamental problems that compound as your infrastructure grows.

Every open port is attack surface. Port 443 is expected, but the moment you expose it, you invite the entire internet to probe your TLS stack, your reverse proxy, and every application behind it. Your firewall rules become a growing list of exceptions. Your VPN becomes a bottleneck that everyone complains about. Your SSH bastion host is one misconfigured authorized_keys away from compromise.

The traditional perimeter model assumes that the network boundary is the security boundary. Zero-trust architecture rejects this assumption entirely. Instead of asking "is this request coming from inside the network?", it asks "is this specific request, from this specific identity, authorised for this specific resource, right now?"

For self-hosters in the EU, there is an additional dimension. GDPR and national data protection frameworks (the German BDSG, the UK Data Protection Act 2018) impose obligations on how traffic flows, where data is processed, and what third parties touch it. Your architecture choices have compliance implications.

The Architecture: Zero Ports Open

We serve 84 Docker containers through a single Cloudflare Tunnel with zero inbound ports open on the host firewall. Here is how the architecture works.

Cloudflare Tunnel (cloudflared) runs as a lightweight daemon on the host. It establishes outbound-only QUIC connections to Cloudflare's edge network. Traffic flows like this:

User -> Cloudflare Edge (TLS termination, WAF, DDoS protection)
     -> Cloudflare Tunnel (outbound connection from your server)
     -> Traefik (internal reverse proxy, port 443/80 on localhost only)
     -> Target container (e.g., Gitea, Grafana, n8n)

The critical property: your server initiates all connections. There are no listening ports. A port scan of your public IP returns nothing. The attack surface is reduced to the Cloudflare edge, which is hardened infrastructure operated by a team whose entire job is absorbing attacks.

Traefik acts as the internal reverse proxy. It routes requests based on hostname to the correct container. It handles internal TLS between services where needed. It applies middleware chains for authentication, rate limiting, and header manipulation.

Authelia provides SSO and two-factor authentication for internal services. It integrates with Traefik as a forward-auth middleware. Any request to a protected service is redirected to Authelia for authentication before it ever reaches the target application. TOTP, WebAuthn, and push notification second factors are supported.

The layered architecture looks like this:

Internet
  |
  v
Cloudflare Edge (WAF, DDoS, Bot Management, Access Policies)
  |
  v  (outbound QUIC tunnel, initiated by your server)
cloudflared daemon
  |
  v
Traefik reverse proxy (localhost only, Docker provider)
  |
  +---> [authelia middleware] ---> Internal service (Grafana, Gitea, etc.)
  +---> Public service (marketing site, API endpoints)

No port forwarding. No NAT rules. No firewall exceptions. No VPN clients to distribute.

Setup: 45 Minutes to Zero-Trust

Step 1: Install cloudflared and Create a Tunnel

# Debian/Ubuntu
curl -L https://github.com/cloudflare/cloudflared/releases/latest/download/cloudflared-linux-amd64.deb -o cloudflared.deb
sudo dpkg -i cloudflared.deb

# Authenticate with Cloudflare
cloudflared tunnel login

# Create the tunnel
cloudflared tunnel create infrastructure

# Note the tunnel UUID - you will need it

Step 2: Configure Tunnel Routes

Create /etc/cloudflared/config.yml:

tunnel: a1b2c3d4-e5f6-7890-abcd-ef1234567890
credentials-file: /root/.cloudflared/a1b2c3d4-e5f6-7890-abcd-ef1234567890.json

ingress:
  # Git forge
  - hostname: git.example.cloud
    service: http://traefik:443
    originRequest:
      noTLSVerify: true

  # Monitoring
  - hostname: metrics.example.cloud
    service: http://traefik:443
    originRequest:
      noTLSVerify: true

  # CI/CD dashboards
  - hostname: ci.example.cloud
    service: http://traefik:443
    originRequest:
      noTLSVerify: true

  # Marketing site (public)
  - hostname: www.example.cloud
    service: http://traefik:443
    originRequest:
      noTLSVerify: true

  # Catch-all: return 404
  - service: http_status:404

The noTLSVerify: true on origin requests is acceptable here because the connection between cloudflared and Traefik is entirely local (same Docker network or localhost). There is no wire to sniff.

Step 3: Traefik Integration

Traefik discovers containers via the Docker provider. A typical container label set:

# docker-compose.yml (excerpt)
services:
  grafana:
    image: grafana/grafana:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.grafana.rule=Host(`metrics.example.cloud`)"
      - "traefik.http.routers.grafana.entrypoints=websecure"
      - "traefik.http.routers.grafana.middlewares=authelia@docker"
      - "traefik.http.services.grafana.loadbalancer.server.port=3000"
    networks:
      - proxy

  authelia:
    image: authelia/authelia:latest
    labels:
      - "traefik.enable=true"
      - "traefik.http.routers.authelia.rule=Host(`auth.example.cloud`)"
      - "traefik.http.routers.authelia.entrypoints=websecure"
      - "traefik.http.middlewares.authelia.forwardauth.address=http://authelia:9091/api/verify?rd=https://auth.example.cloud"
      - "traefik.http.middlewares.authelia.forwardauth.trustForwardHeader=true"
      - "traefik.http.middlewares.authelia.forwardauth.authResponseHeaders=Remote-User,Remote-Groups,Remote-Name,Remote-Email"
    networks:
      - proxy

Step 4: Authelia Configuration

The Authelia access control policy defines which services require authentication and at what level:

# authelia configuration.yml (excerpt)
access_control:
  default_policy: deny
  rules:
    # Public services - no auth required
    - domain: www.example.cloud
      policy: bypass

    # Internal services - two-factor required
    - domain:
        - metrics.example.cloud
        - git.example.cloud
        - ci.example.cloud
      policy: two_factor

    # Admin services - two-factor + specific group
    - domain:
        - admin.example.cloud
      policy: two_factor
      subject:
        - "group:admins"

Step 5: Run as a systemd Service

sudo cloudflared service install
sudo systemctl enable cloudflared
sudo systemctl start cloudflared

At this point, your services are reachable through Cloudflare's edge network. Your firewall can drop all inbound traffic. The tunnel reconnects automatically after restarts or network interruptions.

GDPR Considerations

If you operate in the EU, EEA, or handle data of EU residents, the Cloudflare Tunnel architecture introduces a data processor relationship that requires examination.

Data flow path: User requests travel through Cloudflare's edge network before reaching your server. This means request metadata (IP addresses, headers, URLs) and potentially request bodies pass through Cloudflare's infrastructure.

Cloudflare's Data Processing Addendum (DPA): Cloudflare acts as a data processor under GDPR. Their DPA covers the processing of personal data in transit. You must execute this DPA (available in the Cloudflare dashboard) to establish the legal basis for processing.

EU data residency: Cloudflare offers a Data Localisation Suite (paid add-on) that restricts where TLS termination and WAF inspection occur. Without it, traffic may be processed at any Cloudflare edge location globally, including the US. For most use cases, the Standard Contractual Clauses (SCCs) in Cloudflare's DPA provide adequate legal basis for US transfers post-Schrems II.

German BSI perspective: The BSI (Bundesamt fuer Sicherheit in der Informationstechnik) recommends zero-trust architectures in their IT-Grundschutz framework. Using Cloudflare Tunnel aligns with BSI recommendations for reducing attack surface. However, the BSI also emphasises digital sovereignty, and routing all traffic through a US-based provider introduces a dependency that some German organisations may find unacceptable.

Practical guidance: For most startups and SMEs, the standard Cloudflare setup with an executed DPA is sufficient. For fintech, healthcare, or government-adjacent work, consider the Data Localisation Suite or a self-hosted alternative like WireGuard with Headscale.

Comparison: Cloudflare Tunnel vs Alternatives

Criteria

Cloudflare Tunnel

Tailscale

WireGuard (raw)

ngrok

Traditional VPN

|---|---|---|---|---|---|

Cost

Free tier available

Free for personal, paid for teams

Free (OSS)

Free tier limited, paid scales

Varies (OpenVPN: free, commercial: paid)

Setup complexity

Low (30-45 min)

Very low (15 min)

Medium (1-2 hours)

Very low (5 min)

High (half day+)

Inbound ports required

Zero

One (UDP)

Zero

One or more

Public-facing services

Yes (primary use case)

No (mesh VPN only)

Not directly

Yes

Not directly

DDoS protection

Built-in

Basic

WAF

Built-in

Limited

GDPR compliance

DPA available, SCCs for US transfer

EU servers available, DPA available

Self-hosted, full control

DPA available

Self-hosted, full control

Vendor lock-in

Medium (DNS + tunnel config)

Low (standard WireGuard underneath)

None

Medium

Low

Latency overhead

3-8ms typical

1-3ms (direct mesh)

<1ms (direct)

5-15ms

2-5ms

Best for

Public services, zero-trust access

Internal team access, dev environments

Site-to-site, maximum control

Quick demos, webhooks

Legacy environments

The right choice depends on what you are exposing. For public-facing services (websites, APIs, SaaS products), Cloudflare Tunnel is difficult to beat on the cost-to-security ratio. For purely internal team access, Tailscale is simpler and keeps traffic off third-party infrastructure entirely. For maximum control and zero vendor dependency, raw WireGuard with Headscale (open-source Tailscale control plane) is the path.

Performance Impact

We measured the latency overhead of the Cloudflare Tunnel path versus direct connections over a two-week period from three European locations (Frankfurt, London, Stockholm).

Additional latency: 3-8ms on average. The tunnel adds one extra network hop (your server to Cloudflare edge), but Cloudflare's Anycast network typically routes to a nearby PoP, limiting the overhead. From Frankfurt, connecting to a Cloudflare-fronted service in the same city added 4ms median latency.

Throughput: We observed no meaningful throughput degradation for typical web application workloads (API calls, dashboard rendering, Git operations). For large file transfers (multi-gigabyte), the tunnel's QUIC connection can become a bottleneck compared to a direct connection, though this is rarely the limiting factor in practice.

Connection pooling: cloudflared maintains persistent connections to Cloudflare's edge and multiplexes requests. Cold-start latency (first request after tunnel establishment) is higher, around 50-80ms, but subsequent requests benefit from connection reuse.

WebSocket support: Cloudflare Tunnel supports WebSocket connections natively. We run real-time dashboards and terminal sessions (ttyd) through the tunnel without issues. Keep-alive timeouts are configurable but default to 30 seconds.

Limitations and Trade-offs

This architecture is not without costs. You should understand what you are accepting.

Single vendor dependency: All your public traffic flows through Cloudflare. If Cloudflare experiences an outage, your services are unreachable, even though your server is perfectly healthy. Cloudflare's track record is strong (99.99% SLA on Business plans), but it has experienced notable outages, including the June 2022 incident that took down a significant portion of the internet. You are trading your own availability risk for Cloudflare's, which is usually a good trade, but it is a trade nonetheless.

No direct server access: If the tunnel goes down and you need emergency access, you need a fallback path. We maintain a WireGuard peer that is disabled by default and can be activated via out-of-band access (IPMI/KVM console). This is your break-glass procedure: document it, test it quarterly.

TLS inspection: Cloudflare terminates TLS at their edge and re-encrypts to your origin. This means Cloudflare can, in principle, inspect your traffic. For most use cases this is acceptable (and enables their WAF and bot protection features). For end-to-end encryption requirements, consider Cloudflare Spectrum or client-side encryption.

DNS coupling: Your DNS must be managed through Cloudflare for tunnelled domains. Migrating away requires changing both DNS and tunnel configuration simultaneously. Plan your exit strategy before you need it.

Mistakes We Made

Mistake 1: Running cloudflared as root. The daemon does not need root privileges after initial setup. We ran it as root for months before realising the credentials file was world-readable. Fix: create a dedicated cloudflared user, restrict credentials to that user, run the systemd service as that user.

Mistake 2: Not setting up health checks. cloudflared will happily route traffic to a dead backend. We had a container crash go unnoticed for hours because Cloudflare returned a 502 that looked like a Cloudflare issue, not a backend issue. Fix: configure origin health checks in the tunnel config and set up Uptime Kuma or similar monitoring on the backend side.

Mistake 3: Forgetting the catch-all rule. Without the http_status:404 catch-all at the end of your ingress rules, cloudflared will refuse to start. We lost 20 minutes debugging a "no matching rule" error on our first deployment. The catch-all is mandatory, not optional.

Mistake 4: Ignoring Cloudflare Access for API endpoints. We initially only protected browser-facing services with Authelia. API endpoints were left with application-level auth only. This meant that automated scanners hitting the API endpoints were consuming backend resources. Adding Cloudflare Access service tokens for machine-to-machine communication reduced unwanted API traffic by 90%.

FAQ

What happens when Cloudflare goes down?

Your services become unreachable from the internet. Your server continues running normally. The tunnel daemon will automatically reconnect when Cloudflare recovers. For critical services, maintain a secondary access path (WireGuard peer, SSH over Tailscale, or IPMI/KVM console access). We keep a documented break-glass procedure that we test every quarter.

Is this compliant for fintech in the UK?

It depends on your specific regulatory requirements. The FCA does not prescribe specific networking architectures, but PSD2 and the FCA's operational resilience requirements (PS21/3) expect you to understand and manage third-party dependencies. Cloudflare Tunnel introduces Cloudflare as a critical third-party. You need to: (1) execute Cloudflare's DPA, (2) document Cloudflare as a material outsourcing arrangement if applicable, (3) ensure your incident response plan covers Cloudflare outages, and (4) consider the Data Localisation Suite to keep data processing within the UK/EU. Consult your compliance team, but the architecture itself is not a blocker.

Can I use this with multiple servers or data centres?

Yes. You can run multiple cloudflared instances pointing to the same tunnel, and Cloudflare will load-balance between them. Alternatively, create separate tunnels per server and use Cloudflare's load balancing (paid feature) to distribute traffic. We run a single tunnel per physical host and use Traefik internally for container-level routing.

How does this compare to running Caddy with automatic HTTPS?

Caddy with automatic HTTPS is an excellent solution for simple setups. It handles TLS termination, certificate renewal, and reverse proxying in a single binary. However, it requires inbound port 443 (and optionally 80 for ACME challenges), which means your server is directly exposed to the internet. You do not get DDoS protection, WAF, or bot management. For a personal blog or small project, Caddy is simpler and has fewer moving parts. For production infrastructure with multiple services and security requirements, the Cloudflare Tunnel architecture provides meaningfully more protection.