Automate TLS for dynamic domains with Traefik + Hetzner DNS.

When I started building apidex.dev, I knew custom domains would matter. Users should be able to publish API docs at docs.acme.com, not only at a generic subdomain.

That sounds simple until you get to HTTPS.

Every domain needs a valid TLS certificate. Users can add domains at any time.

Static Traefik labels do not handle that well. Manual certbot scripts are not much better. You either restart containers often or build a pile of glue code.

I wanted this to be boring. No restarts. No manual steps. No background job to watch.

The problem

Traefik works well with Docker labels when routes are known ahead of time. You put routers on containers as labels. Traefik reads them and wires everything up.

But labels are static. Traefik reads them when the container starts. If a user adds a domain later, Traefik does not know about it.

You can work around this by rebuilding docker-compose.yml and running docker-compose up -d.

But you do not want to do that each time someone clicks "Add domain". It causes reloads and is annoying to automate.

Once you have more than a few custom domains, you need dynamic routing.

The solution: Traefik's HTTP provider

Traefik can read config from more than one place. Most Docker setups use the Docker provider. There is also an HTTP provider, and it is more useful than it looks.

With the HTTP provider, Traefik polls an endpoint and merges the response into its live config.

# docker-compose.yml (traefik service)
- "--providers.http.endpoint=https://api.example.com/_dynamic-config/<unguessable-token>"
- "--providers.http.pollInterval=30s"

In my setup, the real endpoint uses an opaque path. Here, Traefik calls the endpoint every 30 seconds. It compares the response with the config it already has, then applies the changes.

Routers can appear and disappear while the container keeps running.

No restarts. No downtime.

What the endpoint returns

On the backend, one controller action builds the Traefik config from the database.

It finds projects with a custom domain and turns each domain into routers.

# app/controllers/traefik_config_controller.rb
def build_traefik_config(domains)
  routers = {}

  domains.each do |domain|
    safe_name = domain.gsub(/[^a-z0-9]/i, "-")

    routers["custom-#{safe_name}"] = {
      rule: "Host(`#{domain}`)",
      service: "frontend",
      entryPoints: ["websecure"],
      tls: { certResolver: "letsencrypt" }
    }

    routers["custom-#{safe_name}-http"] = {
      rule: "Host(`#{domain}`)",
      service: "frontend",
      entryPoints: ["web"],
      middlewares: ["https-redirect"]
    }
  end

  { http: { routers: routers, services: { ... }, middlewares: { ... } } }
end

Each domain gets two routers.

The first router listens on port 443. It enables TLS through the letsencrypt resolver. This is the router that serves traffic.

The second router listens on port 80. It only redirects to HTTPS with a 301.

The service: "frontend" part points to a service from the static Docker labels. That is where the Next.js app lives.

The HTTP provider only adds the routing rules.

Certificate issuance: Let's Encrypt HTTP-01

Traefik handles ACME, so the config is small.

- "--certificatesresolvers.letsencrypt.acme.httpchallenge=true"
- "--certificatesresolvers.letsencrypt.acme.httpchallenge.entrypoint=web"
- "--certificatesresolvers.letsencrypt.acme.email=admin@example.com"
- "--certificatesresolvers.letsencrypt.acme.storage=/letsencrypt/acme.json"

This uses the HTTP-01 challenge.

When Traefik asks for a cert, Let's Encrypt checks that Traefik can serve a token under /.well-known/acme-challenge/ over plain HTTP.

That is why port 80 must stay open. The site redirects to HTTPS, but the challenge still starts over HTTP.

Once Traefik sees a new domain in the next poll, it starts the ACME flow.

The cert is stored in acme.json. Traefik renews it before it expires.

User flow

For the user, the flow is simple.

  1. The user adds docs.acme.com in apidex.dev settings.
  2. They point docs.acme.com to apidex.dev or to the server IP.
  3. Traefik sees the new domain on its next HTTP provider poll.
  4. The ACME challenge runs and a certificate is issued.
  5. The domain goes live on HTTPS.

No one has to approve or provision anything by hand.

What's on Hetzner specifically

The infrastructure is simple.

There is one Hetzner VPS running Docker Compose. Traefik, the Rails API, and the Next.js frontend all run on that machine.

The firewall allows inbound traffic on ports 80 and 443.

There is no load balancer. There is no DNS API integration.

HTTP-01 works fine at this scale, so I did not add more parts.

Wildcard subdomains like *.apidex.dev are handled separately with a HostRegexp rule in Docker labels.

The HTTP provider only handles custom domains users bring in.

Gotchas

A few things are worth knowing.

  • acme.json must have strict permissions. If it is not set to 600, Traefik refuses to start.
  • DNS propagation can bite you. If Traefik asks for a cert before the domain points to your server, the challenge fails.
  • The HTTP provider merges with the existing config. It does not replace it. This is useful, because static routers keep working.
  • Let's Encrypt has rate limits. The main one is 50 certificates per registered domain per week.

Let's Encrypt retries failed challenges, but the user may not see clear feedback. It is worth checking DNS before accepting the domain.

During development, use the staging endpoint so you do not hit rate limits.

Conclusion

This was simpler than I expected.

Traefik's HTTP provider gives you dynamic routing without restarts. ACME handles the certificates.

The only custom part is a small controller that returns JSON.

No certbot scripts, no cron jobs, no DNS API keys.

If you are building a multi-tenant app with custom domains, this pattern is worth considering.

It stays out of your way as more domains get added.