Not every workload needs Kubernetes. I say that as someone who runs Kubernetes in anger — and who has also spent enough late nights debugging a control plane to know when the complexity isn't paying for itself. For a lot of teams, especially ones running a mix of containers and plain old binaries on a handful of nodes, Kubernetes is too much machine for the job.

That's the gap HashiCorp Nomad fills. It's a scheduler and orchestrator that does one thing well — places workloads on machines and keeps them running — without dragging in a dozen other moving parts. And once you pair it with the Nomad Autoscaler, you get the elastic, scale-with-demand behaviour that most people assume only Kubernetes can give you.

This is an honest look at what Nomad is, how the autoscaler actually works, where it shines, and where it doesn't.

What Nomad actually is

Nomad is a workload orchestrator from HashiCorp. You give it a job — "run 5 instances of this container, here are the resources each needs" — and Nomad figures out which machines have room, schedules the work there, restarts anything that dies, and reschedules onto healthy nodes when a machine goes down. That's the core loop.

The part that sets it apart is what it's willing to schedule. Kubernetes orchestrates containers, full stop. Nomad orchestrates containers and raw binaries, Java JARs, QEMU virtual machines, and batch jobs — through a pluggable task driver model. If you have a legacy service that was never containerised, Nomad will still run it under the same scheduler as everything else. That single trait is why a lot of brownfield shops pick it.

A few things worth knowing about the architecture:

  • One binary. The same nomad binary runs as a server (the control plane, using Raft consensus) or as a client (the agent that runs your workloads). No etcd, no separate scheduler, controller, and API server processes to babysit.
  • It integrates, it doesn't absorb. Service discovery, secrets, and networking are handled by its siblings Consul and Vault if you want them — but Nomad also has built-in service discovery now, so a small setup needs nothing else.
  • Multi-region and multi-datacenter are first-class concepts, not bolt-ons.
  • It scales far. Nomad has famously run clusters in the tens of thousands of nodes, and it's used in production by the likes of Cloudflare and others. The simplicity isn't a toy-scale tradeoff.

As of mid-2026 the current line is Nomad 2.0, which arrived alongside HashiCorp's new versioning and support model. It's a mature project, not a moving target.

A word on licensing (be honest about this)

Nomad is not open source anymore, and I'd rather say that plainly than let you find out later. In August 2023 HashiCorp moved Nomad — along with Terraform, Vault, Consul and the rest — from the MPL 2.0 open-source licence to the Business Source License (BSL/BUSL) 1.1. It's now "source-available": you can read, modify, and use it freely, including in production, for basically everything except offering a competing commercial product or hosted Nomad service. Each release also re-licenses to MPL 2.0 four years after it ships.

For the vast majority of self-hosters and internal platform teams, nothing about your day-to-day changes. But it's a real consideration if your business model is "we host orchestration for others." HashiCorp itself is now part of IBM, following the acquisition that closed in 2025 — worth keeping in mind when you're betting infrastructure on a vendor's long-term direction.

Getting a cluster running

Unlike most tools I cover here, Nomad isn't really a "docker compose up" affair — it's a single binary you run as a system service. For local experimentation, though, you can have a working cluster in one command:

# Single-node dev cluster — server + client in one process
nomad agent -dev -bind 0.0.0.0 -network-interface eth0

That gives you the full API and UI on http://localhost:4646. From there a job spec is just HCL. Here's a minimal one running an nginx container:

job "web" {
  group "frontend" {
    count = 2

    network {
      port "http" { to = 80 }
    }

    task "nginx" {
      driver = "docker"

      config {
        image = "nginx:stable"
        ports = ["http"]
      }

      resources {
        cpu    = 200  # MHz
        memory = 128  # MB
      }
    }
  }
}

nomad job run web.nomad.hcl and you have two load-balanced nginx allocations. For production you'd run the binary under systemd on each node, point clients at the servers, and front it with Consul or a reverse proxy — but the job spec is identical whether it's your laptop or a 50-node cluster.

The Nomad Autoscaler — the actual point of this post

Out of the box, Nomad keeps the count you asked for. It does not change that count on its own. The Nomad Autoscaler is what makes the cluster elastic, and it's a deliberately separate piece of software: its own binary, its own release cycle, its own GitHub repo. You run it as a long-lived daemon — almost always deployed as a Nomad job itself, so the thing that scales your cluster is scheduled by the cluster.

The whole design is plugin-based, and it helps to think in three plugin categories:

  • APM pluginswhere the metrics come from. Nomad's own metrics, Prometheus, or Datadog. This is the signal that drives decisions.
  • Target pluginswhat actually gets scaled. A Nomad task group, or a cloud autoscaling group: AWS ASG, GCP managed instance group, Azure VMSS.
  • Strategy pluginshow the decision is made. The common one is target-value (keep some metric near a target, e.g. 70% CPU), plus threshold, pass-through, and fixed-value.

With those pieces, the autoscaler covers two distinct jobs.

1. Horizontal application autoscaling

This changes the allocation count of a task group — more copies of your service when load rises, fewer when it falls. You declare it right inside the job spec with a scaling block:

group "frontend" {
  count = 2

  scaling {
    enabled = true
    min     = 2
    max     = 10

    policy {
      cooldown            = "1m"
      evaluation_interval = "10s"

      check "cpu_load" {
        source = "prometheus"
        query  = "avg(nomad_client_allocs_cpu_total_percent{task_group='frontend'})"

        strategy "target-value" {
          target = 70
        }
      }
    }
  }
}

This says: keep average CPU around 70%, never drop below 2 or climb above 10 allocations, and wait a minute after any change before acting again. Note the check block — a policy can have several, querying different metrics, and the autoscaler reconciles them. That's a big step up from the early days when a policy could only watch one number.

The built-in Nomad APM — scaling without Prometheus

You don't actually need Prometheus to get started. The autoscaler ships with a built-in Nomad APM that reads CPU and memory straight from Nomad's own state — no external monitoring stack at all. It's the default: if you omit the source field, this is what runs. Queries use a simple <operation>_<metric> form instead of PromQL:

check "cpu_usage" {
  # No `source` line also works — the Nomad APM is the default
  source       = "nomad-apm"
  query        = "avg_cpu-allocated"   # avg CPU across the group's allocations
  query_window = "1m"

  strategy "target-value" {
    target = 70
  }
}

Operations are avg, min, max and sum; metrics are cpu, memory, cpu-allocated and memory-allocated. There's one honest limit worth knowing: the Nomad APM only understands CPU and memory. The moment you want to scale on request latency, queue depth, or anything application-specific — or scale a cluster all the way down to zero clients — you reach for Prometheus or Datadog instead. But for the bread-and-butter case of "run more copies when CPU climbs," the built-in APM gets you there with zero extra infrastructure, which is exactly the kind of low-friction default that fits Nomad's whole philosophy.

2. Horizontal cluster autoscaling

The other half: adding and removing actual client nodes. When your jobs can't be placed because there's no room, the autoscaler talks to your cloud provider's scaling group and brings up more machines; when the cluster is over-provisioned, it drains a node gracefully (so allocations migrate cleanly) and terminates it. The target here is something like aws-asg or gce-mig instead of a task group, but the policy structure is the same. This is the piece that turns "fixed fleet of servers" into "pay for what you're actually running."

A note on vertical scaling

There's also Dynamic Application Sizing — the autoscaler recommending right-sized CPU/memory values for your tasks rather than changing counts. Be aware this one is an Enterprise-only feature, not in the community build. The two horizontal modes above are free.

Nomad vs Kubernetes — the honest comparison

This is the question everyone actually has, so let me not dodge it.

Where Nomad wins: operational simplicity. One binary, a config file, and you have a cluster — the conceptual surface area is a fraction of Kubernetes. It schedules non-containerised workloads, which K8s simply can't. It's lighter on resources, so it's viable on small clusters and edge nodes where a full K8s control plane would be absurd. For a team that wants orchestration without hiring a platform team to run the orchestrator, it's a genuinely different cost structure.

Where Kubernetes wins: the ecosystem, and it's not close. Helm charts, operators, CNI plugins, service meshes, a CNCF universe of tooling, and an enormous hiring pool already fluent in it. If you need something, someone has built a K8s solution for it. Nomad's autoscaler is clean and capable, but the Kubernetes scaling story — HPA, VPA, Cluster Autoscaler, KEDA for event-driven scaling — is broader and more battle-tested across edge cases. And "everyone already knows Kubernetes" is a real, unglamorous advantage when you're hiring.

The honest summary: Nomad is the better tool for a surprising number of teams, and the worse career bet for a surprising number of engineers. Both of those things are true at once.

When Nomad fits — and when it doesn't

Pick Nomad if: you run a mix of containerised and non-containerised workloads; you want orchestration without a dedicated platform team; you're at small-to-medium scale, on-prem, or at the edge; or you value being able to understand your whole stack in an afternoon. The autoscaler then gives you elasticity without bolting on a separate scaling system.

Skip Nomad if: your team and tooling are already deep in Kubernetes and it's working; you depend heavily on the CNCF ecosystem (operators, service meshes, the Helm world); you need vertical autoscaling and don't want to pay for Enterprise; or the BSL licensing is a dealbreaker for your organisation. In those cases the friction of being a smaller community isn't worth it.

My take

Nomad is one of those tools that quietly makes you question how much accidental complexity you've been carrying. The first time you run a real workload on it and realise there's no control plane to nurse, no etcd to back up, no fifteen YAML files for a single service — it's a small relief that adds up. The autoscaler extends that same philosophy: it does horizontal app and cluster scaling cleanly, it's configured in the same HCL you already write, and it stays out of your way.

I won't pretend it's the right call everywhere. The licensing change is a real mark against it, and the gravitational pull of the Kubernetes ecosystem is hard to argue with. But for the right workload — heterogeneous, modest-scale, run by a small team — Nomad plus the autoscaler is a genuinely sane place to be. It's the orchestrator I reach for when Kubernetes feels like answering a question nobody asked.


PIPOLINE · DEVOPS CONSULTING

Need help setting up a Nomad cluster with autoscaling?

Standing up a production Nomad cluster — servers in Raft consensus, clients with the right task drivers, Consul for service discovery, Vault for secrets, the Autoscaler as a Nomad job with Prometheus or the built-in APM, plus cluster autoscaling tied to AWS ASG, GCP MIG or Azure VMSS — takes experience to get right. I can handle the full setup and tune the scaling policies to your actual traffic instead of guesswork. You get orchestration with elastic scaling that stays boring — which is the highest compliment infrastructure can earn.

Get in touch at pipoline.com →