Skip to main content

Docker Success Center

The Docker enterprise customer portal.

Docker, Inc.

Docker Reference Architecture: Universal Control Plane 2.0 Service Discovery and Load Balancing


Service discovery is an integral part of any distributed system and service-oriented architecture. As applications are increasingly moving towards microservices and service-oriented architectures, the operational complexity of these environments can increase. Service discovery will register the service and publish its connectivity information so that other services are aware of how to connect to the service.

Docker Datacenter on Docker Engine 1.12 includes service discovery and load balancing capabilities to aid the devops initiatives across any organization. Service discovery and load balancing make it easy for developers to create applications that can dynamically discover each other. Also, these features simplify the scaling of applications by operations engineers.

Docker Engine 1.12 introduced a new way to deploy applications called services. Services consist of containers created from the same image. Each service consists of tasks that execute on worker nodes that define the state of the application. When deploying a service, a service definition is included upon service creation. The service definition consists of tasks that include, among other things, the containers that comprise the service, which ports are published, which networks are attached, and the number of replicas. All of these tasks together make up the desired state of the service. If a node fails a health check or if a specific service task defined in a service definition fails a health check, then the cluster will reconcile the service state to another healthy node. For all of this orchestration to work seamlessly together, new features were added to Docker to aid in service discovery, load balancing, scaling, and reconciliation events.

What You Will Learn

This reference architecture covers the solutions that Docker Datacenter on Docker Engine 1.12 provides in the topic areas of service discovery and load balancing. As services are created, we will explain how DNS is used for service discovery while also going into detail about the different routing meshes that are built into Docker to ensure your application will remain highly available. The release of UCP 2.0 introduces a new application layer routing mesh called the HTTP Routing Mesh (HRM) that routes HTTP traffic based on DNS hostname. We will develop an understanding of how the HRM works and how it integrates with the other Docker service discovery and load balancing features.

DNS for Service Discovery

Docker uses embedded DNS to provide service discovery for containers running on a single Docker Engine and tasks running in a Docker Swarm. Docker Engine has an internal DNS server that provides name resolution to all of the containers on the host in user-defined bridge, overlay, and MACVLAN networks. Each Docker container ( or task in swarm mode) has a DNS resolver that forwards DNS queries to Docker Engine, which acts as a DNS server. Docker Engine then checks if the DNS query belongs to a container or service on each network that the requesting container belongs to. If it does, then Docker Engine looks up the IP address that matches a container, task, orservice's name in its key-value store and returns that IP or service Virtual IP (VIP) back to the requester.

Service discovery is network-scoped, meaning only containers or tasks that are on the same network can use the embedded DNS functionality. Containers not on the same network cannot resolve each other's addresses. Additionally, only the nodes that have containers or tasks on a particular network store that network's DNS entries. This promotes security and performance.

If the destination container or service and the source container are not on the same network, Docker Engine forwards the DNS query to the configured default DNS server.

Service Discovery

In this example there is a service of two containers called myservice. A second service (client) exists on the same network. The client executes two curl operations for and myservice. These are the resulting actions:

  • DNS queries are initiated by client for and myservice.
  • The container's built-in resolver intercepts the DNS queries on and sends them to Docker Engine's DNS server.
  • myservice resolves to the Virtual IP (VIP) of that service which is internally load balanced to the individual task IP addresses. Container names will be resolved as well, albeit directly to their IP addresses.
  • does not exist as a service name in the mynet network, so the request is forwarded to the configured default DNS server.

Internal Load Balancing

When services are created in a Docker Swarm cluster, they are automatically assigned a Virtual IP (VIP) that is part of the service's network. The VIP is returned when resolving the service's name. Traffic to that VIP will be automatically sent to all healthy tasks of that service across the overlay network. This approach avoids any client-side load balancing because only a single IP is returned to the client. Docker takes care of routing and equally distributing the traffic across the healthy service tasks.

Internal Load Balancing

To see the VIP, run the docker service inspect myservice command like so:

# Create an overlay network called mynet
$ docker network create -d overlay mynet

# Create myservice with 2 replicas as part of that network
$ docker service create --network mynet --name myservice --replicas 2 busybox ping localhost

# See the VIP that was created for that service
$ docker service inspect myservice

"VirtualIPs": [
                    "NetworkID": "a59umzkdj2r0ua7x8jxd84dhr",
                    "Addr": ""

DNS round robin (DNS RR) load balancing is another load balancing option for services (configured with --endpoint-mode). In DNS RR mode, a VIP is not created for each service. The Docker DNS server resolves a service name to individual container IPs in round robin fashion.

External Load Balancing (Swarm Mode Routing Mesh)

You can expose services externally by using the --publish flag when creating or updating the service. Publishing ports in Docker Swarm mode means that every node in your cluster will be listening on that port. But what happens if the service's task isn't on the node that is listening on that port?

This is where routing mesh comes into play. Routing mesh is a new feature in Docker 1.12 that combines ipvs and iptables to create a powerful cluster-wide transport-layer (L4) load balancer. It allows all the Swarm nodes to accept connections on the services published ports. When any Swarm node receives traffic destined to the published TCP/UDP port of a running service, it forwards it to service's VIP using a pre-defined overlay network called ingress. The ingress network behaves similarly to other overlay networks but its sole purpose is to transport mesh routing traffic from external clients to cluster services. It uses the same VIP-based internal load balancing as described in the previous section.

Once you launch services, you can create an external DNS record for your applications and map it to any or all Docker Swarm nodes. You do not need to worry about where your container is running as all nodes in your cluster look as one with the routing mesh routing feature.

#Create a service with two replicas and export port 8000 on the cluster
$ docker service create --name app --replicas 2 --network appnet --publish 8000:80 nginx

Routing Mesh

This diagram illustrates how the routing mesh works.

  • A service is created with two replicas, and it is port mapped externally to port 8000.
  • The routing mesh exposes port 8000 on each host in the cluster.
  • Traffic destined for the app can enter on any host. In this case the external LB sends the traffic to a host without a service replica.
  • The kernel's IPVS load balancer redirects traffic on the ingress overlay network to a healthy service replica.

The HTTP Routing Mesh

The swarm mode routing mesh is great for transport-layer routing, by routing to services using the service's published ports. But what if you wanted to route traffic to services based on hostname instead? The HTTP Routing Mesh (HRM) is a new feature added in Docker Datacenter on Engine 1.12 that enables service discovery on the application layer (L7). The HRM extends upon the swarm mode routing mesh that is available on Docker 1.12 by adding application layer capabilities such as inspecting the HTTP header. The HRM and swarm mode routing mesh are both used together for flexible and robust service delivery. The addition of the HRM allows for each service to be accessible via a DNS label passed to the service. As the service scales horizontally and more replicas are added, then the service will be round-robin load balanced as well.

Note: The HRM feature is currently labeled as an experimental feature. Since the HRM is experimental, it is currently unsupported under the Docker SLA. However, you can use the Docker support channels to provide feedback on the feature, and best efforts will be used to fix any issues while under experimental status.

The HRM works by using the HTTP/1.1 header field definition. Every HTTP/1.1 TCP request contains a Host: header. A HTTP request header can be viewed using curl:

$ curl -v
* Rebuilt URL to:
*   Trying
* Connected to ( port 80 (#0)
> GET / HTTP/1.1
> Host:
> User-Agent: curl/7.49.1
> Accept: */*

As mentioned, both the swarm mode routing mesh and the HRM are used in tandem. When a service is created using the com.docker.ucp.mesh.http label, the HRM configuration is updated to route all HTTP requests that contain the Host: header specified in the com.docker.ucp.mesh.http label to route to the VIP of the newly created services. Since the HRM is a service, the HRM is accessible on any node in the cluster using the configured published port.

Below is a diagram that displays a higher level view of how the swarm mode routing mesh and HRM work together.

HRM High Level

  • The traffic comes in from the external load balancer into the swarm mode routing mesh.
  • The HRM service was configured on port 80, so any request to port 80 on the UCP cluster will go to the HRM service.
  • All services attached to the ucp-hrm network can utilize the HRM to have traffic routed based on their HTTP Host: header.

Looking closer, we can see what the HRM is doing. The graphic below represents a closer look of the previous diagram.

HRM Up Close

  • Traffic comes in through the swarm mode routing mesh on the ingress network to the HRM service's published port.
  • As services are created, they are assigned a VIP on the swarm mode routing mesh (L4).
  • The HRM receives the TCP packet and inspects the HTTP header.
    • Services that contain the label com.docker.ucp.mesh.http are checked if they match the HTTP Host: header.
    • If a Host: header and service label label match, then traffic is routed to the service's VIP using the swarm mode routing mesh (L4).
  • If a service contains multiple replicas, then each replica container will be load balanced via round-robin using the internal L4 routing mesh.

Differences Between the HRM and Swarm Mode Routing Mesh

The main difference between the HRM and swarm mode routing mesh is that the HRM is intended to be used only for HTTP traffic at the application-layer, while the swarm mode routing mesh works at a lower level on the transport-layer.

Deciding which to use depends on the application. If the application is intended to be publicly accessible and is an HTTP service then the HRM could be a good fit. If mutual TLS is required to the backend application, then using the transport layer would probably be preferred.

Another advantage of using the HRM is that less configuration is required for traffic to be routed to the service. Often times only a DNS record will need to be configured and setting the label on the service. If a wildcard DNS entry is used, then no configuration outside of setting the service label would be necessary. In many organizations, access to load balancers and DNS is restricted. Being able to control requests to applications by just a service label can empower developers to quickly iterate over changes. With the swarm mode routing mesh, any front-end load balancer would need to be configured to send traffic to the service's published port.

Enabling the HRM

The HTTP Routing Mesh can be enabled from the UCP web console.

To enable it:

  1. Login to the UCP web console
  2. Navigate to Admin Settings > Routing Mesh
  3. Check Enable HTTP Routing Mesh
  4. Configure the port for the HRM to publish the service on


Once enabled, UCP will create a service on your swarm cluster that will route traffic to the specified container based on the HTTP Host: header. Since the HRM service is a swarm mode service, every node in the UCP cluster can route traffic to the HRM by receiving traffic from port 80. The HRM service exposes port 80 cluster-wide, and any request on port 80 to the cluster will send to the HRM.

The HRM creates an overlay network called ucp-hrm, and your application container will need to attach to this network to use the HRM. All containers labeled with the com.docker.ucp.mesh.http label and attached to the ucp-hrm network will have their traffic routed to their respective service tasks. If a service is created and does not reside in the ucp-hrm overlay network, then it won't be able to use the HRM. A service can reside in multiple networks, but to make use of HRM, the ucp-hrm network needs to be one of the attached networks.

HRM Requirements

There are three requirements for services to use the HRM.

  1. One of the networks that the service resides on must be the ucp-hrm overlay network
  2. The service must publish one or more ports
  3. The service must contain a --label of com.docker.ucp.mesh.http

Configuring DNS with the HRM

In this section we will cover how to configure DNS for services using the HRM. In order to use the HRM, a DNS record for the service needs to point to the UCP cluster. This can be accomplished through a variety of different ways because of the flexibility that the swarm mode routing mesh provides.

If a service needs to be publicly accessible for requests to, then the DNS record for that service can be configured one of the following ways:

  1. Configure DNS to point to any single node on the UCP cluster. All requests for will get routed through that node to the HRM.
  2. Configure round-robin DNS to point to multiple nodes on the UCP cluster. Any node that receives a request for will get routed through the HRM.
  3. Or, the best solution for high availability, is to configure an external HA load balancer to reside in front of the UCP cluster. There are some considerations to keep in mind when configuring this way:
    • Set the DNS record for to point to the external load balancer.
    • The external load balancer should point to multiple UCP nodes that reside in different availability zones for increased resiliency.
    • Configure the external load balancer to perform a TCP health check on the HRM service's configured published port so that traffic will route through healthy UCP nodes.


HRM Usage

Now that we have an understanding on how the HRM works and understand the requirements associated with it, we will now cover the syntax of using the HRM and work through some examples.

The syntax for the com.docker.ucp.mesh.http label is a list of one or more values separated by commas. Each of these values is in the form of internal_port=protocol://host, where:

  • internal_port is the port the service is listening on (and may be omitted if there is only one port published)
  • protocol is http
  • host is the hostname that should be routed to this service

Using the HRM can be done via the CLI or the UCP web console.

UCP Web Console

Creating services to utilize the HRM can also be done via the UCP web console. This section will cover creating a demo service that can be used by the HRM.

  1. Login to the UCP web console.
  2. On top, go to the Resources tab.
  3. Go to Services in the left pane and click the Create Service button.
  4. Enter demo-hrm-app as the Service Name.
  5. Enter ehazlett/docker-demo:latest as the Image Name.
  6. Enter -close-conn in the Arguments field.
  7. Click Next to go to the next page.

UCP Console Details

  1. The defaults can be used for the Scheduling tab. Click Next to go to the next page.
  2. On the Resources tab, publish port 8080 and set ucp-hrm as an attached network. Click Next once those resources are configured.

UCP Console Resources

  1. On the final Environment page, add the label to be used by HRM. The key must be com.docker.ucp.mesh.http. Set the value to what traffic this service should receive such as 8080=

UCP Console Environment

Once the service is deployed and the DNS record from its label is configured to point to your UCP cluster, then it should be accessible from a web browser. To scale the services from the UCP web console, select the service from the Services tab on the left pane. Select the service, go to the Scheduling tab, and set the scale to the desired scale.

Docker Client CLI

When using a UCP client bundle and the Docker CLI, creating a service utilizing the HRM can be done by passing a label upon service creation.

The --label of com.docker.ucp.mesh.http is a key and a corresponding value is 80= Once a service is created with these values, the HRM picks up the label and reconfigures itself. Let's create a demo service using the HRM.

docker service create -p 8080 \
    --network ucp-hrm \
    --name demo-hrm-app \
    --label com.docker.ucp.mesh.http=8080= \
    ehazlett/docker-demo:latest -close-conn

In this example command:

  • The demo application listens on port 8080 by default. The -p flag is only given one port, which means an external port will be dynamically assigned (starting at port 30000) to the service and mapped to port 8080 for the application container.
  • The newly created service must reside on the ucp-hrm overlay network to work with the HRM.
  • The label com.docker.ucp.mesh.http=8080= is grabbed by the HRM, and all requests passed through the HRM with the HTTP Host: header of will route to the application containers.

You can now access the demo application by going to the domain provided to the HRM in a web browser. Currently there is no load balancing since the service was created with only one replica, but you should be able to see the current container that is serving content through the HRM.

Now let's try scaling up the service that was just created with the following command:

docker service scale demo-hrm-app=5

There should be five replicas of demo-hrm-app running now, and it should be viewable from the service DNS label in a web browser.

Non Swarm Mode Containers

The HRM and swarm mode routing mesh are only supported for applications deployed using "services." For non-swarm mode containers, such as containers running on pre-1.12 Docker Engines and applications deployed not using services (e.g using docker run), interlock and NGINX must be used.

Interlock is a containerized, event-driven tool that connects to the UCP controllers and watches for events. In this case, events are the containers being spun up or going down. Interlock also looks for certain metadata that these containers have such as hostnames or labels configured for the container. It then uses the metadata to register/de-register these containers to a load-balancing backend (NGINX). The load balancer uses updated backend configurations to direct incoming requests to healthy containers. Both Interlock and the load balancer containers are stateless, and hence can be scaled horizontally across multiple nodes to provide a highly-available load balancing services for all deployed applications.

There are three requirements for containers to use Interlock and NGINX:

I. Interlock and NGINX need to be deployed on one or more UCP worker nodes.

The easiest way to deploy Interlock and NGINX is by using Docker Compose in the UCP portal:

  1. Log into the UCP web console.
  2. On top, go to the Resources tab.
  3. Go to Applications in the left pane, and click the Deploy compose.yml button.
  4. Enter interlock as the Application Name.
  5. For the compose.yml file, enter the following sample compose.yml file. You can alter the Interlock or NGINX config as you desire. Full documentation is on github.

        image: ehazlett/interlock:1.3.0
        command: -D run
        tty: true
            - 8080
            INTERLOCK_CONFIG: |
                ListenAddr = ":8080"
                DockerURL = "tcp://${UCP_CONTROLLER_IP}:2376"
                TLSCACert = "/certs/ca.pem"
                TLSCert = "/certs/cert.pem"
                TLSKey = "/certs/key.pem"
                PollInterval = "10s"
                Name = "nginx"
                ConfigPath = "/etc/nginx/nginx.conf"
                PidPath = "/etc/nginx/"
                MaxConn = 1024
                Port = 80
            - ucp-node-certs:/certs
        restart: always
        image: nginx:latest
        entrypoint: nginx
        command: -g "daemon off;" -c /etc/nginx/nginx.conf
            - 80:80
            - ""
        restart: always

    Note: Substitute `UCP_NODE_NAME` and `UCP_CONTROLLER_IP`. `UCP_NODE_NAME` is the name of the node that you wish to run Interlock and NGINX on (as displayed under the Resources/Nodes section). The DNS name for your application(s) needs to resolve to this node. `UCP_CONTROLLER_IP` is the IP or DNS name of one or more UCP controllers.

  6. Click Create to deploy Interlock and NGINX.

Note: You can deploy Interlock and NGINX on multiple nodes by repeating steps 3-6 above and changing the Application Name and UCP_NODE_NAME. This allows you to front these nodes with an external load-balancer ( e.g ELB or F5) for high-availability. The DNS records for your applications would then need to be registered to the external load-balancer IP.

II. The container must publish one or more ports.

III. The container must be launched with Interlock labels.

For example, to deploy a container that exposes port 8080 and is accessed on the DNS name, launch it as follows:

docker run --name demo -p 8080 --label interlock.hostname=demo --label ehazlett/docker-demo:dcus

Once you launch your container with the correct labels and published ports, you can access it using the desired DNS name.


The ability to scale and discover services in Docker is now easier than ever. With the service discovery and load balancing features built into Docker, engineers can spend less time creating these types of supporting capabilities on their own and more time focusing on their applications. Instead of creating API calls to set DNS for service discovery, Docker automatically handles it for you. If an application needs to be scaled, Docker takes care of adding it to the load balancer pool. By leveraging these features, organizations can deliver highly available applications in a shorter amount of time.

Document Version: 1.0.1

Tested on: Docker CS Engine 1.12.3-cs3, UCP 2.0