0 0 Share PDF

How to functionally test Layer 7 Loadbalancing

Article ID: KB000998

Issue

When faced with a problem accessing a service behind a proxy, it's important to isolate the different points where the client/server path could fail. Commonly, we access a service from our local workstation, which travels across the Internet to the datacenter, where it is load-balanced to a worker node, and finally traverses an overlay network to the container. Because there are many variables in this system, we require a methodology that is repeatable so we can know which component in the system is failing.

This document outlines a process that can be used to functionally test the Layer 7 Loadbalancing features in Docker Enterprise Edition, but can be applied similarly to other service meshes, proxies, etc. that operate at layer 7. There is an expanded document on troubleshooting the underlying layer 4 container networking.

Prerequisites

  • You are operating a Docker Enterprise Edition cluster with the Layer 7 Loadbalancing feature enabled
  • You have at least 1 service deployed using Interlock labels for Layer 7 Loadbalancing
  • You have a UCP client bundle configured in your shell

For this example, we assume to have an EE cluster configured for production

  • 2 nodes labeled as loadbalancers, both running Interlock in host mode with IPs:
  • 10.0.0.10
  • 10.0.0.11
  • Interlock is listening on ports 8080 and 8443 for HTTP/HTTPS respectively.
  • a DNS record resolving *.app.domain.com to an AWS Network LoadBalancer and Target Group containing our 2 loadbalancer nodes [10.0.0.10, 10.0.0.11]

Methodology

Start by examining the core pieces of the Interlock Architecture.

Of primary concern are the ucp-interlock and ucp-interlock-extension logs. Check to make sure that the Interlock application is detecting and completing updates. Upon deploying a new service, we expect to see log messages such as the following:

$ docker service logs -f ucp-interlock
... time="2019-06-10T18:44:19Z" level=info msg="interlock interlock/2.0.0-dev (08d890a1) linux/amd64"
... time="2019-06-10T18:44:19Z" level=info msg="starting server"
... time="2019-06-10T18:44:22Z" level=info msg="update detected" currentVersion= updatedVersion=3edc6f
... time="2019-06-18T12:39:19Z" level=info msg="update detected" currentVersion=3edc6f updatedVersion=6e1626
... time="2019-06-18T12:39:21Z" level=info msg="configured proxy service" id=122wzraywvkpyjt6gdb8oc9pq service_cluster=

Note that when we deploy a new service we expect to see an update detected log message, and when Interlock finishes configuring itself we expect to see configured proxy service. If Interlock never reaches the configured stage, then there may be a problem with the labels attached to your services. Also check to make sure that your services are stable - every time Interlock polls and detects a new container, it reconfigures itself. If one of your services is constantly creating new containers, such as when in a reboot loop, then Interlock may be "stuck" updating constantly.

$ docker service ps crashes_foo
ID                  NAME                IMAGE                  NODE               DESIRED STATE       CURRENT STATE            ERROR                       PORTS
23lh9uhwhus8        crashes_foo.1       ehazlett/demo:latest   ip-172-31-10-254   Ready               Preparing 1 second ago
zrmcbkvzevgr         \_ crashes_foo.1   ehazlett/demo:latest   ip-172-31-29-30    Shutdown            Failed 2 seconds ago     "task: non-zero exit (1)"
u0623sodf28r         \_ crashes_foo.1   ehazlett/demo:latest   ip-172-31-29-30    Shutdown            Failed 9 seconds ago     "task: non-zero exit (1)"seconds ago    "task: non-zero exit (1)"
0qosnzy7x3d1        crashes_foo.2       ehazlett/demo:latest   ip-172-31-34-126   Running             Starting 1 second ago
t4a4k95q5xdv         \_ crashes_foo.2   ehazlett/demo:latest   ip-172-31-12-191   Shutdown            Failed 6 seconds ago     "task: non-zero exit (1)"
o8ot98rvj7mw         \_ crashes_foo.2   ehazlett/demo:latest   ip-172-31-12-191   Shutdown            Failed 14 seconds ago    "task: non-zero exit (1)"

$ docker service logs ucp-interlock
... time="2019-06-27T19:55:34Z" level=info msg="update detected" currentVersion=15e243 updatedVersion=be1286
... time="2019-06-27T19:55:36Z" level=info msg="proxy service is currently being updated; skipping update" service=122wzraywvkpyjt6gdb8oc9pq state=updating
... time="2019-06-27T19:55:41Z" level=info msg="proxy service is currently being updated; skipping update" service=122wzraywvkpyjt6gdb8oc9pq state=updating
... time="2019-06-27T19:55:46Z" level=info msg="proxy service is currently being updated; skipping update" service=122wzraywvkpyjt6gdb8oc9pq state=updating

To bring Interlock into a stable configuration, temporarily scale this service to 0 replicas:

docker service scale crashes_foo=0

To force Interlock to re-check for tasks, you can also scale the ucp-interlock and ucp-interlock-extension services to 0 and then back to 1. This is a safe procedure, as both ucp-interlock and ucp-interlock-extension are stateless services, and is preferred over deleting or toggling Layer 7 Loadbalancing off and on in the UCP GUI:

docker service scale ucp-interlock=0 ucp-interlock-extension=0 \
  && docker service scale ucp-interlock=1 ucp-interlock-extension=1

Next, check the proxy config file that Interlock has generated.

The ucp-interlock-extension service is responsible for taking the list of tasks from ucp-interlock and building a docker config object containing the proxy's config.

$ docker config ls -f name=com.docker.interlock.proxy
ID                          NAME                                CREATED              UPDATED
3ipc8r05mkxp7n84y04czb4zs   com.docker.interlock.proxy.966719   7 minutes ago        7 minutes ago
7pe17li8re9w7dqiyyyfrbnsx   com.docker.interlock.proxy.a91774   About a minute ago   About a minute ago

$ docker config inspect --pretty 3ipc8r05mkxp7n84y04czb4zs | less
...

Now we can view the generated config file for your proxy extension. By default, the extension service builds an Nginx config file.

Search the generated config file for your service's hostname string:

$ docker config inspect --pretty 3ipc8r05mkxp7n84y04czb4zs | grep -A4 foo.app.domain.com
    upstream up-foo.app.domain.com {
        zone up-foo.app.domain.com_backend 64k;
        server 10.0.26.3:8080;
        server 10.0.26.4:8080;
--
  server_name foo.app.domain.com;
--
            proxy_pass http://up-foo.app.domain.com;
        }

Notice that the nginx.conf file contains an upstream and server_name section for your service, and server directives for each replica of your service. We can cross-reference the IP addresses used by the upstream with docker network inspect -v:

$ docker network inspect -v simple_demo
...
        "Containers": {
            "1e4f22aed7c05818f10d11f25fbf28a8c31793290ff1b2cf4fc56c5857e0c624": {
                "Name": "simple_foo.1.ev25mzv4b2xcztxjnvp6j65ir",
                "EndpointID": "5b7a6996447e4c1a053e62a92f6133e677b5f60514fd8fcfdbaf42bcaa4e4d94",
                "MacAddress": "02:42:0a:00:1a:03",
                "IPv4Address": "10.0.26.3/24",
                "IPv6Address": ""
            },
            "e3556bb0ef90787330284f7f1525434b33b2fcad90801c43a13be13c39b8fc95": {
                "Name": "simple_foo.2.g5ku12udhw69gb12cpw4tyaux",
                "EndpointID": "3718043c34c71d5e808bb3e0c7536ce13103f9eac5f4a494d7f7b0a858ae4b68",
                "MacAddress": "02:42:0a:00:1a:04",
                "IPv4Address": "10.0.26.4/24",
                "IPv6Address": ""
            },
        },
...

Check that the ucp-interlock-proxy service is stable:

$ docker service ps ucp-interlock-proxy -f desired-state=running
ID                  NAME                    IMAGE                              NODE               DESIRED STATE       CURRENT STATE         ERROR               PORTS
1szbvvfhtxnm        ucp-interlock-proxy.1   docker/ucp-interlock-proxy:3.1.5   ip-172-31-44-143   Running             Running 2 hours ago
6pi8hxwbjnw9        ucp-interlock-proxy.2   docker/ucp-interlock-proxy:3.1.5   ip-172-31-10-254   Running             Running 2 hours ago

You'll notice that whenever ucp-interlock detects a new task, the ucp-interlock-proxy config is updated, and the ucp-interlock-proxy containers are restarted in order to mount the new config. This may cause a small period of downtime for your applications on Interlock. This can be avoided by using the VIP mode option on your services, but involves a small tradeoff in connection latency due to the routing through the service VIP before being routed to the target container.

At this point we can confirm several things:

  • Interlock application is stable and reading the Docker API correctly
  • The nginx.conf file contains the correct upstream and server lines to redirect our queries to the right containers
  • ucp-interlock-proxy is stable and listening on our dedicated loadbalancer nodes

Next, we can functionally test ucp-interlock-proxy itself by using curl.

We know that our Interlock proxies are pinned to 10.0.0.10 and 10.0.0.11, and we also know that we are using an AWS Network LoadBalancer to terminate connections from the Internet to our cluster.

So, if we model our network path from our workstation to the container, we have something like:

client requests                   (AWS NLB)                  (ucp-interlock-proxy)               (simple_foo)
http://foo.app.domain.com ---> *.app.domain.com:80/443 ---> [10.0.0.10, 10.0.0.11]:8080/8443 ---> {service}

We want to isolate the different sections of the network path and check for functionality at each layer. We will move backwards up the chain, checking the innermost connection in the path then moving outwards; that is, the connection between ucp-interlock-proxy and our service. Since our workstation is located outside the LB, we need to ssh into the cluster so our connection does not traverse the LB. It's convenient to ssh to one of the dedicated loadbalancer nodes so we can use the 127.0.0.1 loopback address to reach ucp-interlock-proxy.

We will use curl to send a specifically crafted HTTP request to ucp-interlock-proxy:

$ curl -LSsvvvv -H "Host: foo.app.domain.com" http://127.0.0.1:8080 | head
* Rebuilt URL to: http://127.0.0.1:8080/
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: foo.app.domain.com
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 200 OK
< Server: nginx/1.14.0
< Date: Thu, 27 Jun 2019 21:16:17 GMT
< Content-Type: text/html; charset=utf-8
< Transfer-Encoding: chunked
< Connection: keep-alive
< x-request-id: acf996106d8a0bf65d1f6364b66c5d28
< x-proxy-id: d0d4970ab287
< x-server-info: interlock/2.0.0-dev (08d890a1) linux/amd64
< x-upstream-addr: 10.0.26.4:8080
< x-upstream-response-time: 3655709.734
<
{ [6505 bytes data]
* Connection #0 to host 127.0.0.1 left intact
<!DOCTYPE html>
<html lang="en">
    <head>
        <meta charset="utf-8">
        <title></title>

We have chosen to use some specific curl flags in this example to make the HTTP transaction more clear. Let's examine each of them in context:

  • curl
  • -L follow redirects, just in case the base URL for our application is a 301/302
  • -Ss be silent, except for errors; don't print the download progress bar
  • -vvvv use maximum verbosity, to show extra curl debugging output showing the headers exchanged
  • -H "Host: foo.app.domain.com" include a custom "Host" header
  • http://127.0.0.1:8080 the request URI.

Because we use name-based virtual hosting and our request URI http://127.0.0.1:8080 does not contain any hostname string, if we did not include a custom "Host" header, then Nginx would not deliver the request:

$ ubuntu@ip-172-31-10-254:~$ curl -vvvv -LSs http://127.0.0.1:8080 | head
* Rebuilt URL to: http://127.0.0.1:8080/
*   Trying 127.0.0.1...
* Connected to 127.0.0.1 (127.0.0.1) port 8080 (#0)
> GET / HTTP/1.1
> Host: 127.0.0.1:8080
> User-Agent: curl/7.47.0
> Accept: */*
>
< HTTP/1.1 503 Service Temporarily Unavailable
< Server: nginx/1.14.0
< Date: Thu, 27 Jun 2019 21:27:21 GMT
< Content-Type: text/html
< Content-Length: 537
< Connection: keep-alive
< ETag: "5b98580f-219"
<
{ [537 bytes data]
* Connection #0 to host 127.0.0.1 left intact
<!DOCTYPE html>
<html>
<head>
<title>Error</title>
...

If you are hosting HTTPS (SSL) sites with Interlock, then we leverage Server Name Indication in order to present multiple TLS certificates on the same IP address and TCP port. With SNI, the client indicates which hostname it is connecting to at the start of the TLS handshake process. Because this process occurs before the HTTP Header exchange, we cannot use the --header "Host: " flag with curl when testing HTTPS sites.

curl supports another flag, --resolve which we can use instead:

```

#### Next, we can expand the test to include the section between the AWS NLB and `ucp-interlock-proxy`
#### Finally, we will expand the test to include the section between the AWS NLB and our client machine

```

When connectivity is established all the way out to the client, we can consider this a successful functional test.

What's Next