0 0 Share PDF

Swarm join fails with transport error on 18.09 engine with http proxy configuration

Article ID: KB000935

Issue

If you have configured the Docker Engine 18.09 with the HTTP_PROXY or HTTPS_PROXY environment variables, then the following error may prevent joining or promoting manager nodes:

Error response from daemon: manager stopped: can't initialize raft node: rpc error: code = Unavailable desc = all SubConns are in TransientFailure, latest connection error: connection error: desc = "transport: Error while dialing dial tcp: lookup proxy.example.com on 192.168.121.1:53: no such host"

This error can be found on the command line in response to docker swarm join with a manager token, or in docker info on the node that is being promoted after promotion fails.

Prerequisites

  • Docker Engine prior to 18.09.04
  • Joining or promoting a manager node
  • HTTP HTTPS proxy configured (configuration visible docker info)

Root Cause

An update to the Docker Swarm gRPC libraries unintentionally caused them to start respecting proxy environment variables HTTP_PROXY, HTTPS_PROXY, and NO_PROXY.

Resolution

A patch at https://github.com/docker/swarmkit/pull/2802 has been proposed to address this issue by restoring the original behavior of ignoring proxy environment variables. This have been fixed in Docker Engine 18.09.4 : Fixed issue for swarm nodes not being able to join as masters if http proxy is set. [moby/moby#36951]

Work Around

Add all manager IP addresses to the NO_PROXY environment variable using the instructions at docs.docker.com

Note: you must specify individual manager node Swarm IPs in NO_PROXY. Docker EE Engine 18.09 and lower do not support CIDR notation for NO_PROXY.

What's Next