docker stack rm fails when removing overlay networks with Interlock backend services on them:
$ docker stack remove myapp Removing service myapp_web Removing network myapp_default Failed to remove network vbxm26mirq7no6kn09pofknd8: Error response from daemon: Error response from daemon: network myapp_default id vbxm26mirq7no6kn09pofknd8 has active endpointsFailed to remove some resources from stack: myapp
This happens the first time
docker stack rm is issued. If issued again about a minute later
docker stack rm demo removes the network.
- Universal Control Plane version 3.0.x starting with version 3.0.0
- Interlock Layer 7 Routing
Interlock's ability to dynamically reconfigure itself to use new networks can transgress the ownership model of Docker stacks, under which a stack expects to exclusively manage the lifecycle of its own resources. This behavior was not present in the predecessor to Interlock, Host Routing Mesh (HRM), because HRM operated on long lived stack-external networks.
docker stackis implemented with individual client side
ucp-interlock-extensionservice watches the swarm API and asynchronously adds or removes the
ucp-interlock-proxyservice to or from networks with Interlock backend services on them.
Sequence of Events
This sequence details the events that occur when the following example stack is deployed and then removed.
docker stack deploy myapp -c - <<EOSTACK version: "3.2" services: web: image: nginx deploy: labels: com.docker.lb.hosts: app.example.org com.docker.lb.network: myappnet com.docker.lb.port: 80 EOSTACK
docker stack deploycreates a default network
myapp_defaultand deploys the service
- (within seconds) the
ucp-interlockservice notices the new service and polls
nginx.confand then updates the
ucp-interlock-proxyservice, adding it to
- (some time later) User issues
docker stack rmto remove the service and begin to remove the network. Network removal fails because
ucp-interlock-proxyis still on the network.
- (within seconds)
ucp-interlocknotices there are no Interlock labelled services on the network, and polls
nginx.confthat it propagates to
ucp-interlock-proxyto remove it from the network.
ucp-interlock-proxyservice update completes)
docker stack rmsucceeds to remove network
This issue is tracked by Docker-internal engineering ticket ID escalation/731 and currently has no known resolution. As of 7 Sept 2018 Docker engineering was evaluating behavioral changes in swarm mode to address this issue.
There are two operational accommodations for this issue. One involves changes to stack files and the other involves changes to the affected CI/CD pipeline.
External Network Work Around
Create swarm networks prior to deploying stacks and reference them in stacks as
external networks. Under this configuration stack removal will not try to remove the networks, thereby avoiding the issue.
Source a Universal Control Plane client certificate bundle.
Create an overlay network
docker network create -d overlay myappnet
docker-compose.ymldeclaring using the network
version: "3.2" services: web: image: nginx deploy: labels: com.docker.lb.hosts: app.example.org com.docker.lb.network: myappnet com.docker.lb.port: 80 networks: myappnet: networks: myappnet: external: true
Deploy the stack:
docker stack deploy myapp_externalnetwork -c docker-compose.yml
Wait a minute for Interlock to reconfigure itself to use the network, then remove the stack:
docker stack remove myapp_externalnetwork
Note: stack removal succeeds because the network is declared as
external, so the problematic network removal is skipped.
CI/CD Retry Work Around
docker stack rm retry in the affected CI/CD pipeline. Eventually (typically about 30 seconds) Interlock will remove itself from the affected network, and
docker stack rm will succeed.