-
Notifications
You must be signed in to change notification settings - Fork 70
docker swarm: dns resolution fails #121
Comments
From @mcastelino on April 27, 2017 17:56 @devimc We need to add this to our release notes. |
From @devimc on April 27, 2017 18:49 great! thanks @mcastelino and nice description |
Initial proposal for support for DNS resolution for clear containers with docker swarm. The internal DNS resolution can be supported by running a DNS proxy agent within each Clear Container VM, which then forwards the request to the host side network namespace to the docker DNS resolver. This can be done in one of two ways
However neither of these methods will solve the resolution of external DNS. Today the docker DNS resolver running in the host namespace does another round of DNS resolution using the host DNS configuration for external DNS, but it does the same through the namespace. In the case of clear containers there is no network connectivity between the host and the host side container namespace. To support this there are two options
The challenge is that DNS works fine when not running in a swarm (and it works when we run CC in kubernetes). So the agent will need to detect this and only activate itself when running in a swarm which will end up being a very specific hack! If there are better ideas I would welcome them before I jump into this implementation |
@mcastelino @sameo -- we need a refresh on this: the way I see it we either:
(sorry if I am mistaken and this is already resolved - do we change the resolv.conf today?) |
From @mcastelino on April 27, 2017 17:37
When running Clear Container based containers in a docker swarm, DNS resolution does not work both for internal and external DNS when the DNS resolution is performed from within the Clear Container.
This is due to the way the DNS resolution is implemented within docker swarm.
DNS Resolution in Swarm
All docker swarm containers have the DNS resolver set to 127.0.0.11:53
Docker swarm has an internal DNS based load balancer that RRs the DNS requests to spread load.
That runs on the localhost on the host bound to a host port specific to the container.
https://github.com/docker/libnetwork/blob/5ac04367ae7b0b12c33bed5f5b395bd4c104fff9/sandbox.go#L815
There is a iptables rule injected into the container namespace which is used to implement the docker DNS load balancer/resolver. That way 127.0.0.11:53 maps to a specific port on which the corresponding resolver is running.
Here the DNS request is NATed to a container specific TCP and UDP port.
The resolver in this case is dockerd
In the case of clear containers there is currently no way for the DNS request from within the VM to talk to the dockerd running on the host side. The only host connectivity that the VM has is via the docker_gwbridge. However the DNS resolver running within the network namespace is not reachable via the VM.
Network setup with Clear Containers
Internal DNS Resolution
Internal DNS resolution is handled completely by dockerd. So dockerd directly responds to the DNS request from the container process for any cluster local resource.
External DNS Resolution
External DNS resolution is not handled by dockerd. When dockerd is unable to resolve the name to a cluster local resource it will then perform a DNS resolution using the host's resolv.conf.
Hence the DNS resolution process for external name is
Here you will notice, dockerd sends packets out from within the namespace to the host via the interface bound to the docker_gwbridge.
In the case of clear containers as there is network connectivity between the container network namespace and the host, this request can never be fulfilled.
Work around for External DNS
For external DNS resolution, the resolv.conf can be updated to point to a external DNS resolver. This will ensure that the external DNS resolution works
Fixing this issue in Clear Containers
The long term plan is to proxy the internal DNS requests from within the VM to dockerd. On failure of the DNS resolution the resolution has to be performed from within the VM to the host resolver. However assuming that the host resolver is the right resolver to use in the case of dockerd resolution failure may not be a correct assumption. Also this results in longer resolution times as dockerd takes a significant amount of time to fail the external DNS request.
Copied from original issue: intel/cc-oci-runtime#854
The text was updated successfully, but these errors were encountered: