Troubleshooting Intermittent DNS Resolution Issues on EKS Clusters

By Sam Babaiants

May 14, 2024

Prior to joining Rearc, I worked as a cloud support engineer at AWS specializing in container services — specifically EKS. Of the many interesting cases I had to solve for customers during my time there, the ones that stand out are DNS resolution related issues.

I am going to share the strategy I used to troubleshoot these issues and common root causes.

VPC DNS Throttling issue

Let’s say you have an EKS cluster with some workload running on it. You noticed that some pods failed to resolve domain names at some point. The most common issue is VPC DNS throttling [1]. Amazon-provided DNS servers enforce a limit of 1024 packets per second per elastic network interface. Amazon provided DNS servers will reject any traffic exceeding this limit.

EKS clusters use CoreDNS as a DNS solution. Typically all DNS requests from pods are sent to CoreDNS replicas, and then external DNS requests are forwarded to the VPC DNS resolver. This is a general workflow. By default, an EKS cluster comes with two CoreDNS pods (replicas). The CoreDNS deployment has a podAntiAffinity parameter that makes its pods deploy on different worker nodes:

...
        podAntiAffinity:
          preferredDuringSchedulingIgnoredDuringExecution:
          - podAffinityTerm:
              labelSelector:
                matchExpressions:
                - key: k8s-app
                  operator: In
                  values:
                  - kube-dns
              topologyKey: kubernetes.io/hostname
            weight: 100
...

However, if the number of available worker nodes is less than the number of CoreDNS pods, a few CoreDNS pods will be scheduled on the same worker node. Since we cannot control which node’s ENI a CoreDNS pod is assigned to, a few CoreDNS pods might use the same ENI within a worker node.

When the cluster workload sends lots of DNS requests, they will be distributed among CoreDNS pods. If the cluster does not have enough CoreDNS pods running on different nodes, the number of requests per ENI will reach the VPC DNS resolver limit, and some DNS requests will be dropped.

Solution

There are a few solutions to resolve the issue; the best being chosen based on the cluster configuration.

Case 1

The cluster constantly has enough worker nodes, which can be used for scheduling additional CoreDNS pods. In this case, you need to increase the number of CoreDNS replicas (for example):

kubectl scale --replicas=8 deployment/coredns -n kube-system

Case 2

The cluster does not have enough worker nodes available for scheduling additional CoreDNS pods. In this case, I recommend creating a dedicated Fargate profile for CoreDNS pods and launching the number of replicas the cluster needs. Each Fargate pod will be launched on its own Fargate node with a dedicated ENI.

Case 3

The workload is highly scalable, and the cluster needs to change the number of CoreDNS pods based on the number of nodes. In this case you can:

Install NodeLocal DNSCache [2], it will deploy a daemonset pod on each node and redirect DNS requests from node’s pods there.

Use DNS horizontal autoscaler [3] and modify CoreDNS deployment not to let a few replicas launch on the same worker node. Replace the snippet above with this one:

...
        podAntiAffinity:
          requiredDuringSchedulingIgnoredDuringExecution:
          - labelSelector:
              matchExpressions:
              - key: k8s-app
                operator: In
                values:
                - kube-dns
            topologyKey: kubernetes.io/hostname
...

Network protocol issue

Let’s discuss other potential issues, which are related to the network protocols (UDP/TCP) that CoreDNS or NodeLocal DNSCache use. We have two problems here:

Some DNS clients (on the pod side) cannot correctly interact with CoreDNS via UDP when DNS responses are truncated because of their large size (for example, this happens with the latest images of busybox).
Some EC2 instance types have an undocumented bug, which causes a high volume of DNS connections via TCP to fail.

Solution

We cannot choose only one of these protocols permanently, because both of them have issues at some point. A tried and true solution is to use UDP protocol by default and switch to TCP only when UDP fails to process a DNS request. Luckily we have this option in CoreDNS [5].

Plugin forward has two options force_tcp and prefer_udp in AWS environment we should specify prefer_udp in all situations. Here is how it can be fixed in NodeLocal DNSCache in case you use it.

Edit the node-local-dns config map:

kubectl edit cm node-local-dns -n kube-system

Replace this snippet (3 times):

...
        forward . __PILLAR__CLUSTER__DNS__ {
                force_tcp
        }
...

with this one (force_tcp → prefer_udp):

...
        forward . __PILLAR__CLUSTER__DNS__ {
                prefer_udp
        }
...

and this one:

...
        forward . __PILLAR__UPSTREAM__SERVERS__
...

with this one (add the prefer_udp parameter):

...
        forward . __PILLAR__UPSTREAM__SERVERS__ {
                prefer_udp
        }
...

Using the prefer_udp parameter should help us avoid issues with the network protocols.

Summary

Amazon EKS clusters can present different kinds of DNS resolution issues, with known root causes.

One prevalent issue is VPC DNS throttling, where Amazon-provided DNS servers enforce a limit on packets per second per elastic network interface, leading to dropped DNS requests. The solution involves adjusting the number of CoreDNS replicas based on cluster configuration or utilizing Fargate profiles for CoreDNS pods.

Additionally, network protocol issues, such as UDP and TCP interaction problems, are addressed. The recommended solution involves configuring CoreDNS and NodeLocal DNSCache to prefer UDP over TCP to mitigate potential connection failures.