Developing CoreDNS backends with gRPC

I recently worked on prototyping a public DNS server using CoreDNS. CoreDNS normally provides serving DNS zone data through files, however it also allows you to proxy to a backend endpoint via gRPC. I used this DNS server as the nameserver for a domain with thousands of subdomains.

This approach allows you to write a custom DNS server in any popular language, and serve records stored in an external datastore (like a redis cache, mysql database). In this article, I’ll go over the sample backend I’ve built.

# Problem

So I set out to write and host my own public Internet-facing DNS server (which is something you should never do, but would that stop me?). Assume I was trying to prototype the nameservers of *.herokuapp.com, which tells the internet which Heroku app is at which IP address.

My requirements were:

Public DNS server, answering only A/AAAA records with a single IP
Will potentially serve thousands of subdomain records
Records are added/updated frequently and they should be available in <10s
DNS server should be highly available (multiple replicas)
I will deploy this system to Kubernetes.

This is something CoreDNS can help with! CoreDNS offers a lot of plugins for serving zone data from various sources:

file: provide a zone file, which is a considerably complicated format. CoreDNS reloads this every minute. I don’t want to deal with the format and latency to poll/reload this file.
etcd: reads the zone data from etcd, but I don’t want to host etcd.
hosts: provide an /etc/hosts-style file and CoreDNS will reload it if its mod_time has changed. Fairly straightforward for A/AAAA records, but still does file polling and re-parsing.

These plugins were not the right fit for me: I needed to serve my records from a database such as MySQL, Redis, Spanner etc. The file-based plugins require me to update the file periodically (from a database), and do an atomic move and hope there’s no disruption or half-reads.

At this point I thought I need to write my own CoreDNS plugin. This requires you to copy your plugin into coredns/plugin tree and recompile. I did not want to maintain a fork or recompile coredns.

So I thought: wouldn’t it be cool to have a “grpc” plugin that CoreDNS queries backends for the records over gRPC? Then I proposed it to CoreDNS.

Turns out it already exists! The proxy plugin offers a “grpc” option. This passes raw DNS packets to the backends for handling. So you can write a DNS backend in any language that has gRPC, without implementing a DNS server. This way CoreDNS does the heavy-lifting of the DNS tasks and you write the question/reply logic.

# Architecture

Since I needed a somewhat highly available (read: multi-replica) DNS server, I needed to build a stateless backend that queries an external data source like a database. (Assume the records in the db are updated by another component.)

I considered two deployment topologies for this:

CoreDNS+Backend “in a Pod”: These will scale up and down together and CoreDNS will talk to the backend on localhost (since they’re in the same Pod). Therefore TLS is not necessary.
CoreDNS and Backends are separate: This would let you scale up and down independently. But ideally requires TLS set up for the gRPC communication. In this set-up, coredns can individually know the IP address of each backend, or we can point it to an internal TCP load balancer.

These two models are pictured as follows:

In both cases, the assumption is that the database has equal or higher availability than the CoreDNS and the backend, so the availability of the service is not primarily determined by the database.

I ended up choosing (1), as it much simpler to deploy, does not require TLS, and I do not have independent scaling requirements.

# Configuring CoreDNS

A simple Corefile like this will start a DNS server on port 1053 (tcp/udp) and proxying the queries for example.com. zone to the backend on port 8053:

example.com.:1053 {
   proxy . 127.0.0.1:8053 {
       protocol grpc insecure
   }
}

You can add the cache plugin to cache the results from the proxy.

# Coding the backend

You can code the backend in any language that is supported by gRPC. I chose Go for this.

Mentioned above, CoreDNS proxy can query gRPC backends defined by the dns.proto.I simply created the Go code stubs from this proto:

protoc dns.proto --go_out=plugins=grpc:.

Then, from the generated code, I coded this simple server method that handles only A/AAAA queries and responds them all with 127.0.0.1 (or ::1):

func (d *dnsServer) Query(ctx context.Context, in *pb.DnsPacket) (*pb.DnsPacket, error) {
	m := new(dns.Msg)
	if err := m.Unpack(in.Msg); err != nil {
		return nil, fmt.Errorf("failed to unpack msg: %v", err)
	}
	r := new(dns.Msg)
	r.SetReply(m)
	r.Authoritative = true

	// TODO: query a database and provide real answers here!
	for _, q := range r.Question {
		hdr := dns.RR_Header{Name: q.Name, Rrtype: q.Qtype, Class: q.Qclass}
		switch q.Qtype {
		case dns.TypeA:
			r.Answer = append(r.Answer, &dns.A{
				Hdr: hdr,
				A: net.IPv4(127, 0, 0, 1)})
		case dns.TypeAAAA:
			r.Answer = append(r.Answer, &dns.AAAA{
				Hdr: hdr,
				AAAA: net.IPv6loopback})
		default:
			return nil, fmt.Errorf("only A/AAAA supported, got qtype=%d", q.Qtype)
		}
	}

	if len(r.Answer) == 0 {
		r.Rcode = dns.RcodeNameError
	}

	out, err := r.Pack()
	if err != nil {
		return nil, fmt.Errorf("failed to pack msg: %v", err)
	}
	return &pb.DnsPacket{Msg: out}, nil
}

In the method above, you can easily customize the implementation by querying a database like Redis, MySQL, or something more highly available like Cloud Spanner.

You can find this code sample at my coredns-grpc-backend-sample repository.

# Conclusion

I was able to learn CoreDNS quickly and leverage it. I just used the official coredns docker image and shipped my server in a container, and deployed it all to Kubernetes and it works!

I probably won’t use this in production, but I learned a lot already.

I can’t say I’m a big fan of the current “proxy” model, as it forces you to deal with the “raw” DNS packets in the backend. I have proposed an alternative to this, that looks more high level (like CloudFlare’s JSON DoH API). This way, developers writing gRPC backends don’t have to deal with parsing raw DNS messages.

If this sounds interesting to you, hit me up on Twitter and let’s chat!