wagl
is a DNS server which allows microservices running
as containers on a distributed Docker Swarm cluster to find and talk to
each other. It is minimalist and works as a drop-in container in your cluster.
This article is intended to describe inner workings of wagl
and give a
broader sense of the state of service discovery problem in today’s container
clusters.
I have presented wagl
and the Service Discovery topic at Docker Seattle
Meetup last month and at DockerCon EU 2015 last
week. I hope you find this article interesting.
wagl is open source on GitHub: https://github.com/ahmetb/wagl
If you use swarm you *need* wagl! Nice
Kubernetes style, based on labels, service discovery for swarm. #dockercon https://t.co/DWAFm1oZPs
— Kelsey
Hightower (@kelseyhightower) November
16, 2015
Is Service Discovery still a problem?
Yes! I gave a talk about how we have been doing service discovery in the microservices era at DockerCon EU 2015 in Barcelona —and it surprisingly turned out to be highest rated talk of the conference.
In a nutshell, there are many methods out there and none actually solves the
problem fully or reliably —neither does wagl
. However, it makes several
good points about how the ideal solution should look like.
Many service discovery methods people have blogged about and caused other people to deploy them to their container clusters have far too many moving parts to rely on in a production environment. More scary point is some of these components are not proven in production for high scale workloads.
Some methods I found require changes to the application code. This is the worst kind: it closely couples the service discovery concern with the service itself.
Service discovery is an infrastructure-level problem and it should not change your application code.
Some methods I came across are just too tedious to install and maintain or has a closely coupled to certain tool you need to install and maintain inside your cluster.
Various solutions I reviewed and compared in my DockerCon EU talk are:
- Interlock + nginx/haproxy
- registrator + consul/etcd + confd/consul-template + nginx/haproxy
- DNS-based solutions: Mesos-DNS, SkyDNS
- Port scanning in overlay networks with nmap
If you want to find out more about this I suggest just watching my talk.
Motives for developing wagl
In my opinion, Docker Swarm is the most easy-to-install cluster manager out there capable of managing Docker containers at scale (as of writing). Therefore, any tool or plugin written for Swarm must have the same property.
My criteria for developing wagl
were:
- Application code should not change to use service discovery.
- Installing the tool should be as simple as one command.
- There should be no maintenance cost.
- There should be no configuration files.
- The tool should have good defaults and should not make people read the docs to get started with it.
If any of these does not hold true, then it means I have failed. ☺︎
Introduction to wagl
wagl
is a DNS Server. It responds to DNS A/SRV queries just like a normal DNS
server and listens on port 50/udp.
wagl
is specifically designed for Docker Swarm. It speaks the Docker
language, understands your containers, figures out ports.
You install wagl
on your Docker Swarm cluster (single command) and forget
about it.
Normally, with Docker, you would run a web server container such as:
docker run -d -p 80:80/tcp nginx
However this would normally land to a random machine in the cluster and you have no easy way of finding out what IP:port this container landed on.
In wagl
, you use “Docker labels” to name your microservices:
docker run -d -p 80:80/tcp -l dns.service=api nginx
and all your containers can reach to this container as http://api.swarm :80
no
matter where they are. Easy as that!
If you are interested in learning more please check out the documentation.
Installing wagl
As promised, installing wagl
is an one-time operation consists of execution
of a single command on Swarm Manager node(s):
$ docker run -d --restart=always --name=dns \
-p 53:53/udp \
--link swarm_manager:swarm \
ahmet/wagl wagl --swarm tcp://swarm:2375
Ta-da! It’s done.
How is it developed?
The inspiration came from reading source code of Mesos-DNS. It is a well- designed DNS server for Apache Mesos and does the job just fine. Other examples I could find, such as SkyDNS are very complicated and has a lot of features.
Perhaps I could have developed a plugin for SkyDNS, but instead I have decided to start from scratch (rarely a good idea). Fun fact is, we are all using the same Go DNS package and it is not very hard to develop a DNS server in Go, after all.
After a couple days of coding I was able to get something up and running and the code was very much functional —and that is s where I stopped, I had a minimalist DNS server that was letting me do service discovery.
The reason I started from scratch and not wrote a plugin for SkyDNS or Interlock is simply because I wanted a minimal feature set and fewer moving parts.
By minimal, I mean, really, really minimal. It just barely works and yet it is good enough to accomondate most of the use cases.
wagl - inspired by mesos-dns provides service discovery for Swarm. #dockercon pic.twitter.com/yp4YpOpmw2
— Kelsey Hightower (@kelseyhightower) November 16, 2015
Now what?
Many experts in the area I spoke to think that Domain Name System (DNS) the right way to go for service discovery problem. It has its own shortcomings, such as connection draining, lack of port information in A records (and the fact that nobody uses SRV records), languages (Java) not obeying to TTL information of the records, and such.
I think combined with Docker 1.9 overlay networks, wagl
will provide a
solution that is very close to seamless and frictionless.
wagl
is not meant to change the Service Discovery scene dramatically. It is
more a proof-of-concept that actually works. It proves that a simple drop-in
tool that utilizes DNS protocol is good enough to accomodate service discovery
needs of most applications.
The service discovery problem, however, remains unsolved for many users of
container clusters out there. Companies like Google, Microsoft and many others
have been rolling out their own solutions and projects we see such as
kube-proxy
is just the tip of an iceberg.
Expect more changes in this area as this is the next problem most engineers working on containers/microservices area will tackle next.
What is ahead?
I am hoping to develop wagl
further, the project already started to get some
usage and feature requests. I
intend to support Docker 1.9 multi-host overlay networks out-of-the-box with
wagl
to solve the static-port allocation problem of DNS-based service
discovery.
Combined with what is already out there and what is next for Docker Networking,
I think wagl
can continue to provide a good solution and with its simplicity
it will be a great tool for beginners as well.
Learn more and contribute
wagl
on GitHub: https://github.com/ahmetb/wagl
wagl
website: https://ahmetalpbalkan.github.io/wagl/
You can find the source code of wagl on GitHub and visit its website.
The project is a little over 1,000 lines of Go and (in my opinion) is pretty readable. Feel free to check it out and “★” the repo as well.
Trivia: Why the name?
While I was on a quest for a name that is about bees and discovery, our very own Ross Gardler told me about the Waggle Dance of the Honeybee.
Turns out honeybees coming from foods sources to the bee hive tend to dance around by “waggling” and the shape in which they dance, combined with radius, frequency and many other parameters, is a way to tell exact location of the food source to the other honeybees.
Sounds a lot like wagl
, ha? ︎☺︎
Leave your thoughts