In this doc I'll try to setup 4 VPS for hosting about 700K rpz blacklist records on a mysql backend. and see what would be best practice for distributing the servers: RecursorServer, DnsDist, IxfrDist and AuthoritativeServer.
Virtual Hardware is:
As you can see in the table below, this is how i initiate my thought about how the software should be distributed, but as usual, thoughts and reality never go hand in hands, but this one was actually rather close.
The reason to divide the server as i have is based on the queries between the software vs hardware and network delays.
The DnsDist distribute DNS queries along the recursors, the recursors get there data from one of 4 sources (in query order);
- The recursor cache
- The Authoritative Server for selfhosted DNS Zones
- The RPZ backend (IxfrDist)
- The recursor server which query the root servers
The recursors is configured to know which domains it should look for in the Authoritative Server and by that saving a root server lookup to only get the answer of, "you are the dns for this zone"
The Next stop is the RPZ zone which holds our DNS FireWall or whitelist to ensure access to a source that other wise might be blocked by a external RPZ zone.
The fourth and last stop is the root server query, which is the slowest and the one that demand the most resources and it's here i got the initial idea wrong.
The wrong idea from the beginning was that the shortest way from DnsDist to the root server answers would be to host both DnsDist and the RecursorServer on the same VPS and in this way make the shortest packet delay as i would rule out any network delay..... But ques what, that packet delay was right... the error was plain and simply the CPU load on a single core vCPU it ramp up to 100% right away on the benchmark test, when the query zone was unknown to the cache of both DnsDist and the recursor and dropped about 90% of the queries. The only workaround for this is simply to add two more vboxes for the recursor (Yes I did try to add more resources to the VPS by adding twice the vcpu + ram with the same result).
The reason you can't move the recursors to the AuthoritativeServer is for the simple reason that the MySql is using the resources for the ~700K RPZ zone when altered
Conclusion for setting up a low prices working DNS with RPZ DNS FireWall.... Do as in the scheme below, for the initial setup, but when you starting to get real world queries, move your RecursorServer to it's own VPS.. Plain and simple...
Another learning; is that the DnsDist and the RecursorServer don't do well on the same installation as the CPU load rises tremendous vs having them on different installations, I've seen this a couple of times doing my test period before this setup.
Let's crack a few numbers...
With DnsDist + recursor on the same vbox vs on individual vboxes (uncached)
Combined the successful query number is around ~2500 QPS
Divided the numbers runs up to ~250k QPS 😃
|DnsDist 22.214.171.124:53 [2a01:4f9:c010:410e::53]:53||DnsDist 126.96.36.199:53 :53|
|RecursorServer 188.8.131.52:5301 [2a01:4f9:c010:410e::53:]5301||RecursorServer 184.108.40.206:5301 :5301|
|AuthoritativeServer 216.166.138:53,[2a01:4f9:c010:2166::1]:53||AuthoritativeServer 220.127.116.11:53,[2a01:4f8:1c1c:abe4::1]:53|
|IxfrDist [2a01:4f9:c010:2166::53]:53||IxfrDist [2a01:4f8:1c1c:abe4::53]:53|