CMU DNS: Cleaning out the Augean Stables

G. Somlo, May 2010

1. Intro, Motivation, and Overview

DNS (Domain Name Service) is one of the most highly visible services supported by the Network Group. Any outage is almost guaranteed to be immediately noticed by end users, and, as such, it is of utmost importance to ensure the high availability of the service through careful design for redundancy, failover, and load sharing.

The purpose of this writeup is primarily to document the new, recently deployed system. However, to better illustrate our progress, and justify some of the technical decisions that were made, I will compare and contrast key differences between the current state of affairs and the original point from where I started.

In a nutshell, we started out with a system consisting of 18 servers interacting in ways both complex and unpredictable, with configurations automatically generated by a poorly understood NetReg subsystem (dns-config.pl) which both lacked documentation and was perfectly willing to generate broken, unloadable DNS configurations without so much as a warning. We now have a three-tiered architecture, where all servers within a tier or class are identically configured, allowing for virtually free horizontal scaling (cost of hardware notwithstanding). With geographical redundancy built in, this system could be deployed on as few as five total servers (we currently use 10, for symmetry -- see 3.1). The dns-config.pl configuration file generator has been completely rewritten, and will now issue useful warnings while refusing to generate broken configurations. The NetReg mechanism by which DNS servers get populated with zone data and configuration is now well understood and documented (see Section 4).

The remainder of this document is organized as follows: Section 2 delves into the various underlying mechanisms for implementing IP anycast, which is the essential building block for providing both load sharing, and seamless failover in the event of hardware failure or maintenance. Section 3 showcases the overall architecture of the new system, explains the role played by each of the three server tiers, and describes how the tiers interoperate with each other. Section 4 documents NetReg's DNS Server Group service (read: object) type, the primary interface by which a NetReg admin is currently expected to make DNS changes. Our current methodology for supporting DNSSec is presented in Section 5. Section 6 concludes with ideas for future work, such as offering service on IPv6, and improving NetReg to better support the management of restricted DNS views for special-purpose clients.

2. Failover and Load-Sharing with IP Anycast

2.1. Built-In DNS Failover

DNS has a very simple built-in failover mechanism: all speakers of the DNS protocol (clients as well as servers) typically encourage (and often require) that at least two server IP addresses be configured for every DNS "reference". To illustrate this point, let's consider DNS end clients (stub resolvers, in DNS parlance) found on most users' desktop, laptop, and handheld devices. In addition to a primary DNS server address to which all queries are sent by default, the device asks (and sometimes insists) that at least one more, secondary DNS server address be configured, to which queries are sent in the event the primary server becomes unavailable. On our campus network, the two (or more) DNS server IP addresses are automatically set via DHCP, but can still be viewed by running ipconfig /all from a Windows shell, or by viewing the contents of /etc/resolv.conf on Mac and Linux machines. Similarly, when DNS servers refer to one another for purposes such as forwarding or delegation, best practices encourage using multiple server IPs for the primary and backup target of the forward or delegation, respectively.

This failover method is not always sufficient in real-world, real-time scenarios. DNS record lookups (e.g., using 'host', or from within a Web browser) will merely take a few seconds longer (while the secondary server is being queried), and go mostly unnoticed. There are, however, situations when a few extra seconds of DNS response delay leads to highly visible outages. For example, the sshd remote login service will attempt a reverse lookup on the connecting clients' IP addresses (for reasons as trivial as attempting to generate a more informative log entry). When the lookup doesn't complete within sshd's set time limit, the login attempt is timed out, resulting in denial of service to ssh clients. It is, of course, possible to configure sshd to forgo reverse DNS lookup on connecting clients. However, rather than externalize this cost across all system administrators, along with addressing all other cases where real-time DNS responses are required, it is much more economical to just make DNS more reliable, as demonstrated in the remainder of this section.

Another highly visible outage scenario occurs when DNS forwarding is used. When a DNS server forwards a client query to another server, the client may give up before a second forwarder has a chance to respond. In effect, having secondary (and ternary, etc.) forwarders may be a complete waste of resources. This specific failure scenario was observed in the field only a few months ago.

2.2. Basics of IP Anycast

IP Anycast is a technique by which one or more IP addresses are advertised using the underlying routing protocol from more than one physical location. As applicable to DNS, we configure a well-known DNS IP address (such as, e.g., 128.2.1.10) on multiple servers, and then announce host routes to this address pointed at each server.

The primary application for anycast is load sharing: each server's advertised anycast IP route "attracts" the "closest" clients' queries, where "distance" is determined by the routing protocol. As will become apparent in Section 3, this also means that server placement within the network is very important to ensure an even load distribution.

Failover can be accomplished if the individual routes comprising an anycast virtual server IP are announced conditionally, based on each real server's availability, and can be quickly withdrawn in the event of a failure. Once a real server's anycast route to the virtual IP is pulled, the underlying routing protocol will redistribute client queries across the remaining real servers.

2.3. Anycast with Quagga

This method was in use on the recently phased-out campus DNS infrastructure, before the upgrade. Each DNS server was expected to participate in the campus backbone routing protocol (OSPF), and announce the virtual DNS service IP(s) it had configured as extra loopbacks using Quagga (an open source routing software package).

This method suffered from severely limited failover capabilities. In the event of the entire machine losing power, network connectivity, or crashing, its announced anycast route went away along with its availability for DNS service, which in turn allowed clients to be re-routed to other available servers. However, in the event of a failure limited to the DNS software (bind/named), the Quagga routing daemon would continue fulfilling its intended purpose (to announce routes for locally configured loopback IPs, in this case), causing a denial of service to those clients which had the misfortune of having their DNS queries routed to this server. A partial outage of this type would last until, through manual intervention, either the DNS software was restored to operational status on the server, or Quagga was shut down to stop announcing the DNS anycast IP and thus stop attracting clients whose queries would then be dropped.

One potential way to improve this situation would (have) be(en) to add support for probes and conditional route announcements to Quagga (similar to SLA, see 2.5) and submit the patch upstream. Another, less robust, but still workable solution would have been to write a small script to shut off Quagga in the event that bind/named was unresponsive (e.g. a simple cron job).

Instead, the adopted solution was one entirely worthy of Rube Goldberg. An extra machine-specific loopback IP was set up on each server, and added to the list of IPs monitored for DNS service availability, in the idea that:

if bind/named died, we'd get paged for DNS on:
- The server's real (management) IP address
- The server's dedicated loopback IP address
- Intermittently, the DNS service IP (since the monitoring system was equally "far" away from all servers in terms of OSPF distance, and thus could hit either one with an equal probability when querying the anycast service IP)
- but all the above IPs would still be reachable via ping
if quagga died, we'd only get paged on the server-specific loopback
if the entire machine became unreachable, we'd get paged for everything.

Additionally, if clients were configured with two anycast service IPs (primary and secondary, see 2.1), these two IPs had to be supported on entirely separate and non-overlapping pools of hardware. In the event that bind/named failed on a server announcing both primary and secondary anycast IPs via a still-operational Quagga daemon, clients failing over from primary to secondary would have their queries routed back to the same failed server, guaranteeing a full DNS blackout instead of simply degraded service.

In conclusion, once the appropriate number of servers to support the primary DNS anycast IP was determined, it had to be doubled in order to support a secondary IP for the same client population, requiring a ridiculously high number of DNS servers which spent the majority of their time idling.

2.4. Anycast with SLB

SLB (Server Load Balancer) is a feature included with the Cisco 6500 hardware platform on which our campus routing infrastructure is built. While SLB is mainly intended to offer rich support for various load balancing policies, we use it for its ability to probe service availability on real servers and to conditionally announce (anycast) routes for virtual service IPs based on the probes' result. Our new, upgraded DNS infrastructure currently uses SLB to implement IP anycast, offering both load sharing (one real server for each tier -- see Section 4 -- advertised from each distribution router) and fast failover (announcements are withdrawn within 10 seconds if probing detects a failed server).

With SLB, individual servers no longer have to run anything beyond bind/named, greatly simplifying administration and monitoring. However, the most important gain obtained by moving anycast support off the servers and back into the network infrastructure was that the same hardware pool of identically configured machines can now support both primary and secondary anycast IPs. Since SLB immediately removes all anycast routes pointing at a failed server, we have eliminated all scenarios in which client traffic would have been sent to failed servers.

For an example of how SLB is configured to manage and monitor DNS servers, consider the following fragment of IOS configuration deployed on each distribution router (see Figure 1 for the relevant network topology elements of the campus backbone, post-DNS-overhaul):

	ip slb probe CMUDNS dns
	 lookup cmu.edu
	!
	ip slb serverfarm CMUDNS
	 predictor leastconns
	 probe CMUDNS
	 !
	 real IP.OF.REAL.DNS.SERVER
	  no faildetect inband
	  inservice
	!
	ip slb serverfarm CMUDNS-BAK
	 predictor leastconns
	 !
	 real IP.OF.CORE.ROUTER
	  no faildetect inband
	  inservice
	!
	ip slb vserver CMUDNS-UDP-10
	 virtual 128.2.1.10 udp dns
	 serverfarm CMUDNS backup CMUDNS-BAK
	 advertise active
	 idle 4
	 inservice

SLB "vservers" must be configured per-IP, per-protocol, but vserver snippets for TCP and a secondary service IP, 128.2.1.11 were omitted for brevity.

Each distribution router will forward DNS queries to the local real server, if it is available. In the event of a failure, queries are routed to the core, which then evenly spreads them out across the remaining distribution routers with operational DNS servers.

2.5. Anycast with SLA

SLA (Service Level Agreements) is another technology available on the Cisco 6500 platform, beginning with IOS version 12.2.(33)SXI. It directly allows the insertion of conditional route statements into the running configuration, and is therefore much simpler and lightweight than SLB. Also, SLA has a much better chance of becoming supported under IPv6 in future Cisco IOS releases. We plan to switch our DNS IP anycast implementation to SLA as soon as all distribution routers are upgraded to SXI.

The equivalent configuration snippet for implementing anycast using SLA is:

	ip sla 100
	 dns cmu.edu name-server IP.OF.REAL.DNS.SERVER
	 frequency 10
	 exit
	!
	ip sla schedule 100 life forever start-time now
	!
	track 100 ip sla 100
	 exit
	!
	ip route 128.2.1.10 255.255.255.255 IP.OF.REAL.DNS.SERVER track 100
	ip route 128.2.1.11 255.255.255.255 IP.OF.REAL.DNS.SERVER track 100

Packets destined for one of the DNS service IPs will simply be routed to the real server, but only as long as the monitor confirms the service is indeed available. Otherwise, the local route(s) are withdrawn, and client traffic is free to follow OSPF to whichever other distribution router still announces routes to the service. NOTE: In the above example, the real server is sent a query for cmu.edu, and an A record is specifically expected in return. There is currently no way to specify a different record type, so we must be sure to query for a known A record to determine DNS service availability on the probed host.

3. Overall Architecture

3.1. Server Tiers and Network Placement

Figure 1. Network Topology vs. DNS Server Placement

Prior to the upgrade, we had a relatively large number of DNS server classes, each responsible for a separate subset of our (sub)domains, which made the overall system unnecessarily difficult to comprehend. To illustrate, we had:

Top-level servers: Used to host domains such as cmu.edu, cmu.net, carnegiemellon.org, cmu.local, and the corresponding reverses such as 2.128.in-addr.arpa. These servers also hosted subdomains not enabled for dynamic DNS updates (DDNS): ad.cmu.local, andrew.cmu.local, etc.

Dotcom servers: Similar to the toplevel class above, these servers hosted various .com, .org, etc. "vanity" domains for various campus projects and groups. Hosting these domains separately originated from an overly paranoid fear w.r.t. the possibility of future policies against serving domains that could potentially be interpreted as "commercial" by Internet2 and Educause.
DDNS servers: Subdomains such as andrew.cmu.edu, net.cmu.edu etc. which support DDNS updates (from other services such as dhcp, loadbalancers, and even NetReg itself). Separating domains with support for DDNS from those without allegedly originated due to a bug in earlier versions of named/bind, which was reportedly causing problems when dynamically updating certain types of records.
caching servers, which supported the anycast addresses advertised to the campus client population.

Within each of the first three main categories (except caching, of course), one server was designated as master, while the rest acted as slaves. Sometimes the master itself was used to serve queries, and sometimes it was used in "shadow" mode, where it was only allowed to serve zone transfer requests from its slaves, which in turn supported the publicly visible service.

After the DNS infrastructure overhaul, only two server classes remain: caching and authoritative. Each class is accessed via two anycast service IPs (primary and secondary), which are both supported on every class member. For reasons of symmetry (so we wouldn't have to designate two distribution routers as "special" by having them host DNS servers) we have one cache and one authoritative server connected to each distribution router. This way, not only are all DNS servers equal (within their designated class), but all distribution routers are also equal (by supporting the same anycast statements to advertise the anycast DNS service IPs). Last, but not least, each distribution routers' clients are exactly one Layer-3 routing hop away from their default DNS server. A simplified network diagram illustrating DNS server placement is shown in Figure 1.

3.2. Server Interactions

Figure 2. Overview of the CMU campus DNS architecture

The only special case server is the "shadow" authoritative master, which is the single point of update for all external systems (such as NetReg, dhcp, lbnamed, etc) . All anycast-enabled authoritative servers are configured as slaves to this machine. Redundancy and failover for the shadow master is accomplished through VMotion, as the shadow master is deployed as a guest on our VM infrastructure. Since queries for authoritative DNS data are only ever sent to the authoritative anycast service IPs, a temporary software failure on the shadow master will only impede the ability to enact further DNS updates, but never prevent querying existing data. An illustration of how the various servers interact with each other is given in Figure 2.

We can now count the minimum number of servers required to support this architecture with physical redundancy:

one shadow master,
two authoritative slaves (each supporting two authoritative anycast service IPs), and
two caching servers (both supporting the two anycast IPs advertised to campus clients),

for a total of five machines. As explained earlier, we run five caches and four authoritative slaves (without much extra effort, since all caches and all authorities are configured identically to each other), for a total of 10 servers.

3.3. Bypassing the DNS Root

Another issue related to how our servers interact is configuring caches to bypass the DNS root. This is a requirement when resolving *.local names or RFC1918 reverse lookups such as *.172.in-addr.arpa. Since these domains are expected to be private to each site that supports them, the DNS root servers must necessarily fail to resolve any related queries. Therefore, caches that begin resolving such queries at the DNS root will be guaranteed to fail, put undue load on the root servers (poor netiquette), and advertise the lack of skill of their administrators to the Internet community.

Caches must instead be configured to begin recursion for any *.local or reverse RFC1918 queries directly at our own authoritative servers. It's a handy extra improvement in response time to also bypass the DNS root for *.cmu.edu, *.2.128.in-addr.arpa, and any other top-layer domain known to be hosted on our authoritative servers.

Two methods exist to configure a cache to bypass the DNS root: zones can be configured as either "forward" or "stub". The former was used extensively with our previous DNS infrastructure, but has now been abandoned in favor of the latter, which brings about vast improvements in simplicity and flexibility.

To bypass the DNS root for all *.cmu.local lookups using "forward", we'd have to start by adding the following to the cache's named.conf file:

	zone "cmu.local" {
		type forward;
		forwarders {IP.OF.AUTH.1; IP.OF.AUTH.2; ...};
	};

If a query were received for something.cmu.local, it would be forwarded to the first IP address on the forwarders list. The caching server would expect a full answer, which it would then return to its client, and also cache for future reference. The big drawback to this method is that any server listed as a forwarder is expected to either be able to answer the query authoritatively, or be willing to recurse on the forwarding cache's behalf. Answering with a delegation is considered a forwarding failure, and the cache would proceed to query the next IP on its list of forwarders. Assuming that, on the forwarders, a subdomain of cmu.local (e.g. cs.cmu.local) was delegated to another set of servers, it would be necessary to either configure the forwarders to allow recursion from our cache, or to add another entry for zone "cs.cmu.local" to the caching server's named.conf, forwarding to the new set of authorities for that domain.

Needless to say, using this method made our old server configuration brittle and complex, by requiring that data (delegations are accomplished by inserting NS records into the parent zone) be replicated in configuration (the same servers mentioned in the NS delegation records now had to be listed as forwarders in named.conf on the caches). It also required that our authoritative servers act as caches (but only to our caching servers), adding to the overall Rube Goldberg-esque nature of the entire system.

The proper solution to the problem of bypassing the DNS root is to use "stub" zones instead:

	zone "cmu.local" {
		type stub;
		masters {IP.OF.AUTH.1; IP.OF.AUTH.2; ...};
	};

This statement simply forces the cache to always keep a copy of the zone's NS records up to date from the listed masters. The other piece of the puzzle is the fact that the DNS recursion algorithm (by which a cache starts at the DNS root and keeps following delegations until it finds a name server which can answer authoritatively) is, in fact, opportunistic. The recursion only starts at the DNS root if there are no better (i.e. more specific) servers already cached. In conclusion, using "stub" will not only bypass the root, but also allow the cache to follow any sub-delegations (e.g. to cs.cmu.local) without requiring them to be spelled out in the actual config file.

Using "stub" allowed a much cleaner and simpler configuration of our servers: authorities are no longer expected to perform recursion for anyone, and caches only need explicit configuration entries for the top layer (local, cmu.edu, etc.) of domains we serve, while subdomains are resolved by following any applicable delegations returned by the authorities.

4. NetReg and DNS

Campus DNS servers continue to have their configuration and zone data generated and pushed from NetReg (our homebrew IPAM system). At a high level, DNS management in NetReg occurs via two types of objects (or "service groups"): DNS Server Group and DNS View Definition. Two scripts, dns-config.pl and dns.pl generate configuration and zone files, respectively. At a lower level, the configuration elements of the NetReg service groups were lacking documentation, and the scripts were poorly written (and therefore mostly unreadable). Also, at least in the case of dns-config.pl, they allowed the creation of broken, unloadable configuration files with absolutely no warning or feedback of any kind.

The purpose of this section is to document NetReg's DNS configuration elements, explain how they affect the resulting configuration files, and serve as a guide to administrators wishing to make DNS changes via NetReg. This is knowledge I acquired during a rewrite of dns-config.pl.

4.1. Service Groups Demystified

NetReg service groups are simply sets of other NetReg objects of various types (machine, dns_zone, service, etc.), with various attributes that may be attached either to individual members, or to the set as a whole. Other, separate logic (such as the dns-config.pl config file generator) are tasked with assigning meaning to these object sets and taking action based on their contents.

Let's begin by examining the layout of our typical named.conf file:

	options {
		GLOBAL OPTIONS;
	}

	keys and ACLs (master server only);

	view "some_view" {
		VIEW OPTIONS;
		zone blocks ...
	};
	...
	view "global" {
		zone blocks ...
	}

Note how the "global" view does not have its own set of options. Instead, the 'GLOBAL OPTIONS' from the top of the file apply. Other views may come with their own options which override or add to the inherited global ones. Master servers may contain keys and ACLs which control the DDNS update process.

4.1.1. The `DNS View Definition` Type

DNS options (whether global or view-specific) are added using this (improperly named, should be something like 'DNS Option Set' instead) service group type. This type of object contains no members, and is used only as a collection of attributes, which may of the following types:

Server Version: Other options and attributes contained in this group will only be applied to servers with matching version strings.

Import Unspecified Zones: Controls whether zones which have no Zone In View attributes configured on their DNS Server Group membership will be added to views generated based on this group (see 4.1.2 below). By default (if unset), the value of this attribute is assumed to be 'yes'.
DNS Parameter: Option lines which will be added to either a specific view, or to the global options of named.conf, depending on this group's Service View Name attribute in the parent DNS Server Group. No error checking is performed on the values of this attribute, so caution must be taken to avoid introducing syntax errors into the resulting config file.

4.1.2. The `DNS Server Group` Type

This is the main DNS configuration mechanism offered by NetReg. Its purpose is to tie together DNS servers (machine type members) with zones (dns_zone type), and views (supplied by adding DNS View Definition groups as service type members). For a DNS Server Group to have any effect, it must contain machines, and at least one of either a set of zones, or a set of views. Any group which contains all three -- machines, zones, and views -- can be separated out into two equivalent groups, one assigning the zones to the machines, and another one adding views to the same machines. In fact, the latter method of keeping views and zones in separate DNS Server Group objects is strongly recommended, as a way of preserving the legibility of NetReg configuration.

Machine members of a DNS Server Group may have the following attributes set:

Server Version: Specifies the type and version of DNS software expected to run on that machine. A machine with this attribute unset may be used as a configuration element in other servers' config files, but will not have any configuration elements generated based on its membership in the current service group. Also, a machine must have this attribute set to the same value across all service groups of which it is a member, unless omitted (e.g., setting the version to 'bind9' in one service group and to 'foobar123' in another is considered an error, and dns-config.pl will complain loudly; setting to 'bind9' in one group and not setting it at all in another is OK, and only the first group will lead to data being added to the machine's named.conf). Currently, dns-config.pl will skip generating a config file for a server if the version is set to something other than 'bind9'.
Server Type: Specifies the role this server will play relative to the zones assigned to it by this service group. Allowed values are 'master', 'slave', 'stub', 'forward', and 'none', with the latter assumed to be the default when the attribute is omitted.

A server's membership in any DNS Server Group will only lead to entries being added to its named.conf if both a valid version and a valid type (other than 'none') are set. A machine with a valid type and no version may end up being referred to in the named.conf of other group members, but will not receive any entries in its own config file based on its membership in this service group.

Service (i.e., view) members of a DNS Server Group may have the following attributes:

Service View Name: Specifies name of the view which will be created on the group's machine members, and to which the options contained in this view member will be added. Using '_default_' or 'global' will result in the options being added to the named.conf global options section. Using anything different will result in the creation of a view with that name on all applicable machine members, with all options added under that view.
Service View Order: Specifies a view's relative position in named.conf as compared to all other views. Has no effect on 'global', which always comes last. Defaults to '0' (earliest order) if not set.

Zone members of a DNS Server Group may have the following attributes:

Zone In View: If no attributes of this type are configured on a zone, it is considered 'unspecified' (see 4.1.1). Unspecified zones are added to any view which does not forbid it by setting Import Unspecified Zones to 'no'. If one or more Zone In View attributes are set, the zone will only be added to the views thus specified.
Zone Parameter: Specifies an arbitrary named.conf line to be added inside a zone's configuration block. Multiple such attributes may be added to a zone. No error checking is performed on the values of this attribute, so caution must be taken to avoid introducing syntax errors into the resulting config file.

The DNS Server Group object type may also contain the following service-wide attribute:

Forward To: Specifies which type of machine members to use as targets for forwarding or stubbing when configuring zones on 'forward' or 'stub' type servers, respectively. The allowed values are 'master', 'slave', or 'both'. For example, if machine members of all types are present in a service group, and Forward To is set to 'both', then any zones added to a 'forward' type server will have 'forwarders' set to both master and slaves. In the absence of a set value for this parameter, 'master' is assumed by default.

4.2. Building Configuration Files

The dns-config.pl script will first construct the set of servers which need configuration files, by iterating across the full list of available DNS Server Group objects and retaining all machine members with a valid version and type attribute.

For each host, dns-config.pl will again iterate across all DNS Server Group objects and collect all views applicable to the host.

Finally, for each view on the current host, dns-config.pl will iterate across all DNS Server Group objects once more, collecting all zones applicable the current view. The whole process is shown in the following snippet of Perl-ish pseudocode:

	foreach $DnsSrvGrp {
	  foreach machine member $HN of $DnsSrvGrp {
	    push ($DnsSrvGrp, @{$HGroups{$HN}});
	  }
	}
	foreach $HN (keys %HGroups) {
	  foreach $DnsSrvGrp (@{$HGroups{$HN}}) {
	    verify version consistency for $HN;
	    collect views in $ViewOrder{$VName};
	  }
	  foreach $VName (sort keys %ViewOrder) {
	    foreach $DnsSrvGrp (@{$HGroups{$HN}}) {
	      collect zones from $DnsSrvGrp and process them for inclusion under $HN->$VName;
	    }
	  }
	}

When adding a zone to a 'master' server's config file, dns-config.pl will check whether the zone is configured for DDNS updates, and, if so, also add the necessary authorization bits to the configuration. This information is primarily originated within the DDNS Authorization string attached to each zone, which contains DDNS key information. In addition, host-based authorization for DDNS updates can be configured via service groups of type DDNS_Zone_Auth. This service group type simply bundles zone and machine members, without any attributes. The semantics of a DDNS_Zone_Auth group is that any machine member will be allowed to DDNS-update any zone member, and dns-config.pl ensures this information is added to the relevant named.conf files in the form of ACL statements.

The following is a list of issues to be aware of, in order to prevent dns-config.pl from generating broken named.conf files:

A zone's dynamic update keys from NetReg will be added to the first valid (version, type) master server encountered; this master server will also be added to dhcpd.conf on the dhcp servers, which will then send it dynamic updates when handing out leases.
NetReg will allow multiple servers to be configured as masters, but only one of them will receive the necessary DDNS key material and be added to dhcpd.conf. Which of the multiple servers "wins" is undetermined, so dns-config.pl will send out a warning when this situation is detected.
NetReg will allow a zone to be associated with a server from more than one DNS Server Group. To avoid dns-config.pl throwing an error, only one such association should lead to named.conf material being generated for the server. In other words, all but one of these groups must have the server listed as versionless, to be referenced in other machines' named.conf as a forwarding or stubbing target.

4.3. Current NetReg Service Group Setup

We are currently managing and monitoring the Pittsburgh campus DNS infrastructure, as well as the two off-site servers located on our New York satellite campus. Several other servers under the CMU umbrella are independently managed (Qatar, West, etc.) and, while we slave a few zones from them, we are not concerned with their configuration.

This section explains how the Pittsburgh campus servers are configured, both with global options and in terms of which zones they end up supporting.

4.3.1. Setting Global Options

Several DNS View Definition objects exist in NetReg with options for various (sub)types of DNS servers:

view.globalopts.common: common across all campus servers
view.globalopts.cache: specific to the caching servers
view.globalopts.auth: common across all authoritative servers
view.globalopts.authslave: specific to the slave auth. servers only
view.globalopts.authmaster: specific to the master server only

For example, the location of the statistics file must be configured on all campus servers, and therefore must go under view.globalopts.common. Only caching servers allow recursion, so allow-recursion goes under view.globalopts.cache. Only the master needs to send notification messages (to its slaves) when zones get updated, so also-notify goes under view.globalopts.authmaster. All authoritative servers maintain journal files (for either dynamic updates or incremental zone transfers), so the 'max-journal-size 1024k' setting (which limits the unbounded growth of a journal and thus prevents performance issues when many full zone transfers need to occur simultaneously) goes under view.globalopts.auth.

New options should be added to exactly one of the above groups. In order to keep NetReg configuration relatively easy to follow, I strongly recommend that DNS Server Group objects used to connect machines with their global options be separate from those used to connect machines with the zones they need to serve. The following DNS Server Group objects are currently in use for setting global options:

master.globalopts.dns
- machine: nsauth-master.net.cmu.edu
- options: view.globalopts.[common,auth,authmaster]
slave.globalopts.dns
- machines: nsauth-[cyh,weh]-[a-w].net.cmu.edu
- options: view.globalopts.[common,auth,authslave]
caching.globalopts.dns
- machines: nscache-[cyh,weh]-[a-w].net.cmu.edu
- options: view.globalopts.[common,cache]
- extra views/options: view.proxy.[quickreg,authbridge].bind9

The two extra views (quickreg and authbridge) configured on the caching servers are specific to a limited number of sources (unregistered wired IP space and the AuthBridge/RAWR proxies, respectively), and serve a set of special, hand-configured zones to these clients to redirect certain types of traffic to our "portal"-style page, a.k.a. FirstConnect. The hand-configured zone files must currently be independently loaded onto each caching server (See Section 5 for a potential future enhancement to NetReg which would allow these zones to also be pushed to the servers automatically).

4.3.2. Assigning Zones to Servers

There are three types of zones for which we are authoritative:

Top Layer: cmu.edu, cmu.net, 2.128.in-addr.arpa, cmu.local, etc., and a whole set of "vanity" domains for various campus entities such as cyrusmail.org, etc.
Subdomains: net.cmu.edu, andrew.cmu.edu, 6.2.128.in-addr.arpa, etc
Slaved zones from other servers: 237.128.in-addr.arpa, west.cmu.edu, qatar.cmu.edu, etc. Slaved zones may be at the top layer (237.128.in-addr.arpa) or subdomains (the remainder of the examples just listed).

We map these zones to our servers using four DNS Server Group objects:

toplayer.dns: all top layer zones under our full control are configured as follows:

as master zone files on nsauth-master.net.cmu.edu
slaved from nsauth-master on nsauth-[cyh,weh]-[a-w].net.cmu.edu
stubbed from nsauth-[cyh,weh]-[a-w] on nscache-[cyh,weh]-[a-w]

toplayer-slaved.dns: all top layer zones we do NOT control:

nsauth-master is listed as a versionless master to avoid attempts to generate conflicting zone configuration (see 4.2). The zones are slaved form their respective masters on nsauth-master via separate, dedicated DNS Server Group objects.
further slaved from nsauth-master on nsauth-[cyh,weh]-[a-w]
stubbed from nsauth-[cyh,weh]-[a-w] on nscache-[cyh,weh]-[a-w]
subdomains.dns: all subdomains under our full control:
- master zone files on nsauth-master
- slaved from nsauth-master on nsauth-[cyh,weh]-[a-w]
- no configuration is necessary on the caches, since stubbing top layer domains facilitates recursivey following sub-delegations.
subdomains-slaved.dns: all subdomains we do not control:
- again, nsauth-master is listed as a versionless master, and zones are slaved from their respective masters via separate, dedicated DNS Server Group objects.
- slaved from nsauth-master on nsauth-[cyh,weh]-[a-w]
- still no configuration is necessary on the caches, since stubbing top layer domains facilitates recursivey following sub-delegations.

When a new zone is created in NetReg, it should be assigned to precisely one of the above service groups. If the zone is under our control, that will most likely be subdomains.dns (unless, of course, we're adding a new vanity top layer domain like awesome-cmu-project.org, in which case the service group should be toplayer.dns).

The New York machines (ny-server-[03,04]) are multi-purpose servers, and the DNS service they offer is also serving multiple purposes. Both machines act as caching servers for campus clients. ny-server-03 is master for the IP subnet we use at our NY satellite campus (so that their on-site dhcp server can perform DDNS updates locally, and have the result available in the event of a WAN outage). ny-server-03 also acts as an off-site backup authority for our top layer zones (and is therefore listed as an additional slave in the toplayer*.dns groups). Separate options are configured for these servers via the ny-*.globalopts.dns groups.

4.4. Frequently Encountered NetReg DNS Operations

Several tasks are encountered on a regular basis when administering DNS through NetReg. The list includes zone creation, adding CNAMEs to external entities, etc. More tasks will be described here as they are identified along the way.

4.4.1. Creating New Domains (Zones)

DNS (sub)domains are improperly referred to as "zones" in NetReg parlance. When a new one must be created, we start at NetReg's "Add a DNS Zone" form. First, we fill in the zone name. Forward zones will be named something like foo.cmu.edu, or myvanitydomain.org. Reverse zones look like 123.237.128.in-addr.arpa. The Type pull-down allows three different subtypes for both forward (fw-) and reverse (rv) zone types:

toplevel: A standalone domain with its own dedicated zone file on the master DNS server. A more appropriate name for this could have been '*-standalone'.
permissible: A domain whose records are hosted within its parent's zone file. A better name for this category could have been '*-parent_hosted'.
delegated: A domain we delegate to a (set of) DNS server(s) outside our control. Only NS records are generated for this type of domain, and they are published into the domain's parent's zone file.

The remaining fields (under SOA Parameters) are only applicable to *-toplevel (a.k.a. standalone) domains (or zones):

Email: Use '.' instead of '@', as expected by the DNS SOA record formatting convention. Typically set to 'host-master.andrew.cmu.edu', less frequently to 'advisor.andrew.cmu.edu'.
Host: Leave blank. This value is ignored by dns.pl, which overwrites it with the server marked as master in the DNS Server Group which contains the zone.
Refresh, Retry, Expire, Minimum, Default: Leave blank. NetReg will automatically fill in default values, which are only worth modifying under very specific (and unlikely) circumstances.
DDNS Authorization: If blank, NetReg will push new versions of the zone file to the master server via scp, which will be the only method of updating records within that domain. Populating this field with appropriate key information will enable both NetReg and other systems (such as DHCP, lbnamed, etc.) to dynamically update the zone. An auxiliary command-line script shipped with NetReg (ddns-auth-keygen) should be used to automatically generate content to be pasted into this field, if management of the zone via DDNS is desired.

Once created, standalone and delegated domains also need NS records, which must be added under the DNS Resources section. Domains under our "jurisdiction" typically use nsauth1.net.cmu.edu and nsauth2.net.cmu.edu as their NS records.

Standalone domains should also be added to the appropriate DNS Server Group, in order to have their zone file pushed to the appropriate master server, and replicated on the appropriate slave servers. Domains under our control should be members of either toplayer.dns or (much more likely) subdomains.dns.

Finally, standalone and parent-hosted (i.e., toplevel and permissible) domains may be associated with one or more subnets, for the purpose of allowing machines to be registered with IPs within those subnets and names within those domains.

4.4.2. CNAMEs to External Entities

We are frequently asked to publish CNAME records pointing at an external entity, such as:

	guides.library.cmu.edu	IN	CNAME	cmu.libguides.com

which would allow clients attempting to connect to an advertised destination of guides.library.cmu.edu to be redirected to an externally hosted service at cmu.libguides.com.

NetReg will only generate CNAME records when a CNAME resource is added to a registered host. For example, adding a somecname.domain.cmu.edu CNAME resource to somehost.net.cmu.edu will result in the following record being added under the domain.cmu.edu zone:

	somecname	IN	CNAME	somehost.net.cmu.edu

Netreg will throw an error if the CNAME's domain (domain.cmu.edu) does not exist as a NetReg-managed "zone", since there would not be a place to publish the CNAME record into DNS. Using the guides.library as an example, we'll frequently see users mistakenly register something like guides.library.cmu.edu in NetReg and then attempt to add a CNAME resource named cmu.libguides.com to it. Not only will this fail due to libguides.com not being a NetReg-managed zone, but even if it could succeed, it would publish a CNAME in the wrong, opposite direction of what's actually desired, in the wrong zone (libguides.com instead of library.cmu.edu):

	cmu.libguides.com	IN	CNAME	guides.library.cmu.edu

To correctly complete this task, we need to perform the following steps:

Create libguides.com as a fw-permissible zone. We don't really need NetReg to ever generate and publish a libguides.com zone file (and, for the record, com is also registered as fw-permissible), but need to be capable of registering fake hosts named *.libguides.com, which requires the domain to exist in NetReg.
Associate the newly created libguides.com zone with the Reserved Devices subnet.
Register a machine named cmu.libguides.com on the Reserved Devices subnet, using Mode = reserved.
On the newly registered machine, under DNS Resources, add a CNAME named guides.library.cmu.edu. This should work just fine, since library.cmu.edu is a valid zone controlled by NetReg.

The next time NetReg generates DNS data, a guides CNAME record will be added under the library.cmu.edu zone, pointing toward cmu.libguides.com, which is the desired result.

    dnssec-keygen -r /dev/urandom -K /var/named/CMU/keys -n ZONE -3 \
                  -b 2048 -f KSK $zone_name

    dnssec-keygen -r /dev/urandom -K /var/named/CMU/keys -n ZONE -3 \
                  -b 1024 $zone_name

where $zone_name is the name of the zone in question (e.g. cmu.edu). On RHEL, make sure that all such keys are owned by named:named, so the bind daemon can access them.

A DS record corresponding to the KSK must be uploaded to the zone's parent (edu in our case). To generate this record, use the following command:

    dnssec-dsfromkey -f $zone_file $zone_name > dsset-$zone_name.

where $zone_file is the file name under which the zone is stored on the master (e.g. CMU.EDU.zone). Upload the DS record to the parent via whatever interface is available (e.g., http://net.educause.edu/edudomain/ for .edu domains).

To force bind to (re-)sign the zone immediately, use:

    rndc sign $zone_name

5.2.2 Key Rollover

To roll over a key, first generate a replacement KSK and/or ZSK (using 'dnssec-keygen as illustrated above in 5.2.1). Next, calculate the largest advertised TTL within the zone (MaxTTL). A quick and easy way to accomplish this would be:

    named-checkzone -o - $zone_name $zone_file | awk '{print $2}' | sort -n

If this value is very large, the DNSKEY RRSIG expiration time may occur sooner, and we may use that lower wait time instead.

We now need to re-sign the zone with

    rndc sign $zone_name

which will include both the old and new versions of the key being rolled over.

If this is the KSK being rolled over, we must now also generate a DS record for the new version of the key:

    dnssec-dsfromkey -f $zone_file $zone_name > dsset-$zone_name.

upload it to the parent, and wait for its publication (use whois to determine when publication has occurred).

At this point, we mark the old key for expiration/deletion in $MaxTTL as calculated above:

    dnssec-settime -K /var/named/CMU/keys \
                   -R +$MaxTTL -I +$MaxTTL -D +$MaxTTL $old_key_file

Look for $old_key_file under /var/named/CMU/keys. It should look something like K$zone_name.+007+$old_key_serial_no.key. The old key should stop automatically stop being used shortly after now + $MaxTTL, but 'rndc sign' may be used anytime after that deadline to force bind to resign the zone without the old key.

6. Next Steps

This section serves as a running wish- (and ToDo) list for remaining DNS related work.

IPv6 service: all our DNS servers are IPv6 enabled. We are currently updating the campus backbone software, and IPv6 networking will be made available to the servers within a month, enabling them to respond to IPv6 queries from on- and off-campus

DNSSec: NetReg needs the ability to place DS record attributes on a child zone such as andrew.cmu.edu in order to publish them in the parent (e.g. cmu.edu), in a way very much similar to how NS records are currently handled. This will allow us to sign zones below cmu.edu, whether they're separate files we maintain (andrew) or delegated (cs, qatar, etc.).

Rewrite dns.pl (zone file generator). One idea is to move zone and config generation off NetReg and onto each server (using SOAP). Short of that, we should at least clean up the bitrot, since the code has not been touched in years. I am starting a list of things I found dns.pl here, to keep track of the issues:

under some circumstances, dns.pl will not send nsupdates to remove stale NS records from a zone.

Implement NetReg support for "virtual" zones, which get all their records configured within the zone (and receive no data from subnet associations). These could then be used to replace the hand-written zone files used with quickreg and authbridge (see 4.3.1).