Really odd one: parts of global DNS just dropped off the map
jim at rfc1035.com
Thu Nov 25 07:59:44 UTC 2004
>>>>> "Andy" == Andy Holyer <andy at holyer.org> writes:
Andy> It seemed a bunch of (mainly US-based) sites were failing
Andy> DNS. Other (UK-based) worked fine. Trying another server in
Andy> the same facility gave the same result. Finally I switced
Andy> forwarders to another ISP and called it a day.
Sounds like there was some sort of routing or connectivity problem:
nothing to do with the DNS. Take this up with your ISP.
Why are you using forwarding? This is silly, dangerous and
pointless. Consult the list archives for an explanation. Perhaps your
DNS infrastructure has been forwarding queries to servers that were
broken or had connectivity problems? This is one of the reasons why
people should run their own name servers: when something goes wrong,
there are less links in the chain to troubleshoot.
Andy> This morning I switched things back and all appears
Andy> fine. however digging around, I don't get ping response from
Andy> about half the hosts in named.root. b.root-servers.net, for
Andy> example. Now, I can understand that root servers would just
Andy> turn off ICMP echo since they're busy enough as it is, but
Andy> it still worries me a bit.
Why? If you want to know if a name server is running, query it!
Sending a ping only establishes if there's connectivity: it doesn't
prove the target is running a working name server. And many busy name
servers (and networks) rate-limit inbound ICMP traffic or don't let it
ping traffic through at all.
Andy> I've never seen this sort of behaviour before, and I'm not
Andy> at all sure where to start in finding out what's going on,
Andy> and whether there's some subtle mis-configuration on my
Andy> part. From my part, the serial number in my root db files
Andy> tells me that I haven't touched the named config since early
Serial numbers in zone files tell you nothing. They're only used for
comparisons: ie which version of a zone is most recent. They don't
(have to) relate to dates. They have nothing to do with the
configuration of a name server. Or the network. If what you meant to
say was your DNS configuration hasn't changed for a while, then the
problem you describe was most likely caused by something else. So
check those things: routing, firewalls, peering at upstream providers,
Andy> *Any* advice as to where I could go from here to ensure
Andy> integrity of DNS is most gratefully recieved.
Are you volunteering to fix every broken name server configuration and
zone file on the planet? :-) Good for you! :-)
All you can do is make sure your own servers are working correctly:
 Don't use forwarding. Ever.
 Always run up to date DNS software.
 Put the zone files and config files under version control.
 Check these files before feeding them to a name server.
named-checkzone and named-checkconf are your friends.
 Disable recursive service for non-local users.
 Monitor the name server's logs and act on any errors.
 Document your DNS installation & management processes: zone file
changes, upgrades, where name servers are located and which users
they serve, contact names and addresses, SLAs with slave server
operators, problem esclation procedurs, support arrangements, etc, etc
 Follow the advice in RFC2182 and BCPs on DNS operations.
google will point you at these.
More information about the bind-users