Still Having Intermittent Issues Resovling External Domains

Joel M Nimety jnimety at cybergnostic.com
Tue Jun 1 02:34:05 UTC 2004


Everything that is commented is done so on purpose.   If it is in my 
config as commented it means that we tried running bind with that option 
but still experienced problems.  So, to keep the config relatively 
simple I commented them out.  If you can justify why allowing more tcp 
connections is beneficial I will uncomment.

Also when doing rndc status on 9.3beta4 I'm always showing 0/xxx tcp 
connections and approx. 20-30 active recursive queries.  When we 
experience problems the recursive queries start to rise.  I believe this 
is a symptom of the problem (concurrent recursive queries increase 
because all recursive queries are alive for the max time allowed) not a 
cause for the problem.  Incidently, I haven't seen the number of 
recursive queries rise to alarming rates.  The largest I've seen is ~200.

We have had a *very* difficult time determining what exactly is a cause 
vs. symptom, another reason this issue is very difficult to debug.  I 
don't want to assume that something is wrong with bind (If I did I'm 
afraid I'd get flamed, a sad side effect of asking intelligent questions 
on a news group and I'd rather something be wrong with my setup anyway).

This I know:

1) root servers use bind and it works.
2) We use bind and it doesn't work.

Please, someone, what is wrong with my config/setup.  I want something 
to be wrong on my end.  If there is something wrong here I can fix it 
and move on. Otherwise...

-- Joel

Sten Carlsen wrote:

> Comments below, you still have some statements commented out though I 
> think you want to have them active.
> 
> Joel M Nimety wrote:
> 
>>I am running bind chroot'd as user named.  I will turn up debugging.
>>tcpdumps during the outages show recursive requests going out but no
>>replies coming back.
>>
>>Here is my config:
>>  
>>
> snip...
> 
>> <>logging {
>> category lame-servers { null; };
>> };
>>
>> // Set options for security
>> // We run BIND9 chroot'ed so all paths are relative to /var/lib/named.
>> options {
>> directory "/var/cache/bind";
>> statistics-file "/var/run/named.stats";
>> pid-file "/var/run/named.pid";
>> memstatistics-file "/var/named/named.memstats";
>> dump-file "/var/adm/named.dump";
>> zone-statistics yes;
>>
>> // We need to allow more recursive queries for spam and email.
>> recursive-clients 5000;
>>
>> // We need to explicitly set the edns size so as not to confuse //
>> certain firewalls
>> // edns-udp-size 512;
> 
> 
>> <>
>> // Force port 53;
>> // query-source address * port 53;
>>
>>
>> // Allow more simultaneous connections.
>> // tcp-clients 500;
> 
> 
> These are still part of a comment, remove the "//" to activate these 
> options.
> 
>> <>
>> // Prevent DoS attacks by generating bogus zone transfer
>> // requests. This will result in slower updates to the
>> // slave servers (e.g. they will await the poll interval
>> // before checking for updates).
>> notify no;
>>
>> // Generate more efficient zone transfers. This will place
>> // multiple DNS records in a DNS message, instead of one per
>> // DNS message.
>> transfer-format many-answers;
>>
>> // Set the maximum zone transfer time to something more
>> // reasonable. In this case, we state that any zone transfer
>> // that takes longer than 60 minutes is unlikely to ever
>> // complete. WARNING: If you have very large zone files,
>> // adjust this to fit your requirements.
>> max-transfer-time-in 60;
>>
>> // We have no dynamic interfaces, so BIND shouldn't need to
>> // poll for interface state {UP|DOWN}.
>> interface-interval 0;
>>
>> allow-transfer {
>> // Zone tranfers limited to members of the
>> // "xfer" ACL.
>> xfer;
>> };
>>
>> allow-query {
>> // Accept queries from our "trusted" ACL. We will
>> // allow anyone to query our master zones below.
>> // This prevents us from becoming a free DNS server
>> // to the masses.
>> trusted;
>> };
>>
>> blackhole {
>> // Deny anything from the bogon networks as
>> // detailed in the "bogon" ACL.
>> bogon;
>> };
>>
>> // conform to RFC1035
>> auth-nxdomain no;
>>
>> };
>>
>> view "internal-in" in {
>> // Our internal (trusted) view. We permit the internal networks
>> // to freely access this view. We perform recursion for our
>> // internal hosts, and retrieve data from the cache for them.
>>
>> match-clients { trusted; };
>> recursion yes;
>> additional-from-auth yes;
>> additional-from-cache yes;
>>
>> zone "." {
>> // prime the server with knowledge of the root servers
>> type hint;
>> file "/etc/bind/db.root";
>> };
>>
>> // be authoritative for the localhost forward and reverse zones, and for
>> // broadcast zones as per RFC 1912
>>
>> zone "localhost" {
>> type master;
>> file "/etc/bind/db.local";
>>
>> allow-query {
>> any;
>> };
>>
>> allow-transfer {
>> none;
>> };
>> };
>>
>> zone "127.in-addr.arpa" {
>> type master;
>> file "/etc/bind/db.127";
>>
>> allow-query {
>> any;
>> };
>>
>> allow-transfer {
>> none;
>> };
>>
>> };
>>
>> zone "0.in-addr.arpa" {
>> type master;
>> file "/etc/bind/db.0";
>>
>> allow-query {
>> any;
>> };
>>
>> allow-transfer {
>> none;
>> };
>>
>> };
>>
>> zone "255.in-addr.arpa" {
>> type master;
>> file "/etc/bind/db.255";
>>
>> allow-query {
>> any;
>> };
>>
>> allow-transfer {
>> none;
>> };
>>
>> };
>>
>> // more zones here, removed for brevity -- Joel
>>
>> };
>>
>> // Create a view for external DNS clients.
>> view "external-in" in {
>> // Our external (untrusted) view. We permit any client to access
>> // portions of this view. We do not perform recursion or cache
>> // access for hosts using this view.
>>
>> match-clients { any; };
>> recursion no;
>> additional-from-auth no;
>> additional-from-cache no;
>>
>> // Link in our zones
>> zone "." in {
>> type hint;
>> file "/etc/bind/db.root";
>> };
>> };
>>
>> // Create a view for all clients perusing the CHAOS class.
>> // We allow internal hosts to query our version number.
>> // This is a good idea from a support point of view.
>>
>> view "external-chaos" chaos {
>> match-clients { any; };
>> recursion no;
>>
>> zone "." {
>> type hint;
>> file "/dev/null";
>> };
>>
>> zone "bind" {
>> type master;
>> file "/etc/bind/db.bind";
>>
>> allow-query {
>> trusted;
>> };
>> allow-transfer {
>> none;
>> };
>> };
>>
>>
>> };
>>
>>
>>
>> Richard Maynard wrote:
>> >> > We're experiencing intermittent issues resolving domain names. Often
>> >> > these domains are microsoft.com, cnn.com, etc. We are running 3
>> >> > identical servers and sometime they can go a week or two without any
>> >> > trouble (othertimes only hours), then without warning one server will
>> >> > be unable to perform a recursive lookup for a few domains.
>> >
>> >
>> > Is bind running as a non root user? If so, is there anything that
>> might be
>> > changing the interfaces so bind can not rebind to them?
>> >
>> > Is there anything at all appearing in your bind logs? If not, do you 
>> have
>> > the log level cranked up to see if perhaps you could get something?
>> >
>> >
>> >>We are getting desperate. I can provide tcpdumps and configs if
>> >>necessary.
>> >
>> >
>> > Configs are a good place to start so folks can take a look at how you
>> expect
>> > your server to start.
>> >
>>
> 
> -- 
> Best regards
> 
> Sten Carlsen
> 
> No improvements come from shouting:
> 
>        "MALE BOVINE MANURE!!!" 
> 



More information about the bind-users mailing list