strange problem with MX records, firewall, Bind and Windows DNS

Kevin Darcy kcd at daimlerchrysler.com
Fri May 31 13:04:59 UTC 2002


RFC 974 was technically obsoleted by RFC 2821, but I think the MX algorithm is
essentially the same.

In any case, you're getting a SERVFAIL, so basically all bets are off as far as
mail delivery is concerned (mailers are supposed to do something "reasonable",
which covers a lot of ground).

Perhaps you could summarize exactly when you are getting a SERVFAIL, and when you
aren't. Inside the firewall versus outside the firewall (?) Using "dig", or
"nslookup", or whatever Micro$oft's lookup tool is (?)


- Kevin

bindlist wrote:

> Thanks for the reply, unforunately I was told to clean the domain
> information for the client by their security people before sending any
> such information out.
>
> Heres the complete Microsoft restest query in debug mode (cleaned ip and
> domain again, ill
> try for permission today to send it unfiltered).
>
> The first is from an NT server behind the firewall trying to forward the
> query to the BIND server. It returns a SERV Fail 2.
>
> The second is from an NT server on the lan I work at NOT using the BIND
> server in question. It was done to show the output differences between the
> problem I am seeing and what I would normally expect.
>
> Also playing read the RFC i did find the following which is saying
> something is goofed on the external DNS server.
>
> RFC 974
>
> Issuing a Query
>
> The first step for the mailer at LOCAL is to issue a query for MX RRs for
> REMOTE. It is strongly urged that this step be taken every time a mailer
> attempts to send the message. The hope is that changes in the domain
> database will rapidly be used by mailers, and thus domain administrators
> will be able to re-route in-transit messages for defective hosts by simply
> changing their domain databases.
> Certain responses to the query are considered errors:
>
> Getting no response to the query. The domain server the mailer queried
> never sends anything back. (This is distinct from an answer which contains
> no answers to the query, which is not an error).
>
> Getting a response in which the truncation field of the header is
>
> RFC 974                                                     January 1986
> Mail Routing and the Domain System
>
> set. (Recall discussion of incomplete queries above). Mailers may not use
> responses of this type, and should repeat the query using virtual circuits
> instead of datagrams.
>
> Getting a response in which the response code is non-zero.
> Mailers are expected to do something reasonable in the face of an error.
> The behaviour for each type of error is not specified here, but
> implementors should note that different types of errors should probably be
> treated differently. For example, a response code of "non-existent domain"
> should probably cause the message to be returned to the sender as invalid,
> while a response code of "server failure" should probably cause the
> message to be retried later.
> There is one other special case. If the response contains an answer which
> is a CNAME RR, it indicates that REMOTE is actually an alias for some
> other domain name. The query should be repeated with the canonical domain
> name.
>
> If the response does not contain an error response, and does not contain
> aliases, its answer section should be a (possibly zero length) list of MX
> RRs for domain name REMOTE (or REMOTE's true domain name if REMOTE was a
> alias). The next section describes how this list is interpreted.
>
> And also another RFC (forget which one) which explains the DNS header
> fields and the results. An error 2 serv fail points to a misconfigured DNS
> server.
>
> Other points of note. I was told the internal (NT DNS) systems have not
> changed in their config. The external one was changed and then the problem
> started. Unfortunately someone forgot the concept of revision control and
> being able to backout of the change. So by the time I was called in, I
> couldnt even restore from an old backup since the backups had already
> rotated out (ie months had passed before i was contacted to work on this).
>
> Thanks again for the assistance, the last part of this is the testing i
> reference above:
>
> Restest from inside firewall:
>
> C:\>restest -debug cdy.bdsys.com
>
> Microsoft (R) Name Resolution Test Utility (5.5.1960.3)
> Copyright (C) Microsoft Corp 1986-1997. All rights reserved.
>
> res_querydomain(cdy.bdsys.com, foo.com, 1, 15)
> res_query(cdy.bdsys.com.foo.com, 1, 15)
> res_mkquery(0, cdy.bdsys.com.foo.com, 1, 15)
> res_send()
> HEADER:
>         opcode = QUERY, id = 3, rcode = NOERROR
>         header flags:  rd
>         qdcount = 1, ancount = 0, nscount = 0, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com.foo.com, type = MX, class = IN
>
> Querying server (# 1) address = x.x.8.4
> got answer:
> HEADER:
>         opcode = QUERY, id = 3, rcode = NXDOMAIN
>         header flags:  qr aa rd ra
>         qdcount = 1, ancount = 0, nscount = 1, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com.foo.com, type = MX, class = IN
>
> NAME SERVERS:
>         foo.com
>         type = SOA, class = IN, ttl = 1 hour, dlen = 64
>         origin = x.xdns02.foo.com
>         mail addr = administrator at foo.com
>         serial = 8806
>         refresh = 1 hour
>         retry = 10 mins
>         expire = 1 day
>         min = 10 mins
>
> rcode = 3, ancount=0
> res_querydomain(cdy.bdsys.com, (null), 1, 15)
> res_query(cdy.bdsys.com, 1, 15)
> res_mkquery(0, cdy.bdsys.com, 1, 15)
> res_send()
> HEADER:
>         opcode = QUERY, id = 4, rcode = NOERROR
>         header flags:  rd
>         qdcount = 1, ancount = 0, nscount = 0, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com, type = MX, class = IN
>
> Querying server (# 1) address = x.x.8.4
> got answer:
> HEADER:
>         opcode = QUERY, id = 4, rcode = SERVFAIL
>         header flags:  qr rd ra
>         qdcount = 1, ancount = 0, nscount = 0, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com, type = MX, class = IN
>
> rcode = 2, ancount=0
> An error occurred while trying to resolve an IP address for cdy.bdsys.com.
> The message would have been queued for another delivery attempt later.
> Check your DNS configuration and make sure your DNS server(s) are running.
>
> Restest from outside:
>
> C:\>restest -debug cdy.bdsys.com
>
> Microsoft (R) Name Resolution Test Utility (5.5.1960.3)
> Copyright (C) Microsoft Corp 1986-1997. All rights reserved.
>
> res_querydomain(cdy.bdsys.com, external-system.com, 1, 15)
> res_query(cdy.bdsys.com.external-system.com, 1, 15)
> res_mkquery(0, cdy.bdsys.com.external-system.com, 1, 15)
> res_send()
> HEADER:
>         opcode = QUERY, id = 4, rcode = NOERROR
>         header flags:  rd
>         qdcount = 1, ancount = 0, nscount = 0, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com.external-system.com, type = MX, class = IN
>
> Querying server (# 1) address = x.x.181.227
> got answer:
> HEADER:
>         opcode = QUERY, id = 4, rcode = NXDOMAIN
>         header flags:  qr aa rd ra
>         qdcount = 1, ancount = 0, nscount = 1, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com.external-system.com, type = MX, class = IN
>
> NAME SERVERS:
>         external-system.com
>         type = SOA, class = IN, ttl = 2 days, dlen = 41
>         origin = dc1.external-system.com
>         mail addr = postmaster.external-system.com
>         serial = 174
>         refresh = 1 hour
>         retry = 10 mins
>         expire = 1 day
>         min = 2 days
>
> rcode = 3, ancount=0
> res_querydomain(cdy.bdsys.com, external.com, 1, 15)
> res_query(cdy.bdsys.com.external.com, 1, 15)
> res_mkquery(0, cdy.bdsys.com.external.com, 1, 15)
> res_send()
> HEADER:
>         opcode = QUERY, id = 5, rcode = NOERROR
>         header flags:  rd
>         qdcount = 1, ancount = 0, nscount = 0, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com.external.com, type = MX, class = IN
>
> Querying server (# 1) address = x.x.181.227
> got answer:
> HEADER:
>         opcode = QUERY, id = 5, rcode = NXDOMAIN
>         header flags:  qr aa ra
>         qdcount = 1, ancount = 0, nscount = 1, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com.external.com, type = MX, class = IN
>
> NAME SERVERS:
>         external.com
>         type = SOA, class = IN, ttl = 1 hour, dlen = 38
>         origin = host.external.com
>         mail addr = addr.external.com
>         serial = 1072
>         refresh = 1 hour
>         retry = 10 mins
>         expire = 1 day
>         min = 1 hour
>
> rcode = 3, ancount=0
> res_querydomain(cdy.bdsys.com, (null), 1, 15)
> res_query(cdy.bdsys.com, 1, 15)
> res_mkquery(0, cdy.bdsys.com, 1, 15)
> res_send()
> HEADER:
>         opcode = QUERY, id = 6, rcode = NOERROR
>         header flags:  rd
>         qdcount = 1, ancount = 0, nscount = 0, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com, type = MX, class = IN
>
> Querying server (# 1) address = x.x.181.227
> got answer:
> HEADER:
>         opcode = QUERY, id = 6, rcode = NOERROR
>         header flags:  qr aa ra
>         qdcount = 1, ancount = 0, nscount = 1, arcount = 0
>
> QUESTIONS:
>         cdy.bdsys.com, type = MX, class = IN
>
> NAME SERVERS:
>         bdsys.com
>         type = SOA, class = IN, ttl = 2 hours 24 mins, dlen = 48
>         origin = ns1.host.net
>         mail addr = support.host.net
>         serial = 2001072502
>         refresh = 1 day
>         retry = 30 mins
>         expire = 10 days
>         min = 2 hours 24 mins
>
> rcode = 0, ancount=0
> No MX Records were found in the response from the DNS server.  Doing
> gethostbyname()
>
> On Thu, 30 May 2002, Simon Waters wrote:
>
> >
> > bindlist wrote:
> > >
> > > Testing shows that queries work for the most part except for one instance
> > > that is causing mail problems with other mailsystems.
> > >
> > > If, say I am on inner.foo.bar (or a workstation on the net using the
> > > inner.foo.bar NS) and try to lookup and MX record for some.com, the
> > > request is forwarded to the external DNS server. Next ns.foo.bar tries to
> > > lookup the MX record and gets (one example but any such occur taken from a
> > > cache dump):
> >
> > > The result sent to the inner.foo.bar nameserver ends up as:
> > >
> > > DNS R  Error:2(Server Fail)
> >
> > Sorry can you show us what query you actually did.
> >
> > Use "dig" and show the whole output cut and pasted.
> >
> > > At this point mail gets queued to the outside domain (in this case
> > > x.berkeley.edu) because the error return is keeping something from falling
> >
> > I assume you mean ssl.berkeley.edu?
> >
> > > back to an A record to try to hand mail off to since there is no MX
> > > record.
> > >
> > > Or at least thats what I thought occured. If there is no MX records BIND
> > > would then try for an A record yes?
> >
> > No if you ask for an MX record you will get an MX record, most
> > modern MTAs either ask for MX then A or ask for "ANY" and sort
> > it out themselves.
> >
> > But none of this logic is implemented in BIND, you ask the DNS
> > for an MX, it typically returns no-error and 0 answers (assuming
> > the domain exists).
> >
> > > Querying server (# 1) address = x.x.x.x
> > > got answer:
> > > HEADER:
> > >         opcode = QUERY, id = 4, rcode = SERVFAIL
> > >         header flags:  qr rd ra
> > >         qdcount = 1, ancount = 0, nscount = 0, arcount = 0
> >
> > > Anyone got any ideas?
> >
> > What do get if you "dig @ns.foo.bar ssl.berkeley.edu mx" from
> > the inner name server. The debug suggests you should get server
> > fail. If you do the external server is broken, if you get an
> > empty answer then it is a MS DNS problem and off topic here.
> >
> > We generally try not to obscure domain names and the like. If
> > the external name server provides recursion to everyone I could
> > have answered one of these queries in a couple of seconds.
> >
> >



More information about the bind-users mailing list