glibc IPv6 resolver issue - search order and getaddrinfo()

Ted_Rule at flextech.co.uk Ted_Rule at flextech.co.uk
Mon Dec 10 13:49:52 UTC 2001




Over the weekend we had a nasty loss of Internet connectivity, and during
that outage we found that some or more recently patched Linux boxes had
extreme difficulty telnetting to anywhere.

Further investigation revealed that the most recent telnet RPMs have included
IPv6 support, and in particular use getaddrinfo instead of gethostbyname or
gethostbyname2 to determine the host address.

getaddrinfo is called with an address family hint of PF_UNSPEC

As a result, the DNS server is queried for AAAA and A records in the resolv.conf
search order.

However, somewhat to my surprise, the exact order is arguably "wrong".

For connection to a host called badhost, the DNS lookups performed are
( assuming no such host exists and a resolv.conf shown below )

$ cat /etc/resolv.conf
search flextech.co.uk oursister.com
nameserver 10.1.2.3
nameserver 10.2.3.4

badhost.flextech.co.uk/AAAA/IN
badhost.oursister.com/AAAA/IN
badhost/AAAA/IN
badhost.flextech.co.uk/A/IN
badhost.oursister.com/A/IN
badhost/A/IN

rather than

badhost.flextech.co.uk/AAAA/IN
badhost.flextech.co.uk/A/IN
badhost.oursister.com/AAAA/IN
badhost.oursister.com/A/IN
badhost/AAAA/IN
badhost/A/IN

i.e. the resolver library searches all possible IPv6 hosts in the search path,
including the root domain itself, before attempting any  IPv4 lookup.

Once Internet connectivity is lost, the AAAA lookup in the root domain becomes
impossible, and local connectivity between hosts via this version of telnet
takes
several tens of seconds to fall through to the local domain A record lookup.

Whilst I have been able to patch together a fix for telnet which avoids any AAAA
 lookups from
hosts with no IPv6 support ( diff below ), the problem is not really telnet's
fault per se.

I would say that getaddrinfo() ought to be performing a lookup across all
available address
families for each suffix in the search order before moving on to the next
suffix; thus ensuring
that if a host early in the search order has no IPv6 records it will still be
preferred ahead of
any such match further down the search order which does have IPv6 support.

By making that subtle change to resolution order , my problems would have been
eliminated. Whether or not making that change in order is 'illegal' in terms of
existing
IPv6 API's, I have no idea.

There is a second issue , though; and one that has bugged me for a long time.
Should the resolver ever look up an A/AAAA record at the root level if there
is an explicit search path which doesn't explicity include ".", and the original
 call
to getaddrinfo() or gethostbyname() contained no dots.

The only regularly used A/AAAA records at "root" level are those for
"localhost.",
and my personal workround to this is to create

     localhost.flextech.co.uk IN   CNAME      localhost.

so that a client resolver can always find the localhost. record without resort
to
a "." suffix lookup.

By NOT looking up records in the root domain, my problems would also have
been avoided. The resolv.conf manpages I've seen suggest that the resolver
will attempt to lookup with the root suffix if the original hostname contains a
dot
( the ndots option ), but makes no explicit statement I can find that a final
root suffix
lookup is always performed.

Whilst this problem was found on RedHat 6.2 / glibc 2.1.3-15 / telnet-0.17-6x.18
I haven't yet been able to determine how widespread is the "flaw" in resolver
libraries
of other OS's or even bind8's own resolver library or even how lwresd responds
to
this situation. I would be most interested to know, if some kind person(s) would
 care
to run up the appropriate debugging tools.

Note that the fix below is also slightly kludgy in that getaddrinfo() supports
more than just
IPv4 and IPv6 - my patch reduces the " Any Family" lookup to just "IPv4" ,
instead of
"Any Family except IPv6"

$ more telnet-0.17-ipv6.patch
diff -uNr netkit-telnet-0.17.orig/telnet/commands.c
netkit-telnet-0.17/telnet/commands.c
--- netkit-telnet-0.17.orig/telnet/commands.c   Sun Dec  9 17:56:23 2001
+++ netkit-telnet-0.17/telnet/commands.c        Mon Dec 10 12:31:06 2001
@@ -2254,6 +2254,10 @@
     int error;
     struct sockaddr_in sin;
     unsigned long temp;
+
+    /* Dynamically set flag to determine local ipv6 support */
+    int ipv6support = 0;
+
 #if !defined(__linux__)
     extern char *inet_ntoa();
 #endif
@@ -2328,6 +2332,17 @@
     if (hostp == 0)
        goto usage;

+    /* Dynamically determine IPv6 support */
+    ipv6support = socket(PF_INET6,SOCK_STREAM,0);
+    if ( ipv6support != -1 ) {
+       close(ipv6support);
+       ipv6support = 1;
+    } else {
+       printf("IPv6 support unavailable on this host: %s\r\n",
+                               strerror(errno));
+       ipv6support = 0;
+    }
+
 #if    defined(IP_OPTIONS) && defined(IPPROTO_IP)
     if (hostp[0] == '@' || hostp[0] == '!') {
        if ((hostname = strrchr(hostp, ':')) == NULL)
@@ -2349,7 +2364,11 @@
     {
        hostname = hostp;
        memset(&hints, 0, sizeof(hints));
-       hints.ai_family = PF_UNSPEC;
+       /* Force IPv4 lookup if no IPv6 support available */
+       if ( ipv6support == 1 )
+               hints.ai_family = PF_UNSPEC;
+       else
+               hints.ai_family = PF_INET;
        hints.ai_socktype = SOCK_STREAM;
        hints.ai_flags = AI_CANONNAME;
        if (portp == NULL) {
@@ -2390,7 +2409,11 @@
        if (aliasp) {
            struct addrinfo ahints, *ares;
            memset(&ahints, 0, sizeof(ahints));
-           ahints.ai_family = PF_UNSPEC;
+           /* Force IPv4 lookup if no IPv6 support available */
+           if ( ipv6support == 1 )
+               ahints.ai_family = PF_UNSPEC;
+           else
+               ahints.ai_family = PF_INET;
            ahints.ai_socktype = SOCK_STREAM;
            ahints.ai_flags = AI_PASSIVE;
            error = getaddrinfo(aliasp, "0", &ahints, &ares);
$


Ted Rule,
Flextech Television



***************************************************************************************************


This E-mail message, including any attachments, is intended only for the person
or entity to which it is addressed, and may contain confidential information.

If you are not the intended recipient, any review, retransmission, disclosure,
copying, modification or other use of this E-mail message or attachments is
strictly forbidden.

If you have received this E-mail message in error, please contact the author and
 delete the message and any attachments from your computer.

You are also advised that the views and opinions expressed in this E-mail
message and any attachments are the author's own, and may not reflect the views
and opinions of FLEXTECH Television Limited.
***************************************************************************************************




More information about the bind-workers mailing list