TCP Receive Error

Cahill, Stephen stephen.cahill at LibertyMutual.com
Thu Aug 3 15:00:38 UTC 2006



You can also read this explanation.

http://www.incidents.org/diary.php?storyid=1538&isc=90f7e0cf745ef88d67ec
9d6641feb4e8


named/bind error messages - solved 
Published: 2006-08-02,
Last Updated: 2006-08-02 18:41:18 UTC by Daniel Wesemann (Version: 1) 

ISC readers report a significant increase of "odd" error messages in
their named/bind logs.

server named[18013]: dispatch 0x8face08: shutting down due to TCP
receive error: [IP REMOVED]#53: connection reset.
named[8428]: dispatch 0x81eb2b0: shutting down due to TCP receive error:
<unknown address, family 48830>: connection reset

Update 18:30 UTC:  It looks like we got the solution, or at least parts
of it:

Some DNS servers of "secureserver.net" are apparently broken and
sometimes return incomplete records. Two DNS servers in particular,
64.202.165.202 and 68.178.211.201, are implicated in the majority of the
"TCP receive error" packet traces that we have received. 
What happens is that "named" sends a UDP DNS query to one of the broken
servers and receives a truncated UDP response. By nature of the DNS
protocol, "named" re-tries the same query in TCP, which is answered by
the broken servers with a rude "tcp reset" packet, which in turn again
triggers "named" to write the above log line. This behaviour can be
reproduced with "dig" as shown below:
daniel at debian:$ dig whatever.net @68.178.211.201
;; Truncated, retrying in TCP mode.
;; communications error to 68.178.211.201#53: connection reset

Lookups against ISIPP's IADB spam / sender database seem to have ended
up on the broken servers listed above from time to time, causing the
"link" between receiving email and seeing the named log entries as
reported by some readers

The IP address in the named log does not seem to have anything to do
with the IP that causes the problem. I have no idea where this logged IP
comes from, but seeing that some versions are printing "address unknown"
instead of an IP, I suspect that this error print statement is broken in
several (older?) Bind releases
A big thank you to all the readers who have volunteered their packet
traces and time to help with this analysis!




-----Original Message-----
From: bind-users-bounce at isc.org [mailto:bind-users-bounce at isc.org] On
Behalf Of Mark Andrews
Sent: Wednesday, August 02, 2006 8:34 PM
To: Barry Margolin
Cc: comp-protocols-dns-bind at isc.org
Subject: Re: TCP Receive Error 


> In article <earbuh$tfn$1 at sf1.isc.org>,  Mark Andrews 
> <Mark_Andrews at isc.org> wrote:
> > > On my caching nameserver I have been seeing these since July 31:
> > > 
> > > Aug  2 09:41:14 tesla named[18013]: dispatch 0x8face08: shutting 
> > > down due to TCP receive error: 68.178.211.201#53: connection reset
> > > 
> > > The IP is always as above or 64.202.165.202
> >
> > 	It's just stupid nameservers not following RFC 1035 that
> > 	accept TCP connections after returning "TC" to the UDP
> > 	request then they reply to the TCP/DNS query with a TCP
> > 	RESET.  If they don't want to answer the query then return
> > 	a refused, or perform a graceful TCP shutdown.  The TCP
> > 	RESET is just plain rude.
> 
> Indeed, that's what it is:
> 
> barmar $ dig . ns @64.202.165.202 +norec ;; Truncated, retrying in TCP

> mode.
> ;; communications error to 64.202.165.202#53: connection reset

	I suspect what is happening is that they are sending a
	REFUSED then immediately closing the socket but have not
	set SO_LINGER.

	named sends the refused then waits for another query and
	eventually closes when the querier closes or the wait timer
	expires.   Most DNS clients close immediately after they
	get the answer.  At this point there is no data in transit
	and a graceful shutdown of the TCP connection occurs.
 
> Any idea why so many sites are suddenly seeing lots of these errors?  
> There are messages on the NANOG mailing list (North American Network 
> Operators Group, which is mostly very knowledgeable network engineers 
> at major ISPs) reporting many of these errors as well.

	I suspect they upgraded there nameservers.
 
> Both of those IP's resolve to ip-<ipaddr>.ip.secureserver.net.  
> They're also near to, but not the same as, the addresses of 
> cns1.secureserver.net and cns2.secureserver.net.  So my guess is that 
> they're servers that are hosting some domains that secureserver.net 
> hosts.  Why would everyone suddenly be querying them so often?
> 
> --
> Barry Margolin, barmar at alum.mit.edu
> Arlington, MA
> *** PLEASE post questions in newsgroups, not directly to me ***
> *** PLEASE don't copy me on replies, I'll read them in the group ***
> 
> 
--
Mark Andrews, ISC
1 Seymour St., Dundas Valley, NSW 2117, Australia
PHONE: +61 2 9871 4742                 INTERNET: Mark_Andrews at isc.org



More information about the bind-users mailing list