statschannel assertion failure

Timothy Holtzen tah at NebrWesleyan.edu
Fri Feb 27 19:31:47 UTC 2009


Ok, I think I've included what your looking for below.  If it's not the
right thing please let me know how to generate what your looking for
from the core dump.  I will readily admit this level of debugging isn't
something I'm very familiar with.

If you want to try to reproduce it I made a little script on the remote
system like this:

foo.sh
for i in `seq 1 300` ; do
wget --quiet -O - 10.9.2.18:8085 > /dev/null
done

and then run the command:

foo.sh & foo.sh

I find that you have to run two instances of wget in a loop in order to
get two requests to occur close enough together to trigger the crash. 
Sometimes I have to extend the length of the loop but I find it usually
triggers a crash when run for any length of time from a separate
system.  Interestingly however if I run the same thing on the same
system Bind is running on it never triggers the crash.


 
=========
Thread 1 (process 12282):
#0  0x0000003930c30155 in raise () from /lib64/libc.so.6
No symbol table info available.
#1  0x0000003930c31bf0 in abort () from /lib64/libc.so.6
No symbol table info available.
#2  0x0000000000411cb5 in assertion_failed (file=0x556e7c
"statschannel.c", line=152,
    type=isc_assertiontype_insist, cond=0x568d53 "0") at ./main.c:161
No locals.
#3  0x000000393e230f98 in __xmlRaiseError () from /usr/lib64/libxml2.so.2
No symbol table info available.
#4  0x000000393e231aaa in __xmlErrEncoding () from /usr/lib64/libxml2.so.2
No symbol table info available.
#5  0x000000393e2336d6 in xmlCurrentChar () from /usr/lib64/libxml2.so.2
No symbol table info available.
#6  0x000000393e243c57 in xmlParseCharDataComplex () from
/usr/lib64/libxml2.so.2
No symbol table info available.
#7  0x000000393e246baf in xmlParseChunk () from /usr/lib64/libxml2.so.2
No symbol table info available.
#8  0x000000393e2f02ea in xmlFreeTextWriter () from /usr/lib64/libxml2.so.2
No symbol table info available.
#9  0x000000393e2595e1 in xmlOutputBufferWrite () from
/usr/lib64/libxml2.so.2
No symbol table info available.
#10 0x000000393e2ef254 in xmlTextWriterEndElement () from
/usr/lib64/libxml2.so.2
No symbol table info available.
#11 0x000000000053f397 in isc_taskmgr_renderxml (mgr=0x2b03dae22058,
writer=0x8f95730)
    at task.c:1352
        task = (isc_task_t *) 0x2b03dae39c48
#12 0x0000000000429a16 in render_index (url=<value optimized out>,
    querystring=<value optimized out>, arg=0x2b03dae2d010,
retcode=0x8f962b0,
    retmsg=0x8f962b8, mimetype=0x8f962a8, b=0x8f962c0, freecb=0x8f962f8,
    freecb_args=0x8f96300) at statschannel.c:745
        msg = <value optimized out>
        msglen = <value optimized out>
#13 0x000000000052c5a9 in isc_httpd_recvdone (task=0x2b03dae4ef88,
ev=0x2b03db4bcc40)
    at httpd.c:688
        r = {base = 0x2b03dae4ef88 "KSAT����X ��\003+", length = 5551396}
        result = <value optimized out>
        httpd = (isc_httpd_t *) 0x8f95e00
        url = (isc_httpdurl_t *) 0x2b03daf15bf0
        now = {seconds = 1235744333, nanoseconds = 318308000}
        datebuf = "Fri, 27 Feb 2009 14:18:53 GMT\000\000"
#14 0x000000000053f6f5 in isc__taskmgr_dispatch () at task.c:862
        manager = (isc_taskmgr_t *) 0x2b03dae22058
#15 0x000000000054209f in evloop () at app.c:358
        when = {seconds = 1235744333, nanoseconds = 784265000}
        tv = {tv_sec = 0, tv_usec = 472793}
        n = 2
        now = {seconds = 1235744333, nanoseconds = 311472000}
        tvp = <value optimized out>
        swait = (isc_socketwait_t *) 0x7b5b5c
        readytasks = <value optimized out>
        result = <value optimized out>
#16 0x000000000054232a in isc_app_run () at app.c:550
        event = (isc_event_t *) 0x0
        next_event = (isc_event_t *) 0x0
        task = (isc_task_t *) 0x0
#17 0x0000000000412434 in main (argc=0, argv=0x7fffcfcc2fe8) at ./main.c:914
        result = <value optimized out>


JINMEI Tatuya / 神明達哉 wrote:
> At Thu, 26 Feb 2009 07:58:29 -0600,
> Timothy Holtzen <tah at NebrWesleyan.edu> wrote:
>
>   
>> No it is a single processor on both production and test systems. 
>> Production is an Opteron and the test system is an Athlon64 but both are
>> single core processors.  Just to be sure I did a configured with a
>> --disable-threads on the test system and tried again.  Testing still
>> triggers the exception with the same errors.
>>     
>
> Okay, then please try the revised patch.  This will make named abort
> itself in the context of the libxml2 error, so please then get the
> stack trace of the core dump and show it.
>
> BTW, I tried to reproduce the problem by mostly concurrent access
> like:
>   wget http://127.0.0.1:5300/ &; wget http://127.0.0.1:5300/
> but couldn't see the crash.
>
> Also, since this happened even --disable-threads, it's very unlikely
> to be a kind of race condition.  I have no idea how the concurrent
> access relates to the problem at this moment.
>
> ---
> JINMEI, Tatuya
> Internet Systems Consortium, Inc.
>
> Index: statschannel.c
> ===================================================================
> RCS file: /proj/cvs/prod/bind9/bin/named/statschannel.c,v
> retrieving revision 1.2.2.13.2.1
> diff -u -r1.2.2.13.2.1 statschannel.c
> --- statschannel.c	18 Dec 2008 02:39:12 -0000	1.2.2.13.2.1
> +++ statschannel.c	26 Feb 2009 17:43:50 -0000
> @@ -109,10 +109,60 @@
>  #endif
>  }
>  
> +#ifdef HAVE_LIBXML2
> +static void
> +error_libxml2(void *ctx, xmlErrorPtr error) {
> +	xmlParserCtxtPtr pctx;
> +	ns_server_t *server = ctx;
> +	char *msg, *cp;
> +
> +	REQUIRE(server != NULL);
> +	REQUIRE(error != NULL);
> +
> +	/*
> +	 * Save the error code, if available, so that it can be used in the main
> +	 * code.  No lock is necessary here.
> +	 */
> +	pctx = error->ctxt;
> +	if (pctx != NULL && pctx->myDoc != NULL &&
> +	    pctx->myDoc->_private != NULL) {
> +		*(int *)pctx->myDoc->_private = error->code;
> +	}
> +
> +	/*
> +	 * Log the error message.  Since some libxml2 error messages are
> +	 * terminated with a CR, we make a local copy to remove it.  This is
> +	 * expensive, but should be okay as we don't expect to see libxml2
> +	 * errors so often.
> +	 */
> +	if (error->message != NULL) {
> +		msg = isc_mem_strdup(server->mctx, error->message);
> +		if (msg == NULL)
> +			return;
> +		cp = strchr(msg, '\n');
> +		if (cp != NULL)
> +			*cp = '\0';
> +
> +		isc_log_write(ns_g_lctx, NS_LOGCATEGORY_GENERAL,
> +			      NS_LOGMODULE_SERVER, ISC_LOG_INFO,
> +			      "libxml2 Error: %s", msg);
> +
> +		isc_mem_free(server->mctx, msg);
> +
> +		INSIST(0);
> +	}
> +}
> +#endif
> +
>  static void
>  init_desc(void) {
>  	int i;
>  
> +#ifdef HAVE_LIBXML2
> +	xmlSetStructuredErrorFunc(ns_g_server, error_libxml2);
> +	xmlInitParser();
> +#endif
> +
>  	/* Initialize name server statistics */
>  	memset(nsstats_desc, 0,
>  	       dns_nsstatscounter_max * sizeof(nsstats_desc[0]));
>   

-- 
Timothy A. Holtzen
Campus Network Administrator
Nebraska Wesleyan University




More information about the bind-users mailing list