Memory corruption after AXFER
Cody.Gibson at intermec.com
Cody.Gibson at intermec.com
Thu Feb 3 23:11:58 UTC 2000
I found the source of the problem. Turned out to be a subtle difference in
the way that readv() and writev() function under OS/2. The BIND functions
readable() and writeable() [contained in ev_streams.c] ASSUME that the iov
array will be left intact as originally passed in to readv() and writev().
Then they call consume() to update the pointers contained in the iov array
to point to the next available byte based on how many bytes were just
read/written.
The problem is that under OS/2, the iov array is ALREADY updated by the
readv() and writev() functions before they return to the caller...
duplicating what consume() was attempting to accomplish. So consume()
actually corrupted memory because the pointers had already been moved.
I cannot find any clear documentation for what the TCP stack may/may not do
with the iov array for a readv() and writev() call. Thus it appears it's up
to the specific implementation of the TCP stack as to whether or not it is
changed by the function call. So other operating systems may have a similar
problem.
This is what I did to fix it on my system:
ev_streams.c (Feb 03 2000 13:47:00)
x:ev_streams.c (Feb 03 2000 13:28:38)
===================
39 43 |
40 44 |static int copyvec(evStream *str, const struct iovec
*iov, int iocnt);
+ 45 |#ifndef DONT_NEED_CONSUME
41 46 |static void consume(evStream *str, size_t bytes);
+ 47 |#endif
42 48 |static void done(evContext opaqueCtx, evStream *str);
43 49 |static void writable(evContext opaqueCtx, void *uap, int
fd, int evmask);
===================
215 221 |}
216 222 |
+ 223 |#ifndef DONT_NEED_CONSUME
217 224 |/* Pull off or truncate lead iovec(s). */
218 225 |static void
===================
233 240 | }
234 241 |}
+ 242 |#endif
235 243 |
236 244 |/* Add a stream to Done list and deselect the FD. */
===================
262 270 | if ((str->flags & EV_STR_TIMEROK) != 0)
263 271 | evTouchIdleTimer(opaqueCtx, str->timer);
+ 272 |#ifndef DONT_NEED_CONSUME
264 273 | consume(str, bytes);
+ 274 |#else
+ 275 | str->ioDone += bytes;
+ 276 |#endif
265 277 | } else {
266 278 | if (bytes < 0 && errno != EINTR) {
===================
283 295 | if ((str->flags & EV_STR_TIMEROK) != 0)
284 296 | evTouchIdleTimer(opaqueCtx, str->timer);
+ 297 |#ifndef DONT_NEED_CONSUME
285 298 | consume(str, bytes);
+ 299 |#else
+ 300 | str->ioDone += bytes;
+ 301 |#endif
286 302 | } else {
287 303 | if (bytes == 0)
===================
-----Original Message-----
From: Cody.Gibson at intermec.com [mailto:Cody.Gibson at intermec.com]
Sent: Wednesday, February 02, 2000 1:15 PM
To: bind-workers at isc.org
Subject: Memory corruption after AXFER
I am trying to track down what appears to be some sort of memory corruption
by AXFER in my OS/2 port of BIND 8.2. I would like to hear from anyone that
can reproduce this, or knows how to fix it. If I do the following:
>nslookup
>ls -d <any active primary domain name here... I'm using jon.intermec.com>
<results displayed here>
>/exit
>ndc reconfig
I get an access violation inside of __memget_record() where it's dealing
with the freelists[] array (line 292) because a "next" pointer is invalid.
The "ls -d" works fine by itself, even when repeating it. Also "ndc
reconfig" works fine (even if repeated many times) if NOT preceded by an
AXFER. It's the combination of the 2 that is deadly.
It would be very useful to know if I am dealing with an OS/2 port specific
problem, or a problem that exists in the common code base. Thx for any help
you can provide.
Cody Gibson
More information about the bind-workers
mailing list