dhcpd-4.2.2 maxsocks problem

Jan Markus markus.jan at seznam.cz
Fri Jan 13 19:33:18 UTC 2012


Well, one more interesting thing, I've found...

In RELNOTES, there is a note in section "Changes since 4.2.0":

- Disable the use of kqueue in the ISC library.  This avoids a problem
   between the fork and socket code that caused the dhcpd process to
   use all available cpu if the program daemonized itself.
   [ISC-Bugs #21911]

So, when I change in "dhcp-4.2.3-P2/bind/Makefile":

- @(cd ${bindsrcdir} && ./configure --disable-kqueue ...
+ @(cd ${bindsrcdir} && ./configure --enable-kqueue ...

The dhcp daemon really eats the config but also 99% of CPU. When I run the daemon with -f switch and 
daemonize it myself - like this:

./dhcpd -f .... &

It run just "normal". I think I can live with this for now.

Maybe, it's a clue. And, maybe, it's not...

-Jan




Dne 01/13/2012 07:41 PM, Jeff Waller napsal(a):
> Ok 1024.  Pretty much what we expected, that means, I think, that  USE_SELECT is being defined
>
> Somehow this is being set and really it shouldn't.  Because from the code we have
> to get around this problem we somehow have to compile this program with USE_KQUEUE
> (if it's actually supported!) -- but this is BSD it should be supported.
>
>
> the reason we can't simply change the number is that FD_SETSIZE can't really
> be grown beyond 1024 because the data structures aren't large enough to accept it.
>
>
> Here's some relevance from socket.c
> /*%
>   * Choose the most preferable multiplex method.
>   */
> #ifdef ISC_PLATFORM_HAVEKQUEUE
> #define USE_KQUEUE
> #elif defined (ISC_PLATFORM_HAVEEPOLL)
> #define USE_EPOLL
> #elif defined (ISC_PLATFORM_HAVEDEVPOLL)
> #define USE_DEVPOLL
> typedef struct {
>          unsigned int want_read : 1,
>                  want_write : 1;
> } pollinfo_t;
> #else
> #define USE_SELECT
> #endif  /* ISC_PLATFORM_HAVEKQUEUE */
>
>
>
> I did fine this even in my distrobution
>
> ISC_PLATFORM_HAVEKQUEUE='#undef ISC_PLATFORM_HAVEKQUEUE'
> so maybe this is being brute force set to no?  or maybe it's not being set properly.
> Not sure, but this is what is necessary to fix.
>
> -Jeff
>
>
>
>
>
> On Jan 13, 2012, at 12:39 PM, Jan Markus wrote:
>
>> Hello,
>>
>> I'm sorry for a late reply, I've had a very busy day.
>>
>> Anyway, here's my first gdb adventure! ;)
>>
>>
>> # wget ftp://ftp.isc.org/isc/dhcp/4.2.3-P2/dhcp-4.2.3-P2.tar.gz
>> # tar -xzf dhcp-4.2.3-P2.tar.gz
>> # cd dhcp-4.2.3-P2
>> # ./configure --enable-debug
>> # make
>> # cd server
>>
>>
>> # ./dhcpd -cf /tmp/dhcpd.conf -lf /tmp/dhcpd.leases -pf /tmp/dhcpd.pid
>> [...]
>> ../../../../lib/isc/unix/socket.c:959: INSIST(fd>= 0&&  fd<  (int)manager->maxsocks) failed, back trace
>> #0 0x81cd6f1 in ??
>> #1 0x81cd639 in ??
>> #2 0x81f1972 in ??
>> #3 0x81f1bfb in ??
>> #4 0x81f5143 in ??
>> #5 0x81e5bd8 in ??
>> #6 0x80d44f3 in ??
>> #7 0x809f819 in ??
>> #8 0x804b5fb in ??
>> #9 0x804a617 in ??
>> #10 0x804a588 in ??
>> #11 0x7 in ??
>> Abort trap: 6 (core dumped)
>>
>>
>>
>> # gdb dhcpd dhcpd.core
>> [...]
>> Core was generated by `dhcpd'.
>> Program terminated with signal 6, Aborted.
>> Reading symbols from /lib/libc.so.7...done.
>> Loaded symbols for /lib/libc.so.7
>> Reading symbols from /libexec/ld-elf.so.1...done.
>> Loaded symbols for /libexec/ld-elf.so.1
>> #0  0x2836ac77 in kill () from /lib/libc.so.7
>>
>>
>>
>> (gdb) bt
>> #0  0x2836ac77 in kill () from /lib/libc.so.7
>> #1  0x2836abd6 in raise () from /lib/libc.so.7
>> #2  0x283697aa in abort () from /lib/libc.so.7
>> #3  0x081cd63e in isc_assertion_failed (file=Could not find the frame base for "isc_assertion_failed".
>> ) at ../../../lib/isc/assertions.c:58
>> #4  0x081f1972 in wakeup_socket (manager=0x28458000, fd=1024, msg=-3) at
>> ../../../../lib/isc/unix/socket.c:959
>> #5  0x081f1bfb in select_poke (manager=0x28458000, fd=1024, msg=-3) at
>> ../../../../lib/isc/unix/socket.c:1092
>> #6  0x081f5143 in isc__socket_fdwatchcreate (manager0=0x28458000, fd=1024, flags=1,
>> callback=0x80d4280<omapi_iscsock_cb>, cbarg=0x28d47e80,
>>      task=0x2845b000, socketp=0x28d47eac) at ../../../../lib/isc/unix/socket.c:2699
>> #7  0x081e5bd8 in isc_socket_fdwatchcreate (manager=0x28458000, fd=1024, flags=1, callback=0x80d4280
>> <omapi_iscsock_cb>, cbarg=0x28d47e80,
>>      task=0x2845b000, socketp=0x28d47eac) at ../../../lib/isc/socket_api.c:205
>> #8  0x080d44f3 in omapi_register_io_object (h=0x28403480, readfd=0x809f960<if_readsocket>,
>> writefd=0, reader=0x809fad0<got_one>, writer=0,
>>      reaper=0) at dispatch.c:259
>> #9  0x0809f819 in discover_interfaces (state=1) at discover.c:1289
>> #10 0x0804b5fb in main (argc=7, argv=0xbfbfec0c) at dhcpd.c:709
>>
>>
>>
>> (gdb) break assertions.c:58
>> Breakpoint 1 at 0x81cd639: file ../../../lib/isc/assertions.c, line 58.
>>
>>
>>
>> (gdb) run -cf /tmp/dhcpd.conf -lf /tmp/dhcpd.leases -pf /tmp/dhcpd.pid
>> [...]
>> ../../../../lib/isc/unix/socket.c:959: INSIST(fd>= 0&&  fd<  (int)manager->maxsocks) failed, back trace
>> #0 0x81cd6f1 in ??
>> #1 0x81cd639 in ??
>> #2 0x81f1972 in ??
>> #3 0x81f1bfb in ??
>> #4 0x81f5143 in ??
>> #5 0x81e5bd8 in ??
>> #6 0x80d44f3 in ??
>> #7 0x809f819 in ??
>> #8 0x804b5fb in ??
>> #9 0x804a617 in ??
>> #10 0x804a588 in ??
>> #11 0x7 in ??
>>
>> Breakpoint 1, isc_assertion_failed (file=0x8236e38 "../../../../lib/isc/unix/socket.c", line=959,
>> type=isc_assertiontype_insist,
>>      cond=0x8236ebc "fd>= 0&&  fd<  (int)manager->maxsocks") at ../../../lib/isc/assertions.c:58
>> 58      ../../../lib/isc/assertions.c: No such file or directory.
>>          in ../../../lib/isc/assertions.c
>>
>>
>>
>> (gdb) where
>> #0  isc_assertion_failed (file=0x8236e38 "../../../../lib/isc/unix/socket.c", line=959,
>> type=isc_assertiontype_insist,
>>      cond=0x8236ebc "fd>= 0&&  fd<  (int)manager->maxsocks") at ../../../lib/isc/assertions.c:58
>> #1  0x081f1972 in wakeup_socket (manager=0x28458000, fd=1024, msg=-3) at
>> ../../../../lib/isc/unix/socket.c:959
>> #2  0x081f1bfb in select_poke (manager=0x28458000, fd=1024, msg=-3) at
>> ../../../../lib/isc/unix/socket.c:1092
>> #3  0x081f5143 in isc__socket_fdwatchcreate (manager0=0x28458000, fd=1024, flags=1,
>> callback=0x80d4280<omapi_iscsock_cb>, cbarg=0x28d47d40,
>>      task=0x2845b000, socketp=0x28d47d6c) at ../../../../lib/isc/unix/socket.c:2699
>> #4  0x081e5bd8 in isc_socket_fdwatchcreate (manager=0x28458000, fd=1024, flags=1, callback=0x80d4280
>> <omapi_iscsock_cb>, cbarg=0x28d47d40,
>>      task=0x2845b000, socketp=0x28d47d6c) at ../../../lib/isc/socket_api.c:205
>> #5  0x080d44f3 in omapi_register_io_object (h=0x28403840, readfd=0x809f960<if_readsocket>,
>> writefd=0, reader=0x809fad0<got_one>, writer=0,
>>      reaper=0) at dispatch.c:259
>> #6  0x0809f819 in discover_interfaces (state=1) at discover.c:1289
>> #7  0x0804b5fb in main (argc=7, argv=0xbfbfec04) at dhcpd.c:709
>>
>>
>>
>> (gdb) up
>> #1  0x081f1972 in wakeup_socket (manager=0x28458000, fd=1024, msg=-3) at
>> ../../../../lib/isc/unix/socket.c:959
>> 959     ../../../../lib/isc/unix/socket.c: No such file or directory.
>>          in ../../../../lib/isc/unix/socket.c
>>
>>
>>
>>
>> (gdb) print fd
>> $1 = 1024
>>
>>
>> Best regards and thank you for your help so far,
>> -Jan
>>
>>
>>
>> Dne 01/12/2012 07:53 PM, Jeff Waller napsal(a):
>>> Hmm,
>>>
>>> Looks like the variable values are missing...
>>>
>>> could you do
>>> up
>>> up
>>> up
>>> list
>>>
>>> I have a feeling that wont work because you missing the source code.
>>>
>>>
>>> So, I think you're going to need to get the source and compile it. There is a option to configure that
>>> allows you to compile DHCPd for debugging.  Do that, I think it would be
>>>
>>> get source
>>> unpack source
>>> read install instructions but this will be something like
>>>
>>> configure --enable-debug
>>> make
>>>
>>>
>>> you don't even need to install this yet because you'll debug it from the source directory
>>>
>>> I do not know the arguments that need to be used to start the dhcp server on your particular system, but
>>> you'll need to know these
>>>
>>> here's an example, but, again, you'll need to know how yours is different
>>>
>>> dhcpd -lf /usr/local/dhcpd/dhcpd.leases -cf /usr/local/dhcpd/dhcpd.conf eth0
>>>
>>> Once you know this, then you can use that same information using gdb.
>>>
>>> In the source directory for dhcp, when the dhcp server is compiled, it is placed in the subdirectory server
>>>
>>> so cd to that and then do
>>>
>>> gdb dhcpd
>>>
>>> at the gdb prompt type
>>>
>>> break assertions.c:58 (because that's where it died in the stack trace before).  However, if you get the
>>> latest source, then maybe version 4.2.3 will have that particular statement on a different line then 4.2.2
>>> so maybe break assertions.c:58 is wrong and instead you'll need another number -- comparing the source
>>> code and check what line number each is on will suffice.
>>>
>>>
>>> this will catch the server and stop it just before it kills the process with abort
>>>
>>>
>>> then type
>>>
>>> run<but with the arguments to dhcp you found out before>   so in my example above (but again yours
>>> may be different)
>>>
>>> run -lf /usr/local/dhcpd/dhcpd.leases -cf /usr/local/dhcpd/dhcpd.conf eth0
>>>
>>> and then the server will attempt to start up and hit the error you found and the reach that section that it
>>> was just about to kill itself, but not yet because gdb stopped it first and then at that point you can type
>>>
>>> where
>>> up
>>> print fd
>>>
>>> Then tell us what fd is.
>>>
>>> On Jan 12, 2012, at 12:54 PM, Jan Markus wrote:
>>>
>>>> Dne 01/12/2012 04:38 PM, Jeff Waller napsal(a):
>>>>> Without looking into the code too much I found the following comment in socket.c (see below):  Which I think
>>>>> means yea you can change the kernel to allow more file descriptors but you will run into problems
>>>>> trying to increase beyond either 4096 or FD_SETSIZE whichever is applicable depending on how
>>>>> the server was compiled.   It appears you can change that number maybe and re-compile but
>>>>> be wary.
>>>>>
>>>>> Here's what FD_SETSIZE is set to in linux:
>>>>>
>>>>> ./linux/posix_types.h:#define __FD_SETSIZE  1024
>>>>>
>>>>>
>>>>>
>>>>> Couple of things to try.
>>>>>
>>>>> 1)  First off, what is that number?  Is it 1024 or is it 4096?  Use gdb to find out where this is hitting the error.
>>>>> It maybe be as simple as gdb dhcpd core (modify to actual filenames if not all in the same
>>>>> directory).
>>>>
>>>> Oh, one more thing, it will be of any help:
>>>>
>>>>
>>>> #0  0x00000008009d1fcc in kill () from /lib/libc.so.7
>>>> (gdb) bt
>>>> #0  0x00000008009d1fcc in kill () from /lib/libc.so.7
>>>> #1  0x00000008009d0dcb in abort () from /lib/libc.so.7
>>>> #2  0x000000000052b71f in isc_assertion_failed (file=Variable "file" is not available.
>>>> ) at ../../../lib/isc/assertions.c:58
>>>> #3  0x000000000054872b in select_poke (manager=0x157cc, fd=Variable "fd" is not available.
>>>> ) at ../../../../lib/isc/unix/socket.c:959
>>>> #4  0x000000000054a013 in isc__socket_fdwatchcreate (manager0=0x800c69000, fd=1024, flags=1,
>>>>      callback=0x46e930<omapi_iscsock_cb>, cbarg=0x801908460, task=0x800c6e000, socketp=0x8019084b0)
>>>>      at ../../../../lib/isc/unix/socket.c:2699
>>>> #5  0x0000000000470340 in omapi_register_io_object (h=0x8011d0f00, readfd=0x444860<if_readsocket>,
>>>> writefd=0,
>>>>      reader=0x4452c0<got_one>, writer=0, reaper=0) at dispatch.c:259
>>>> #6  0x0000000000445e9b in discover_interfaces (state=1) at discover.c:1289
>>>> #7  0x000000000040de69 in main (argc=Variable "argc" is not available.
>>>> ) at dhcpd.c:709
>>>> (gdb)
>>>>
>>>> -Jan
>>>>
>>>>
>>>>>
>>>>> 2)  Get 4.2.3  (4.2.2 is 6 months old).  I would have thought that you can simply download the
>>>>> source and compile, it doesn't compile on BSD??  If you're going to modify, you're going to need
>>>>> the source anyway.  And figure out a way to not use SELECT there is definitely support for
>>>>> this in configure but it looks like kqueue on by default; make sure.  BTW this is all bind stuff which
>>>>> dhcp depends on
>>>>>
>>>>> ./configure --help
>>>>>
>>>>> ....
>>>>>
>>>>>    --enable-kqueue         use BSD kqueue when available [default=yes]
>>>>>    --enable-epoll          use Linux epoll when available [default=auto]
>>>>>    --enable-devpoll        use /dev/poll when available [default=yes]
>>>>>
>>>>> ...
>>>>>
>>>>>
>>>>> ==================================================================================
>>>>> /*%
>>>>>   * Maximum number of allowable open sockets.  This is also the maximum
>>>>>   * allowable socket file descriptor.
>>>>>   *
>>>>>   * Care should be taken before modifying this value for select():
>>>>>   * The API standard doesn't ensure select() accept more than (the system default
>>>>>   * of) FD_SETSIZE descriptors, and the default size should in fact be fine in
>>>>>   * the vast majority of cases.  This constant should therefore be increased only
>>>>>   * when absolutely necessary and possible, i.e., the server is exhausting all
>>>>>   * available file descriptors (up to FD_SETSIZE) and the select() function
>>>>>   * and FD_xxx macros support larger values than FD_SETSIZE (which may not
>>>>>   * always by true, but we keep using some of them to ensure as much
>>>>>   * portability as possible).  Note also that overall server performance
>>>>>   * may be rather worsened with a larger value of this constant due to
>>>>>   * inherent scalability problems of select().
>>>>>   *
>>>>>   * As a special note, this value shouldn't have to be touched if
>>>>>   * this is a build for an authoritative only DNS server.
>>>>>   */
>>>>> #ifndef ISC_SOCKET_MAXSOCKETS
>>>>> #if defined(USE_KQUEUE) || defined(USE_EPOLL) || defined(USE_DEVPOLL)
>>>>> #define ISC_SOCKET_MAXSOCKETS 4096
>>>>> #elif defined(USE_SELECT)
>>>>> #define ISC_SOCKET_MAXSOCKETS FD_SETSIZE
>>>>> #endif  /* USE_KQUEUE... */
>>>>> #endif  /* ISC_SOCKET_MAXSOCKETS */
>>>>> ==================================================================================
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> On Jan 12, 2012, at 8:51 AM, Jan Markus wrote:
>>>>>
>>>>>> Hello,
>>>>>>
>>>>>> we have isc-dhcpd 4.2.2 from ports on FreeBSD 8.2. We have cca 1050 Vlan interfaces:
>>>>>>
>>>>>>
>>>>>> ifconfig vlan1234 create inet 10.0.0.1/29 vlan 1234 vlandev igb1
>>>>>>
>>>>>>
>>>>>> and the same number of dhcpd declarations like this:
>>>>>>
>>>>>> shared-network vlan1234 {
>>>>>>      subnet 10.0.0.0 netmask 255.255.255.248 {
>>>>>>          range 10.0.0.2 10.0.0.5;
>>>>>>          option routers 10.0.0.1;
>>>>>>      }
>>>>>> }
>>>>>>
>>>>>> But our DHCP server refuses to start, saying:
>>>>>>
>>>>>> ../../../../lib/isc/unix/socket.c:958: INSIST(fd>= 0&&    fd<    (int)manager->maxsocks) failed, back trace
>>>>>> #0 0x52a5ca in ??
>>>>>> #1 0x52a77a in ??
>>>>>> #2 0x54768b in ??
>>>>>> #3 0x548f63 in ??
>>>>>> #4 0x46fc80 in ??
>>>>>> #5 0x445cfb in ??
>>>>>> #6 0x40de69 in ??
>>>>>> #7 0x40c38e in ??
>>>>>> #8 0x8006bc000 in ??
>>>>>> Abort trap (core dumped)
>>>>>> /usr/local/etc/rc.d/isc-dhcpd: WARNING: failed to start dhcpd
>>>>>>
>>>>>> My kernel sockets limits are:
>>>>>>
>>>>>> # sysctl -a | grep soc
>>>>>> kern.ipc.maxsockbuf: 16777216
>>>>>> kern.ipc.maxsockets: 204800
>>>>>> kern.ipc.numopensockets: 17
>>>>>> net.inet.ip.mcast.maxsocksrc: 128
>>>>>>
>>>>>> Please, what should I do?
>>>>>>
>>>>>> Thank you very much for your time.
>>>>>> -Jan
>>>>>> _______________________________________________
>>>>>> dhcp-users mailing list
>>>>>> dhcp-users at lists.isc.org
>>>>>> https://lists.isc.org/mailman/listinfo/dhcp-users
>>>>>
>>>>> _______________________________________________
>>>>> dhcp-users mailing list
>>>>> dhcp-users at lists.isc.org
>>>>> https://lists.isc.org/mailman/listinfo/dhcp-users
>>>>>
>>>>
>>>> _______________________________________________
>>>> dhcp-users mailing list
>>>> dhcp-users at lists.isc.org
>>>> https://lists.isc.org/mailman/listinfo/dhcp-users
>>>
>>> _______________________________________________
>>> dhcp-users mailing list
>>> dhcp-users at lists.isc.org
>>> https://lists.isc.org/mailman/listinfo/dhcp-users
>>>
>>
>> _______________________________________________
>> dhcp-users mailing list
>> dhcp-users at lists.isc.org
>> https://lists.isc.org/mailman/listinfo/dhcp-users
>
> _______________________________________________
> dhcp-users mailing list
> dhcp-users at lists.isc.org
> https://lists.isc.org/mailman/listinfo/dhcp-users
>




More information about the dhcp-users mailing list