BIND 10 #642: SIGHUP and other signals cause boss to leave BIND 10 processes lying around

BIND 10 Development do-not-reply at isc.org
Fri Apr 8 13:10:42 UTC 2011


#642: SIGHUP and other signals cause boss to leave BIND 10 processes lying around
-------------------------------------+-------------------------------------
                 Reporter:  shane    |                Owner:  shane
                     Type:  defect   |               Status:  reviewing
                 Priority:  minor    |            Milestone:
                Component:  Boss of  |  Sprint-20110419
  BIND                               |           Resolution:
                 Keywords:           |            Sensitive:  0
Estimated Number of Hours:  0.0      |  Add Hours to Ticket:  0
                Billable?:  1        |          Total Hours:  0
                Internal?:  0        |
-------------------------------------+-------------------------------------

Comment (by shane):

 I sent a mail to the bind10-dev list about this:

 {{{
 From: Shane Kerr <shane at isc.org>
 To: bind10-dev <bind10-dev at lists.isc.org>
 Date: Fri, 18 Mar 2011 14:30:01 +0100
 Subject: [bind10-dev] Handling Disappearing Terminals

 All,

 We need to think about what happens to the server when the terminal it
 is running in disappears.

 History
 -------
 (Skip if you are impatient for the good stuff.)

 At the end of last month, Jeremy sent a mail about his problems setting
 up a forwarding resolver:

 https://lists.isc.org/pipermail/bind10-dev/2011-February/002038.html

 He reported this:

         I know why my bind10 was killed; it doesn't daemonize so when I
         closed terminal it was running in, it was killed -- but
         sometimes children didn't get killed. HUP or whatever signal was
         not trapped or passed to children?

 This led me to make a ticket so that we handle SIGHUP and other signals
 that might kill the boss process:

 http://bind10.isc.org/ticket/642

 However, Michal noted that this didn't seem to do anything at all when
 he started a process in the background and the terminal was closed. So I
 had a look and discovered that the behavior for processes varies quite a
 bit depending on the exact details of how the controlling terminal goes
 away.


 Details of Terminal Closing
 ---------------------------
 I looked at what happens to a process under 3 ways of being started:

 1. Running the program
 2. Using "su" and then running the program
 3. Using "sudo" to run the program

 My theory was that there may be slightly different things done, and it
 turns out that is true.

 I tried 3 types of test:

 A. Start program and close the terminal window
 B. Start program in the background (with & at the shell) then logout
 C. Start program in the background then close the terminal window

 I wrote small Python programs to use for this test, to concentrate on
 figuring out the behavior.

 My 1st program intercepted all signals possible, and then just waited
 around for a KILL signal. :)

 My 2nd program intercepted all signals possible, and then wrote a stream
 to STDOUT in a loop.

 My 3rd program intercepted all signals possible, and then used select()
 to see if anything was available for reading, and tried to read if it
 was.

 Results:

 --[ 1: idle ]----------------------------------------------------------
               Start/Close     Background/Logout    Background/Close
 normal          SIGHUP            nothing               SIGHUP
 su              nothing           nothing               nothing
 sudo          SIGHUP (3x)         nothing               SIGHUP

 --[ 2: writing ]-------------------------------------------------------
               Start/Close      Background/Logout   Background/Close
 normal   SIGHUP, SIGTSTP, err         err             SIGHUP, err
 su       err, SIGHUP, SIGTSTP         err                 err
 sudo     SIGHUP, SIGTSTP, err         err             SIGHUP, err

 --[ 3: reading ]-------------------------------------------------------
               Start/Close      Background/Logout   Background/Close
 normal        SIGHUP, EOF             EOF             SIGHUP, EOF
 su                EOF             SIGTTIN, EOF       SIGTTIN, EOF
 sudo          SIGHUP, err      SIGTTIN, SIGTSTP,      SIGHUP, EOF
                              SIGTERM, SIGTSTP, EOF

 If more than one thing happened, they are listed in the order they
 occurred.

 Key:
   SIGXXX is a signal arriving
   err is an I/O error (either writing or reading)
   EOF means a read return 0 bytes, indicating EOF



 Michal's Observation
 --------------------
 I think we can understand Michal's results:

       * When the terminal window closed, the boss got no signal at all.
       * Then when one of the child processes tried to output some
         message, it got a write error.
       * When the boss caught the dying child, it tried to output a
         message explaining this and *also* got a write error.
       * Over time, more and more children got write errors and died.


 Analysis
 --------
 The boss process can adapt itself to handle the terminal going away,
 because based on the research above, we can detect this and change
 outputs so that they go to /dev/null (or better yet so they call empty
 functions).

 The problem becomes what we do with child processes. If we want them to
 write to the console, then they will get some sort of error too.

       * We could let the children die, and restart them, but this is...
         inelegant.
       * We could perhaps have the boss act as a proxy and use pipes to
         read the output.
       * We could do the same thing, but with pseudo-ttys. Python even
         has a module for this:
         http://docs.python.org/py3k/library/pty.html
       * We could shut down.

 I realize some people want us to 'properly' daemonize. This would make
 the problem go away, but we'll have to change all of the processes to
 live in such an environment, and we'll *still* have to deal with these
 issues when the program is run in the equivalent of '-f' or '-g' from
 BIND 9 (run in foreground).

 Please let me know what you think.

 --
 Shane

 _______________________________________________
 bind10-dev mailing list
 bind10-dev at lists.isc.org
 https://lists.isc.org/mailman/listinfo/bind10-dev
 }}}

-- 
Ticket URL: <http://bind10.isc.org/ticket/642#comment:8>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development


More information about the bind10-tickets mailing list