BIND 10 #2903: need more asio error handling in asiodns

Tue Apr 30 18:41:51 UTC 2013

#2903: need more asio error handling in asiodns
-------------------------------------+-------------------------------------
            Reporter:  jinmei        |                        Owner:
                Type:  defect        |  jinmei
            Priority:  medium        |                       Status:
           Component:  b10-auth      |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-20130514
         Sub-Project:  DNS           |                   Resolution:
Estimated Difficulty:  4             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------

Comment (by jinmei):

 Replying to [comment:16 vorner]:

 > > I'm sorry for the excessive amount of changes.  [...]
 >
 > Well, my point about the workflow was about the fact that it seems
 > wrong there's a motivation for the piggy-backing. It shows that it

 I understood it, but simply thought your main point was about this
 particular piggyback and focused on it in my response.  Regarding the
 workflow, I agree there's a room to improve in terms of how to handle
 very small tasks.  But at the same time I do not necessarily think the
 higher barrier of incorporating them is a bad practice.  Any such work
 can be a distraction shifting our focus from the main feature tasks,
 and if we consider ensuring quality, including tests and
 documentation, it's not uncommon that even a few lines of patch is not
 really that small as it's originally deemed.

 We have some operational workaround in our practice: we already allow
 making very trivial changes such as fixing typo in comments directly,
 sometimes even skipping any review.  We also allow skipping the formal
 review process for urgent fixes if they are simple enough.  And I
 think reasonable piggy-back is actually not that bad as workaround:
 by definition of piggy-back, both the developer and reviewer should
 already be working in the context closer to the piggy-backed code, so
 the overhead of distraction can be reduced.  And yet we can (probably)
 make more changes than we would if we strictly follow the formal
 workflow.

 As admitted already, I agree the current practice would still not be
 ideal in terms of the balance of quality, speed, and focus.  But
 that's far beyond of the scope of this single ticket.  More of a
 subject of a team call, or ideally we could have discussed it in the
 f2f meeting.

 > '''About the changes''':
 >
 > {{{
 > % ASIODNS_SYNC_UDP_CLOSE_SOCKET_FAIL_ON_STOP failed to close a
 > This is the same to ASIODNS_UDP_CLOSE_SOCKET_FAIL_ON_STOP but happens
 > on the "synchronous UDP server", mainly used for the authoritative DNS
 > server daemon.
 > }}}
 >
 > Looking at that, it seems quite a long ID. Even longer than the human-
 readable message. But that might be related to the fact the human-readable
 message looks incomplete.

 I admit they are quite long and probably look awkward.  It's sometimes
 difficult to come up with reasonable ID names due to the constraint of
 uniqueness and if we still want to make them human understandable (not
 something like "ASIODNS_LOG_1137").  Anyway, I tried to make them a bit
 more concise.  The incomplete message was a clear error, which was
 also fixed.

 > Many messages speak about closing when the server is stopping itself. I
 believe we close the sockets under other circumstances too, for example
 when reconfiguring the listening addresses.

 I tried to clarify that with the above change.

 > When did we see a failure on close? Under what circumstances can that
 happen? And do we leak a file descriptor in that case, or are we trying to
 close something that is not open?

 To be honest, it's a sort of mystery to me.  But it certainly happened
 on our AS112 server (this ticket even has the log and stack trace of
 that case).  I tried to provide much more detailed explanation for
 that log, including information about these questions, to the extent
 of my best knowledge, but I still couldn't make it definitive.

 > From cppcheck (I don't know if they were present before, but it seems
 related):
 > {{{
 > src/lib/asiodns/sync_udp_server.cc:41: check_fail: Member variable
 'SyncUDPServer::resume_called_' is not initialized in the constructor.
 (warning,uninitMemberVar)
 > src/lib/asiodns/sync_udp_server.cc:41: check_fail: Member variable
 'SyncUDPServer::done_' is not initialized in the constructor.
 (warning,uninitMemberVar)
 > }}}

 They were in the original code, and shouldn't have caused a real
 problem as they are always set before read, but explicit
 initialization is a generally good practice anyway, if only to silence
 cppcheck.  So I fixed it in the branch.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2903#comment:17>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development