BIND 10 #353: Move function 'check_port()' and 'check_addr()' to one library

Sun Oct 17 06:47:29 UTC 2010

#353: Move function 'check_port()' and 'check_addr()' to one library
-------------------------------+--------------------------------------------
      Reporter:  zhanglikun    |        Owner:  vorner   
          Type:  enhancement   |       Status:  reviewing
      Priority:  major         |    Milestone:           
     Component:  Unclassified  |   Resolution:           
      Keywords:                |    Sensitive:  0        
Estimatedhours:  0.0           |        Hours:  0        
      Billable:  1             |   Totalhours:  0        
      Internal:  0             |  
-------------------------------+--------------------------------------------
Changes (by jinmei):

  * owner:  jinmei => vorner

Comment:

 Replying to [comment:6 vorner]:

 > I modified the function to guess first by a regular expression which
 version of address it is. This saves one call to inet_ntop() that fails in
 case of IPv6 address and should workaround about this problem as a
 sideeffect. Could you run the tests once more and see if it really does? I
 don't have MacOS.
 >
 The offending test now passes on MacOS X.

 Understanding it may be controversial, however, I always wonder why some
 people tend to write in-house versions of this type of parser (e.g., by
 using a regular expression) while the standard getaddrinfo() library is
 provided exactly for this purpose.

 Regular expressions are often difficult to understand for many people,
 which is in itself against one major goal of BIND 10: better code
 understandability.  And, in general, in-house code tends to more errors
 than carefully written and widely deployed standard libraries (which is
 why I believe the "NIH syndrome" is considered a bad practice).  In fact,
 the regular expression "isv6" has a minor bug: it cannot recognize the
 third form of IPv6 textual representation as defined in Section 2.2 of
 RFC4291: It incorrectly rejects forms like "::ffff:192.0.2.1" or
 "2001:4f8:3:bb::0.0.0.5" (yes, the latter form is a valid representation
 even though it's quite unlikely to see the latter form in practice).

 Admittedly, standard libraries are not always correct (just like we saw it
 in OS X's inet_pton()), but if we use it as an excuse to introduce an in-
 house partial parser, we should end up even replacing inet_pton() with an
 in-house complete parser.

 So, my suggestion is:

  - ignore the buggy behavior of MacOS.  it should be a minor and rare
 error anyway, so IMO it's not worth making our implementation more
 complicated and less understandable.  If we want to ensure this type of
 bogus string be rejected in other platforms, I'd rather move the
 complexity to the test code, e.g.:
 {{{
         try:
             # XXX: MacOS X's inet_pton() doesn't reject this form, so we
             # check the behavior of the underlying library implementation
             # before the actual test
             socket.inet_pton(socket.AF_INET, "0000.0.0.0")
         except socket.error:
             self.assertRaises(ValueError, addr_parse, "0000.0.0.0")
 }}}
  - then, independently from how to address the buggy inet_pton() behavior,
 if we want to save the unnecessary call to inet_pton() for IPv6, use
 socket.getaddrinfo(), e.g.:
 {{{
         try:
             addrinfo = socket.getaddrinfo(addr, None)[0]
             self.family = addrinfo[0]
             # (we could avoid the call to inet_pton() if it's okay to just
             # hold the string address or address tuple)
             self.addr = socket.inet_pton(self.family, addrinfo[4][0])
         except socket.error as e:
             raise InvalidAddress(str(e))
 }}}

 > > I also have following minor comments.  I'd leave it to you whether and
 how to address these points.
 > >  - port_check()/addr_check() may not be a good name (and in that sense
 check.py, too) because xxx_check() sounds like performing a check only and
 it's not intuitive for it to return something
 >
 > I changed it to _parse, which can be expected to complain if it is
 invalid. Does it seem better?
 >
 Yes, I think so.

 > >  - it's not clear that IPAddr() raises socket.error against an invalid
 input (unless you read the source code, and has knowledge that inet_pton()
 raises that exception on failure).  Please at least describe it in the
 function description, and it's probably better to use a user-defined
 specific exception.
 >
 > Added and documented.
 >
 Confirmed.

 > >  - in check_test.py, you assume specific output of inet_ntop(), but as
 far as I know it's implementation dependent (although we may expect a
 "canonical form" with future implementation due to RFC5952).
 >
 > I left it there. For one, I do not see any simple way to check it works.
 For another, even if it is implementation dependent, the canonical form
 (which I hope I put there) seems the only reasonable thing it could return
 and I guess there's no system that would return something else. If there
 is, the only thing that happens would be a false positive of the test (it
 would fail even when it shouldn't), which is better than the other way
 around. Would you mind if it is left there until we actually find a system
 that returns something else?
 >
 That's fine for me, but I'd suggest commenting on it in the test code.

-- 
Ticket URL: <https://bind10.isc.org/ticket/353#comment:7>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development