BIND 10 #2902: Direct DHCPv4 traffic on Linux - Linux Packet Filter (LPF)

Wed May 22 08:17:36 UTC 2013

#2902: Direct DHCPv4 traffic on Linux - Linux Packet Filter (LPF)
-------------------------------------+-------------------------------------
            Reporter:  marcin        |                        Owner:
                Type:  enhancement   |  stephen
            Priority:  medium        |                       Status:
           Component:  dhcp4         |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-DHCP-20130523
         Sub-Project:  DHCP          |                   Resolution:
Estimated Difficulty:  0             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------
Changes (by marcin):

 * owner:  marcin => stephen

Comment:

 Replying to [comment:5 stephen]:
 > Reviewed commit 9c2f41d0ce4250623b36966eb04698d55eac7509
 >
 > '''src/bin/dhcp4/dhcp4_srv.cc'''[[BR]]
 > Dhcp4Srv constructor: appears to be a word missing in the first comment
 block inside the
 function: it currently says "The 'true' value of in the call".

 Corrected.

 >
 > Dhcp4Srv constructor: a comment should be added to the effect that the
 check on whether the source/destination hardware address are specified
 before copying them across is because the copy function throws an
 exception if the pointer is null.  However, what should the logic be when
 these two fields are null and the corresponding fields in "answer" are
 not? (In other words, should Pkt4::setLocalHWAddr() zero out the
 Pkt4::local_hwaddr_ field if passed a null pointer?)

 I don't have strong opinion here. I followed the logic for the setHWAddr
 when I created the set{Local,Remote}HWAddr. I think that there is no use
 case currently, when we want to set the HWAddr to NULL after it was set to
 non-NULL value. So, it makes sense to me to throw an exception when such
 attempt occurs, so as we can detect unexpected behavior. If we find such
 use case, we can always change it.

 >
 >
 > '''src/bin/dhcp4/tests/dhcp4_srv_unittest.cc'''[[BR]]
 > The comment in the NakedDhcpv4Srv constructor disables raw traffic owing
 to lack of root privileges.  If root privileges are enabled, is there a
 way of running unit tests with raw sockets?

 The PktFilterLPF unit tests cover the part which requires root privileges.

 >
 > testDiscoverRequest:
 > * "i*10" should be "i * 10"

 Corrected.

 > * Adding a comment above the block of code that actually creates the
 request packet saying that this is what is happening would immediately
 make the point of the next few statements clear.

 Added.

 >
 >
 > '''src/lib/dhcp/iface_mgr.cc'''[[BR]]
 > As Iface::closeSockets now requires that !SocketCollection be a
 container whose iterators are not evalidated when erase() has been called,
 an additional comment to that effect at the point at which
 !SocketCollection is defined might avoid problems later on. (For example,
 if someone decided to change it to a vector.)

 Comment added.

 >
 > IfaceMgr::setPacketFilter: The comment "There is at least one socket
 open, so we have to fail" should be after the "if" statement. (Either
 that, or precede the comment with the word "If".)
 >

 ok.

 >
 > '''src/lib/dhcp/iface_mgr.h'''[[BR]]
 > Trivial point: all the comments the Iface class should start with a
 capital letter.

 I corrected all I found.

 >
 > IfaceMgr::setPacketFilter declaration: !PktFilterPtr has been defined in
 pkt_filter.h and can be used in place of the type
 "boost::shared_ptr<!PktFilter>".

 Changed to !PktFilterPtr.

 >
 >
 > '''src/lib/dhcp/iface_mgr_{bsd,linux,sun}.cc'''[[BR]]
 > IfaceMgr::setMatchingPacketFilter - could use !PktFilterPtr instead of
 explicitly using shared_ptr (and shortens the statements enough to allow
 the removal of the intermediate variable without loss of clarity).

 Changed to !PktFilterPtr.

 >
 >
 > '''src/lib/dhcp/pkt4.cc'''[[BR]]
 > setHWAddr()/setLocalHWAddr()/setLocalHWAddr(): these could call a common
 function, passing through the member variable as a parameter.
 Alternatively, leave it as-is, but expand the exception message to
 indicate which hardware address is being set to NULL.

 Exception message expanded.

 >
 > setPacketFilter declaration: !PktFilterPtr has been defined in
 pkt_filter.h and can be used in place of the type
 "boost::shared_ptr<!PktFilter>" in the argument list.

 Changed to !PktFilterPtr.

 >
 >
 > '''src/lib/dhcp/pkt_filter.h'''[[BR]]
 > If the forward declaration of "class Iface" deserves a comment, so does
 the preceding declaration of "struct !SockInfo".

 Added a brief comment.

 >
 >
 > '''src/lib/dhcp/pkt_filter_inet.cc'''[[BR]]
 > PktFilterInet::openSocket: should the "if (receive_bcast)" statement in
 the #ifdef'd block be changed to "if (receive_bcast &&
 iface.flag_broadcast)" in line with similar changes elsewhere?

 Added additional condition as suggested.

 >
 >
 > '''src/lib/dhcp/pkt_filter_lpf.cc'''[[BR]]
 > Header comments: several typos
 > s/other then DHCP/other than DHCP/
 > s/Socket filter program is/The socket filter program is/
 > s/origins from/originates from/

 Corrected.

 >
 > Check that there is a space around operators in the BPF_ statements.
 Also, it's a bit unusual to split the macro BPF_STMT/BPF_JUMP from its
 body.

 Space added. What do you mean by split macro from its body?

 >
 > It would help if the header comments gave a reference to a man page or
 and a bit more help in  interpreting the BPF_STMT and BPF_JUMP statements,
 e.g.
 > {{{
 > The following structure defines a Berkely Packet Filter program to
 perform
 > the filtering. The program operates on Ethernet packets.  To help with
 > interpretation of the program, for the types of Ethernet packets we are
 > interested in, the header layout is:
 >
 >   6 bytes  Destination Ethernet Address
 >   6 bytes  Source Ethernet Address
 >   2 bytes  Ethernet packet type
 >
 >  20 bytes  Fixed part of IP header
 >  variable  Variable part of IP header
 >
 >   2 bytes  UDP Source port
 >   2 bytes  UDP destination port
 >   4 bytes  Rest of UDP header
 > }}}
 > Then expand the comments a bit.  For example:
 > {{{
 >     // Make sure this is an IP packet: check the half-word (two bytes)
 >     // at offset 12 in the packet (the Ethernet packet type).  If it
 >     // is, advance to the next instruction.  If not, advance 8
 >     // instructions (which takes execution to the last instruction in
 >     // the sequence: "drop it").
 >     BPF_STMT (BPF_LD + BPF_H + BPF_ABS, 12),
 >     BPF_JUMP (BPF_JMP + BPF_JEQ + BPF_K, ETHERTYPE_IP, 0, 8),
 >
 >     // Make sure it's a UDP packet.  The IP protocol is at offset
 >     // 9 in the IP header so, adding the Ethernet packet header size
 >     // of 14 bytes gives an absolute byte offset in the packet of 23.
 >     BPF_STMT (BPF_LD + BPF_B + BPF_ABS, 23),
 >     BPF_JUMP (BPF_JMP + BPF_JEQ + BPF_K, IPPROTO_UDP, 0, 6),
 >
 >     // Make sure this isn't a fragment by checking that the fragment
 >     // offset field in the IP header is zero.  This field is the
 >     // least-significant 13 bits in the bytes at offsets 6 and 7 in
 >     // the IP header, so the half-word at offset 20 (6 + size of
 >     // Ethernet header) is loaded and an appropriate mask applied.
 >     BPF_STMT(BPF_LD + BPF_H + BPF_ABS, 20),
 >     BPF_JUMP(BPF_JMP + BPF_JSET + BPF_K, 0x1fff, 4, 0),
 >
 >     // Get the IP header length.  This is achieved by the following
 >     // (special) instruction that, given the offset of the start
 >     // of the IP header (offset 14) loads the IP header length.
 >     BPF_STMT (BPF_LDX + BPF_B + BPF_MSH, 14),
 >
 >     // Make sure it's to the right port.  The following instruction
 >     // adds the previously extracted IP header length to the given
 >     // offset to locate the correct byte.  The given offset of 16
 >     // comprises the length of the Ethernet header (14) plus the offset
 >     // of the UDP destination port (2) within the UDP header.
 >     BPF_STMT (BPF_LD + BPF_H + BPF_IND, 16),
 >     // The following instruction tests against the default DHCP server
 port,
 >     // but the action port is actually set in
 PktFilterLPF::openSocket().
 >     // N.B. The code in that method assumes that this instruction is at
 >     // offset 8 in the program.  If this is changed, openSocket() must
 be
 >     // updated.
 >     BPF_JUMP (BPF_JMP + BPF_JEQ + BPF_K, isc::dhcp::DHCP4_SERVER_PORT,
 0, 1),
 >
 >     // If we passed all the tests, ask for the whole packet.
 >     BPF_STMT(BPF_RET + BPF_K, (u_int) - 1),
 >
 >     // Otherwise, drop it.
 >     BPF_STMT(BPF_RET + BPF_K, 0),
 > }}}
 > (I've gone on about this a little bit because, having not come across
 BPF before, I spend quite a bit of time working it out.  As well as
 reading and understanding the BPF documentation, checking offsets etc.
 involved looking up the format of Ethernet headers, IP headers and UDP
 headers. (And finally, it took a little time puzzling over it before I
 realised why an offset of 16 was used when extracting the UDP port
 number.)

 I appreciate detailed comments you provided for each instruction in this
 program. They are very comprehensive, so I just used the in the code as
 they are.

 >
 > PktFilterLPF::openSocket(): would appreciate a comment before the
 "bind()" call saying why this is happening (as the method is openSocket()
 and the only documentation for it is "open socket") rather than have to
 look at the exception text for the case where it has failed.

 Added comment to bind().

 >
 > PktFilterLPF::receive(): read() can return -1 to indicate an error.
 However, this case is ignored and treated as if an end of file was
 received. (This might be a valid way to treat the error, but it should be
 documented.)

 Comment added.

 >
 > PktFilterLPF::receive(): why don't we know where the DHCP data starts?
 Is it because we have an Ethernet packet and there is the variable IP
 header in it? If so, it should be stated.

 There are two reasons why we have to decode the whole packet before
 getting the DHCP data:
 - The IP header has variable length which is encoded in the packet
 - The packet may be invalid (e.g. truncated). We have to make sure that
 the data is actually there.

 >
 > PktFilterLPF::send(): pkt->getLocalAddress() returns an IOAddress
 object.  This has an "operator uint32_t" to cast an IPV4 address to a
 uint32_t.  Instead of converting the address to text and checking against
 "255.255.255.255", it would be faster to use this cast and compare it the
 value against UINT32_MAX. (Or create a 'static const
 IOAddress("255.255.255.255")' within the method and compare two IOAddress
 objects for equality.)

 Using static const makes the code clearer in my opinion, so I created
 static const.

 >
 >
 > '''src/lib/dhcp/protocol_util.cc'''[[BR]]
 > decodeEthernetHeader(): rather than put the explicit "14" in the
 exception message, ETHERNET_HEADER_LEN should be used as an argument
 instead.

 ok.

 >
 > decodeIpUdpHeader(): if the IP header length is less that five words,
 this is a corrupt packet.  An exception should be thrown instead of the
 error being ignored.

 I now throw an exception.

 >
 > decodeIpUdpHeader(): The comment about skipping the IP header options
 should note "Skip IP header options (if any) to the start of the UDP
 packet".

 Added.

 >
 > decodeIpUdpHeader(): The pointer position is not set to the tail of the
 UDP header (which implies the last byte in the header) but to the start of
 the UDP payload, the DHCP packet.

 Corrected.

 >
 > writeEthernetHeader(): rather than use the constant 0x0800, can't you
 use the symbolic constant ETHERTYPE_IP inmstead?

 This implies that I need to pull another header but ok, I changed that.

 >
 > calcChecksum(): Spaces around "+".

 Corrected.

 >
 >
 > '''src/lib/dhcp/protocol_util.h'''[[BR]]
 > Having read this after pkt_filter_lpf.cc, use of the symbolic constants
 defined here (extended with additional relevant ones) in the BPF macros
 may make the filter instructions easier to understand.

 I have added a couple of constants and use them instead of raw values.

 >
 > Headers for decode{Ethernet,Ip}Header should either say "the pkt object"
 or "a Pkt4 object".

 Corrected.

 >
 > decodeEthernetHeader() does not check if the pointer is NULL, but
 decodeIpUdpHeader() does check.  Why the difference?

 I am not sure I understand this point. Both do check!

 >
 > calcChecksum(): is there a suitable reference to where the checksum
 calculation is defined?

 I added a reference to RFC.

 >
 >
 > '''src/lib/dhcp/tests/iface_mgr_unittest.cc'''[[BR]]
 > socketsForRemoteAddress test: will the OS_LINUX test be re-endabled?  If
 so, under what circumstances?

 It will not. The behavior of this test is driven by the actual interface
 configuration of the machine. On reflection, I think it is better to
 remove this code.

 >
 > sendReceive4 test: Remove the commented out line testing that socket2 is
 set.

 Removed.

 >
 > setMatchingPacketFilter test: both the Linux and non-linux versions look
 identical to me.

 They are not identical. Take a look at last lines of each test. One
 expects TRUE another expects  FALSE.

 >
 >
 >
 > '''src/lib/dhcp/tests/pkt4_unittest.cc'''[[BR]]
 > hwaddrSrcRemote() test: I think that the second
 EXPECT_NO_THROW/EXPECT_TRUE test should use a different hardware address
 to the first.  (Currently both use dst_hwaddr) Otherwise the test could
 pass if the "set" did nothing and the "get" returned the wrong field.

 Yes. This is because I copied the piece of code and forgot to modify it.

 >
 > '''!ChangeLog'''[[BR]]
 > This will need a !ChangeLog entry.

 How about...

 {{{
 6XX.    [func]*         marcin
         b10-dhcp4: Server is capable to respond to a directly connected
 client,
         which doesn't have an address yet. On Linux, server will unicast
 the
         response to the 'yiaddr' which holds the client's new address.
 Sending
         response to unicast address prevents other (not interested) hosts
 from
         receiving server response. This capability is not yet implemented
 on
         non-Linux Operating Systems. On these systems, server will respond
 to
         the broadcast address. The new logic conforms to the chapter 4.1
 of
         RFC2131.
         (Trac #2902, git abc)
 }}}

 As I mentioned during one of our discussions, I discovered bugs in the
 implementation of Direct Traffic which you had reviewed. The
 implementation did not follow some of the rules stated in section 4.1 of
 RFC2131:
 - Broadcast flag set by the client was not respected by the server. Server
 must respond to the broadcast (rather than unicast) address when this flag
 is set.
 - DHCPNAK must be sent to broadcast address because server denied
 assignment of the requested address to the client.
 - Server should respond to ciaddr if available.

 I corrected these issues, and as a result I had to add a few new unit
 tests and modify existing unit tests.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2902#comment:6>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development