BIND 10 #1528: Interface detection on Linux refactoring
BIND 10 Development
do-not-reply at isc.org
Fri Mar 16 16:19:21 UTC 2012
#1528: Interface detection on Linux refactoring
-------------------------------------+-------------------------------------
Reporter: tomek | Owner: tomek
Type: | Status: reviewing
enhancement | Milestone: Sprint-
Priority: | DHCP-20120319
medium | Resolution:
Component: | Sensitive: 0
libdhcp | Sub-Project: DHCP
Keywords: | Estimated Difficulty: 0
Defect Severity: N/A | Total Hours: 0
Feature Depending on Ticket: |
Add Hours to Ticket: 0 |
Internal?: 0 |
-------------------------------------+-------------------------------------
Comment (by tomek):
Replying to [comment:6 stephen]:
> '''src/lib/dhcp/iface_mgr_linux.cc'''[[BR]]
> Should update copyright to 2011-2012.
Done.
> Really needs an introduction explaining the logic of Linux interfaces,
e.g. what is netlink, what messages are being used, what needs to happen
etc.
Such introduction/explanation with links to netlink interface is included
in ticket #1540.
> It is not clear why a vector of pointers is required to have many nlmsg
structures pointing to the same header. Can this comment be expanded?
Yes. See if this updated explanation is sufficient.
> There is no need to use the keyword "struct" in the declaration of a
vector of pointers to nlmsghdr objects.
Removed.
> "Holds information about interface or address attributes." This does
not hold the information, it merely points to it. Also, what is the
significance of "rt" in "rtaddtr", "RTattribs" and rtnl_* - Real-Time?
(What I'm trying to get across is that the reader may not know much about
the Netlink interface as it is not as widely-used as, say, the socket
interface. Adding comments to guide them makes it much easier to follow
the logic and to spot errors.)
I'm not really sure. I'm just following convention here: rtnetlink.h,
The original interface used in somewhat similar to naming conventions used
in ip tool from iproute2, which is used as a reference implementation for
netlink. I assume that RT stands for routing, as netlink interface allows
route modification as well. In fact, it is extremely robust, you can
create/get/set/delete links, addresses, routes, queuing, manipulate
traffic classes, manipulate neithborhood tables and even do something with
address labels (whatever that is). I must admit that I don't fully
comprehend all the details of the netlink interface. I can study it
further if you want me to.
> Both !NetlinkMessages and RTattribs objects hold pointers to other
objects. This immediately raises the question of what is expected to free
up the memory pointed to by them. This should be documented.
Added extra explanation with more background about netlink, the way how
normal byte buffer is read from netlink socket and then later is parsed as
collection of netlink messages concatenated together.
Also added comments about releasing allocated memory (see comments around
calls to rtnl_process_reply).
>(In fact, looking further on in the code I note that there is a special
function that deletes the pointed-to objects. In this case, why not
declare the vectors as vectors of scoped_ptrs? Then the objects will be
automatically deleted when the vector is deleted.)
Because this code is already convoluted enough. Have you ever tried to
debug such horribly aliased interface? I'm willing to write those extra
few lines of code for trivial memory release (I don't have to, as the code
is already there).
> Typo: "addres info", should be "address info". Also "any code reuse"
should be "any code reusable".
>
> The assumption that IFA_MAX is shorter than IFLA_MAX can be explicitly
checked during compilation using BOOST_STATIC_ASSERT.
>
> No need to use "struct" in the declaration of rtnl_handle.
Done.
> Should document why !__u32 is used as a data type rather than uint32_t
in rtnl_handle.
>
> sndbuf/rcvbuf. These variables sound like send buffer and receive
buffer, whereas in fact they are the size of the buffer. SNDBUF_SIZE and
RCVBUF_SIZE (or xxx_MAX) would be better names for them.
You mentioned both things in #1540. I will not duplicate it in #1528 to
avoid merge problems.
> ''rtnl_send_request''[[BR]]
> The members of "struct req" do not need to include the "struct"
declaration in C++.
>
> In rtnl, two structs are declared with another one. The enclosing
struct is then sent via a sendto() call with sizeof(enclosing-struct) as
the size. Are we sure that the compiler is not aligning the member
structs on natural boundaries? In other words, in some circumstances could
there be unused bytes between the two? Addition of an assertion to check
that would be useful.
It will work as sizeof(nlmsghdr) = 8, so regardless of memory alignment,
the next structure will start immediately after it. Nevertheless, it is a
good idea to add assert to confirm that. I tried this approach:
BOOST_STATIC_ASSERT(sizeof(nlmsghdr) == offsetof(req,generic) );
But it does not work. Compiler complains about a comma not being allowed
in macro. I think gcc is confused and thinks that two parameters are
passed to BOOST_STATIC_ASSERT that takes only one.
> Spaces around the binary operator "|" are required.
>
> A comment as to why handle.dump is set to ++handle.seq would be useful -
that is non-obvious, as handle is just described as a "context structure".
Also, I would explicitly put that assignment as a separate statement -
it's easy to overlook.
Done as requested. Also added extra comments for seq and dump fields.
> In the call to sendto(), the casting should be done using C++
constructs.
Done.
> ''rtnl_store_reply''[[BR]]
> "Appends nlmsg to a storage" should be "Appends a copy of a !NetLink
message to the message store".
>
> Why the redundant parentheses in the call to "new"?
Copy and paste error. Fixed.
>
> No need to check if {{{copy == NULL}}} as the "new" operation would
throw an exception if this were true. (But if it didn't, the check should
follow the memcpy() call, which would segfault if the destination were
null.)
>
> ''parse_rtattr''[[BR]]
> Reading this raised the following question: why is a boost array used to
store pointers to the rtattr structure rather than a vector (as is the
case for a !NetlinkMessage pointer)? (As it is, if the number of rtattr
structures is too many for the array, the excess are silently ignored.)
Because boost array offers the ability to fix size of the container. This
saves up checking if the vector passed is of proper size.
> In the "if" block, braces are required even for single-line statements.
>
> What are the RTA_OK and RTA_NEXT macros? Some comments are required
here.
These are macros defined in linux/rtnetlink.h for RTA manipulation. RTA_OK
checks if pointed rta structure is sane. RTA_NEXT() returns pointer to the
next rtattr message that immediately follows pointed rta. Added comment
that explains that.
> ''ipaddrs_get''[[BR]]
> This parses addr_info - but addr_info is described as a collection of
parsed netlink messages. More information about what this function
actually does is required.
Added more information.
> Why is addr declared with a length of 16? A symbolic constant should be
used.
Done.
> {{{struct ifaddrmsg *ifa"}}} should be {{{ifaddrmsg* ifa}}} (note the
absence of "struct" and the position of the "*".)
Done.
> Should use C++ construct to cast to "ifaddrmsg*".
Done.
>
> Redundant space after "(" in the "if" statement.
>
> No need to use "struct rtattr*" in call to fill() - "rtattr*" should be
sufficient.
>
> The two "if" blocks checking ifa->family are identical, with the
exception of the address family in the call to IOAddress::from_bytes and
the size of the data in the memcpy() call. The common code should be
abstracted into a separate function. (Additionally: (a) The "if" blocks
in this code should include braces, (b) C++ casts should be used in the
call to memcpy (and the second argument should be cast to {{{const
void*}}}) and (c) symbolic constants should be used as the length of data
to be copied in the call to memcpy.
Collapsed the code into a single block.
> ''rtnl_process_reply''[[BR]]
> It would be helpful if the header were to say what processing was going
to be done. (And comments in the code to say that as well.)
Added.
> Using "rth" as the designation for "netlink parameters" seems a bit
obscure.
Renamed to context.
> No need to use the "struct" keyword when declaring variables.
Removed.
> Declaration of "status" should be at first use. (The same goes for the
declaration of "header".)
Done.
> Does iov.iov_len vary between iterations of the loop? If not, why not
set it outside?
Moved.
> The "if" statement checking for EINTR should have the body within
braces.
Done.
> If recvmsg() returns a value less than zero, it does not automatically
mean that there is an overrun - it could be any one of a number of errors
(determined by errno). Including errno in the exception thrown would be
useful for diagnostic purposes.
Updated.
>
> I notice that here, if recvmsg fails due to an interrupted system call
status, the call is retried. This does not happen in the recvmsg() calls
in iface_mgr.cc. If this is an omission, I would create a separate
function that calls recvmsg and handles the case of an interrupted system
call.
The idea here is that there is no guarantee that messages received over
netlink will come back in one piece. We have such guarantee for UDP
packets, though. (For now we ignore the unlikely possibility of receiving
fragmented UDP which is unheard of in DHCP world).
> Assignment of address of "buf" to "header" - use C++ casts.
Done.
> The comment "Why we received this anyway?" does not give much
information. From the code, better would be something like "Received a
message not addressed to our process, or not with a sequence number we are
expecting. Ignore, and look at the next one."
Done.
> ''release_list''[[BR]]
> This would not be needed if the object were a vector of scoped_ptrs - a
call to vector::clear() would handle the destruction of the pointed-to
objects.
See my comments above about trade-offs between ease of debugging and
language fancy stuff. I had bad experience with trying to debug smart
pointers in gdb, so I try to avoid smart pointers in code that deals with
strange memory tricks. That certainly is true for netlink interface.
> ''detectIfaces''[[BR]]
> The comment "link info" on a variable named "link_info" is not really
helpful (same for addr_info).
>
> Start comments with a capital letter.
>
> "len" can be declared within the main loop, at the point of first use.
>
> Within the main "for" loop, why not set "len" with a single statement.
And what is "len" being set to?
Moved len into "for" loop.
>
> The "valgrind" report has to be bogus - iface_name is scope-limited to
the "for" loop and takes a copy of its argument. However, it is not clear
what the string represents - a comment would be useful. And are we sure
the string is non-null?
No, we are not sure. But we act on two assumptions:
1. RFC3549 states that it is ASCII string. Null termination is not
mentioned, but somewhat assumed.
2. Practical experience shows that it is Null-terminated.
The same approach is used in ip route2. It is sort of reference
implementation and was implemented by Alexey Kuznetzov, the same guy who
wrote Linux kernel counter-part.
> Looking at the declaration of iface:
> * The index is set in the constructor
> * The hardware type is set via a direct access to a member of Iface
> * The flags are set through a call to the method setFlagsx().
> This is inconsistent.
To some degree it is. However, that was chosen for a reason. The absolute
minimum of parameters that are required are interface name and ifindex.
That is useful in some cases, where other information is not available,
e.g. when interface information is read from a file rather than actually
detected. There are many parameters that can possibly be available: name,
index, MAC address, flags, assigned IPv4 and IPv6 addresses, possibly
additional info. We have 3 choices here:
later if available.
1. Create constructor with all parameters with most of the defaulting to 0
or NULL.
2. Create multiple constructor combinations, depending on what information
is available.
3. Create simple constructor with just minimal list of parameters and set
all other parameters
later, using dedicated methods.
First choice gives only a illusion of flexibility, as it is not possible
to sort parameter from most to least likely to be available. Therefore any
order proposed will not work in some cases.
Second option is not good due to code bloat as number of combinations are
really large.
Therefore third option was chosen as the most flexible. Keep in mind that
sooner or later, we will need to implement interface detection for systems
like Solaris that provides only limited information about available APIs
for interface detection.
> The inconsistency is carried through to the setting of the mac_len_ and
mac_ members: these should be zeroed in the the constructor. Similarly
the copying of the MAC from the attribs_table should be handled by a
single call to an Iface method, passing to it the address of the data and
length of the data.
Implemented setMac(),getMac(), getMacLen(), setHWType() and getHWType().
> Note that if exceptions are thrown, the socket will not be closed. This
suggests that rtnl_handle should contain a destructor that closes the
socket. Also, since a socket is
Agree. Added destructor that closes the socket if the descriptor is set.
opened with rtnl_open_socket(), it would be symmetrical for there to be a
rtnl_close_socket() call to avoid the need to expose handle.fd. In fact,
going further, I would suggest that
rtnl_handle should be a fully-fledged class, and that the all the rtnl_xxx
functions should be class methods. This would give maximum information
hiding.
And what benefit that would give us? I would rather work on something
useful instead. During next meeting when asked about progress I would very
much prefer to say "logging is implemented", "developer's documentation is
ready" or "fixed some bugs", rather than "there's no real progress, but
our interface detection code is now symmetric and nicely wrapped in a
class".
Anyway, I did as you suggested.
Rest of the review comments will follow.
--
Ticket URL: <http://bind10.isc.org/ticket/1528#comment:7>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list