BIND 10 #2404: MySQL IPv4 lease database access functions
BIND 10 Development
do-not-reply at isc.org
Wed Dec 5 18:51:49 UTC 2012
#2404: MySQL IPv4 lease database access functions
-------------------------------------+-------------------------------------
Reporter: stephen | Owner:
Type: task | stephen
Priority: medium | Status:
Component: dhcpdb | reviewing
Keywords: | Milestone:
Sensitive: 0 | Sprint-DHCP-20121213
Sub-Project: DHCP | Resolution:
Estimated Difficulty: 0 | CVSS Scoring:
Total Hours: 0 | Defect Severity: N/A
| Feature Depending on Ticket:
| Add Hours to Ticket: 0
| Internal?: 0
-------------------------------------+-------------------------------------
Changes (by tomek):
* owner: tomek => stephen
Comment:
Replying to [comment:9 stephen]:
> In the longer term, I would like to use a utility like "dbutil" to
update the schema. This requires discussion with the DNS team as although
the schema and backend database is different, there will be a lot of
commonaility between the DNS and DHCP update programs. I would prefer
that we don't duplicate code.)
Feel free to initiate such discussion. If you need more time for this than
you can dedicate now, please submit a new ticket for that. Currently the
schema is rather simple, so we can update it manually. But it will get
much more complicated in the future.
> (Although as a note, if we were to put the definitions of getters and
setters in the class definition, the compiler should inline them and there
should be no penalty. However, I think a "struct" is warranted in this
case: I tend to use a "struct" where the intention is to collect data
items together, and a "class" where the contained data is less important
than the functionality the class provides. In this case, both Lease4 and
Lease6 are principally data stores, so a "struct" is appropriate.)
Agree. The only real difference between class and a struct is that struct
cannot have private members. Which is entirely the case here. We may
develop some utility methods, but the core purpose will still be to store
data.
> '''src/lib/dhcpsrv/mysql_lease_mgr.cc'''[[BR]]
> > CLIENT_ID_MAX_LEN constant: Please comment that this is arbitrarily
chosen value as RFC2131 does not specify upper bounds. It just follows 128
byte limit from DHCPv6.
>
> > We can possibly add a comment that a minor performance boost may be
achieved with shortening this field.
> Is that actually the case? VARCHAR fields in the database are variable
length and only store the data passed to them. I'm not sure that changing
the maximum size affects performance.
Ok, I was thinking about changing the whole schema. I do not have
operational experience with v4, so I can't comment on feasibility of
client-id being 128 bytes long. It definitely seems too large, especially
taking into consideration the limitations of DHCPv4 packet size.
I'm not saying that we should this now. Just marking it as "potential
(minor) performance improvement" will be sufficient.
> > On a side note, it may be a good idea to mark such comments somehow.
It will be useful when a customer comes to us and says that extra
perfomance is needed even if corners are cut. Note that it is "when", not
"if". Not all performance improvements are sane in general case, but it
would be useful to have pointers to start.
> We could just ensure that such sections start "Performance:", although
it might be better to define a suitable tag: the Doxygen command \xrefitem
may be our friend here. However, as adding a suitable tag such as
"\performance" means altering the Doxygen configuration file to include a
suitable ALIAS statement, I would prefer to get agreement with the DNS
team first.
Ok. This is not urgent, it's just something that would be good to have.
> > ADDRESS6_TEXT_MAX_LEN is 42 bytes. Comment for address:varchar states
that we define the field as "couple bytes longer", so we can null
terminate it. Why couple bytes and not just one? Also, is the MySQL API
require null-termination? In that case why string length is passed as
well?
> The documentation is not as clear as it could be, but on reflection I
think you are correct - the number of bytes transferred from the database
for a VARCHAR is the number of characters in the field. As such, the
buffer need only allocate one additional byte to allow for a trailing
NULL.
Ok.
> > Why did you remove error checking (bind_[X].error values)? There
shouldn't be any overflows, but it is better to check it anyway. One
example that comes to mind is overflow my happen with IDN hostnames and
odd UTF-8 encoding. That encoding may be set by the user and internal
MySQL conversion is outside of our control. Let's keep it.
> Mainly because it wasn't used. But you are right, we might as well
check for truncation now (it was a "todo"). The error variables have been
restored and truncated data causes an exception to be thrown.
Great. It's better to be safe than sorry. And with MySQL exposed, I'm sure
there will be folks eager to tinker with it "and see what happens".
> > Why MySqlLease6Exchange::getLeaseData() does not set T1 and T2 and
uses zeros? I haven't reviewed the tests yet, but aren't tests supposed to
catch that?
> Should T1 and T2 be stored in the database? At the moment, they aren't.
Good point. memfile stores them, just because it uses Lease6 (and will use
Lease4). It is worth mentioning in the documentation, though. A single
sentence in .dox will do.
> > You may add @todo here noting that innodb_flush_log_at_trx_commit
parameter will be tweakable from bind10 configuration one day.
> Is that wise? I'm not certain if the parameter can be altered from the
C API and besides, it is a database tuning parameter, not really in the
scope of BIND 10.
Ok. I thought that it will be controllable via bind10 config, but changing
it directly in MySQL is fine as well. So how is it changed? Using mysql
client? Or in some config file?
> > getLease4: I think we should throw exception if the subnet-id field
does not match, and not silently ignore it. This means database
corruption.
> I was a puzzled when I saw this method in the !LeaseMgr class: the
address is the unique primary key of the table so we can find at most one
lease with that address. The subnet_id is just an attribute of the row
returned, so getting by address and subnet ID is really over-constraining
the retrieval operation.
I agree to some degree. Unless someone tries to assign several overlapping
RFC1918 address in different subnets. But that will not work for other
reasons...
Yes, you are right. Please remove this call.
Also, I've found a minor bug in the getLease4(addr) comment. It mentions
DUID instead of client-id, which is likely a copy-paste error.
> Really, I don't think this method belongs in the lease manager: it is
the caller that is expecting a particular relationship between address and
subnet ID, not the database layer. So it seems more logical for the
caller to perform the check. Having said that, a performance optimisation
could be for the database to perform the search using SQL: if the
combination of address and subnet ID is not found, it would save the
transfer of a record from the database.
Agree. Feel free to remove it as part of this ticket or submit separate
ticket for that.
> > getName() method is supposed to return backend type. That is required
for Kea to report which backend is used. See DHCP6_DB_BACKEND_STARTED log
message and memfile implementation.
> getType() returns the type of backend (memfile, !MySql etc.) getName()
returns the name of the instance being used. For !MySql, getType returns
"mysql", getName returns the name of the database ("kea", "keatest" etc.)
Fair enough. I have either forgotten about getType() method or someone
else added it. Can I then ask you to update getName() in
memfile_lease_mgr.h? It's a single line change.
> > deleteLease4/6 returns boolean value, but updateLease4/6 status is
reported as exception (or lack of thereof). This should be consistent. I
think exeptions are they way to go (there may be other exeptions as well,
e.g. db error), but I'm ok with whatever consistent approach you'll
propose.
> It's really up to you as you are writing the code that uses it. But I
would have thought that the cases are different:
>
> The desired post-condition of deleteLease() is that the lease is not in
the database. Whether it was there beforehand is not really relevant as
the aim is to get rid of it. Therefore, deleting a lease that is not
there may be worthy of a note somewhere but it does not affect the flow of
control of whatever is deleting the lease. (If the pre-condition that
presence of the lease is important, getLease should have been used to
check.)
Agree to some degree. But an attempt to delete a lease that is not there
likely indicates a flaw in the server logic. On the other hand, I can
imagine at least two scenarios where this could happen:
1. administrator manually deleted a lease shortly before client sent
release. Less than thoughtful administrator, I admit, but such cases will
happen.
2. we may later develop performance mode, when all non-critical checks are
disabled. So when doing release we may skip the check that the lease is
really there and go straight to delete. Implications of such optimization
must be investigated, but that is at least something to consider.
So I withdraw my comment. Please keep the code as it is now - returning
boolean.
> > deleteLease4/6 todo comment: I'm not sure about unifying them. I think
it would be better from the consistency perspective to keep the API
clearly label protocol number.
> addlease does not include a numeric suffix, the version of the method
being chosen by argumewnt type. With deleteLease, the version of the
method to be used can be inferred from the type of address passed to it.
So why introduce a complication?
Fair enough.
> > We could query several subnets and just keep adding results to the
same collection. I'm not sure if that is how we will implement leasequery,
but I'm somewhat doubtful about result.clear(). If you have strong opinion
about keeping it as it is now, I won't object.
> As noted above, as currently written the clear() was unecessary. If we
do go for this idea, then we will have to rewrite some of the getLease()
methods to modify a !LeaseCollection object passed to them as opposed to
returning one via the method return value.
Let's not rewrite anything yet. It was just an idea that something like
this may potentially be useful. We may need to rewrite it once we get a
specific usecase that we decide to support.
> > I like the templated getLeaseCollection approach. Do you think
specifying 3 parameter getLeaseCollection methods are needed? Templated
getLeaseCollection could have been called directly. The only (minor)
drawback is that you'd need to use exchange4_ and exchange6_ explicitly in
every call. Again, that is only a suggestion. I'm fine with keeping the
code as it is now.
> I added them more for ease of use and understandability than anything
else. Without them, at the call to getLeaseCollection() you would need to
introduce the correct exchange object, which has not appeared anywhere in
the method up to then. With them, you don't worry about the exchange
objects, they only appear in the common code. The three-parameter methods
are defined in the header, so the compiler should inline them.
Ok.
> I agree, and "But let's keep things simple for now" was my approach. In
the long-term I was envisioning something like the DNS "dbutil" utility
for creating and updating the schema. It may be possible to add something
to that to generate the .h file.
Ok.
> '''src/lib/dhcpsrv/tests/lease_mgr_unittest.cc'''
> > Please include a one or two line comment before each test to explain
the scope of a test. That is Shawn's idea that I very much support. We
will convert them to a more structured test documentation one day.
> Done.
Thank you. In a near future, I think we should extend TEST and TEST_F
macros (or rather define BIND_TEST and BIND_TEST_F) that take additional
parameter that explains test purpose and possibly have other structured
information. This would be useful to automatically generate test
documentation. Your comments will be very useful for that purpose.
> > Also, we probably should defined !ClientIdPtr (or perhaps use !DuidPtr
for as a !ClientId is derived from Duid class).
> I leave that up to you.
Let's define ClientIdPtr. ClientId field will probably get extra methods,
e.g. like getHwType(), when we add hardware type to it, as Shawn
suggested.
> > checkLeasesDifferent(): I agree with the first loop using ASSERT_TRUE
instead of EXPECT_TRUE, but I question the second double loop: It will
print out the first error and abort. It would be more informational if all
> combinations were printed...
> Done by adding a SCOPED_TRACE statement prior to the call.
Nice trick!
> > ... Also, it should be EXPECT_EQ or ASSERT_EQ, so leases would be
printed in case of mismatch.
> That will only work where the lease can be converted to a string. For
the moment, I think that just identifying which leases are equal is
sufficient.
Ok.
> > Finally, I think it would be good to compare not the shared pointers,
but the objects they point to. This would catch the case where you have 2
exact copies with all parameters the same.
> Good catch. Done - of course, it did mean adding Lease4::operator==()
and then writing unit tests for both Lease4::Operator==() and
Lease6::operator==().
Great!
I was thinking about writing Lease6Ptr::operator== that will always throw
exception to easier trigger cases, when developer wants to compare leases,
but actually compares pointers to them. But I haven't found a way to do
that.
> > getLease4ClientId, getLease4HwaddrSubnetId, getLease4Hwaddr: Please
check that the functions do not crash when empty client-id and/or hw
address is provided. (Remember the fun we had with zero length client-
ids?) Please test maximum sizes of the client-id (128) and hwaddr (20).
> Done. Also tried lengths in excess of these limits to verify they are
truncated (although I can't find a way to get that information when the
data is added to the database). In addition, the tests were extended to
check DUID lengths in the Lease6 tests.
Ok.
> > One generic comment about client-id: Shawn suggested that we should
keep hardware type as the first byte in hwaddr. That is something we are
doing in isc-dhcp. I'm not saying that we should do it within this ticket,
but we should keep in mind that something like this will likely be
implemented. The best course of action would be to talk about if briefly
and the probably create a ticket for it.
> Agreed. Or define a !HardwareAddress class which contains both type and
value, and wrote both into separate columns in the database.
That is backend implementation detail. I was more thinking about how
!HardwareAddress and !ClientId are handled in the server code. I don't
have much experience with v4, so we will probably need to consult with
Shawn on this.
> > Generic comment about getLeaseX family of tests. They should be
eventually refactored to cover other backends. I think we need one common
class for all backends. As this is a huge task, we should probably create
a separate ticket for it. I for one would like to avoid replicating this
huge work you've done here when implementing similar tests for memfile.
> Agreed, but I don't think it's a huge task. The tests are virtually
completely backend agnostic, so it is really just getting them to run with
different backends. Gtest has "parameterised tests", which may meet that
need. I've created ticket #2533 for this.
Ok, thank you.
Hmm, seems that you either addressed on convinced me that things are good
as they are now. The only outstanding thing is my request to update
getName() in memfile backend. After that this ticket is ready for merge. I
don't need to see it again.
--
Ticket URL: <http://bind10.isc.org/ticket/2404#comment:10>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list