BIND 10 #2404: MySQL IPv4 lease database access functions

Wed Dec 5 18:51:49 UTC 2012

#2404: MySQL IPv4 lease database access functions
-------------------------------------+-------------------------------------
            Reporter:  stephen       |                        Owner:
                Type:  task          |  stephen
            Priority:  medium        |                       Status:
           Component:  dhcpdb        |  reviewing
            Keywords:                |                    Milestone:
           Sensitive:  0             |  Sprint-DHCP-20121213
         Sub-Project:  DHCP          |                   Resolution:
Estimated Difficulty:  0             |                 CVSS Scoring:
         Total Hours:  0             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |          Add Hours to Ticket:  0
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------
Changes (by tomek):

 * owner:  tomek => stephen

Comment:

 Replying to [comment:9 stephen]:
 > In the longer term, I would like to use a utility like "dbutil" to
 update the schema.  This requires discussion with the DNS team as although
 the schema and backend database is different, there will be a lot of
 commonaility between the DNS and DHCP update programs.  I would prefer
 that we don't duplicate code.)
 Feel free to initiate such discussion. If you need more time for this than
 you can dedicate now, please submit a new ticket for that. Currently the
 schema is rather simple, so we can update it manually. But it will get
 much more complicated in the future.

 > (Although as a note, if we were to put the definitions of getters and
 setters in the class definition, the compiler should inline them and there
 should be no penalty.  However, I think a "struct" is warranted in this
 case: I tend to use a "struct" where the intention is to collect data
 items together, and a "class" where the contained data is less important
 than the functionality the class provides.  In this case, both Lease4 and
 Lease6 are principally data stores, so a "struct" is appropriate.)
 Agree. The only real difference between class and a struct is that struct
 cannot have private members. Which is entirely the case here. We may
 develop some utility methods, but the core purpose will still be to store
 data.

 > '''src/lib/dhcpsrv/mysql_lease_mgr.cc'''[[BR]]
 > > CLIENT_ID_MAX_LEN constant: Please comment that this is arbitrarily
 chosen value as RFC2131 does not specify upper bounds. It just follows 128
 byte limit from DHCPv6.
 >
 > > We can possibly add a comment that a minor performance boost may be
 achieved with shortening this field.
 > Is that actually the case?  VARCHAR fields in the database are variable
 length and only store the data passed to them.  I'm not sure that changing
 the maximum size affects performance.
 Ok, I was thinking about changing the whole schema. I do not have
 operational experience with v4, so I can't comment on feasibility of
 client-id being 128 bytes long. It definitely seems too large, especially
 taking into consideration the limitations of DHCPv4 packet size.

 I'm not saying that we should this now. Just marking it as "potential
 (minor) performance improvement" will be sufficient.

 > > On a side note, it may be a good idea to mark such comments somehow.
 It will be useful when a customer comes to us and says that extra
 perfomance is needed even if corners are cut. Note that it is "when", not
 "if". Not all performance improvements are sane in general case, but it
 would be useful to have pointers to start.
 > We could just ensure that such sections start "Performance:", although
 it might be better to define a suitable tag: the Doxygen command \xrefitem
 may be our friend here.  However, as adding a suitable tag such as
 "\performance" means altering the Doxygen configuration file to include a
 suitable ALIAS statement, I would prefer to get agreement with the DNS
 team first.
 Ok. This is not urgent, it's just something that would be good to have.

 > > ADDRESS6_TEXT_MAX_LEN is 42 bytes. Comment for address:varchar states
 that we define the field as "couple bytes longer", so we can null
 terminate it. Why couple bytes and not just one? Also, is the MySQL API
 require null-termination? In that case why string length is passed as
 well?
 > The documentation is not as clear as it could be, but on reflection I
 think you are correct - the number of bytes transferred from the database
 for a VARCHAR is the number of characters in the field.  As such, the
 buffer need only allocate one additional byte to allow for a trailing
 NULL.
 Ok.

 > > Why did you remove error checking (bind_[X].error values)? There
 shouldn't be any overflows, but it is better to check it anyway. One
 example that comes to mind is overflow my happen with IDN hostnames and
 odd UTF-8 encoding. That encoding may be set by the user and internal
 MySQL conversion is outside of our control. Let's keep it.
 > Mainly because it wasn't used.  But you are right, we might as well
 check for truncation now (it was a "todo").  The error variables have been
 restored and truncated data causes an exception to be thrown.
 Great. It's better to be safe than sorry. And with MySQL exposed, I'm sure
 there will be folks eager to tinker with it "and see what happens".

 > > Why MySqlLease6Exchange::getLeaseData() does not set T1 and T2 and
 uses zeros? I haven't reviewed the tests yet, but aren't tests supposed to
 catch that?
 > Should T1 and T2 be stored in the database?  At the moment, they aren't.
 Good point. memfile stores them, just because it uses Lease6 (and will use
 Lease4). It is worth mentioning in the documentation, though. A single
 sentence in .dox will do.

  > > You may add @todo here noting that innodb_flush_log_at_trx_commit
 parameter will be tweakable from bind10 configuration one day.
 > Is that wise?  I'm not certain if the parameter can be altered from the
 C API and besides, it is a database tuning parameter, not really in the
 scope of BIND 10.
 Ok. I thought that it will be controllable via bind10 config, but changing
 it directly in MySQL is fine as well. So how is it changed? Using mysql
 client? Or in some config file?

 > > getLease4: I think we should throw exception if the subnet-id field
 does not match, and not silently ignore it. This means database
 corruption.
 > I was a puzzled when I saw this method in the !LeaseMgr class: the
 address is the unique primary key of the table so we can find at most one
 lease with that address.  The subnet_id is just an attribute of the row
 returned, so getting by address and subnet ID is really over-constraining
 the retrieval operation.
 I agree to some degree. Unless someone tries to assign several overlapping
 RFC1918 address in different subnets. But that will not work for other
 reasons...

 Yes, you are right. Please remove this call.

 Also, I've found a minor bug in the getLease4(addr) comment. It mentions
 DUID instead of client-id, which is likely a copy-paste error.

 > Really, I don't think this method belongs in the lease manager: it is
 the caller that is expecting a particular relationship between address and
 subnet ID, not the database layer.  So it seems more logical for the
 caller to perform the check.  Having said that, a performance optimisation
 could be for the database to perform the search using SQL: if the
 combination of address and subnet ID is not found, it would save the
 transfer of a record from the database.
 Agree. Feel free to remove it as part of this ticket or submit separate
 ticket for that.

 > > getName() method is supposed to return backend type. That is required
 for Kea to report which backend is used. See DHCP6_DB_BACKEND_STARTED log
 message and memfile implementation.
 > getType() returns the type of backend (memfile, !MySql etc.)  getName()
 returns the name of the instance being used.  For !MySql, getType returns
 "mysql", getName returns the name of the database ("kea", "keatest" etc.)
 Fair enough. I have either forgotten about getType() method or someone
 else added it. Can I then ask you to update getName() in
 memfile_lease_mgr.h? It's a single line change.

 > > deleteLease4/6 returns boolean value, but updateLease4/6 status is
 reported as exception (or lack of thereof). This should be consistent. I
 think exeptions are they way to go (there may be other exeptions as well,
 e.g. db error), but I'm ok with whatever consistent approach you'll
 propose.
 > It's really up to you as you are writing the code that uses it.  But I
 would have thought that the cases are different:
 >
 > The desired post-condition of deleteLease() is that the lease is not in
 the database.  Whether it was there beforehand is not really relevant as
 the aim is to get rid of it.  Therefore, deleting a lease that is not
 there may be worthy of a note somewhere but it does not affect the flow of
 control of whatever is deleting the lease.  (If the pre-condition that
 presence of the lease is important, getLease should have been used to
 check.)
 Agree to some degree. But an attempt to delete a lease that is not there
 likely indicates a flaw in the server logic. On the other hand, I can
 imagine at least two scenarios where this could happen:

 1. administrator manually deleted a lease shortly before client sent
 release. Less than thoughtful administrator, I admit, but such cases will
 happen.
 2. we may later develop performance mode, when all non-critical checks are
 disabled. So when doing release we may skip the check that the lease is
 really there and go straight to delete. Implications of such optimization
 must be investigated, but that is at least something to consider.

 So I withdraw my comment. Please keep the code as it is now - returning
 boolean.

 > > deleteLease4/6 todo comment: I'm not sure about unifying them. I think
 it would be better from the consistency perspective to keep the API
 clearly label protocol number.
 > addlease does not include a numeric suffix, the version of the method
 being chosen by argumewnt type.  With deleteLease, the version of the
 method to be used can be inferred from the type of address passed to it.
 So why introduce a complication?
 Fair enough.

 > > We could query several subnets and just keep adding results to the
 same collection. I'm not sure if that is how we will implement leasequery,
 but I'm somewhat doubtful about result.clear(). If you have strong opinion
 about keeping it as it is now, I won't object.
 > As noted above, as currently written the clear() was unecessary.  If we
 do go for this idea, then we will have to rewrite some of the getLease()
 methods to modify a !LeaseCollection object passed to them as opposed to
 returning one via the method return value.
 Let's not rewrite anything yet. It was just an idea that something like
 this may potentially be useful. We may need to rewrite it once we get a
 specific usecase that we decide to support.

 > > I like the templated getLeaseCollection approach. Do you think
 specifying 3 parameter getLeaseCollection methods are needed? Templated
 getLeaseCollection could have been called directly. The only (minor)
 drawback is that you'd need to use exchange4_ and exchange6_ explicitly in
 every call. Again, that is only a suggestion. I'm fine with keeping the
 code as it is now.
 > I added them more for ease of use and understandability than anything
 else.  Without them, at the call to getLeaseCollection() you would need to
 introduce the correct exchange object, which has not appeared anywhere in
 the method up to then.  With them, you don't worry about the exchange
 objects, they only appear in the common code.  The three-parameter methods
 are defined in the header, so the compiler should inline them.
 Ok.

 > I agree, and "But let's keep things simple for now" was my approach.  In
 the long-term I was envisioning something like the DNS "dbutil" utility
 for creating and updating the schema.  It may be possible to add something
 to that to generate the .h file.
 Ok.

 > '''src/lib/dhcpsrv/tests/lease_mgr_unittest.cc'''
 > > Please include a one or two line comment before each test to explain
 the scope of a test. That is Shawn's idea that I very much support. We
 will convert them to a more structured test documentation one day.
 > Done.
 Thank you. In a near future, I think we should extend TEST and TEST_F
 macros (or rather define BIND_TEST and BIND_TEST_F) that take additional
 parameter that explains test purpose and possibly have other structured
 information. This would be useful to automatically generate test
 documentation. Your comments will be very useful for that purpose.

 > > Also, we probably should defined !ClientIdPtr (or perhaps use !DuidPtr
 for as a !ClientId is derived from Duid class).
 > I leave that up to you.
 Let's define ClientIdPtr. ClientId field will probably get extra methods,
 e.g. like getHwType(), when we add hardware type to it, as Shawn
 suggested.

 > > checkLeasesDifferent(): I agree with the first loop using ASSERT_TRUE
 instead of EXPECT_TRUE, but I question the second double loop: It will
 print out the first error and abort. It would be more informational if all
 > combinations were printed...
 > Done by adding a SCOPED_TRACE statement prior to the call.
 Nice trick!

 > > ... Also, it should be EXPECT_EQ or ASSERT_EQ, so leases would be
 printed in case of mismatch.
 > That will only work where the lease can be converted to a string.  For
 the moment, I think that just identifying which leases are equal is
 sufficient.
 Ok.

 > > Finally, I think it would be good to compare not the shared pointers,
 but the objects they point to. This would catch the case where you have 2
 exact copies with all parameters the same.
 > Good catch.  Done - of course, it did mean adding Lease4::operator==()
 and then writing unit tests for both Lease4::Operator==() and
 Lease6::operator==().
 Great!

 I was thinking about writing Lease6Ptr::operator== that will always throw
 exception to easier trigger cases, when developer wants to compare leases,
 but actually compares pointers to them. But I haven't found a way to do
 that.

 > > getLease4ClientId, getLease4HwaddrSubnetId, getLease4Hwaddr: Please
 check that the functions do not crash when empty client-id and/or hw
 address is provided. (Remember the fun we had with zero length client-
 ids?) Please test maximum sizes of the client-id (128) and hwaddr (20).
 > Done.  Also tried lengths in excess of these limits to verify they are
 truncated (although I can't find a way to get that information when the
 data is added to the database). In addition, the tests were extended to
 check DUID lengths in the Lease6 tests.
 Ok.

 > > One generic comment about client-id: Shawn suggested that we should
 keep hardware type as the first byte in hwaddr. That is something we are
 doing in isc-dhcp. I'm not saying that we should do it within this ticket,
 but we should keep in mind that something like this will likely be
 implemented. The best course of action would be to talk about if briefly
 and the probably create a ticket for it.
 > Agreed.  Or define a !HardwareAddress class which contains both type and
 value, and wrote both into separate columns in the database.
 That is backend implementation detail. I was more thinking about how
 !HardwareAddress and !ClientId are handled in the server code. I don't
 have much experience with v4, so we will probably need to consult with
 Shawn on this.

 > > Generic comment about getLeaseX family of tests. They should be
 eventually refactored to cover other backends. I think we need one common
 class for all backends. As this is a huge task, we should probably create
 a separate ticket for it. I for one would like to avoid replicating this
 huge work you've done here when implementing similar tests for memfile.
 > Agreed, but I don't think it's a huge task.  The tests are virtually
 completely backend agnostic, so it is really just getting them to run with
 different backends.  Gtest has "parameterised tests", which may meet that
 need.  I've created ticket #2533 for this.
 Ok, thank you.

 Hmm, seems that you either addressed on convinced me that things are good
 as they are now. The only outstanding thing is my request to update
 getName() in memfile backend. After that this ticket is ready for merge. I
 don't need to see it again.

-- 
Ticket URL: <http://bind10.isc.org/ticket/2404#comment:10>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development