BIND 10 #3360: memfile: implement file support

Wed Apr 2 13:55:30 UTC 2014

#3360: memfile: implement file support
-------------------------------------+-------------------------------------
            Reporter:  tomek         |                        Owner:
                Type:  enhancement   |  stephen
            Priority:  medium        |                       Status:
           Component:  dhcpdb        |  reviewing
            Keywords:                |                    Milestone:  DHCP-
           Sensitive:  0             |  Kea0.9-alpha
         Sub-Project:  DHCP          |                   Resolution:
Estimated Difficulty:  5             |                 CVSS Scoring:
         Total Hours:  2             |              Defect Severity:  N/A
                                     |  Feature Depending on Ticket:
                                     |          Add Hours to Ticket:  2
                                     |                    Internal?:  0
-------------------------------------+-------------------------------------
Changes (by marcin):

 * owner:  marcin => stephen

Comment:

 Replying to [comment:6 stephen]:
 > Reviewed commit 991fea36336a0303a1f61f1ef0bcfe82d3a682ce
 >
 > I've made some small changes to comments and to the BIND 10 guide and
 pushed.
 >
 >
 > '''General'''[[BR]]
 > This is mentioned below, but I think it should be highlighted here.
 Rather than introduce two new configuration parameters ("leasefile" and
 "persist"), why not use the existing "name" parameter?  If the "type"
 parameter is "memfile", then "name" is the name of the file to which
 leases are written; if "name" is null, leases are not persisted to disk.
 This simplifies the parsing.

 I agree that using !''name!'' parameter is a good idea.

 I disagree with the suggestion that the null name should be used to
 disable persistence:
 - I want to stick to the concept of the default file location. The default
 is taken when the file is not specified or is empty. This is not really
 doable if we want to use null or not null to disable or enable
 persistence. This can't have two meanings.
 - I don't want to remove the file name from the configuration to
 temporarily disable persistence because I will have hard time recreating
 the name when I want to re-enable it. It is so much easier for the user to
 flip boolean flag, rather than copy back and forth the absolute file path.

 >
 > A related issue is one of defaults.  I would suggest that we remove the
 default values for the lease database and explicitly require the user to
 set it. (They have to do a fair amount of configuration to run a DHCP
 server already, so setting the lease database information is not a big
 deal. And they would not be able to run the server until it was set, as
 LeaseMgrFactory::create logs an error and throws an exception if the
 database type is not set or is unrecognised.)  This would eliminate the
 problem of deciding a default location for the lease file; it would also
 remove the case where a parameter default value is hard-coded instead of
 being in the .spec file.

 I disagree. We should strive to make configuration user friendly. People
 will install the server and will want it to work instantly. And less they
 have to configure, the better. I don't think it is a good idea to provide
 defaults for MySQL or Postgres because there is very little chance that
 the defaults will be accurate. In fact it is impossible, given that
 password has to be specified.

 For Memfile there is no password and there is a known directory structure,
 with the var folder in it, that can be used by default. I don't see a
 compelling reason to not use this opportunity and create a lease file for
 a user if we can. We do it once in the code, users would have to do it
 over and over again.

 Also, this doesn't impose any additional burden on the user, comparing
 with the case when default is not supported. He'd have to change the
 location if he wanted a non-default one anyway.

 Finally, is .spec file really an issue? We are going to toss our
 configuration mechanism quite soon anyway. I also remember that defaults
 didn't really work properly through the spec file.

 >
 >
 > '''doc/guide/bind10-guide.xml'''[[BR]]
 > I've made some changes to the text in this file.

 Thanks.

 >
 > '''src/lib/dhcp/duid.cc'''[[BR]]
 > decode: see the comment below about the fromText method in hwaddr.cc
 concerning the handling of two adjacent separators.

 I did as suggested.

 >
 > '''src/lib/dhcp/hwaddr.{cc,h}'''[[BR]]
 > fromText: at present, the hardware type set in the HWAddr object is
 HTYPE_ETHER.  For more flexibility, it could be passed as the second
 argument which has a default value of HTYPE_ETHER.

 Added.

 >
 > fromText: do we want token compression in the split operation?  AIUI, it
 treats adjacent separators as a single separator, so 5::7 appears as 5:7.
 Given that an IPv6 address can look similar to a hardware address - and
 that given the size of the IPv6 address range, most IPv6 addresses include
 a "::" - it might be better to treat the apperance of two adjacent
 separators as an error. That way we have some form of guard should an IPv6
 address be passed to this class.

 Ok. I did as suggested.

 >
 > '''src/lib/dhcp/tests/hwaddr_unittest.cc'''[[BR]]
 > fromText test: should also check the case where the input string
 contains two adjacent colons.

 It checks now.
 >
 >
 > '''src/lib/dhcpsrv/csv_lease_file4.cc'''[[BR]]
 > CSVLeasefile4::next(): There seems no reason not to be the the first few
 statements (apart from the declaration of "row") in the "try" block as
 well. If this is done, the initial lease.reset() call can be moved into
 the "if" block that tests for the empty row returned from the next() call.
 It is a trivial saving, but it means that lease.reset() is only ever
 called once regardless of the path through the method.

 Not a big deal to change that. I did.

 >
 > CSVLeasefile4::next(): client_id_len can simply be set to
 client_id_vec.size(). If the vector is empty, size() will return 0.

 Ooops. Fixed.

 >
 > '''src/lib/dhcpsrv/csv_lease_file4.h'''
 > I've made some small changes to the comments, but have not made large
 ones.  Unless there is a plan to change the documentation, I suggest
 removing the reference to ticket #2405.  A ticket is a transient thing and
 the comment will not really be relevant in a year or so.

 There is a plan to update documentation. Specifically, revisit @todos.

 >
 > '''src/lib/dhcpsrv/csv_lease_file6.cc'''[[BR]]
 > See comment for CSVLeasefile4::next() (above).
 >
 > '''src/lib/dhcpsrv/csv_lease_file6.h'''[[BR]]
 > See comments for src/lib/dhcpsrv/csv_lease_file4.h.
 >

 Addressed.

 >
 > '''src/lib/dhcpsrv/dbaccess_parser.cc'''[[BR]]
 > !DbAccessParser constructor: ctx_ has already been set in the
 initializer list, so the body of the method can be empty.

 Ooops. Fixed.

 >
 >
 > '''src/lib/dhcpsrv/memfile_lease_mgr.cc'''[[BR]]
 > A future ticket should look at this file: there's a lot of replication,
 I'm sure it could be shrunk with judicious use of templates.

 There is a bunch of things that should be fixed in this file. I agree.
 But, nothing in this ticket.

 >
 > '''src/lib/dhcpsrv/memfile_lease_mgr.h'''[[BR]]
 > I would suggest an alteration to the lease manager parameters.  Instead
 of "leasefile" and "persist", use "name" as the name of the file: if it is
 empty, persistence is disabled.  ("name" is meant to be the name of the
 database - in this case, the file is the database.).

 See my answer above.

 >
 > I've made some small changes to the header, but some comments are wrong:
 there is no "persist{4,6}" or "leasefile{4,6}" in the database access
 string, they are "persist" and "leasefile".

 Yes. I had a few turns of action with this ticket and these parameters
 were a reminder of it.

 >
 > I do not think the idea of a default file is a good one: we don't supply
 a default for the database name, why a default for the memfile backend?

 I explained that above. The database name and the lease file location is
 not an apple to apple.

 >
 > persistLeases: The universe could be stored in the Memfile_LeaseMgr
 class when it is constructed, so it is not clear why persistLeases()
 requires a "universe" parameter.

 It could. But, nothing precludes memfile from being extended to support v4
 and v6 leases simultaneously, so I thought it may be good to query the
 persistence using universe.

 >
 > '''src/lib/dhcpsrv/tests/Makefile.am'''[[BR]]
 > As all the .csv files are created by the tests, is there any really need
 to put them in a special testdata directory? Why not put them in the build
 directory (as is done elsewhere for other temporary files)?  It eliminates
 the need for the testdata directory.

 Initially I thought that I will have some files there (instead of
 dynamically creating them). Then I decided differently, but the folder
 remained and I didn't see any harm. But, I took your point and removed it.

 >
 >
 > '''src/lib/dhcpsrv/tests/generic_lease_mgr_unittest.cc'''[[BR]]
 > testBasicLease4: one of the reopen() calls should be passed an explicit
 "v4" parameter. A possible future edit could change the default value and
 have that default value open a V4 lease file, but an explicit "v4"
 parameter open something else. (Admittedly this is highly unlikely, but it
 is a possibility.)

 Ok.

 >
 > '''src/lib/dhcpsrv/tests/lease_file_io.h'''[[BR]]
 > Should be a header describing the purpose of the class.

 Added.

 >
 > '''src/lib/dhcpsrv/tests/memfile_lease_mgr_unittest.cc'''[[BR]]
 > There needs to be two additional tests checking that the recording of a
 lease deletion survives a re-opening of the lease file.
 > * Add a lease and delete it, then reopen the lease file and check that
 the lease is absent.

 This test has been there already in testBasicLeaseX.

 > * Add a lease, delete it, and add it again with different parameters.
 Reopen the lease file and check that the lease is as expected.

 Added new unit test.

 >
 > '''src/lib/dhcpsrv/tests/mysql_lease_mgr_unittest.cc'''
 > Comments: Strictly speaking, there is no need for the v4 and v6 leases
 to share the same database: the system could be configured so that they
 use different databases.  However the schema for the database defines both
 V4 and V6 tables so whatever the configuration, the database has
 appropriate tables to hold the data being stored.

 Ok. But, what exactly I should do about it? I couldn't find a culprit
 comment.

 >
 >
 > '''src/lib/util/csv_file.{h,cc}'''[[BR]]
 > CSVRow::parse: as this is C++, shouldn't the argument be a const
 std::string&? Every time it is used in this file, the caller converts a
 string to a char* using the c_str() method and the first thing the method
 does is to convert it back to a string.

 I made it a string.

 >
 > CSVRow::parse: wouldn't it be easier to use boost::split (as used in
 hwaddr.cc) to split a CSV line into multiple fields?

 Ok. I use split now.

 >
 > CSVRow::EMPTY_ROW: I can see why EMPTY_ROW is used, but as the use is
 typically comparing a row just read from the file to EMPTY_ROW, I can't
 help feeling that a CSVRow::empty() method would be better.

 I assign empty row to the object, not only check that it is empty.

 >
 > General: Some of the one-line methods (e.g. operator==(), operator!=(),
 writeAt with a "string" as a second argument) could be encoded in the
 header file, allowing for optimisation by the compiler.

 Moved to the header file.

 >
 > CSVFile: rather than open/recreate, why not open() with an argument
 stating "always open a new file regardless"?

 These are two different functionalities. I didn't want to enclose them in
 one super-big and complex function. I think it makes them clearer.

 >
 > CSVFile::checkStreamStatus: in most cases where it is used,
 checkStreamStatus() is followed by fs_->clear().  (The exception is in the
 flush() method, where clear is not called: is this an oversight?)  Suggest
 adding the clear() call into the method and renaming the method something
 like checkStreamStatusAndReset().

 True. I put clear() into checkStreamStatus

 >
 > CSVFile::addColumn: it appears that columns can be added dynamically
 while the file is open. If this is not desired, the method should throw an
 exception if fs_ is not null.

 Right. addColumn was also used internally in which case it was allowed for
 the stream to be opened. But, I added another protected function that
 serves this purpose. From the public interface it is now not possible to
 add column while a stream is open.

 >
 > CSVFile::append: seekg is only used on input streams, seekp is for
 output streams.  As this is an output method, I think the call to seekg is
 unnecessary. (I see that the unit test checks that if a file is being read
 and is then written to, the input pointer is pushed to the end of the file
 - I'm not clear why this behaviour is required.)

 In theory. In practice the seekg and seekp happen to do the same thing on
 iostream. I added some extra comment.

 >
 > CSVFile::getColumnName: the comparison should return an error if the
 given col_index is ">=" the number of columns (indexes start at 0 and go
 through to "size-1").

 Good catch.

 >
 > CSVFile::open: a check is made whether the file is open and if not, an
 error is thrown.  The isc_throw call is preceded by a call to close: this
 is unnecessary as the immediately preceding "if" test shows that the file
 is not open. (But see the next comment.)

 Not only does close() performs an actual close, but it also sets the fs_
 pointer to NULL, which is essential.

 >
 > CSVFile::open: the "else" block comprises a series of tests; if a test
 fails, the file is closed and an exception thrown.  The final clause in
 that block is a loop that calls addColumn on each iteration.  Should
 addColumn throw an exception (which it can if adding a duplicate column),
 close() will not be called.  To get round this, I suggest putting the
 whole "else" clause in a "try" block with the close() call in the
 corresponding "catch", i.e.
 > {{{
 >     else {
 >         try {
 >               :
 >             if (!fs_->good()) {
 >                 isc_throw(CSVFileError, "unable to set read pointer in
 the file '"
 >                           << filename_ << "'");
 >             }
 >               :
 >               etc.
 >               :
 >         } catch (...) {
 >             close();
 >             throw;    // re-throw the exception
 >         }
 >     }
 > }}}

 I wanted to avoid too many "catches" but I agree it is less error prone
 and more readable.

 >
 > '''!ChangeLog'''[[BR]]
 > If my suggestions about parameter names is adopted, s/new
 configuration/configuration/
 >
 > s/using configuration//

 They aren't.

 >
 > Otherwise the !ChangeLog entry looks OK.

 Ok. Thanks.

-- 
Ticket URL: <http://bind10.isc.org/ticket/3360#comment:7>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development