BIND 10 #547: Update of statistics daemon for HTTP/XML reporting: implement

Fri Apr 8 01:22:08 UTC 2011

#547: Update of statistics daemon for HTTP/XML reporting: implement
-------------------------------------+-------------------------------------
                 Reporter:           |                Owner:  vorner
  naokikambe                         |               Status:  reviewing
                     Type:           |            Milestone:
  enhancement                        |  Sprint-20110419
                 Priority:  major    |           Resolution:
                Component:           |            Sensitive:  0
  statistics                         |  Add Hours to Ticket:  0
                 Keywords:           |          Total Hours:  0
Estimated Number of Hours:  40.0     |
                Billable?:  1        |
                Internal?:  0        |
-------------------------------------+-------------------------------------
Changes (by naokikambe):

 * owner:  naokikambe => vorner

Comment:

 Replying to [comment:8 vorner]:
 Thank you very much for so deeply reviewing and for giving so many
 comments!  And I'm very sorry for confusing reviewers a lot. So I would
 like to suggest you this. I'll close this ticket and split it into
 following 4 tickets.
 * TICKET1: document a manpage about XML/HTTP interface for statistics
 feature
 * TICKET2: implement a XML/HTTP interface for statistics feature
 * TICKET3: update the existing statistics daemon for the XML/HTTP
 interface
 * TICKET4: update mockup classes referred from statistics modules
 And I'll delete the existing branches on the repository (trac547,
 trac547-summarized) and recreate new branch and complete history by each
 ticket. How about that? Can it make sense for you? I'll do that if it is
 okay for you. Please let me know.

 BTW, I will answer to each your comment here.

 > I fixed some English in the man page, pushed into the -summarised
 branch.
 Thank you. That's good.
 >  * Every process/module has a separate directory in the bin/. Would the
 xml stats server deserve it's own directory as well?
 Because I think the stats daemon and the HTTP server are grouped in the
 same category according to the picture in [[DesignDiagrams]], and they
 almost share the same libraries and the same stub codes, so I put them
 into the same directory. If it is incorrect, I will separate the
 directory.
 >  * Do we expect users to want to know the internal how it works? Like
 how it is started by boss, etc. Maybe the man page could be less detailed
 in that sense, and just say that it presents the statistics from stats
 module trough XML/HTTP?
 Yes, the users needs to know details but I don't exactly know what we
 should add in the man page. What about this? Please show me more
 suggestion if you have.
 {{{
 For example, The HTTP server gets the following data from the stats daemon
 in python dict format. :
 {
   "auth.queries.tcp": 1,
   "auth.queries.udp": 1,
   "bind10.boot_time": "2011-04-06T10:52:10Z",
   "report_time": "2011-04-06T10:56:32Z",
   "stats.boot_time": "2011-04-06T10:52:12Z",
   "stats.last_update_time": "2011-04-06T10:56:11Z",
   "stats.lname": "4d9c45dc_c at fdvm",
   "stats.start_time": "2011-04-06T10:52:12Z",
   "stats.timestamp": 1302087392.218623
 }
 Then the server pushes this to the client in XML format through HTTP. :
   <stats>
   <auth.queries.tcp>0</auth.queries.tcp>
   <auth.queries.udp>0</auth.queries.udp>
   <bind10.boot_time>2011-04-06T10:52:10Z</bind10.boot_time>
   <report_time>2011-04-06T10:55:10Z</report_time>
   <stats.boot_time>2011-04-06T10:52:12Z</stats.boot_time>
   <stats.last_update_time>2011-04-06T10:54:11Z</stats.last_update_time>
   <stats.lname>4d9c45dc_c at fdvm</stats.lname>
   <stats.start_time>2011-04-06T10:52:12Z</stats.start_time>
   <stats.timestamp>1302087310.02</stats.timestamp>
   </stats>
 That is, one element in python dict corresponds to one tag in XML.
 }}}
 >  * The man page talks about parameters when it talks on which addresses
 it listens. IMO that's little bit misleading, user might want to put it
 into the command line after reading it. Maybe call them options or
 configuration points so it is clear they are not command line arguments?
 OK. What about adding the following description into OPTIONS section in
 the man page?
 {{{
 The verbose mode is only supported in the command-line option of
 b10-stats-httpd. You can set the parameters like addresses and ports in
 HTTP
 into the specification file (stats-httpd.spec). Please see "CONFIGURATION
 AND
 COMMANDS" section below for details.
 }}}
 >  * The {{{BASE_LOCATION = BASE_LOCATION.replace("${datarootdir}",
 DATAROOTDIR).replace("${prefix}", PREFIX)}}} is little bit suspicious. Can
 the `BASE_LOCATION` really contain strings like '${prefix}'? Where do they
 come from?
 Yes. In my environment, BASE_LOCATION is "${datarootdir}/bind10-devel",
 DATAROOTDIR is "${prefix}/share" and PREFIX is "/home/kambe/bind10", then
 ${datarootdir} in BASE_LOCATION is replaced with DATAROOTDIR and ${prefix}
 in DATAROOTDIR is replaced with "/home/kambe/bind10". That is,
 BASE_LOCATION contains ${datarootdir} and ${datarootdir} contains
 ${prefix}. Is that correct?
 >  * You seem to use {{{dict(key=value)}}} consistently trough the code.
 Is there any special reason about that? I think the use of
 {{{{'key':value}}}} is more common in the python world.
 I don't have so special reason. The way of dict method is more intuitive
 for me because we don't need to bracket the key name with quotations. If
 it is the bad manner, I will correct through the all codes.
 >  * This might be out of the scope of this task, but is there a way to
 ask for only part of the data? Like there is for the stats module?
 Yes, there is a way, but it is not prepared for the HTTP server. It is
 only prepared for the stats daemon. You can invoke the show command with
 the argument in bindctl like this. Is the same way needed for the HTTP
 server?
 {{{
 > Stats show auth.queries.ucp
 {
     "auth.queries.tcp": 1
 }
 }}}
 >  * Why is this limitation in place?:
 >    {{{#!python
 >    # The size of template file should be under 100k. TODO: should be
 >    # considered later.
 >    MAX_SIZE_OF_TEMPLATE = 102400
 >    }}}
 >    Allowing arbitrary size should be no problem, since the file is local
 one and the only one able to change it is the administrator. Why stop him?
 Reading the whole file by single call without the limitation should be no
 problem in python (the file object should have something like
 {{{readAll()}}} or similar).
 I don't have so specific reason for it. But I cannot expect the situation
 which so huge file is read-in. For the unexpected situation, I set such
 limitation. In almost cases an administrator dosen't need to change this
 limitation. Do we need to remove it?
 >  * There is number of `assert` calls trough the code. While this is
 generally good thing, asserts for given types of things (eg. {{{assert
 isinstance(self.server, HttpServer)}}}) is AFAIK explicitly not
 recommended by python guidelines. The strength of python is that you can
 put whatever object you like into it as long as it has the needed methods.
 So, such assert actually make the code reuse harder, because every user
 needs to inherit of the given checked base class. Example of where it
 might make trouble are mock classes in testing.
 That's right. I will remove assertions like the following in the all
 codes. Is that okay?
 {{{#!python
 assert isinstance(xxx, yyy)
 assert type(xxx) is yyy
 }}}
 >  * The server redirects all unknown addresses to the one address for the
 data. That's quite surprising behaviour, wouldn't it be better to redirect
 only things like `/` and `index.html`, but leave
 `/something_that_does_not_exist` to actually return 404? If the user took
 the work to type some address, he had some intention and we should tell
 him we don't know the answer for him instead of giving him something else.
 That's right, I will remove the codes for redirection.
 >  * Why does it subscribe to the stats' messages? It does nothing when it
 receives such message and it asks for the statistics on demand when they
 are needed. I guess nothing would happen if it didn't subscribe to it,
 simplifying the code.
 OK. I will remove the codes to subscribe/unsubscribe.
 >  * You sort out duplicate addresses by some set. But if they are
 problem, shouldn't a configuration with duplicate addresses be rejected
 instead of silently modifying it?
 OK. I will remove the code to sort addresses and ports.
 >  * If you want to raise the original exception:
 >    {{{#!python
 >    except select.error as err:
 >        if err.args[0] == errno.EINTR:
 >        (rfd, wfd, xfd) = ([], [], [])
 >    else:
 >        raise select.error(err)
 >    }}}
 >    You might want to use just raise without any parameters. That raises
 the exact instance of exception as it caught in the except, preserving all
 other information (like errno, which your way doesn't) and the backtrace
 is more helpful (it doesn't show the note about exception being raised
 while handling another exception).
 OK. I will replace with another exception instead of the original
 exception(select.error).

 >  * You use select around the httpd object and when a socket is readable,
 you call it. But, does it use non-blocking sockets? If there's a client
 that sends `GET url` and gets stuck before sending the rest of headers,
 will our program get stuck as well, trying to read until the end of the
 headers?
 The HTTP server can handle only one request at a time. If the server is
 trying to read, it will be timed out and the client will be reset later. I
 think this interface is not so busy like a DNS server, because I think
 this feature is for administration use only. Is that correct?
 >  * Aren't you sending answer to answer? I guess only commands should
 have answers:
 >  {{{#!python
 >  try:
 >        seq = self.cc_session.group_sendmsg(
 >            isc.config.ccsession.create_command('show'),
 >            self.stats_module_name)
 >        (answer, env) = self.cc_session.group_recvmsg(False, seq)
 >        if answer:
 >            (rcode, value) = isc.config.ccsession.parse_answer(answer)
 >            self.cc_session.group_reply(
 >                env, isc.config.ccsession.create_answer(0))
 >  }}}
 That's right, this code to reply is unnecessary. I'll remove it.
 >  * You check that the resulting list of statistics is not empty
 ({{{assert len(xml_list) > 0}}}). However, it should be empty in case you
 get empty statistics list from the stats module. It is true that there
 should be some statistics from it, but is it really illegal? And should we
 crash for assertion in that case?
 I think xml_list cannot be empty as long as it gets along with the stats
 daemon, but we consider the case that xml_list become empty as you
 mentioned.  I will remove it.
 >  * The process will return code 0 (success) even when it catches an
 exception in main and terminates. That should generate an error.
 OK, it will exit with non-zero status code when it catches any exception
 in the main routine.
 >  * You seem to use regexps in the tests when you really think
 substrings:
 >  {{{#!python
 >  self.assertRegexpMatches(handler.response.body,
 stats_httpd.XSD_NAMESPACE)
 >  self.assertRegexpMatches(handler.response.body,
 stats_httpd.XSD_URL_PATH)
 >  }}}
 >  The paths aren't valid regexps, they are just string. Furthermore, if
 they contain regexp metacharacters, the test would break.
 That's right. I'll use find method in string object for it. What about
 this?
 {{{#!python
 self.assertTrue(handler.response.body.find(stats_httpd.XSD_NAMESPACE)>0)
 self.assertTrue(handler.response.body.find(stats_httpd.XSD_URL_PATH)>0)
 }}}
 >  * Isn't there a spelling error in `_clear_ques`? Shouldn't it be
 `_clear_queues` (used consistently with the mock ccsession).
 OK. I'll correct it with the proper spelling(`_clear_queues`).
 >  * There's a class containing tests only in setUp. That's counter-
 intuitive and might generate wrong error messages when the test actually
 fail. It should be moved to some test_something method (the setUp method
 is optional).
 OK. I'll use test_xxx method instead of setUp method in the unittest class
 which has only setUp method.
 >  * You test only the error condition in `test_open_template`, but not
 the successful run.
 I think the successful condition of `test_open_template` is tested in
 other test cases, but I'll add the successful one into
 `test_open_template` too.
 >  * Can this assert fail under any circumstances?
 >  {{{#!python
 >  d = dict(_UNKNOWN_KEY_=None)
 >  assert type(d) is dict
 >  }}}
 No, but I'll remove these assertions as I mentioned above.
 >  * In `test_config`, you call it with various configs and check that it
 doesn't crash and it returns the correct answer. But shouldn't there be a
 check it did anything „internal“, like it sets any value inside?
 OK. I will add tests to check internal values. What about this?
 {{{#!python
   self.assertEqual(
       self.stats_httpd.config_handler(
                   dict(listen_on=[dict(address="::2",port=8000)])),
       isc.config.ccsession.create_answer(0))
   self.assertTrue(self.stats_httpd.config in "listen_on")
   self.assertTrue(self.stats_httpd.config["listen_on"] in "address")
   self.assertTrue(self.stats_httpd.config["listen_on"] in "port")
   self.assertTrue(self.stats_httpd.config["listen_on"]["address"] ==
 "::2")
   self.assertTrue(self.stats_httpd.config["listen_on"]["port"] == 8000)
 }}}
 >  * What is tested in `test_no_buildpath`? How can the test fail, if
 there's no self.assert* call? The same with `test_osenv` (probably older
 code, but I saw it in the diff as well).
 There aren't so specific reason. I intended to let it go through the
 following code in stats_http.py. It will fail if it has errors in syntax.
 Do we need to remove `test_no_buildpath`?
 {{{#!python
 else:
     PREFIX = "@prefix@"
     DATAROOTDIR = "@datarootdir@"
     BASE_LOCATION = "@datadir@" + os.sep + "@PACKAGE@"
     BASE_LOCATION = BASE_LOCATION.replace("${datarootdir}",
 DATAROOTDIR).replace("${prefix}", PREFIX)
 SPECFILE_LOCATION = BASE_LOCATION + os.sep + "stats-httpd.spec"
 STATS_SPECFILE_LOCATION = BASE_LOCATION + os.sep + "stats.spec"
 XML_TEMPLATE_LOCATION = BASE_LOCATION + os.sep + "stats-httpd-xml.tpl"
 XSD_TEMPLATE_LOCATION = BASE_LOCATION + os.sep + "stats-httpd-xsd.tpl"
 XSL_TEMPLATE_LOCATION = BASE_LOCATION + os.sep + "stats-httpd-xsl.tpl"
 }}}
 >  * You modified the `dequeue` method in the mock ccsession. But why is
 there the `seq` parameter now? It seems to be unused.
 Yes, it isn't necessary. I'll remove it.
 >  * The get_value seems suspicious. It returns either some spec file
 default value or default value of a type. But is it able to return any
 actual configured value? Or why isn't it needed?
 No, it is not able to return any actual value. The purpose of this
 get_value is only for getting the default value in the spec file, because
 this method is called only when stats_http is loading the spec file at
 first.  But the behavior of this method is unlike to the behavior of the
 original method in the original class. So what about switching the name of
 it to `get_default_value`?
 >  * The mock select has some hidden state which can make it quite
 surprising, if someone tried to reuse it in some other tests or wrote
 another test here that used it. Shouldn't the variable be exported and
 explicitly set in the test using it?
 I think these mock classes (select.py, socket.py, etc) are not reusable
 for other tests, the codes should be more coded if they need to be, and I
 think these tasks are out of scope of this ticket. Should they be
 reusable?

-- 
Ticket URL: <http://bind10.isc.org/ticket/547#comment:9>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development