BIND 10 #547: Update of statistics daemon for HTTP/XML reporting: implement
BIND 10 Development
do-not-reply at isc.org
Sat Apr 2 18:32:25 UTC 2011
#547: Update of statistics daemon for HTTP/XML reporting: implement
-------------------------------------+-------------------------------------
Reporter: | Owner: naokikambe
naokikambe | Status: reviewing
Type: | Milestone:
enhancement | Sprint-20110405
Priority: major | Resolution:
Component: | Sensitive: 0
statistics | Add Hours to Ticket: 0
Keywords: | Total Hours: 0
Estimated Number of Hours: 40.0 |
Billable?: 1 |
Internal?: 0 |
-------------------------------------+-------------------------------------
Changes (by vorner):
* owner: vorner => naokikambe
Comment:
Hello
The problem with size was not about the number of commits. It was about
the amount of changed code, which has to be read during the review. I did
go trough it, but squashing the whole branch into one commit does not
help, quite the reverse. Splitting into commits might help a lot, but that
expects that the commits have good description and the history is clean.
(Btw. how did you manage to have some commits 3 times in the history? Do
you by any chance combine rebase with merge? It does nasty things to the
history). Anyway, keep it like it is for now (I tried cleaning the history
up, but it seems it would require quite a lot of time). Jinmei originally
asked if you could say something like „commits from this point to this
implement the startup and listening and are independent of the rest“, and
so on, so one could read a part, write comments for them, forget them and
start reading the next part ‒ it helps both from psychic point of view
(it's less tiresome going trough 300 lines of changes, even 10 times, than
3000 lines at one time) and from practical point of view (while reviewing,
I try to remember all the relations between the new code, and it gets
confusing when the amount gets large).
I fixed some English in the man page, pushed into the -summarised branch.
About the code:
* Every process/module has a separate directory in the bin/. Would the
xml stats server deserve it's own directory as well?
* Do we expect users to want to know the internal how it works? Like how
it is started by boss, etc. Maybe the man page could be less detailed in
that sense, and just say that it presents the statistics from stats module
trough XML/HTTP?
* The man page talks about parameters when it talks on which addresses it
listens. IMO that's little bit misleading, user might want to put it into
the command line after reading it. Maybe call them options or
configuration points so it is clear they are not command line arguments?
* The {{{BASE_LOCATION = BASE_LOCATION.replace("${datarootdir}",
DATAROOTDIR).replace("${prefix}", PREFIX)}}} is little bit suspicious. Can
the `BASE_LOCATION` really contain strings like '${prefix}'? Where do they
come from?
* You seem to use {{{dict(key=value)}}} consistently trough the code. Is
there any special reason about that? I think the use of
{{{{'key':value}}}} is more common in the python world.
* This might be out of the scope of this task, but is there a way to ask
for only part of the data? Like there is for the stats module?
* Why is this limitation in place?:
{{{#!python
# The size of template file should be under 100k. TODO: should be
# considered later.
MAX_SIZE_OF_TEMPLATE = 102400
}}}
Allowing arbitrary size should be no problem, since the file is local
one and the only one able to change it is the administrator. Why stop him?
Reading the whole file by single call without the limitation should be no
problem in python (the file object should have something like
{{{readAll()}}} or similar).
* There is number of `assert` calls trough the code. While this is
generally good thing, asserts for given types of things (eg. {{{assert
isinstance(self.server, HttpServer)}}}) is AFAIK explicitly not
recommended by python guidelines. The strength of python is that you can
put whatever object you like into it as long as it has the needed methods.
So, such assert actually make the code reuse harder, because every user
needs to inherit of the given checked base class. Example of where it
might make trouble are mock classes in testing.
* The server redirects all unknown addresses to the one address for the
data. That's quite surprising behaviour, wouldn't it be better to redirect
only things like `/` and `index.html`, but leave
`/something_that_does_not_exist` to actually return 404? If the user took
the work to type some address, he had some intention and we should tell
him we don't know the answer for him instead of giving him something else.
* Why does it subscribe to the stats' messages? It does nothing when it
receives such message and it asks for the statistics on demand when they
are needed. I guess nothing would happen if it didn't subscribe to it,
simplifying the code.
* You sort out duplicate addresses by some set. But if they are problem,
shouldn't a configuration with duplicate addresses be rejected instead of
silently modifying it?
* If you want to raise the original exception:
{{{#!python
except select.error as err:
if err.args[0] == errno.EINTR:
(rfd, wfd, xfd) = ([], [], [])
else:
raise select.error(err)
}}}
You might want to use just raise without any parameters. That raises
the exact instance of exception as it caught in the except, preserving all
other information (like errno, which your way doesn't) and the backtrace
is more helpful (it doesn't show the note about exception being raised
while handling another exception).
* You use select around the httpd object and when a socket is readable,
you call it. But, does it use non-blocking sockets? If there's a client
that sends `GET url` and gets stuck before sending the rest of headers,
will our program get stuck as well, trying to read until the end of the
headers?
* Aren't you sending answer to answer? I guess only commands should have
answers:
{{{#!python
try:
seq = self.cc_session.group_sendmsg(
isc.config.ccsession.create_command('show'),
self.stats_module_name)
(answer, env) = self.cc_session.group_recvmsg(False, seq)
if answer:
(rcode, value) = isc.config.ccsession.parse_answer(answer)
self.cc_session.group_reply(
env, isc.config.ccsession.create_answer(0))
}}}
* You check that the resulting list of statistics is not empty ({{{assert
len(xml_list) > 0}}}). However, it should be empty in case you get empty
statistics list from the stats module. It is true that there should be
some statistics from it, but is it really illegal? And should we crash for
assertion in that case?
* The process will return code 0 (success) even when it catches an
exception in main and terminates. That should generate an error.
* You seem to use regexps in the tests when you really think substrings:
{{{#!python
self.assertRegexpMatches(handler.response.body,
stats_httpd.XSD_NAMESPACE)
self.assertRegexpMatches(handler.response.body, stats_httpd.XSD_URL_PATH)
}}}
The paths aren't valid regexps, they are just string. Furthermore, if
they contain regexp metacharacters, the test would break.
* Isn't there a spelling error in `_clear_ques`? Shouldn't it be
`_clear_queues` (used consistently with the mock ccsession).
* There's a class containing tests only in setUp. That's counter-
intuitive and might generate wrong error messages when the test actually
fail. It should be moved to some test_something method (the setUp method
is optional).
* You test only the error condition in `test_open_template`, but not the
successful run.
* Can this assert fail under any circumstances?
{{{#!python
d = dict(_UNKNOWN_KEY_=None)
assert type(d) is dict
}}}
* In `test_config`, you call it with various configs and check that it
doesn't crash and it returns the correct answer. But shouldn't there be a
check it did anything „internal“, like it sets any value inside?
* What is tested in `test_no_buildpath`? How can the test fail, if
there's no self.assert* call? The same with `test_osenv` (probably older
code, but I saw it in the diff as well).
* You modified the `dequeue` method in the mock ccsession. But why is
there the `seq` parameter now? It seems to be unused.
* The get_value seems suspicious. It returns either some spec file
default value or default value of a type. But is it able to return any
actual configured value? Or why isn't it needed?
* The mock select has some hidden state which can make it quite
surprising, if someone tried to reuse it in some other tests or wrote
another test here that used it. Shouldn't the variable be exported and
explicitly set in the test using it?
Anyway, I did read the code, but my brain is too limited to check that
every function is tested, as the diff is so large. Hopefully you remember,
did you write the tests for every function, or should I make the effort,
find some piece of paper and writing down what is tested by what test and
such?
Thanks
--
Ticket URL: <http://bind10.isc.org/ticket/547#comment:8>
BIND 10 Development <http://bind10.isc.org>
BIND 10 Development
More information about the bind10-tickets
mailing list