[bind10-dev] comment to StatsModule 2010/6/1

Kazunori Fujiwara fujiwara at wide.ad.jp
Tue Jun 1 14:49:51 UTC 2010


I will comment "StatsModule" idea.

http://bind10.isc.org/wiki/StatsModule

|+------ Sending modules -----+
|| +------+ +------+ +------+ |
|| | Boss | | Auth | | etc. | | <- *1
|| +------+ +------+ +------+ |
|+-----^--------^--------^----+
|      |        |        |
|      +--[CC protocol]--+
|               |               <- *2
|               v
|        +--------------+  
|        |    Stats     |       <- *3
|        +--------------+ 
|               |               <- *4
|               v 
|      +-----------------+
|      |    Cmd-Ctrld    |      <- *5
|      +-----------------+
|
|*1 Modules except Boss and Auth, which send stats data to stats module,
|   is not supported in initial version of stats module

No, it has nothing to do with the statistics specification.

|== Procedure of stats module ==
|=== Basic procedure ===
| * Initial process:
|   0. Boss starts stats daemon and other modules.
| * Main process in loop:

First, "statistics module" contacts config manager.

Configuration changes and commands from bindctl are come from config
manager.

|   1. Stats starts to subscribe in stats channel.
|   1. Other modules send stats data to stats module periodically.
|   1. Stats module collects data and then aggregates it.
|   1. When print_stats command is invoked via bindctl, stats daemon
|      reports formatted statistics data via bindctl.
| * Final process:
|   X. When Boss is shutting down, stats module and other modules are
|      killed.

|== Collecting items ==
|Stats module collects following items from Boss and Auth.
| * In general (for both modules)
|   * version -- A version number of this stats data definition

version is not necessary in the protocol because the version number
will be written in *.spec configuration file.

|   * module -- A module name which sends the  stats data
|   * process_id -- A process id of the module

process_id is not used in another part of BIND 10.
So, local name defined in msgq may be better.

|   * processes -- A number of processes of the same module, if
|     multiple processes of the module are running

then, "process_id" may be a list of processes.

|   * send_time -- Milli-seconds of current time since epoch time
|     (1970-01-01T00:00:00Z)

why milli second?
I prefer unixtime + microsecond (struct timeval) format.

What are "T" and "Z" characters?
Text printable format is hard to parse.

|   * sequence -- A sequence number which must be unique and consistent
|     in the sending module

Is this necessary?

| * For Boss module
|   * boot_time -- A date time when BIND 10 starts up, format is
|     YYYY-MM-DDTHH:MM:SSZ
| * For Auth module
|   * queries_in [[BR]]
|     * tcp -- A number of query counts per a process which Auth servers receives in
|       TCP since it sends last
|     * udp -- A number of query counts per a process which Auth servers receives in
|       UDP since it sends last

CC module has good counters.

|== Reporting items ==
|Stats module reports following items via bindctl.

This format will be generated by parsing each *.spec file.

| * Local name -- A localname, which is returned from msgq module in CC protocol

Local name is assigned for each process.

| * Boot time  -- A date time when BIND 10 process starts
| * Reported time -- A date time when stats module reports
| * Process id    -- Process ids of all related modules
| * Incoming Queries (TCP) -- A calculated query counts by stats module
| * Incoming Queries (UDP) -- A calculated query counts by stats module
|
|This is an example of output image via bindctl.
|{{{
|    ++ BIND 10 Statistics Report ++
|    Local name: 4bea7903_4 at host
|    Boot time:  2010-05-13T05:19:43Z
|    Report time:      2010-05-13T05:44:41Z
|    Process id(Boss):         777
|    Process id(Auth):         888
|    Process id(Stats):         999
|    Incoming Queries (TCP):     8888
|    Incoming Queries (UDP):     9999
|    ++ BIND 10 Statistics Report ++
|}}}

The output format requires another knowledge.
The statistics module only knows input data format.
Or we must define output data format definition.

My idea was:

* BIND 10 Statistics report
  Report time: ...

  bind10.LocalName: xxxx at localhost
  bind10.BootTime: ...

  auth.LocalName: xxx
  auth.queries.tcp: ...
  auth.queries.udp: ...

|== Available commands in bindctl ==
|Two commands via bindctl are available in initial version of stats
|module.
| * "print_stats" command:
|    Stats module aggregates current numbers and prints the list of
|    them by using formatted text.

print_stats command may have module name argments.
print_status without arguments show all statistics.

| * "print_clear" command:

typo. It is "clear_statistics".

|    Stats module resets query counts to zero. If this command is
|    invoked, then at first 'Are you sure?' prompt to confirm it.

The command may also have module name arguments.

|== Backend DB for stats module ==
|'''(TBD)'''
|A specific DB, like sqlite3 or Berkeley DB, is not used in stats
|module in initial version. It's assumed that stats module keeps
|aggregated data in memory.

It is not defined, I think.

|== Message format ==
|'''Message format from Boss module to Stats:'''
|{{{
|#!js
|{
|  "stats_data":
|    {
|      "General":
|        {
|          "version": "1.0",

It is not necessary.

|          "module": "Auth",

The parameter may be included in the data itself.

|          "process_id": 777,

The localname in the envelope is sufficient.

|          "send_time": "2010-05-13T05:40:41Z",

It may be included in the data itself.

|          "sequence": 2345,

It is not necessary.

|        },
|      "Boss":

Is it a mistake? Is it "Auth:"?
But It contains module name.
Then "General" section is not necessary.

|'''Message format from Auth module to Stats:'''
|{{{
|#!js
|{
|  "stats_data":
|    {
|      "General":
|        {
|          "version": "1.0",
|          "module": "Auth",
|          "process_id": 888,
|          "processes": 2,
|          "send_time": "2010-05-13T05:40:41Z",
|          "sequence": 2345,
|        },
|      "Auth":
|        {
|          "queries_in":
|            {
|              "tcp" : 123,
|              "udp": 4567
|            }
|        }
|    }
|}
|}}}

To simplify the format, I propose a new data format.
Add a "timestamp" as a unixtime in the outermost.
>From or Local name is obtained from envelope.
Module name is "Auth" which exists in the data.

      {
       "timestamp": unixtime,
|      "Auth":
|        {
|          "queries_in":
|            {
|              "tcp" : 123,
|              "udp": 4567
|            }
|        }
      }

|== Data schema ==
|A schema which defines above massage formats, filename of which is configured in
|spec file for stats module.[[BR]]
|'''stats_data_schema.spec:'''
|{{{
|#!js
|{
|  "stats_data":
|    {
|      "description": "A schema for BIND 10 stats data definitions \
|      		      using JSON schema syntax (http://json-schema.org/)",
|      "type": "object",
               ~~~~~~~~
                 I prefer "dict".

The "stats_data" schema is defined for each module.
I prefer it will be written in *.spec file.

I prefer new spec file format will be:
  { "module_spec": { ... },
    "commands": { ... },
    "stats_spec": { ... }
  }

The "stats_spec" format should be compatible with "module_spec" and
"commands" format.

-- 
Kazunori Fujiwara



More information about the bind10-dev mailing list