[bind10-dev] Rate limiting for logging?

Michal 'vorner' Vaner michal.vaner at nic.cz
Mon Jul 15 07:04:30 UTC 2013


Hello

On Fri, Jul 12, 2013 at 04:55:27PM +0200, Shane Kerr wrote:
> It reminded me of a suggestion that Stephen made at the last
> face-to-face meeting, which is that rather than being super-careful
> about logging levels and restricting the amount of information that we
> put out we should solve the *actual* problem, which is a chance of
> getting too many logging messages by adding rate limiting to logging.
> 
> Mukund and I thought that whatever we come up with should probably be
> pushed upstream into the log4cplus library itself.

I'm not completely sure about that (the pushing upstream).

The problem, as I see it, is if the logs get spammed by one or two message
types. In that case I'm likely to overlook something important.

On the other hand, I don't want to stop logging completely just because I logged
300 messages in the last second. That would mean I'd lose important things too.

Obviously, setting the limit per message level does not move us anywhere (we
would still fight over if this one can be this level or if it may be logged too
many times).

Therefore it seems to me we want to limit the rate based on the message itself
(each one message may be logged max 20 times per second, but hitting this won't
stop other messages from being logged).


If we wanted to solve this problem on the log4cplus level, we would need to
somehow detect the same messages, which will be hard, because they don't have to
be exactly the same (with different parameters, for example slightly different
domain name or source port of query). So we need some kind of detection of
similar message and we need to hash it somehow. I wouldn't be ashamed to assign
such task as a whole-year homework in some algorithms & data structures class
and I wouldn't be completely sure I'd get a solution that works fast.

Or we may solve it on the message IDs. Each message ID gets two new items:
• Counter for number of times being logged.
• Time at which the counter started.

And add few more rules around the logging:
• If the time is older than a second, reset the time to now() and the counter to
  0.
• Increase the count by 1.
• If the count equal to a limit, log a warning message that we hit the limit and
  not all messages are logged. Don't log the original message.
• If the count is higher, don't log it at all.


It may be I don't see the simple solution for solving it in the log4cplus, but I
think that the second way is much easier and therefore the way to go forward.
The problem with the second approach is, log4cplus doesn't know our message IDs.

With regards

-- 
No, I will not change your lightbulb.

Michal 'vorner' Vaner
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 198 bytes
Desc: Digital signature
URL: <https://lists.isc.org/pipermail/bind10-dev/attachments/20130715/19f81464/attachment.bin>


More information about the bind10-dev mailing list