hash buckets
Brian J. Murrell
brian_murrell at ssd.bctel.net
Wed Jan 5 00:01:52 UTC 2000
from the quill of Ted Lemon <mellon at isc.org> on scroll
<200001042330.SAA09897 at grosse.manhattan.fugue.com>
>
> > Meaning a number of hash buckets as hight as 10k right?
>
> Right.
>
> > > I'd have to think about it before recommending that you go higher.
> >
> > Than 10k hash buckets or 97. I am pretty sure you mean 10k.
>
> 10k.
Cool. I will look into upping the number of buckets when I can more
closely determine how many leases we do keep outstanding.
> It's superstition. I read somewhere once that it was a good idea.
Ahhhh. Interesting, because the idea of it being a prime number is not all
that foreign to me.
> I have no idea whether or not it makes sense mathematically, but
> intuitively it seems sensible. The hash algorithm could probably use
> some tuning if you're feeling like being scientific about it... :')
Maybe, likely not though. I am not particularly a math genius nor do I
enjoy it much.
> Given that leases are the biggest database being hashed, it would
> probably make sense to just count how many leases are created during
> the parsing of the configuration file, and then resize all the hashes
> based on that.
Oh man! You do throw me a looper every now and then Ted. I was thinking
of a hashing API that would resize a table when it determined the load was
too high, but thought you would skin me for suggesting it. :-)
> You don't need to reparse - just
> create a new hash structure with more buckets and re-hash what's in
> the old structure into the new.
That was exactly what I was thinking about. Would have to be careful not
to resize too often and chew up your savings though. This would make an
interesting case study in C++ (or any OO I guess) class design and writing
methinks. A hashing class that manages itself to be efficient.
> It's probably reasonably winning at high volumes, yes.
Sounds like me. :-)
> If you have a
> hundred entries in a bucket, it's probably taking about the same
> amount of time to find a given entry as it is to do the rest of the
> processing for that entry.
Hmmmmm.
> If you have ten entries in a bucket, it's
> probably not worth the effort, because theoretically we only do a hash
> lookup between one and three times for each packet processed, and
> there are other things in packet processing that can be equally
> time-consuming.
Ahhhhh, cool. It was this analysis that I was looking for. Another
interesting case study would to be putting timers into the server like Joe
Grecco's (I hope I have the reference correct) INND timers. I did this
with my SPF packet filter code and it was quite educational to discover
where all the time was being spent. I realize you can do this with
"profiling", but it's more interesting to see it based on roles in the code
rather than broken down to functions.
b.
--
Brian J. Murrell Brian_Murrell at ssd.bctel.net
BCTel Advanced Communications 604 454 5279
Vancouver, B.C.
More information about the dhcp-hackers
mailing list