hash buckets
Ted Lemon
Ted.Lemon at iengines.com
Tue Jan 4 23:30:14 UTC 2000
> Meaning a number of hash buckets as hight as 10k right?
Right.
> > I'd have to think about it before recommending that you go higher.
>
> Than 10k hash buckets or 97. I am pretty sure you mean 10k.
10k.
> Can I ask why? Does the hashing algorithm (which I have not studied at any
> length yet) work better on prime numbers?
It's superstition. I read somewhere once that it was a good idea.
I have no idea whether or not it makes sense mathematically, but
intuitively it seems sensible. The hash algorithm could probably use
some tuning if you're feeling like being scientific about it... :')
> I was wondering about the feasibility of the server dynamically deciding on
> a good number for hash buckets at (say) startup. I cannot think of a way
> that does not include a double parsing or processing at least, of the
> configuration details though.
Given that leases are the biggest database being hashed, it would
probably make sense to just count how many leases are created during
the parsing of the configuration file, and then resize all the hashes
based on that. Each hash table can have its own size, so you could
be even fancier and just have hash_add notice when the table was
getting full and make it bigger. You don't need to reparse - just
create a new hash structure with more buckets and re-hash what's in
the old structure into the new.
> And I guess one last thing: IYO, is there much benefit to be gained by
> reducing the number of items in a bucket from say 50-100 down to 5-10 or
> less? Is a decent enough time spent in the server dealing with the hashes
> to make more buckets a decent win?
It's probably reasonably winning at high volumes, yes. If you have a
hundred entries in a bucket, it's probably taking about the same
amount of time to find a given entry as it is to do the rest of the
processing for that entry. If you have ten entries in a bucket, it's
probably not worth the effort, because theoretically we only do a hash
lookup between one and three times for each packet processed, and
there are other things in packet processing that can be equally
time-consuming.
_MelloN_
More information about the dhcp-hackers
mailing list