64-bit bugs in new ovdb

Dan Riley dsr at mail.lns.cornell.edu
Thu Oct 5 19:20:04 UTC 2000

Heath Kehoe <hakehoe at arthur.avalon.net> writes:
>> struct datakey {
>>     group_id_t groupnum;
>>     ARTNUM artnum;
>> };
>> On the alpha, ARTNUM is unsigned long (64 bits) and group_id_t is
>> defined by ovdb as u_int32_t.  By the common rules of padding C
>> structs, the compiler adds an extra 32 bit word of padding after
>> groupnum to 64-bit align artnum.  Since BerkeleyDB apparently treats
>> the datakey as a chunk of memory sizeof(datakey), that padding *must*
>> be zeroed (or at least set to a consistent value).  The enclosed patch
>> adds the appropriate memsets, and appears to fix the obvious problem:
>Hi, thanks for your report.  I hadn't considered the padding issue.
>Are alphas big-endian?  

Little endian (well, they can be either, and I believe the Cray T3E
operates in big endian mode--do we have to worry about INN running on
the T3E? :-).

>There's another potential problem, in that htonl() and ntohl() macros
>are used on ARTNUM datatypes.  Are those macros safe to use on 64-bit

A network long is 32 bits, so it's going to break for article numbers
larger than 32 bits--probably by losing the high bits (in fact, htonl
and ntohl were my first suspects, but I eventually convinced myself
that they weren't causing trouble, yet).  Other things probably will
break as well--for example,

void *OVopensearch(char *group, int low, int high);

probably ought to be

void *OVopensearch(char *group, ARTNUM low, ARTNUM high);

It seemed like an academic issue--my active file will need to be
regenerated long before then--so I didn't worry about it (despite
being an academic...).

>If not, I'll have to change the artnum's so that they are u_int32_t.
>If there are two 32-bit ints, there won't be padding between them,
>right? (on a 64-bit system, I mean).

There won't on the alpha, and I'd expect that to be true on most other
64-bit systems, but there's nothing that guarantees that--the compiler
is allowed to insert padding anywhere except the beginning of a
struct.  When you're passing a struct that is going to be treated as a
chunk of memory by the callee, you generally have to zero it first if
you want to be safe.  The alternative is to pack it yourself into a
array of char, which is a pain, but is the only choice if the
resulting database needs to be portable across architectures.

Dan Riley                                         dsr at mail.lns.cornell.edu
Wilson Lab, Cornell University      <URL:http://www.lns.cornell.edu/~dsr/>
    "History teaches us that days like this are best spent in bed"

More information about the inn-workers mailing list