buffindexed: could not open overview

Alex Kiernan alexk at demon.net
Wed Jan 22 10:57:14 UTC 2003


Russ Allbery <rra at stanford.edu> writes:

> Alex Kiernan <alexk at demon.net> writes:
> 
> > I'm almost certain that all of our concurrent file based locking where
> > we're using mmap is broken on SMP (and it'll depend on your architecture
> > whether its really broken or just theoretically broken). On SMP the
> > locking primitives for concurrent code use memory barrier operations to
> > ensure that all CPUs have consistent caches; the lock file stuff we're
> > using I suspect doesn't generate the right memory barriers so we manage
> > to exit critical regions with inconsistent cache views of the mmap()ed
> > regions on different CPUs which means we get all confused about the
> > state of the bitmaps.
> 
> Hm, that sounds like an operating system bug to me.  Why wouldn't the
> operating system make sure that fcntl works properly with SMP?
> 

I'd thought not, because I thought I'd seen it on Solaris and Linux,
but I'm not sure the Solaris problem wasn't the NFS/mmap/range lock
problem (since at the time I was testing I had some FC-AL disk and a
NetApp F840, and I really can't remember what I tested where).

We really need someone who understands the intimate detail of SMP on a
particular CPU I guess.

I did try & write a test program to show the problem, but I ended up
needing real synchronisation primitives which I think would change the
conditions too much :(

-- 
Alex Kiernan, Principal Engineer, Development, THUS plc


More information about the inn-workers mailing list