brainstorming about shared memory and configuration files (esp. for sm -d )
David L. Nicol
david at kasey.umkc.edu
Fri Apr 27 02:51:26 UTC 2001
xload showed a humongous load spike.
I looked through the log files and apparently the load spike
was due to dozens of instances of sm being called, not sure from
where, as part of handling of a burst of control files.
I assume this is normal. I also suppose this kind of thing is
what kept killing my ovdb, and hope that moving the log files
to a different spindle, as suggested on the ovdb man page, will
help. But I digress.
Each sm starts by reading the control file, then it does what
it is told to do, then exits. So a burst of cancels would cause
the load spike.
I wonder these things:
1: if the sm tool could queue up commands instead of seeing
to them directly, possibly with a -defer switch
2: if shared memory segments could be used for these purposes:
a: eliminating configuration file reading, by passing
a segment ID of a previously read configuration to each
tool, or having a one-line file that the segment ID can
be read from, eliminating the parsing step (the segment
would hold the struct conf_vars, or a compliant program
would, on error, read the file and share the segment and
rewrite the (hypothetical) $pathdb/shmid/inn.conf file
b: a fast queueing system.
I do not know what kind of locking sm uses, so it is possible that
a dozen sm utilities running at the same time are all simply waiting
for the lock instead of using any CPU, in which case the problem is
simply with the OS for showing a blocked process as part of the load.
If not, if it is busy-waiting, that could be changed, for instance by
designating a temporary directory to copy sm directives to, one instance
per file, moving the directives to a second directory, and having a smd
that checks the second directory every few seconds and serializes the
directives, in effect doing a MailDir delivery to the daemon. We might lose
or even exacerbate elevator issues this way, by giving the fs Yet More to
I imagine the directories would live w/in pathdb, next to history, and
be called sm_queue and sm_queue_temp , and that few files in there would
long enough to ever get written to disk.
Synchronizing them all by doing what I described above, but into data
in a big shared memory block instead of via the FS, would be sneakier. The
first sm process would want to stay alive as the daemon, after starting out
by creating the Big Shared Memory Block, writing its ID to
and queuing up whatever it is supposed to do for itself when it switches
If the queueing system gets full (!) how big is a token, anyway (!) sm
open a second shared bock or could start blocking for a chance to write
into the queue.
Say we use a queue block with space in it for 200 tokens. sm takes its pid
mod 200 and checks to see if that space is empty (set to all 0 -- is that a
valid article ID?) and if so, writes the tokens it wants deleted into its
hashed slot and exits.
The sm server process checks all 200 slots until they are all zeros, sleeps
a second, tries again. When it finds a token in a slot, it deletes the
(using sm as it is now).
Maybe slots have space in them for many tokens. Maybe locking of some kind
is required before slotting tokens.
tokens are 16 bytes, which means an 8k segment could hold 512 of them.
Mod your pid by 128 and multiply by four and start putting tokens in
zeroed slots, if you get all the way back around to where you started and
you still have tokens left to delete, open your own shared segment and
If "copy to destination address if value there is zero and return success"
is not atomic enough (it it?) for use in SMP architectures --- hmmm ---
semaphores can be atomically incremented and can hold integers, a semaphore
could indicate which slot in the deletion table is the next one to write
to. A second semaphore is needed to do locking on the first semaphore,
makes using the semaphore for anything other than controlling mutexes
we would use a semaphore to control access to incrementing the slot
so a second or later sm process, instead of directly dealing with storage,
to delete an article, would do this, after reading in the semaphore and shm
IDs from a configuration file.
Aquire the mutex
find a zero slot and copy the token into it, for all tokens
we are to schedule deletions for
release the mutex
http://www.p-nand-q.com/linux/mutex.htm provides some alleged example code
of setting and releasing a semaphore-based mutex.
I just want to smooth the performance so you can stack up a hundred
sm -d calls and you don't get as high of a load spike.
what think everyone?
David Nicol 816.235.1187 dnicol at cstp.umkc.edu
and they all say "yodelahihu"
More information about the inn-workers