nnrpdcheckart speedup possible?

Russ Allbery rra at stanford.edu
Tue Jul 5 05:30:12 UTC 2005


It's been a while since your message, so I figured I'd cc you; not sure if
you're following inn-workers actively.

Lars Magne Ingebrigtsen <larsi at gnus.org> writes:

> Another approach to take would be to have tradindex mark the NOV line
> as "removed".  Looking at the code for outputting the NOV lines, it
> looks like if the Xref data is blanked out, the XOVER command won't
> output the lines.  The comments for tradindexed_cancel() say that
> it's not implemented yet because you can't go from token to group
> name, which I guess is true, but tradspool does just that by looking
> at the Xref header and cancelling all the instances there.  Can't
> tradindexed_cancel() use the same approach?  (And the other backends
> as well, but that doesn't really interest me as much.  :-)

I decided to finally make this work.  I've implemented OVcancel in CURRENT
(as well as the corresponding function in the new API) and modified innd
there to call it whenever cancelling a message.  Here's the commit
message:

| Add an overview_cancel (which takes group and article number) and
| overview_cancel_xref (which takes a token and retrieves the Xref header
| from the article to find groups and article numbers) interfaces to the
| new overview API.
| 
| Change the underlying cancel method of overview backends to take a group
| and article number rather than a token, since an overview backend is
| never going to know what to do with a token.
| 
| Modify OVcancel, copying the new overview_cancel_xref implementation, to
| retrieve the article and parse the Xref header to find groups and
| article numbers and then call the cancel method for each pair.
| 
| Implement the new cancel method for tradindexed.  buffindexed and ovdb
| still have stub implementations at this point.
| 
| Modify innd to cancel the overview data for an article whenever
| cancelling an article.

I expect this will remove much of the need to call nnrpdcheckart going
forward.  Once we have cancel implementations for the other two overview
methods, I'll probably turn it off by default, assuming this works out
well.  A cancel isn't exactly a cheap operation now, but these days it's
not as horribly common as it used to be either.

I have not (and do not plan to) backport this to STABLE, but it shouldn't
be too hard to do so.  Attached are the bits of this patch that you're
interested in for STABLE.  It may actually apply with a bit of fuzz.

Index: innd/art.c
===================================================================
--- innd/art.c	(revision 7349)
+++ innd/art.c	(working copy)
@@ -1238,6 +1238,8 @@
   }
 
   /* Get stored message and zap them. */
+  if (innconf->enableoverview)
+    OVcancel(token);
   if (!SMcancel(token) && SMerrno != SMERR_NOENT && SMerrno != SMERR_UNINIT)
     syslog(L_ERROR, "%s cant cancel %s (SMerrno %d)", LogName,
 	TokenToText(token), SMerrno);
Index: storage/buffindexed/buffindexed.c
===================================================================
--- storage/buffindexed/buffindexed.c	(revision 7347)
+++ storage/buffindexed/buffindexed.c	(working copy)
@@ -1552,7 +1552,7 @@
   return true;
 }
 
-bool buffindexed_cancel(TOKEN token UNUSED) {
+bool buffindexed_cancel(const char *group UNUSED, ARTNUM artnum UNUSED) {
     return true;
 }
 
Index: storage/buffindexed/buffindexed.h
===================================================================
--- storage/buffindexed/buffindexed.h	(revision 7347)
+++ storage/buffindexed/buffindexed.h	(working copy)
@@ -13,7 +13,7 @@
 bool buffindexed_groupdel(const char *group);
 bool buffindexed_add(const char *group, ARTNUM artnum, TOKEN token,
                      char *data, int len, time_t arrived, time_t expires);
-bool buffindexed_cancel(TOKEN token);
+bool buffindexed_cancel(const char *group, ARTNUM artnum);
 void *buffindexed_opensearch(const char *group, int low, int high);
 bool buffindexed_search(void *handle, ARTNUM *artnum, char **data, int *len,
                         TOKEN *token, time_t *arrived);
Index: storage/ovdb/ovdb.h
===================================================================
--- storage/ovdb/ovdb.h	(revision 7347)
+++ storage/ovdb/ovdb.h	(working copy)
@@ -12,7 +12,7 @@
 bool ovdb_groupdel(const char *group);
 bool ovdb_add(const char *group, ARTNUM artnum, TOKEN token, char *data,
               int len, time_t arrived, time_t expires);
-bool ovdb_cancel(TOKEN token);
+bool ovdb_cancel(const char *group, ARTNUM artnum);
 void *ovdb_opensearch(const char *group, int low, int high);
 bool ovdb_search(void *handle, ARTNUM *artnum, char **data, int *len,
                  TOKEN *token, time_t *arrived);
Index: storage/ovdb/ovdb.c
===================================================================
--- storage/ovdb/ovdb.c	(revision 7347)
+++ storage/ovdb/ovdb.c	(working copy)
@@ -128,7 +128,7 @@
 bool ovdb_add(const char *group UNUSED, ARTNUM artnum UNUSED, TOKEN token UNUSED, char *data UNUSED, int len UNUSED, time_t arrived UNUSED, time_t expires UNUSED)
 { return false; }
 
-bool ovdb_cancel(TOKEN token UNUSED)
+bool ovdb_cancel(const char *group UNUSED, ARTNUM artnum UNUSED)
 { return false; }
 
 void *ovdb_opensearch(const char *group UNUSED, int low UNUSED, int high UNUSED)
@@ -2201,7 +2201,7 @@
     return true;
 }
 
-bool ovdb_cancel(TOKEN token UNUSED)
+bool ovdb_cancel(const char *group UNUSED, ARTNUM artnum UNUSED)
 {
     return true;
 }
Index: storage/ovinterface.h
===================================================================
--- storage/ovinterface.h	(revision 7347)
+++ storage/ovinterface.h	(working copy)
@@ -24,7 +24,7 @@
     bool	(*groupdel)(const char *group);
     bool	(*add)(const char *group, ARTNUM artnum, TOKEN token,
                        char *data, int len, time_t arrived, time_t expires);
-    bool	(*cancel)(TOKEN token);
+    bool	(*cancel)(const char *group, ARTNUM artnum);
     void	*(*opensearch)(const char *group, int low, int high);
     bool	(*search)(void *handle, ARTNUM *artnum, char **data, int *len,
                           TOKEN *token, time_t *arrived);
Index: storage/tradindexed/tradindexed.h
===================================================================
--- storage/tradindexed/tradindexed.h	(revision 7347)
+++ storage/tradindexed/tradindexed.h	(working copy)
@@ -26,7 +26,7 @@
 bool tradindexed_groupdel(const char *group);
 bool tradindexed_add(const char *group, ARTNUM artnum, TOKEN token,
                      char *data, int length, time_t arrived, time_t expires);
-bool tradindexed_cancel(TOKEN token);
+bool tradindexed_cancel(const char *group, ARTNUM artnum);
 void *tradindexed_opensearch(const char *group, int low, int high);
 bool tradindexed_search(void *handle, ARTNUM *artnum, char **data,
                         int *length, TOKEN *token, time_t *arrived);
Index: storage/tradindexed/tdx-data.c
===================================================================
--- storage/tradindexed/tdx-data.c	(revision 7347)
+++ storage/tradindexed/tdx-data.c	(working copy)
@@ -622,6 +622,31 @@
 
 
 /*
+**  Cancels a particular article by removing its index entry.  The data will
+**  still be present in the data file until the next expire run, but it won't
+**  be returned by searches.
+*/
+bool
+tdx_data_cancel(struct group_data *data, ARTNUM artnum)
+{
+    static const struct index_entry empty;
+    off_t offset;
+
+    if (!data->writable)
+        return false;
+    if (data->base == 0 || artnum < data->base || artnum > data->high)
+        return false;
+    offset = (artnum - data->base) * sizeof(struct index_entry);
+    if (xpwrite(data->indexfd, &empty, sizeof(empty), offset) < 0) {
+        syswarn("tradindexed: cannot cancel index record for %lu in %s.IDX",
+                artnum, data->path);
+        return false;
+    }
+    return true;
+}
+
+
+/*
 **  Start the process of packing a group (rewriting its index file so that it
 **  uses a different article base).  Takes the article number of an article
 **  that needs to be written to the index file and is below the current base.
Index: storage/tradindexed/tradindexed.c
===================================================================
--- storage/tradindexed/tradindexed.c	(revision 7347)
+++ storage/tradindexed/tradindexed.c	(working copy)
@@ -209,15 +209,27 @@
 
 
 /*
-**  Cancel an article.  At present, tradindexed can't do anything with this
-**  information because we lack a mapping from the token to newsgroup names
-**  and article numbers, so we just silently return true and let expiration
-**  take care of this.
+**  Cancel an article.  We do this by blanking out its entry in the group
+**  index, making the data inaccessible.  The next expiration run will remove
+**  the actual data.
 */
 bool
-tradindexed_cancel(TOKEN token UNUSED)
+tradindexed_cancel(const char *group, ARTNUM artnum)
 {
-    return true;
+    struct group_entry *entry;
+    struct group_data *data;
+
+    if (tradindexed == NULL || tradindexed->index == NULL) {
+        warn("tradindexed: overview method not initialized");
+        return NULL;
+    }
+    entry = tdx_index_entry(tradindexed->index, group);
+    if (entry == NULL)
+        return NULL;
+    data = data_cache_open(tradindexed, group, entry);
+    if (data == NULL)
+        return NULL;
+    return tdx_data_cancel(data, artnum);
 }
 
 
Index: storage/tradindexed/tdx-private.h
===================================================================
--- storage/tradindexed/tdx-private.h	(revision 7347)
+++ storage/tradindexed/tdx-private.h	(working copy)
@@ -141,6 +141,9 @@
 /* Store article data. */
 bool tdx_data_store(struct group_data *, const struct article *);
 
+/* Cancel an entry. */
+bool tdx_data_cancel(struct group_data *, ARTNUM);
+
 /* Start a repack of the files for a newsgroup. */
 bool tdx_data_pack_start(struct group_data *, ARTNUM);
 
Index: storage/ov.c
===================================================================
--- storage/ov.c	(revision 7347)
+++ storage/ov.c	(working copy)
@@ -18,6 +18,8 @@
 
 #include "inn/innconf.h"
 #include "inn/messages.h"
+#include "inn/wire.h"
+#include "inn/vector.h"
 #include "libinn.h"
 #include "ov.h"
 #include "ovinterface.h"
@@ -237,12 +239,58 @@
 bool
 OVcancel(TOKEN token)
 {
+    ARTHANDLE *art;
+    const char *xref, *xrefend, *group;
+    size_t xreflen, i;
+    char *xref_copy, *p, *end;
+    ARTNUM artnum;
+    struct cvector *groups;
+
     if (!ov.open) {
-	/* must be opened */
+        /* must be opened */
         warn("ovopen must be called first");
-	return false;
+        return false;
     }
-    return ((*ov.cancel)(token));
+
+    /* There's no easy way to go from a token to the group and article number
+       pairs that we need to do this.  Retrieve the article and find the Xref
+       header and then parse it to figure out what to cancel.  Articles
+       without Xref headers lose. */
+    art = SMretrieve(token, RETR_HEAD);
+    if (art == NULL)
+        return false;
+    xref = wire_findheader(art->data, art->len, "Xref");
+    if (xref == NULL)
+        goto fail;
+    xrefend = wire_endheader(xref, art->data + art->len - 1);
+    if (end == NULL)
+        goto fail;
+    xreflen = xrefend - xref + 1;
+    xref_copy = xstrndup(xref, xreflen);
+    SMfreearticle(art);
+    groups = cvector_split_space(xref_copy, NULL);
+    for (i = 0; i < groups->count; i++) {
+        group = groups->strings[i];
+        p = (char *) strchr(group, ':');
+        if (p == NULL || p == group || p[1] == '-')
+            continue;
+        *p = '\0';
+        errno = 0;
+        artnum = strtoul(p + 1, &end, 10);
+        if (artnum == 0 || *end != '\0' || errno == ERANGE)
+            continue;
+
+        /* Don't worry about the return status; the article may have already
+           expired out of some or all of the groups. */
+        (*ov.cancel)(group, artnum);
+    }
+    free(xref_copy);
+    cvector_free(groups);
+    return true;
+
+fail:
+    SMfreearticle(art);
+    return false;
 }
 
 void *

-- 
Russ Allbery (rra at stanford.edu)             <http://www.eyrie.org/~eagle/>

    Please send questions to the list rather than mailing me directly.
     <http://www.eyrie.org/~eagle/faqs/questions.html> explains why.


More information about the inn-workers mailing list