Ternary search trie 1.4

Julien ÉLIE julien at trigofacile.com
Sat Jul 5 18:27:00 UTC 2008


Hi,

The new 1.4 version of the ternary search trie has the following changes:

* the insert routine has been slightly restructured;
* a new option TST_SUBSTRING_MATCH has been added for the search function
  so that if the trie contains the strings "test" and "testing", a search
  for "testin" with this option will return the entry for "test".  Basically,
  the most complete substring found is returned.  Another optional parameter
  to the search function returns the length of the match.


I have done a diff between tst 1.3 and tst 1.4.  I confirm that nothing else
has been modified.
I do not think the changes in the search routine are useful for us.
However, I tried to understand the changes in the insert routine and
I could not see what was happening...
The code between tst 1.3 and its implementation in INN is not the same
(tst 1.3 is more verbose and has more stuff).

The only thing I see in tst 1.4 is that there is this addition:


if(key[key_index] == 0) {
  if(tst->free_list == NULL) {
    if(tst_grow_node_free_list(tst) != 1)
      return TST_ERROR;
  }
  new_node = tst->free_list;
  tst->free_list = tst->free_list->middle;
  memcpy((void*)new_node, (void*)current_node, sizeof(struct node));
  current_node->value = 0;
  if(new_node->value < 64) {
    current_node->left = new_node;
    current_node->right = '\0';
  } else {
    current_node->left = '\0';
    current_node->right = new_node;
  }
  current_node->middle = data;
  return TST_OK;
}


and it would occur in INN's implementation here:


    while (root_node == NULL) {
        if (key[key_index] == current_node->value) {
            if (key[key_index] == '\0') {
                if (exist_ptr != NULL)
                    *exist_ptr = current_node->middle;
                if (option == TST_REPLACE) {
                    current_node->middle = data;
                    return TST_OK;
                } else
                    return TST_DUPLICATE_KEY;
            }
            if (current_node->middle == NULL)
                root_node = &current_node->middle;
            else {
                current_node = current_node->middle;
                key_index++;
            }
---> XXXXX
        } else if (LEFTP(current_node, key[key_index])) {
            if (current_node->left == NULL)
                root_node = &current_node->left;
            else
                current_node = current_node->left;
        } else {
            if (current_node->right == NULL)
                root_node = &current_node->right;
            else
                current_node = current_node->right;
        }

    }


It looks as though the algorithm might stop at XXXXX with TST_OK
without going again in the loop.

Since Peter A. Friend only mentions a restructuration (and not a bug),
maybe we should not change anything...  It seems wiser :-)

-- 
Julien ÉLIE

« -- Ben où est l'eau ?
  -- Elle est sortie quand tu es entré Obélix. Il n'y a pas de place
  pour vous deux là dedans ! » (Astérix) 



More information about the inn-workers mailing list