INN config file parsing infrastructure
Forrest J. Cavalier III
mibsoft at epix.net
Thu May 4 16:02:15 UTC 2000
I think the allowance of "" surrounding booleans and integers
is invitation to confusion.
> This is a good point. On the other hand, I really don't want to get into
> handling the full set of C backslash sequences, at least for right now
> until we have some pressing need for them. INN doesn't really need the
> ability to embed newlines in a string; shall we just say that escaped
> newlines disappear completely like they do in C?
>
I think the following should be done. (And tested code follows for
string translating.)
---------------------------
Configuration String Values
---------------------------
String values (and only string values) are surrounded by
doublequotes.
Within a string, the following escape sequences
are recognized and translated:
\t
\n
\f
\r
\b
\\
\x<DD> where <DD> is two characters from the set [0-9A-Fa-f]
To embed a doublequote character in a string, use \x22.
Line Continuation
-----------------
For readability, strings may continue across more than one
text line, as long as the last non-blank character on a line
is '\'. In the case of such continuation, the backslash and
any spaces or tabs which follow up to and including the newline
are discarded.
To embed a newline in a string use the \n escape sequence.
Note that it is considered a syntax error to continue a string
across a linebreak unless the '\' properly appears. The parser
will recognize this as a syntax error.
----------------------------------------------------------------
/* decodeCBE placed in the public domain. 4 May 2000
Forrest J. Cavalier III http://www.mibsoftware.com/
*/
#include <ctype.h>
char *decodeCBE(char *pszBackslashEscaped,int *pcb)
{
/*
Translate backslash escapes of the form:
\f \t \n \r \b \xXX '\\' \0
Other characters which follow a backslash character are
stored literally (but the backslash itself is not returned.)
if (pcb), the number of characters in the translated
string is stored at *pcb. This allows \0 to be
used within the string.
*/
/* We can work in place, since we know dptr <= sptr always. */
char *sptr = pszBackslashEscaped;
char *dptr = sptr;
char *ptr;
while(*sptr) {
if ((*sptr == '\n')||(*sptr == '\r')) {
/* Syntax error. Backslash must appear at end of line */
sptr++; /* ignore */
} else if (*sptr != '\\') {
*dptr++ = *sptr++;
} else {
/* Got a backslash */
sptr++;
ptr = sptr;
while ((*ptr == ' ')||(*ptr == '\t')) {
ptr++;
}
if ((*ptr == '\r')||(*ptr == '\n')) {
/* Backslash at end of line. Discard */
if ((*ptr == '\r')&&(ptr[1]=='\n')) {
ptr += 2;
} else {
ptr++;
}
sptr = ptr;
} else {
if (*sptr == 't') {
*dptr++ = 0x09;
sptr++;
} else if (*sptr == '\\') {
*dptr++ = '\\';
sptr++;
} else if (*sptr == 'n') {
*dptr++ = 0x0a;
sptr++;
} else if (*sptr == 'f') { /* 10-3-95 */
*dptr++ = '\f';
sptr++;
} else if (*sptr == 'r') {
*dptr++ = 0x0d;
sptr++;
} else if (*sptr == 'b') {
*dptr++ = 0x08;
sptr++;
} else if (*sptr == '0') {
*dptr++ = '\0';
sptr++;
} else if (*sptr == 'x') {
/* Specify in hex. This does no character validity checking. */
sptr++;
if (isdigit(*sptr)) {
*dptr = *sptr - '0';
} else {
*dptr = toupper(*sptr)-'A' + 10;
}
*dptr <<= 4;
sptr++;
if (isdigit(*sptr)) {
*dptr += *sptr - '0';
} else {
*dptr += toupper(*sptr)-'A' + 10;
}
sptr++;
dptr++;
} else { /* Other escape. Send as literal Debateably a syntax error */
*dptr++ = *sptr++;
}
}
}
}
*dptr = '\0';
if (pcb) {
*pcb = dptr - pszBackslashEscaped;
}
return pszBackslashEscaped;
} /* decodeCBE */
/****************************************************************/
#ifdef TEST_decodeCBE
char *tests[] = {
"This is a\\ttest",
"This is \x61\\ttest",
"This is \x61\\tte\\ \r\nst",
"This is \x61\\tte\\ \rst",
"This is \x61\\tte\\ \nst",
"This is \x61\\tte\\\r\nst",
"This is \x61\\tte\\\rst",
"This is \x61\\tte\\\nst",
"This is a \\\\test",
"{This is a \\n test}",
0
};
main()
{
char buf[1024];
int i;
i = 0;
while(tests[i]) {
strcpy(buf,tests[i]);
printf("%s\n",decodeCBE(buf,0);
i++;
}
} /* main */
#endif
More information about the inn-workers
mailing list