Proposal [A] for Re: INN config file parsing infrastructure

Todd Olson tco2 at cornell.edu
Fri May 5 16:21:13 UTC 2000


Hi Russ

So I recently sent notes questioning the proposed line continuation spec
for handling inn config files.  Here is an alternate proposal.

  1) <newline> is significant to the parser for parameter: statements
     it always terminates a "parameter: value" statement
     unless it is "escaped"
     (<newline> need not terminate "group" statements)

  2) A <newline> is escaped by a '\'
     Where ever \<newline> occurs it and all subsequent white space is elided.
     This is independent of quotes.

  3) \n is rendered as a newline character.
     This could be limited to only in quoted strings,
     or not ... I have no strong opinion (at the moment).
     
The major virtue of this scheme is that it separates formating of
the config file from the format of the data.  As a result the
following good things happen:
      a) it will be possible to format config files so that they
         are *readable*
      b) the "values" can be unambiguously specified
      c) it will be possible to format config files to support
         *simple* homerolled script manipulation.
There are two downsides I see
      a) to continue a line past a newline requires an explicit
         continuation character.
      b) it is awkward to use '\' with  '\n       '  (see example #4 & #5)
The only way around (a) is to *require* a termination character (eg ';')

This proposal is consistant with many of the other context I work in
so it will lead to the least confusion for me.

note also that because of #1, ';' style terminators are not required
but can be optionally supported.

EXAMPLES:

#1] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=
     parameter:  host1 at very.long.domain.name1,host2 at very.long.domain.name2,host3 at very.long.domain.name3

     parameter:  host1 at very.long.domain.name1,\
                 host2 at very.long.domain.name2,\
                 host3 at very.long.domain.name3

     parameter: "\
                ,host1 at very.long.domain.name1\
                ,host2 at very.long.domain.name2\
                ,host3 at very.long.domain.name3\
                "

   These all mean the same thing.

   The first is hard to read.
   The second and third a *much* more readable.

   The third in particular is amenable to manipulation by simple scripts

#2] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=

     wildmat: @to.*, at contro*, at clari.*, at uo.*,@*warez*,@*cd.image*,@*mp3*\
             ,@*bootleg*,@*crack*,@*hack*,@*2600*,@*phreak*,@*password*\
             ,@*job*,@*test*,@*mlm*\
             ,@*multilevel*,@*nospam*,@*anon*, at mail*, at private.*,@*lists.*\
             ,@*bonehead*\

     
   In my current newsfeeds file I have things like this that go for 30+ lines

   This presumably gets around the 8K character line length limit (??)

   I'm not sure how the above would be accomodated in Russ's proposal.


#3] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=

      parameter:  "x-comment:  A long line.... ...... ......... .... .....formated gracefully"

      parameter:  "x-comment:  A long line.... ...... .....\
                                   .... .... .....formated gracefully"

   Both set the paramater to 
"x-comment:  A long line.... ...... ......... .... .....formated gracefully"

   but the second definately leads to a cleaner config file.
   Especially with the grouping structure and the tendency for people
   to use indentation with the grouping.

#4] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=
     parameter:  "x-comment:  Three lines\n	with leading\n	tabs"

     parameter:  "x-comment:  Three lines\n	\
                              with leading\n	\
                              tabs"

     parameter:  "x-comment:  Three lines\
                \n	with leading\
                \n	tabs\
                 "

   These give the same result
   (I like the third format best as it is clearest)

   I think Russ's proposal would require the following
               parameter:  "x-comment:  Three lines\
	with leading\
	tabs"
   Which I find confusing ...

#5] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=
     parameter:  "x-comment:  Three lines\
                              with leading\
                              tabs"

     parameter:  "x-comment:  Three lineswith leadingtabs"

   This is the most confusion I see with this proposal.

#6] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=
   Russ said

>Syntax error.  Actually, if you write:
>
>    parameter1:
>    parameter2:
>
>you'll actually set parameter1 to the value "parameter2:", so inncheck
>should check for this.  (Or we could require that the value start on the
>same line as the parameter, but I'm not sure that's necessary.)

In the present proposal this is *not* a problem ... <newline> terminates the
statement.

#7] =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=
Finally the example Russ gives work pretty nice in this scheme also

    group blah { parameter: value; parameter2: value2 }

    group
    blah
    { parameter:\
    value
    parameter2: value2
    }

note that I am assuming the config parser continues from line to line
to acumulate the "group ..." tokens


Regards,
Todd Olson
Cornell Unversity








More information about the inn-workers mailing list