unified conf syntax proposal..

Sun Mar 11 17:38:14 UTC 2001

According to Russ Allbery:
> 
> Fabien Tassin <fta at sofaraway.org> writes:
> 
> > I've planned to implement a parser for the following syntax RSN.
> 
> Is there any way that you could use the syntax that we previously hashed
> out for INN's configuration file? 

by RSN, I really meant *soon*. It is written now and almost polished.
Compared to my latest proposal, I've added a maxconnections keyword
(applying in both in and out ways) and '^$INCLUDE "file"' with a
configurable max depth.

oh, it is in Lex/Yacc. Well, it requires Flex as some complex checks are made
in the lexical analyzer but as innfeed already requires flex, it should
not be a problem.

> It looks like you're doing something
> *almost* similar, but just different enough that we couldn't use the same
> parser, and I think that the existing syntax could still do what you want
> to do.

My parser is a lib. It can be used in INN but it will be difficult because
innfeed and innxmit/nntpsend are separate processes. Each process has to
parse the common file and take what it needs.

> It would be really unfortunate if we couldn't use the same syntax for all
> of INN's configuration files.  :/  I suppose it's my fault for not having
> managed to finish writing that parser yet.
> 
> I'd also recommend against the default group and stuff like "no default";
> I think it adds confusing linkage between different groups that aren't
> obviously nested.

this a point that stopped me during the API design. Defaults *are* needed to
avoid non deterministic situations. The way there are defined is another
problem that I'm currently fighting with.

> The idea of defaults can be handled with nested groups that inherit all
> properties from their enclosing groups unless they're explicitly
> overridden;

That was also what I thought when I've created incoming.conf years ago.
I've never used groups in my own servers because it makes things header when
you have a lot of peers. The main reasons are that a) you have to remember
all nested changes introduced by default+groups - easy if all fit on the
same page but it is rare for a transit box and b) if you change a parameter
(global or in a group) you have have to rewrite everything that depends
on it. Ick.

> it's then easier to tell at a glance what properties apply.
> I think a descriptive approach to feeds would be much clearer, like:
> 
> peer foo {
>     accept {
>         groups: "*,@*poison*"
>         address: "news.foo.org, 1.2.3.4"
>     }
>     feed {
>         condition: "small"
>         exclude-path: "foo, news-small.foo.org"
>         address: news-small.foo.org
>         port: 456
>         type: builtin
>     }
>     feed {
>         condition: big
>         exclude-path: "foo, news-big.foo.org"
>         address: news-big.foo.org
>         port: 456
>         type: builtin
>     }
> }
> 
> I don't think you're actually gaining anything from the "condition then
> action" syntax model; the above expresses the same thing without requiring
> that additional structure.

hmm. I admit that it looks simpler but I don't think it really is.
Conditions and parameters are mixed. You (human) have to read them all before
beeing able to determine if the rule applies or not.

Action blocks are not named either.

> That has all the same information as your example, but the syntax elements
> are much simpler.  No special syntax for lists (although I could be argued
> into [] instead of a de facto standard of comma-separated lists -- there's
> a good justification for adding that syntax element), a uniform syntax
> that always has "key: value", and the same syntax as incoming.conf,
> readers.conf, or innfeed.conf so that we can use the same parser for all
> of them.

My goal was not only to reduce the number of parsers but also the
number of files. For these 3, I want 1 parser and 1 file.

> I don't care much about whether the accept and feed blocks have labels;
> I'm happy to have the syntax allow either putting in labels or leaving
> them off, actually.

it can be.

> I do actually care a lot about keeping to a "key: value" syntax if at all
> possible, if it doesn't make much difference to you.  That sort of detail
> doesn't change what the syntax can express, but it makes a *huge*
> difference when writing the parser and using the same parser for
> everything.

you mean what ? the trailling ":" or the 'scalar' value ?
My view :
- ":" adds nothing.
- scalar impose to use strings and then move the syntax checking to
all the places that will use the parameters. With [] lists, the parser
can already check that all items are what is expected.
Oh, I forgot to tell that in my grammar " [ x ] " is equivalent to " x ".

excludepathentry: EXCLUDEPATH listpaths SCOLON;
listpaths: path | LBRACKET paths RBRACKET;
paths: path | paths path;
path: WORD {
  /* ... */
};

> > - are 'option's in a program block needed or not ? This is the place for
> >   non-peer outgoing parameters but there's no way to use them except for
> >   'builtin'.

I no longer see why only 'builtin' can use them.. all programs are able
to take their parameters that way.

> I'd use a completely separate block for global settings, probably called
> something like "global".

the goal was, once again, to pilot all programs with a single conf file,
if possble. Your 'global' is then program dependant.

> > - Some (including Russ) dislike named terms. I agree that the names are
> > useless in the semantic but I want to be able to manipulate terms by
> > name like in 'ctlinntd del peer foo term fromme1'.
> 
> Yeah, that's a good reason for them.
> 
> Okay, assuming that we add [] for lists, where a value is expected, the
> syntax that I'm proposing has the following semi-formal description:
> 
>     group         := <type> *1(<space> <name>) { *<item> *<group> }
>     type          := <label>
>     name          := <string>
>     item          := <parameter> : (<string> / <list>)
>     <parameter>   := <label>
>     <string>      := <label> / " <quoted-text> "
>     <list>        := [ <string> *(<space> <string>) ]
>     <label>       := 1*(A-Z / a-z / 0-9 / _)
>     <quoted-text> := *(any character, \ escapes " or newline, standard
>                        C backslash escapes apply)
> 
> Whitespace can be placed arbitrarily anywhere except in the middle of a
> label (and is significant in quoted-text), about as you'd expect.  There
> are two places where at least one space is required; between elements of a
> list, and between the type of the group and the group name if given.

well well well... now that the code is written, what should we do ?
We are only two with opposed visions. What do others think about all
of this ? It is still time to change things if I'm alone to think that my
proposal is good.

-- 
Fabien Tassin -+- fta at sofaraway.org