What's the alternative to left recursion in GrammarKit?

I'm trying to write a .bnf for a domain-specific language using GrammarKit.

In this language (as in most languages), boolean expressions can be combined and nested within parentheses with arbitrary depth, including such expressions as ((true and ((true) or(false))) and (false or true)).

I would express this using left recursion as something similar to the following:

{
boolean='regexp:(true|false)'
// ...
}
// ...
condition ::= lparen boolean rparen | boolean (and|or) boolean

but I don't know how to make GrammarKit bnf support this.

 

4 comments

Try this bnf:

{
extends(".*Expr")=Expr
tokens = [
BOOLEAN = "regexp:(true|false)"
AND = "and"
OR = "or"
WHITESPACE="regexp:[ \n\r\t\f]"
]

}

Expr ::= relationalExpr | primaryGroup
relationalExpr ::= Expr relOp Expr
parenExpr ::= parenConstruct
literalExpr ::= BOOLEAN

private primaryGroup ::= literalExpr | parenExpr
private relOp ::= AND | OR
private parenConstruct ::= '(' [ !')' Expr (',' Expr) * ] ')'

The line

extends(".*Expr")=Expr

is the important one. It allows left recursion for *Expr rules and make AST more clean.

0

Thanks, folks. George, I noticed that if I replace your final line there with this,

private parenConstruct  ::= Expr a Expr

The left recursion error message comes back. Having played around with this for a couple days now and having read the documentation a few times (Thank you Gregory), I am still pretty mystified by the behavior of extends(.*Expr)  .

I thought that, maybe, extends(.*Expr) means that any rule that ends with the characters "Expr" can use left recursion. This is not the case. I have not been able to identify exactly what circumstances in which an expression violates the left-recursion rule.

Any advice or additional documentation on this matter would be greatly appreciated. It's a great kit by the way and I'm really enjoying working with it.

Thanks

0

Consider the ordinary logic: a non-private rule means a dedicated AST node.

Then the AST for a text "TRUE" will be: FILE->EXPR->LITERAL->TRUE

"Extends" semantic collapses the meaningless EXPR resulting in: FILE->LITERAL->TRUE

 

Recursive Descend (top-down) algorithm and PEG doesn't support left recursion as it always leads to stack overflow.

But if certain conditions are met the Grammar-Kit generator switches to Pratt parsing algorithm that is suitable for operators and looks more like bottom-up parsing. Bottom-up parsers do tolerate left recursion.

 

One basically writes the "expression part" of the grammar in the left-recursive style (operator precedence increases left-to-right, top-to-bottom), uses "rightAssociative" attribute if needed, groups operators with the same priority using private rules if needed. And there's just one requirement:

 

the "root" expression rule MUST ALWAYS collapse, i.e. it shall never appear in AST

 

So Grammar-Kit switches to Pratt expression parsing algorithm if 1. left recursion, 2. a "root" rule that always collapses. For example:

 

E = ADD_E | MUL_E | LITERAL_E    // the root rule. always a choice

ADD_E ::= E '+' E {extends=E}    // specific operator. extends the root

MUL_E ::= E '*' E {extends=E}    // ...

LITERAL_E ::= ID {extends=E}     // ..

 

See that there's a left recursion and E node will be always collapsed because all its possible children extends it.

One can move {extends=E} to top attributes group using rule-name-pattern { extends(".*_E")=E } in order not to repeat that for each operator rule.

That's it. The "left recursion" warning shall disappear.

 

0

Please sign in to leave a comment.