The dangling else is a problem in programming of parser generators in which an optional else clause in an if–then(–else) statement results in nested conditionals being ambiguous. Formally, the reference context-free grammar of the language is ambiguous, meaning there is more than one correct parse tree.
In many programming languages one may write conditionally executed code in two forms: the if-then form, and the if-then-else form – the else clause is optional: if a then s if b then s1 else s2
This gives rise to an ambiguity in interpretation when there are nested statements, specifically whenever an if-then form appears as s1
in an if-then-else form:
if a then if b then s else s2In this example, s
is unambiguously executed when a
is true and b
is true, but one may interpret s2
as being executed when a
is false (thus attaching the else to the first if) or when a
is true and b
is false (thus attaching the else to the second if). In other words, one may see the previous statement as either of the following expressions: if a then (if b then s) else s2 if a then (if b then s else s2)
The dangling else problem dates to ALGOL 60,[1] and has been resolved in various ways in subsequent languages. In LR parsers, the dangling else is the archetypal example of a shift-reduce conflict.
This is a problem that often comes up in compiler construction, especially scannerless parsing. The convention when dealing with the dangling else is to attach the else to the nearby if statement,[2] allowing for unambiguous context-free grammars, in particular. Programming languages like Pascal,[3] C[4] and Java[5] follow this convention, so there is no ambiguity in the semantics of the language, though the use of a parser generator may lead to ambiguous grammars. In these cases alternative grouping is accomplished by explicit blocks, such as begin...end
in Pascal[6] and {...}
in C.
Depending on the compiler construction approach, one may take different corrective actions to avoid ambiguity:
The problem can also be solved by making explicit the link between an else and its if, within the syntax. This usually helps avoid human errors.[7]
Possible solutions are:
if
without a fallback clause to be an error, effectively distinguishing conditional expressions (i.e if
) from conditional statements (i.e when
and unless
, which do not have fallback clauses).if e do s
for the one-alternative case and if e1 then e2 else e3
for the general case.Concrete examples follow.
In C, the grammar reads, in part:
statement = ... | selection-statement selection-statement = ... | IF (expression) statement | IF (expression) statement ELSE statement
Thus, without further rules, the statementelse
block is associated with the nearest if
. Therefore, the first tree is chosen.
The above example could be rewritten in the following way to remove the ambiguity :
statement: open_statement | closed_statement ; open_statement: IF '(' expression ')' statement | IF '(' expression ')' closed_statement ELSE open_statement ; closed_statement: non_if_statement | IF '(' expression ')' closed_statement ELSE closed_statement ; non_if_statement: ... ;
Any other statement-related grammar rules may also have to be duplicated in this way if they may directly or indirectly end with a statement
or selection-statement
non-terminal.
However, we give grammar that includes both of if and while statements.
statement: open_statement | closed_statement ; open_statement: IF '(' expression ')' statement | IF '(' expression ')' closed_statement ELSE open_statement | WHILE '(' expression ')' open_statement ; closed_statement: simple_statement | IF '(' expression ')' closed_statement ELSE closed_statement | WHILE '(' expression ')' closed_statement ; simple_statement: ... ;
Finally, we give the grammar that forbids ambiguous IF statements.
statement: open_statement | closed_statement ; open_statement: IF '(' expression ')' statement | IF '(' expression ')' closed_statement ELSE open_statement | WHILE '(' expression ')' open_statement ; closed_statement: simple_statement | IF '(' expression ')' closed_statement ELSE closed_statement | WHILE '(' expression ')' closed_statement ; simple_statement: ... ;
With this grammar the statement if (a) if (b) c else d
can only be parsed one way, because the other interpretation (if (a) {if (b) c} else d
) is produced as
statement open_statement IF '(' expression ')' closed_statement ELSE open_statement 'if' '(' 'a' ')' closed_statement 'else' 'd'and then the parsing fails trying to match
closed_statement
to "if (b) c". An attempt with closed_statement
fails in the same way. The other parse, if (a) {if (b) c else d}
) succeeds:statement open_statement IF '(' expression ')' statement IF '(' expression ')' closed_statement IF '(' a ')' (IF '(' expression ')' closed_statement ELSE closed_statement) IF '(' a ')' (IF '(' b ')' c ELSE 'd')