IBM
Contents Index Previous Next



Syntax Definition


In this section, a notation called a context-free grammar (grammar for short) is introduced. This notation is used for specifying the syntax of a language.

A grammar naturally describes the hierarchical structure of many programming language constructs. For example, an if-else statement in C has the form as below.

if ( expression ) statement else statement

That is, the statement is the concatenation of the keyword if, an opening parenthesis, an expression, a closing parenthesis, a statement, the keyword else, and another statement. Using the variables expr to denote an expression and the variables stmt, stmt1 and stmt2 to denote three (possible different) statements, this structuring rules can be expressed as

stmt -> if ( expr ) stmt1 else stmt2

in which the arrow may be read as "can have the form". Such a rule is called a production. In a production lexical elements like the keyword if and the parentheses are called tokens. Variables like expr and stmt2 represent sequence of tokens and are called nonterminals.

A context-free grammar has four components:

Through this chapter, examples and pictures will be present in order to clarify a topic or explain a specific concept. As TTCN, at this stage, has a too large grammar definition to be included in this document, a language called SMALL is defined that in this chapter will be used for examples and discussions.

SMALL will be defined as a language only consisting of digits and the plus and minus signs, e.g., 9-5+2, 3-1 and 7. In the language SMALL, a plus or minus sign must appear between two digits, and expressions as above will be referred to as "lists of digits separated by plus or minus sign".

The following grammar specifies the syntax of this language with list as its start symbol:

1:list -> list + digit
2:list -> list - digit
3:list -> digit
4:digit -> 0 | 1 | 2 | 3 | 4 | 5 | 6 | 7 | 8 | 9

From production (3), a single digit by itself is a list. Productions (1) and (2) express the fact that if one take any list and follow it by a plus or a minus sign and then another digit one has a new list. For example, it is possible to deduce that 9-5+2 is a list as follows:

a) 9 is a list by production (3), since 9 is a digit.
b) 9-5 is a list by production (2), since 9 is a list and 5 is a digit.
c) 9-5+2 is a list by production (1), since 9-5 is a list and 2 is a digit.

This reasoning is illustrated by the tree in the next figure. Each node in the tree is labelled by a grammar symbol. An interior node and its children correspond to a production; the interior node corresponds to the left side of the production, the children to the right side. Such trees are called parse trees and are discussed in the next chapter.

Figure 173 : A parse tree for 9-5+2


http://www.ibm.com/rational
Contents Index Previous Next