The easiest way is to use a parser generator, for example, a lex / yacc pair (flex / bison).
The parse tree will turn out automatically:
program: statements { $$ = new Program($1); } statements: /* empty */ { $$ = new StatementList(); } | statements statement { $1->push_back($2); $$ = $1; } inner_statement: compound_statement | assignment | loop statement: inner_statement | vardef compound_statement: '{' statements '}' { $$ = new CompoundStatement($2); } assignment: IDENT '=' expression ';' { $$ = new AssignmentStatement($1, $3); } loop: while_loop | for_loop vardef: TYPE IDENT maybe_initializer ';' { $$ = new VarDefinition($1, $2, $3); } maybe_initializer: /* empty */ { $$ = null; } | '=' expression { $$ = $2; } while_loop: WHILE '(' expression ')' inner_statement { $$ = new WhileLoop($3, $5); }
and so on.
See how the tree is built recursively?