* Techie(S)pArK *: Getting Started With ANTLR:Basics

Yeah! It's after a lapse of a month or so that there is a post in this blog! :)
Well, this post drives you towards the basics of ANTLR. Previously, we had learnt about setting up of ANTLR as an external tool.
RECAP! It's here: ANTLR External Tool
:)
So, here we go....

What is ANTLR?• ANother Tool for Language Recognition, is a language tool that provides a framework for constructing recognizers, interpreters, compilers, and translators from grammatical descriptions containing actions.

What can be the target languages?

• Action Script, Ada

• C
• C#; C#2
• C#3
• D
• Emacs ELisp
• Objective C
• Java
• Java Script
• Python
• Ruby
• Perl6
• Perl
• PHP
• Oberon
• Scala

What does ANTLR support?• Tree construction
• Error recovery
• Error handling
• Tree walking
• Translation

What environment does it support?

ANTLRWorks is the IDE for ANTLR. It is the graphical grammar editor and debugger, written by Jean Bovet using Swing.

What for ANTLR can be used?
• ""REAL"" programming languages

• domain-specific languages [DSL]

Who is using ANTLR?
• Programming languages :Boo, Groovy, Mantra, Nemerle, XRuby etc.

• Other Tools: HIbernate, Intellij IDEA, Jazillian, JBoss Rules, Keynote(Apple), WebLogic(Oracle) etc.

Where is that you can look for ANTLR?

You can always follow here http://www.antlr.org • to download ANTLR and ANTLRWorks, which are free and open source

• docs,articles,wiki,mailing list,examples....

You can catch everything here!

Row your Boat....

Basic terms

• Lexer : converts a stream of characters to a stream of tokens.
• Parser : processes of tokens, possibly creating AST
• Abstract Syntax Tree(AST): an intermediate tree representation of the parsed input that is simpler to process than the stream of tokens. It can as well be processed multiple times.
• Tree Parser: It processes an AST
• String Template: a library that supports using templates with placeholders for outputting text

General Steps

• Write Grammar in one or more files
• Write string templates[optional]
• Debug your grammar with ANTLRWorks
• Generate classes from grammar
• Write an application that uses generated classes
• Feed the application text that conforms to the grammar

A Bit Further....

Lets write a simple grammar which consists of
• Lexer
• Parser
Lexer: Breaks the input stream into tokens
Lets take the example of simple declaration type in C of the form "int a,b;" or "int a;" and same with float.As we see we can write lexer as follows:

//TestLexer.g
grammar TestLexer;
ID : ('a'..'z'|'A'..'Z'|'_') ('a'..'z'|'A'..'Z'|'0'..'9'|'_'|'.'|'a'..'z'|'A'..'Z')*; COMMA: ',';
SEMICOLON:';';
DATATYPE: 'int' | 'float';

As we could see, these were the characters that were to be converted to tokens.
So, now lets write some rules which processes these tokens generated and may it create a parse tree accordingly.

//TestParser.g
grammar TestParser;
options {language : Java;}
decl:DATATYPE ID (',' ID)* ;

Running ANTLR on the grammar just generates the lexer and parser,TestParser and TestLexer. To actually try the grammar on some input, we
need a test rig with a main( ) method as follows:

// Test.java
import org.antlr.runtime.*;
public class Test {

public static void main(String[] args) throws Exception {

// Create an input character stream from standard in
ANTLRFileStream input = new ANTLRFileStream("input"); // give path to the file input

// Create an ExprLexer that feeds from that stream
TestLexer lexer = new TestLexer(input);
// Create a stream of tokens fed by the lexer
CommonTokenStream tokens = new CommonTokenStream(lexer);
// Create a parser that feeds off the token stream
TestParser parser = new TestParser(tokens);
// Begin parsing at rule decl
parser.decl();
}}

We shall see how to create an AST and walk over the tree in the next blog post...
Happy learning....! :)

* Techie(S)pArK *

Wednesday, 6 June 2012

Getting Started With ANTLR:Basics

No comments:

Post a Comment