1 Replies - 771 Views - Last Post: 24 December 2012 - 09:43 AM Rate Topic: -----

#1 shandan97  Icon User is offline

  • New D.I.C Head

Reputation: 0
  • View blog
  • Posts: 6
  • Joined: 23-December 12

Using ANTLR to create a grammar.

Posted 24 December 2012 - 08:03 AM

I need some help using ANTLR to create language grammar. I am just tinkering around with it to see what I can do. I'm just trying to create something simple. I just don't quite understand what tokens are in ERNF. I've checked out almost every website I could, and I think I have a little understanding. Could someone please just try to atleast summarize what tokens are and what they do please?
Thanks!
Shandan
Is This A Good Question/Topic? 0
  • +

Replies To: Using ANTLR to create a grammar.

#2 sepp2k  Icon User is offline

  • D.I.C Lover
  • member icon

Reputation: 2153
  • View blog
  • Posts: 3,311
  • Joined: 21-June 11

Re: Using ANTLR to create a grammar.

Posted 24 December 2012 - 09:43 AM

EBNF¹ doesn't have tokens. That concept is only really used by real-world parsers for implementation reasons - not at the theoretic level. So reading about EBNF won't tell you about tokens, is what I'm trying to say.

So what are tokens? Well, in most real-world applications parsing happens in two phases:

1. Tokenizing (a.k.a. lexing). This takes a string of characters and splits it into a list of tokens, which you can think of as the "words" that make up the program. For example the string "int x = 42;" might be split into the tokens keywordInt, identifier("x"), operatorEquals, number(42), semicolon. This list can be more efficiently processed by the actual parser than a string of characters could.

2. The actual parsing phase. Here a list of tokens is processed to make sure that it fits certain patterns and execute certain actions depending on which patterns were matched. Usually what happens is that the parser creates some sort of tree data structure that represents the program. The rest of your compiler or interpreter can then work with that tree.

When working with a parser generator like ANTLR (or yacc/bison, javacc etc.) the way this works is that you write down a grammar which tells ANTLR what constructs exist in your language and what those constructs look like. You then tell it what to do when it encounters which construct. So you could for example say "My language has function definitions, function calls, and variable definitions and variable usages. A function definition looks like this: [...], a variable definition like this: [...] [and so on]. When you encounter a function call, create a FunctionCall object that contains the parsed arguments as members. When you encounter a variable usage, create a VariableUsage object with the variable name as a member. [And so on.]"

¹ I'm assuming ERNF was a typo. If it wasn't, I'm not familiar with that abbreviation.
Was This Post Helpful? 0
  • +
  • -

Page 1 of 1