C++ Compiler "grammar"
Page 1 of 114 Replies - 3488 Views - Last Post: 07 December 2012 - 01:36 PM
#1
C++ Compiler "grammar"
Posted 28 November 2012 - 06:37 PM
Note, I am wanting to get a little more in depth knowledge on the C++ grammar before I grab books on compiler theory / assembly language.
Replies To: C++ Compiler "grammar"
#2
Re: C++ Compiler "grammar"
Posted 28 November 2012 - 06:52 PM

POPULAR
Quote
I think you've got it backwards. If anything you want to become comfortable parsing simple languages before delving into C++. I can't think of a harder syntax to work with.
Here's (direct PDF link) a draft of one of the open ISO standards. Look at the grammar section towards the end of the PDF. I think it's pretty much the same thing as the accepted standard. You can buy the latest published standard here.
Some good resources on Compiler writing are: The Dragon Book, Modern Compiler Implementation in C/Java/ML and Engineering a Compiler. There's also a Compilers course on Coursera that runs twice a year.
This post has been edited by blackcompe: 28 November 2012 - 07:36 PM
#3
Re: C++ Compiler "grammar"
Posted 28 November 2012 - 07:05 PM
Also, welcome to the ACC.
#4
Re: C++ Compiler "grammar"
Posted 29 November 2012 - 12:50 AM
If you're into compiler creation though, you should rather start with books focusing on the topic, and not on some language syntax.
#5
Re: C++ Compiler "grammar"
Posted 29 November 2012 - 07:48 AM
#6
Re: C++ Compiler "grammar"
Posted 29 November 2012 - 08:56 AM
Great suggestion by the way, OP - just google "Bill Campbell compilers java".
#7
Re: C++ Compiler "grammar"
Posted 29 November 2012 - 12:14 PM
#8
Re: C++ Compiler "grammar"
Posted 29 November 2012 - 12:51 PM
Xupicor, on 29 November 2012 - 10:56 AM, said:
I agree 100%, there's so much you can do with the C and C++ languages that makes it very difficult to write a compiler for them. One example is how certain operators do certain things depending on the context. Just a simple one, like the minus sign ('-'), can be either a unary or binary operator depending on the context. Example:
/* Binary minus operator */ int a = 5 - 2; /* a == 3 */ /* Unary minus operator */ int b = -3; /* b == -3 */
If you still want to learn about C/C++ compilers, you may want to look into LLVM (Low Level Virtual Machine) and Clang which is an "LLVM native" C/C++/Objective C compiler.
This post has been edited by vividexstance: 29 November 2012 - 12:52 PM
#9
Re: C++ Compiler "grammar"
Posted 29 November 2012 - 12:59 PM
vividexstance, on 29 November 2012 - 02:51 PM, said:
Sure, and the '-' token can also be part of a pre- or post-fix decrement operator, or part of the -= operator. But as I recall, there was nothing very difficult about parsing those, and once you've finished parsing them you don't have to worry about it any more (the operators themselves are obviously distinct entities)
#10
Re: C++ Compiler "grammar"
Posted 29 November 2012 - 01:52 PM
#11
Re: C++ Compiler "grammar"
Posted 29 November 2012 - 04:33 PM
examples of where it gets tricky:
take the arguments of a template, they can be both types and expressions(which are parsed differently) the only way to know is to use contextual disambiguation based on the kind of template being used.
what about this?
y * x;
is that multiplication or am I declaring a pointer? again this is contextual
now here is a really tricky one
x<y>::z
which is it?
((x) < (y)) > (::z)
or is x<y> a type and z is a static member?
again the answer is contextual.
templates are the primary source of ambiguities in the language, but by no means the only.
in C++11 they added they maid it so that '>>' can end two templates which is a tough one to get right. my best bet is that most implementations will parse it like a single '>' in the case of templates and put a '>' back into the stream. that requires that the parser have a way to communicate with the lexer or the underlying character stream. some might place more of the burden on the lexer by simply telling it "if you get a '>>' split it into two '>'s". some might use a scanerless parser in which case the parse only requests a '>' and not a '>>' but that's a pretty rare case.
add to this that C++ requires non-trivial look-ahead, is context sensitive, and that the grammar (and language for that matter) is huge you have one of the hardest languages to parse in mainstream use if not the hardest.
This post has been edited by ishkabible: 29 November 2012 - 04:45 PM
#12
Re: C++ Compiler "grammar"
Posted 07 December 2012 - 11:18 AM
#13
Re: C++ Compiler "grammar"
Posted 07 December 2012 - 11:44 AM
On the list would actually be C. Not C++, not even modern C, but plain, old, K&R ( read THAT book ) C. It really is a simple language. Free of most modern gotchas, including the ones people are concerned with in C++.
The easiest I can think of would be BASIC, the one with the line numbers. A seriously dead language of my youth, it is lean and practically pre parsed. With GOTO and line numbers, you can do a one to one to assembly with a lot of it. I've considered doing this for fun, but never got around to it.
#14
Re: C++ Compiler "grammar"
Posted 07 December 2012 - 11:57 AM
#15
Re: C++ Compiler "grammar"
Posted 07 December 2012 - 01:36 PM
This post has been edited by ishkabible: 07 December 2012 - 01:46 PM
|
|

New Topic/Question
Reply


MultiQuote









|