QUOTE(tiff88 @ 3 Apr, 2009 - 09:06 AM)

Hi,
can somebody explain to me (maybe with an analogy) what exactly a parser is, does. I mean, I know that it checks for correct syntax and all, but for example, if you send an XML message why do you need an XML parser. Anyways, thanks for your guys' help.
To send an XML message, you do
not need a parser. But to receive the message, i.e. to process it, a parser would be "useful". If that satisifies your question, then read on no further. Otherwise, I am sure you will already be familiar with a lot of what I will be saying, but for those who are not familiar with the concept of parsing, perhaps the following will be useful:
A language (spoken or programming) will have a specification that defines its
syntax and
semantics. By syntax I mean the rules on how to form valid sentences of that language and by semantics I mean defining what a sentence
means. For example, in English we can say that a
sentence consists of a
subject followed by a
predicate (I know that other possibilities exist). The
subject consists of an
article followed by an
adjective followed by a
noun. The
predicate consists of a
verb followed by an
adverb. Again, many other possibilities actually exist for English but if we limit ourselves to just these rules, then we are essentially defining a subset of English which we will call MyEnglish. We can then proceed to enumerate all the possible words that are
articles,
adjectives,
nouns,
verbs, and
adverbs. A formal specification of MyEnglish might be the following
grammar made of
productions (AKA
rewriting rules):
CODE
(1) <sentence>: <subject> <predicate>
(2) <subject>: <article> <adjective> <noun>
(3) <predicate>: <verb> <adverb>
(4) <article>: The | A | An
(5) <adjective>: pretty | handsome| black | white
(6) <subject>: dog | cat | man | woman
(7) <verb>: talks | runs | swims | eats
(8) <adverb>: quickly | slowly
The symbols enclosed in angle brackets, e.g. <sentence> are not the actual words that make up MyEnglish but are grammatical elements or "parts of speech" defined in terms of other grammatical elements or words. Production (5) basically says that a subject can be the words
dog,
cat,
man, or
woman (the veritical bar means or).
These productions or rewriting rules constitute the syntax rules for MyEnglish. For example, a valid sentence in this language would be:
CODE
The black cat runs quickly.
What the sentence "means" (its semantics) is another issue. We can generate all the possible legal sentences in MyEnglish by starting with the first rewriting rule and applying additional rewriting rules until we are left with nothing but words (there are no more grammatical parts of speech that can be rewritten):
CODE
<sentence> is rewritten as:
<subject> <predicate> is rewritten as:
<article> <adjective> <noun> <predicate> is rewritten as:
The <adjective> <noun> <predicate> is rewritten as
The black <noun> <predicate> is rewritten as
The black cat <predicate> is rewritten as
The black cat <verb> <adverb> is rewritten as
The black cat runs <adverb> is rewritten as
The black cat runs quickly
In general, a grammar usually results in an infinite number of possible sentences. We can also show the grammatical structure of the sentence as a tree, which is a bit difficult to draw here, but I will show the "top" of the tree:
CODE
<sentence>
/ \
<subject> <predicate>
Here's the punchline:Analyzing a sentence to see if it is legal and to create a sytax tree as above is called parsing. After a sentence has been parsed and you understand what role the various words play, then you can apply the semantics. For example, "Time flies like an arrow", which is unfortunately ambiguous in English, can be translated into French in various ways depending upon whether "Time" is a noun or a verb, whether "like" is an adverb or a verb, etc., etc.
As far as XML is concerned, its rules can easily be defined by a grammar such as the one for MyEnglish. For such grammars, there are well know parsing algorithms that can be used to check the syntax and build a tree representation (a tree representation very natural for XML>. Once the tree is built, its very easy to apply semantics (for example, convert the tree into Java classes). So to send an XML message, it is very easy to "hand code" it and a parser is, in fact,
not used. But to process (i.e. receive) the message, the parser
is needed.