Gerry Rzeppa's Profile
Reputation: 3 Apprentice
- New Members
- Active Posts:
- 6 (0.01 per day)
- 13-December 13
- Profile Views:
- Last Active:
- Apr 14 2014 10:38 PM
- OS Preference:
- Who Cares
- Favorite Browser:
- Favorite Processor:
- Who Cares
- Favorite Gaming Platform:
- Who Cares
- Your Car:
- Who Cares
- Dream Kudos:
14 Dec 2013 - 00:26
Posts I've Made
Posted 14 Dec 2013I hate to be the one to throw cold water on the party, but looking at your documentation, I came across this:
QuoteSo you can see that my power is rooted in my simplicity. I parse sentences pretty much the same way you do. I look for marker words — articles, verbs, conjunctions, prepositions — then work my way around them. No involved grammars, no ridiculously complicated parse trees, no obscure keywords.
I'm afraid that's not how you or I parse sentences.
We believe that it is, based on years of study with small children. It appears that we have "buckets" in our heads for the who, what, when, where, why and how of a thought, and when someone speaks to us, we mentally divide the statement at certain marker words (like articles, conjunctions, and prepositions) and attempt to fill up the buckets with the resulting phrases. Which is why children will consistently reply to a statement like, "I'm going to the store" with a question like "When?" -- seeking information for the as-yet unfilled bucket(s).
In fact "ridiculously complicated parse trees" are a vastly simplified version of the models developed to understand natural-language sentences. We don't just "look for marker words", in fact natural languages have syntax.
Of course they do. But we contend that most of that syntax is idiomatic, learned by rote; while the marker-word/bucket processing is in operation at least a year before a child learns to speak; and many years before a child has a clear concept of nouns, verbs, and other formal language classifications.
So there's a difference between "John hit Mary" and "Mary hit John", to take an obvious case.
In some languages, yes; others use verb inflection to indicate subject and object and word order is less important, etc.
Things get more interesting when we build more involved sentences, like
"This is the boy that John said he'd give one of his socks to, but he didn't."
Who does the final "he" refer to?
What didn't he do? Who was to get the sock? How do we know these things? That's all syntax.
It can be analyzed syntactically, yes. Or it can be interpreted with the marker-word/bucket approach -- which (we argue) is how most people would handle it. How else could the vast majority of Americans -- who have no idea what the word "antecedent" means, and who couldn't delineate a subordinate clause to save their lives -- understand anything?
So if you're not actually parsing sentences, you're not doing natural language.
Or, we're processing natural language as small children do (rather than the way a professional linguist would).
I suspect that you're doing something a lot more like Weizenbaums's Eliza...
Perhaps. But don't forget that Eliza also processes language as humans do. When I'm trying to make conversation with a stranger, for example, I let their words more-or-less wash over me until I hear a term that I think we might have in common; then I reply, as Eliza would, "Tell me about your vacation (or your piano, or your mother)." And I know this is what I do, because I've consciously watched myself doing it.
which is a fun toy but not something you could use for computation.
But we have used this technique for computation, and non-trivial computation at that: a complete development system including desktop, file manager, editor, dumper, native-code-generating compiler/linker, and wysiwyg page layout facility.
QuoteOur parser operates, we think, something like the parsing centers in the human brain. Consider, for example, a father saying to his baby son:
“Want to suck on this bottle, little guy?”
And the kid hears,
“blah, blah, SUCK, blah, blah, BOTTLE, blah, blah.”
No, this is more of a Gary Larson theory of language - it's much like the view that Augustine of Hippo proposed, but I don't think it's ever been proposed as a serious contender for a theory of language acquisition in the modern era. You have a little bootstrapping problem here. The child's input is not a delineated sequence of conveniently marked and isolated tokens, "What" and "to" and "suck" etc. Instead, it's a continuous stream of sounds. The child's task is much more complicated and much more interesting than you're suggesting here. In any case, the point is: this is not how any human brain works on language.
Glad you got the Gary Larson allusion. But I've studied a lot of kids in the process of understanding language, and I have a speaking brain of my own, and it appears to me that we actually do process language along these lines. And it worked, as I mentioned a moment ago, in a practical, real-world programming project.
So just from the point of view of the linguist, my suggestion would be "go study some linguistics".
Have done so, and will continue to do so. But I'll always be a fan of the simplest solution to a problem, and the solution that appears to be most similar to natural systems.
So much for the linguistics. As a programmer, this looks horrible. Why would I want to type "A polygon is a thing with vertices" and hope that my testing reveals all of the errors of interpretation that this allows, when I can use a well-designed language (or even something like VB or ML) and know exactly what I'm going to get? That is, either "A polygon is a thing with vertices" is a precise construct - in which case I might as well learn a less atrocious syntax - or or it's not a precise construct - in which case this is completely useless for programming.
Alternately, why would I want to type "Runtime.getRuntime().exec("cls");" instead of "Clear the screen."? Our general position on the matter, as intimated in my original post, is that a program should be written more-or-less like a math book: a natural-language framework punctuated with specialized syntaxes as appropriate.
QuoteWe deal with ambiguity as you would when speaking to, say, an employee: when he does what you asked, you're done; when he doesn't, you elaborate.
Seriously? Why not just use a syntax that allows me to state my meaning unambiguously?
Because your unambiguous syntax has to be memorized, while our occasionally-ambiguous syntax is already known. And because your unambiguous syntax doesn't get us closer to creating a HAL 9000, which our natural-language syntax just might.
[quote name='jon.kiparsky' date='14 December 2013 - 07:45 AM' timestamp='1387032305' post='1947146']
QuoteSo from the point of view of the programmer, my suggestion would be that you study language design a little.
Do you really think we could have written the program we're talking about if we hadn't studied, in significant depth, both linguistics and programming language design? I'm afraid you're mistaking our love of simplicity and our out-of-the-box thinking for lack of education. The proof is in the pudding. The thing works, and it has answered the three questions we posed at the outset. We can now say, with years of experience -- and a tangible product -- to back up the assertions:
1. It is easier (at least for us!) to program in a natural language rather than translate our thoughts into specialized syntaxes; and
2. We can parse English in a relatively sloppy, marker-word/bucket fashion and still produce a system that is sufficiently precise to write non-trivial computer programs; and that
3. Low level programs (like compilers) can indeed be conveniently and efficiently written in high-level languages (like English).
Thanks for the challenging remarks.
Posted 14 Dec 2013How about time and space complexity?. Will the programmer have full control over the amount of memory the program uses as well as the running time of the code?.
The programmer has as much control over memory use and execution time as the Windows operating system allows />/> Global variables are allocated as start up; local variables are allocated on the stack at the time of each call and discarded on exit; parameters are passed by reference (though you can "privatize" them to get a copy; dynamic variables are allocated and destroyed by the programmer (no automatic garbage collection, though we do provide a handy one-command "deep deallocation" for certain structures. Execution of the code is linear (except for loops and calls) and deterministic, but since Windows multitasks there is no guarantee regarding execution times. It's a very small and efficient program, however, considering the range of it's functionality.
I think you'd enjoy the instruction manual, even if you don't want to run the program. Why not download it and see what it has to say?
Posted 14 Dec 2013Interesting. Are there mechanisms to reason about context?
Only very primitive ones at present. If, for example, you reference a "field" in a "record" without specifying the record-level component, the compiler will look within all the records in the immediate context for a suitably unambiguous field; or if you specify a call with variable types that are not directly supported by any of the existing routines, the compiler will recursively reduce those types to more basic types and attempt to call a compatible lower-level routine. Some string and number conversions also take place automatically. And all variables can be reference by "nickname" ("the left side", for example, can be specified simply by saying "the left"). But that's all I can think of off the top of my head. We hope to add significant support for pronouns in the next version.
Posted 14 Dec 2013Impressive, I like how you used declarative and procedural semantics. How do you handle ambiguity though?.
We deal with ambiguity as you would when speaking to, say, an employee: when he does what you asked, you're done; when he doesn't, you elaborate. Likewise when we're coding. We code a few lines, then test: if it works as intended, we assume the compiler has arrived at the proper interpretation; if it doesn't work properly, we provide further clarification. We don't see this as a shortcoming. We're convinced that computers of the future will work in a similar way -- after all, eventually we want to get to the point where we "program" our machines simply by talking to them, yes? And sometimes they'll misunderstand, and need clarification, as humans do.
I think you're using pattern matching technique. You primarily look for verbs and translate those to functions, and nouns translate to objects. I am only guessing though from the top of my head.
We actually make several passes at the code, compiling types first, then global variables, then routine headers. Most of that is standard recursive descent parsing in accord with a standard EBNF definition of the language. See page 11 of the instructions for a summary of that parsing. Then the fun begins. We compile the routine bodies by breaking each sentence into phrases (at article, conjunction, and prepositional boundaries); then we recursively interpret each phrase as a possible (1) string or numeric literal, (2) variable reference, (3) mathematical expression, or (4) irrelevant "noise words", and look for a matching routine header (compiled earlier) to call. I say "recursively interpret" because if one interpretation of the sentence doesn't yield a match, we try another, and another, etc, until we've exhausted the possibilities.
- Member Title:
- New D.I.C Head
- 61 years old
- July 15, 1953
- Franklin, KY
- Full Name:
- Gerry Rzeppa
- Years Programming:
- Programming Languages:
- Fortran, COBOL, Assembly, Basic, Pascal, SQL