Read Gödel, Escher, Bach: An Eternal Golden Braid Online
Authors: Douglas R. Hofstadter
Tags: #Computers, #Art, #Classical, #Symmetry, #Bach; Johann Sebastian, #Individual Artists, #Science, #Science & Technology, #Philosophy, #General, #Metamathematics, #Intelligence (AI) & Semantics, #G'odel; Kurt, #Music, #Logic, #Biography & Autobiography, #Mathematics, #Genres & Styles, #Artificial Intelligence, #Escher; M. C
(6) describe its actions and their reasons, in English.
It might seem reasonable to break up the overall program into modular subprograms, with one module for each different part of the problem; then, after the modules have been developed separately, to integrate them smoothly. Winograd found that this strategy of developing independent modules posed fundamental difficulties. He developed a radical approach, which challenged the theory that intelligence can be compartmentalized into independent or semi-independent pieces. His program
SHRDLU
named after the old code "
ETAOIN SHRDLU
", used by linotype operators to mark typos in a newspaper column-did not separate the problem into clean conceptual parts.
The operations of parsing sentences, producing internal representations, reasoning about the world represented inside itself, answering questions, and so on, were all deeply and intricately meshed together in a procedural representation of knowledge. Some critics have charged that his program is so tangled that it does not represent any "theory" at all about language, nor does it contribute in any way to our insights about thought processes.
Nothing could be more wrong than such claims, in my opinion. A tour de force such as
SHRDLU
may not be isomorphic to what we do-in fact, in no way should you think that in
SHRDLU
, the "symbol level" has been attained-but the act of creating it and thinking about it offers tremendous insight into the way intelligence works.
The Structure of SHRDLU
In fact,
SHRDLU
does consist of separate procedures, each of which contains some knowledge about the world; but the procedures have such a strong interdependency that they cannot be cleanly teased apart. The program is like a very tangled knot which resists untangling; but the fact that you cannot untangle it does not mean that you cannot understand it. There may be an elegant geometrical description of the entire knot even if it is physically messy. We could go back to a metaphor from the Mu Offering, and compare it to looking at an orchard from a "natural" angle.
Winograd has written lucidly about
SHRDLU
. I quote here from his article in Schank and Colby's book:
One of the basic viewpoints underlying the model is that all language use can be thought of as a way of activating procedures within the hearer. We can think of any utterance as a program-one that indirectly causes a set of operations to he carried out within the hearer's cognitive system. This "program writing" is indirect in the sense that we are dealing with an intelligent interpreter, who may take a set of actions which are quite different from those the speaker intended. The exact form is determined by his knowledge of the world, his expectations about the person talking to him, etc. in this program we have a simple version of this process of interpretation as it takes place in the robot. Each sentence interpreted by the robot is converted to a set of instructions in
PLANNER
. The program that is created is then executed to achieve the desired effect."
PLANNER Facilitates Problem Reduction
The language
PLANNER
, referred to here, is an At language whose principal feature is that some of the operations necessary for problem reduction are built in-namely, the recursive process of creating a tree of jsubgoals, subsubgoals, etc. What this means is that such processes, instead of having to be spelled out time and time again by the programmer, are automatically implied by so-called
GOAL
-statements. Someone who reads a
PLANNER
program will see no explicit reference to such operations; in argon, they are user-transparent. If one path-in the tree fails to achieve the desired goal, then the
PLANNER
program will "backtrack" and try another route. "Backtracking" is the magic word as far as
PLANNER
is concerned.
Winograd's program made excellent use of these features of
PLANNER
-more exactly, of
MICROPLANNER
, a partial implementation of the plans for
PLANNER
. In the past few years, however, people with the goal of developing At have concluded that automatic backtrack ing, as in
PLANNER
, has definite disadvantages, and that it will probably not lead to their goal; therefore they have backed off from it, preferring to try other routes to
AI
.
Let us listen to further comments from Winograd on
SHRDLU
: The definition of every word is a program which is called at an appropriate point in the analysis, and which can do arbitrary computations involving the sentence and the present physical situation."
Among the examples which Winograd cites is the following:
The different possibilities for the meaning of "the" are procedures which check various facts about the context, then prescribe actions such as "Look for a unique object in the data base which fits this description", or "Assert that the object being described is unique as far as the speaker is concerned." The program incorporates a variety of heuristics for deciding what part of the context is relevant. 18
It is amazing how deep this problem with the word "the" is. It is probably safe to say that writing a program which can fully handle the top five words
of English-"the", "of", "and", "a", and "to"-would be equivalent to solving the entire problem of AI, and hence tantamount to knowing what intelligence and consciousness are. A small digression: the five most common nouns in English are-according to the Word Frequency Book compiled by John B. Carroll et al-"time", "people", "way",
"water", and "words" (in that order). The amazing thing about this is that most people have no idea that we think in such abstract terms. Ask your friends, and 10 to 1 they'll guess such words as "man", "house", "car", "dog", and "money". And while we're on the subject of frequencies-the top twelve letters in English, in order, according to Mergenthaler, are: "
ETAOIN SHRDLU
".
One amusing feature of
SHRDLU
which runs totally against the stereotype of computers as "number crunchers" is this fact, pointed out by Winograd: "Our system does not accept numbers in numeric form, and has only been taught to count to ten."19 With all its mathematical underpinning, SHRDLU is a mathematical ignoramus! Just like Aunt Hillary,
SHRDLU
doesn't know anything about the lower levels which make it up. Its knowledge is largely
procedural
(see particularly the remark by "Dr, Tony Earrwig" in section 11 of the previous Dialogue).
It is interesting to contrast the procedural embedding of knowledge in
SHRDLU
with the knowledge in my sentence-generation program. All of the syntactical knowledge in my program was procedurally embedded in Augmented Transition Networks, written in the language Algol; but the semantic knowledge-the information about semantic class membership-was static: it was contained in a short list of numbers after each word. There were a few words, such as the auxiliary verbs "to be", "to have", and others, which were represented totally in procedures in Algol, but they were the exceptions. By contrast, in
SHRDLU
, all words were represented as programs. Here is a case which demonstrates that, despite the theoretical equivalence of data and programs, in practice the choice of one over the other has major consequences.
Syntax and Semantics
And now, a few more words from Winograd:
Our program does not operate by first parsing a sentence, then doing semantic analysis, and finally by using deduction to produce a response. These three activities go on concurrently throughout the understanding of a sentence. As soon as a piece of syntactic structure begins to take shape, a semantic program is called to see whether it might make sense, and the resultant answer can direct the parsing. In deciding whether it makes sense, the semantic routine may call deductive processes and ask questions about the real world. As an example, in sentence 34 of the Dialogue ("Put the blue pyramid on the block in the box"), the parser first comes up with "the blue pyramid on the block" as a candidate for a noun group. At this point, semantic analysis is done, and since "the" is definite, a check is made in the data base for the object being referred to. When no such object is found, the parsing is redirected to find the noun group "the blue pyramid". It will then go on to find "on the block
in the box" as a single phrase indicating a location .-.. Thus there is a continuing interplay between the different sorts of analysis, with the results of one affecting the others .20
It is extremely interesting that in natural language, syntax and semantics are so deeply intertwined. Last Chapter, in discussing the elusive concept of "form", we had broken the notion into two categories: syntactic form, which is detectable by a predictably terminating decision procedure, and semantic form, which is not. But here, Winograd is telling us that-at least when the usual senses of "syntax" and "semantics" are taken-they merge right into each other, in natural language. The external form of a sentence-that is, its composition in terms of elementary signs-does not divide up so neatly into syntactic and semantic aspects. This significant point for linguistics.
Here are some final comments on
SHRDLU
by Winograd.
Let us look at what the system would do with a simple description like "a red cube which supports a pyramid". The description will use concepts like BLOCK, RED, PYRAMID, and EQUIDIMENSIONAL-all parts of the sys
FIGURE 118.
Procedural representation of "a red cube which supports a pyramid."
Adapted from Roger Schank and Kenneth Colby, Computer Models of Thought and
Language (San Francisco: W. H. Freeman, 1973), p. 172
.
tem's underlying categorization of the world. The result can be represented in a flow chart like that in Figure 118. Note that this is a program for finding an object fitting the description. It would_ then be incorporated into a command for doing something with the object, a question asking something about it, or, if it appeared in a statement, it would become part of the program which was generated to represent the meaning for later use. Note that this bit of program could also be used as a test to see whether an object fit the description, if the first
FIND
instruction were told in advance to look only at that particular object.
At first glance, it seems that there is too much structure in this program, as we don't like to think of the meaning of a simple phrase as explicitly containing loops, conditional tests, and other programming details. The solution is to provide an internal language that contains the appropriate looping and checking as its primitives, and in which the representation of the process is as simple as the description. The program described in Figure 11S would be written in
PLANNER
looking something like what is below:
(GOAL (IS ?X 1 BLOCK))
(GOAL (COLOR-OF ?X1 RED))
(GOAL (EQUIDIMENSIONAL ?X1))
(GOAL (IS ?X2 PYRAMID))
(GOAL (SUPPORT ?X1 ?X2))
The loops of the flowchart are implicit in
PLANNER
'S backtrack control structure.
The description is evaluated by proceeding down the list until some goal fails, at which time the system backs up automatically to the last point where a decision was made, trying a different possibility. A decision can be made whenever a new object name or
VARIABLE
(indicated by the prefix
") such as "?X 1" or "?X2" appears. The variables are used by the pattern matcher.
If' they have already been assigned to a particular item, it checks to see whether the
GOAL
is true for that item. If not, it checks for all possible items which satisfy the
GOAL
, by choosing one, and then taking successive ones whenever backtracking occurs to that point. Thus, even the distinction between testing and choosing is implicit.21
One significant strategy decision in devising this program was to not translate all the way from English into
LISP
, but only partway-into
PLANNER
. Thus (since the
PLANNER
interpreter is itself written in
LISP
), a new intermediate level-
PLANNER
-was inserted between the top-level language (English) and the bottom-level language (machine language). Once a
PLANNER
program had been made from an English sentence fragment, then it could be sent off to the
PLANNER
interpreter, and the higher levels of
SHRDLU
would be freed up, to work on new tasks.