Proposal to use the SUMEX-AIM Resource for Computer Simulation of Language Acquisition John R. Anderson Human Performance Center University of Michigan Ann Arbor, Michigan 3 Oo The purpose of this research is to understand language acquisition. There has been a great deal of research on first language acquisition in children, second language learning by adults, and learning of artificial languages by laboratory subjects. The principle goal of this research is not gecting more experimental evidence. Rather it is to develop a working computer simulation model that can learn natural languages. The model would attempt to explain the already available set of experimental facts. It is also hoped that such a model would be a contribution to the artificial intelligence goal of developing language understanding systems. Some of the detailed plans of the research are described in the accompanying grant proposal that was awarded by NIMH (grant number 1 RO 1 MH26383-01). The period of this award is May 1, 1975 to May 1, 1977. That proposal states an intention to use Augmented Transition Networks as the basic grammatical formalism. I have already completed some initial learning programs using the augmented transition network formalism. The very earliest of this work is described in the NIMH proposal. More recently I have decided to try to develop a production system formalism as en alternate to the augmented transition network. There are three main reasons for this switch 2. in representational formalism. First, I think it is easier to represent the grammatical knowledge contained in highly inflected languages (eg., Finnish, Latin) by production systems rather than augmented transition networks. Second, I think it is easier to represent human information processing limitations in terms of production systems, Third, I think production systems serve as a means of representing non-linguistic proced- ures such as inference-making. Therefore, a theory of induction of pro. duction systems for language has the promise of generalizing to the induc~ tion of other human cognitive skills. I have bean using the SUMEX facility in a pilot project this summer. I have been bringing up a version of my production system called ACT on this facility. It is hoped that in a few months this program will be in a sufficiently developed form that other SUMEX users may use that t+ production system. t uses an associative network representation as its basic data base. This is a variant of the HAM propositional network that I developed earlier and is described in the accoupanyine proposal (p. 23 - 27). In the ACT system various portions of the network are active at any point. of time. The productions look for patterns of activation in the net- work. If these patterns exist, the productions are executed causing exter- nal actions to be taken, building network structure, and possibly changing the state of activation of the network, Activation spreads associatively through the network and there is also a dampening process which deactivates network structure. A preliminary description of the ACT system is given in the accompanying document "An Overview of ACT." It is a chapter froma forthcoming book. The most relevant section in that chapter is from pages il to 25. It was originally projected that this simulation work would be performed on the Michigan Computer Systen. However, there are @ number £ of advantages of the SUMEX-AIM facility. All the programming will occur in LISP. The INTERLISP system in SUMEX, as surmised from my own experi-~ entation, permits programming and debugging ¢0 progress at least twice as fast as with Michigan LISP. Also programs in INTERLISP would be more available to other A.f. users than programs in Michigan LISP. The Michigan computer is isolated from the national A.i. community whereas I can take advantage of the connections SOMEX-AIM has through the TYMNET and the ARPANET. Finally, the SUMEX-ATM facility provides free computing resources and so will relieve some of the-strain fron my tight research budget. It is intended that there will be continued development and testing of this production system formalism as a model of human information processing. There are plans to build substantial ACT production system models for language generation and understanding.and for inference making. A.2. cC.3. c.4. c.5. Responses to SUMEX-AIM Questionnaire Read the accompanying proposal. The research is currently supported by a grant from NIMH (grant number 1 RO 1 MH 26383-O1) for the period May 1, 1975 to May 1, 1977. The amount of the award for the first year is $20,000. This is to pay for a programmer, computer time, and rental of a terminal. Read the accomparnjing proposal. It is expected that this research will have some general contribution to make to development of language understanding systems, modeling human cognitive processes, and development of production systems. None There should be no difficulty in making my programs generally’ available to users of SUMEX-AIM. Yes Yes Read next to last paragraph in accompanying proposal. The INTERLISP language on SUMEX is the principle requirement of my research. I do not anticipate requiring any additional systems programs not already available at SUMEX. Estinated requirements per month: 100 connect hours 2 CPU hours ee ~ Lb i Qa UY a> 1500 file pages bh 2 _ us ‘ The principle times of use in Ann Arbor would probably be 0600-0900 and 1800-2100 I intend to communicate with SUMEX via the TYMNET. I would either use the private node in Ann Arbor or the public node in Detroit. The toll cost to Detroit could be met from my current grant as could the cost of terminal rental. Not really relevant ~ mes tem Seite wate phe soar @ ap eae pegs y a > pee DB elogebeg at : (Tiva the fot wig 1th ore sor fOr wet Sond 3 ad Os ri ght ae Pep hag) re Ege 4G BIRT MDATE . : 4 | 5 , Jonn BR. Anderson Junior Fellow i Aug. 27, 1947 oe nem = —- a an nc PLEGE OF MATA icity, State, County! NPE NATIONALITY (UF noa-U.S. citken, SEX to kind of ving god expiva0d dite} ds > Vancouver, B.C-, Canada Canadian - Ji - 4ug- 1974 - calaursate tan sng and ingiude PastIoele erat) - YEAR SCMENTSFIC mr mo DEGRES CONFERRED FIELD | Oe m L { - : + . B.A. 1965 Psychology . Gancouver. wd Stanford University ~ oe Ph.D. 1972 Psycholoz: Stanford, Califo ormia y “ HONORS 1968-~The Governor-Gener ral's Gold Me adal (Head of eraduating classes in ris and Sciances , University of British colunbia) MAJOA RESEARCH INTEREST ROLE IN PROPOSED PROJECT Language & Human Memory Principal Investigator RESEAACH SUPPORT (See instructions) NSF - Recognition Memory for sentences: a procass madel Sept. 1, 1973 - Sept. 1, 1975 - $40,000 $20,265 for year 1 50% of research effor grant number — GB-40298 RESEARCH AND/OR PROFESSIONAL EXPE RIENCE (Starcing with present position, jist trainiag and expenance relevant [0 arse of project Listall or mest representative publications Do not axceed 3 pages far each individual. Research and Profess! ional Experience: Junior Fellow, University of Michigan, 1973 - present . Assistant Professor, Yale University, 1972 - 1973 £ mory under the supervision 0 Numerous experiments in graduate school in human me Gordon H. Bower at Stanford University, 1968 ~ 1972 * Publications: Reber, A. S- ‘and Anderson, J. R. The perception of clicks in linguistic and non-lLinguistic messages. Perception & Pevchovhysics, 19/0, 8, 81-89. Anderson, J. R- and Bower, G. H. On an assoc ofa rive trace for sentence memory. Journal © of Verbal Le Learning and Verbal Behavior, 1971, 19, 673-680. Anderson, J. R. FRAN: A simulation model of aes recal in G. H. Bower (Ed.), The Psychology of Learning an nd. “Motivation on, Vol. 5. New York: Acadamic Press, 197: Anderson, J. RB. and Bower, G. H- Recognition and retrieval processes in free recall. Psychological Review, 1972, 79; 97-123. 41H 398 (FORMERLY PHS 493) 6 . Rev. 1/73 , u, 5. COVERNMENT PRINTING OFFICE: as7t @ - 454-723 Angarson, J. BR. A stochastic model of sentence memory. Doctoral dissertation, Stanford Unis ity, “June, 1972. dnderson, J. R. and Bowar, G. H. Configural properties in sentence memory Journal of Verbal Learning and Verbal Behavior, 1972, 11, 594-605. son, J. R. and Bowe c, G. H. Human Associative Mamory. Washington: 3 3 -l a OnS s, 1973. Reder, L. M., Anderson, J. R., & Bjork, R. A. A semantic interpretation of encoding spacificity. Journal of Experimental Psychology, 1974, 4, 648-656 Anderson, J. R. Verbatim and propositional representation of sentences in ismediate and leng-term memory. Journal of Verbal Learning and Verbal Benavior, in press.. son, J. R. and Bower, G. H. A propositional theory of recognition memory. Memory & Cognition, in press. Anderson, J. R. and Bower, G. H. Interference in memory for multiple contexts. n & Cognition, in press. Anderson, J. R. Retrieval of propositional information from long-term memory. Cognitive Psychology, in press. . Anderson. J. R. and Hastie. R. Individuation and reference in memorv: proper names end definite descriptions. Cognitive Psychology, in prass. Anderson, J. R. Computer simulation of a language-acquisition system, first report. In R. L. Solso (Ed.) Information Processing and Cognition: The Loyola Symposium, in press. Anderson, J. R. Language acquisition by computer and child. To appear in: S. Y¥. Sedelow & W. A. Sedelow (Eds.), Current Trends in Computer Use for Language Research, in preparation. x Special Note I am in the second year of an exchange visitor's visa. [I can renew the visa for another year. My wife, an American citizen, is currently petitioning to have my status changed to that of a permanent resident. Therefore, I / will be able to be at the University of Michigan for the entire period of the proposed research. COMPUTER SIMULATION OF LANGUAGE ACQUISITION : - A. Introduction iL. Direction and goals of the research Most simply stated, the purpose of this research is to understand language sequisition, There has been a great deal of research on first lenguage ecqui- sition in children, second Language learning by adults, and Learning of arti- ficial lenguages by Laboratory subjects. Tais research is not principally concerned With getting more exper rimental evidence. Rasner it is concerned wit developing an infornation-processing model that can be used to explain tne already aveilasle set of experimental “facts. One of tne principal concerns governing the design of this model is just that it be able to Learn a natural Language Twill snow that this, in itself, is a very significant goal. it that algorithms adequate to learn & naturel language are quite complex. It is not possible to sit down and sinply specify tnem verovally or with. quations. This research makes Use of the computer as & tool tc pnd test complex modes. Wwaererore, i nave been aeveLouinis & computer jon model of language acquisition. Tris model is calied LAS (en acronym ror Langu23® Acquisition Syste). Most OF the proposed budget is concerned with suppor rting the developme ant of this progre®. Input to LAS con- sists of sentences of the Language paired with represenvaet tons of their meaning. Therefore it simulates langusge learning in sitnetions where 4 Learner cen figure o f the sentenc ease of such & situation simple pictures and sentences describing then. grammar which allows it to go from sentences to lying meaning. The grammar can also be used to meanings. It is also hoped that this program evolution of computer language understanding sys really has two purposes, one in psychology and one ut the meaning oO would be one in I became interested in language acquisition with a computer simulation model of human memory - in e book by myself and Gordon Bower entitled Hunz computer program Was 6m attempt t principal purpose of that re retrieval system (called HAM) and test it in ersion of HAM is used witnin LAS. HAM 's sy understander which was capable. of dealing with subset of English and which was capable of usin to resolve reference. Nevertheless, it was re a2 5 em a etom a & wey which the learner Tne progran representations OL will make 2 contr o simulate simple que search was to develop e from context. The simplest is presented with constructs & their umdaer- sentences to convey ibution to the Tous, the research ieial intelligence. generate Tens. @ in artifs of my Work escribed The The fu experivents. € = considerable exh guete and Anderson bilities compared to the work of Schank (1973 result of my own experiences and sbudyin T vecame pessimistic about the value of re D> : e rms of a computer program. To represent the unbounded Lins Yr o oO o Oo rh ct ry % ny Pp 5 $ oH ou rs Oo xs QO 0 he a i pu fu Kh ch a 8 DO a ch ck O by ow oD 0,0 nog ft fee ue .. Dp 0 4 fo He go Oo hock ct a 5 Sb © wo er 8 ch oO \— « co QO M mG is) ry ee fo oO P ce oO UW yy DO ta Oo ra 0 OQubline of Provosal The concern in this proposal will be primarily with developing a systen logically adequate for language acquisition nd only secondarily with 2 systen t } , that simulated actual humen performance. I do not istic goal until we have a characterization of the so adequate for natural language acquisition. Tais emphasis’ on Logical. adequacy is clear in the organization of the proposal. I will irst review the work that has been done on computer language understanding. This is importent ce- cause LAS is a language understander as well es a learner. Then I will review the formal results on granmar induction. Tnen LAS-1 will be deseribed. LAS 1 is a first pass version of the LAS program adequa to learn simple languages. rh OY te Than I will propose en extensive set of developments to be added to the progran, aimed both at increasing its Linguistic powers and making it a realistic sinu- Jation. In describing LAS-1 and the proposed extensions, I will review rele- vant research in the child language literature. Finally, I will propose a series of experiments with artificial languages to check specific claims LAS makes about language jearnability.— ‘ om 2. Computer Language nderstanding Computers have been applied to natural language processing for 25 years. There has been & succession of major reconcentualizations of the problem of language understanding, each of which constitutes @ clear advance over the previous. conceptions. However, any realistic assessment would concede that we are very far from @ general language understanding system of human capability. The ergument has been advanced that there are fundamental obstacles that will prevent this goal from ever being realized (Dreyfus, 1972). ‘These arguuents are shamefully imprecise and lacking in rigor. Te best (e.g., Bar-Hillel, 1962) has to do with the extreme open-endedness of language, that en effectively unbounded variety of knowledge is relevant to the understanding process. It is boldly asserted, without proof, that it is not possible to rovide the computer with the requisite background knowledge. In reviewing the work on natural language systems, T will constantly measure them with respect to the goal or general language understanding. I appreciate that it is a legitimate artiviciel intelligence goal to develop a lenguage system for some special purpose application. Such attempts are free from the Dreyfus and Bar-Hillel criticisms. However, from any psychological point of view these systems are interesting only as they advance our under- standing of how lenguage is understood in general. 9 Machine Treastation SAL S 2 Ue ee tye first inten enea with treasiation. Comp: us a ial ssive effort turned out to be & dismal, failure (ALPAC, 1955; Jus agner 1965). Today, it is fashionable to attribute the Failure to che then-current ingoverished concepsion oF lenguage (e.g., Simmons, 21970; WilKS, 1973). Tre early attemots took the Tors of substitution of equivalent words across lansguagss This was ausnentes oy use oF surface sbtrucvure and word associations but ab no point was tne word abandoned 2S the princioal unit or meaning. Recent work on language understanding (e.g., Schank, 1972s Winograd, 1973) has ecandoned the word as tae mit of meaning. It remains to be seen “~hether current attempts e.g., Wilks, 1973) at machine translation nave better success. Interactive Systens inveracties =f Tae now popular task domain for applications of computers to lenguase is in constructing systems thay can interact with tne user in nis own langusgs2- . PS ~ a Question-answering systexs are the most common, the User can in program abouts Kno s data base and input new xno depen j % yp pu DY % eantatinn snd store n taat Lo. the ini is 37 & wh used to guide an interrogetion of the data base For the answer. The f system is critical in tne answering of questions since many answers will not be directly stored but will have to be inferred from what is in memory. Both parsing and inferencing run into time problems. The central tine problem in parsing has to do witha the exsrerme syntactic end lexical ambiguity of natural languege. Rach word in & sentence admits of n syntactic and semantic interpretations where mon the average may be as high es 10. If there are a words, mt interpretations must be consicere ealtnoughn a only one is intended. ‘The fact that language 4s so amoiguous Was & surprising 6 n discovery of the early machine attempts at parsing (e.g., Xuno, 31955). Thus, there is exponential growth in processing time with sentence length. To date, no heuristics have been gemonstrated that change in general this exponential function of sentence length to something closer to @ Linear function. The human cen use general context to reduce ambiguity to something approxinating the linear yelation. Suppose there are m facts in the data base and the desired @educt wineti long. Then, there 15 something Like mm possible comssne There is also an exponential growth factor in tne task of in xr the desired @eduction. ‘This suggests that very caeep ings n is difficult to acnieve and this is certainly true of our every-c38y yeasoning. However, it also suggests that inference making should become more difficult as we knoy more facts (i.e., nigh & shich is clearly not the case. The provlem fecing inference systems is to select only those facts that are relevant. 10 Aaderson Resolution theorem-proving (Robinson, 1965) is the most studied of the mechaai- cal inference systems. It is also here thet the most careful work has Been cone on heuristics for selecting facts fron the data vase. ‘These methods include semantic resolution (Slagle, 1965), lock resolution (Boyer, 1971), and linear resolution (Loveland, 1970; and Luckhan, 1970). In practical applications these heoristies have served to considerably reduce the growth in computation tine. e t nstrations of the optimality of these heuristics are tasx- ir are no gensral theorems about their optimality. I suspect that eral deal effectively with the problens of exponential g20¥ he Althousn there are potentially serious time problems both in pars pr inferencing, 2 problems have not surfaced in tne past ograns might haye expects isis because these programs have all been rather narrowly constrained. ir lenguage systems only need to deal with a srall portion of possible syntactic constructions and possible word meanings. Also, because of restrictions in the domain of discourse, only 4 restricted set of inferences are needed. Some of the interac ano r etive systems (ELIZA - Weizenbaum, 1966; PERRY -— Colby & effort to Go a complete job of sentence analysis. gs performed to permit success in marrowly circum- ‘antences were generated by filling in pre-prosramnzed The ambition in programs like Colby's or Weisen- earance of understanding. Weisenbaun's program jan psychotnerapist and cColoy's a paranola patient. L uage understanding it was difficult » % H nat these might just be manifesta- Other attempts made more serious efforts at language understanding. They avoided the time vrobdlems inherent in arsing and inferencing by Gealing with restricted task domains. Slagle's DEDUCOM (1965) dealt with simple set inclu- Sion probleas; Green, Wolf, Chomsky & Laughery (1963) with baseball questions; Lindsay (1963) with kinship terms; Kelloggs (1968) with data management systems; Woods (1953) with airline schedules, Woods (1973) with iumar geology; Bobrow (1964) and Charniak (1969) with word arithmetic problems; Fikes, Hart & Nilsson (1972) with a robot world; Winograd (1973) with a blocks world. Other systems like Green and Raphael (1968), Coles (1969), Schank (1972), Schwarez, Berger, 2 3 (1969), Anderson and Bower (1973), Rumelhart, Lindsey end Norman (1972), and Guillian (1969) have not been especially designed for specific task domains but nonetheless succeed only because they worked with sericusly Limited data pases and restricted classes of English input. Because the parser deals with only certain word senses and certain syntactic structures Linguistic am- biguity is much reduced. Those programs that use general inference procedures {ke resolution theorem proving are notabdly inefficient even with restricted a bases. Winograd made extensive use of tne Ta itie ecting inferencing with specific neuristic information. Tne validity of se heuristics depended criticaliy on the constraints Ae. toe li fom Dae ary PNG CSG Winograd (1 1973) has comoined good task analyses, programming: iil, powers of advanced progreé pning languages to create the oeast ext neud standing system, I have heard it seriously claimed that the W sys could be extended to baceme a general model of language unders 2 WHRAD is needed would be to program in all the knowledge o n adult chend the parsing rules to the point where they handled all English sent Admitted ly, this would be a big tasx requiring hundreds of man-years oO put, it is argued, no greater than the work that goes into writing big ing systems, Clearly, this argument is faulty if only because it + deal with the time problems in general inferencing and general parsing. r, it is also unclear whether human langueze understanding can ve captur a Fixed program, Further, it is dubious whether it is manageable to a ook keeping -thas is necessary to assure that all the specific pieces of Kn are properly integrated and interact in the intended ways. Our Li e conpe= tence is not a fixed object. This is clear over the period of as We learn new gramnat ce styles, new words, and new ways of thinki think this is also true ove nort spans of time. That is, the way humans ¢ with the time problems naan in parsing and inferencing is to adjust the parsing and inferencing eccording to context. Language Acauisition as the Road to General Language Understanding The preceding remarks were meant to suggest how an adaotive language system might provide the solution to the fundan antal croblems in general language understanding. Rather than def ining and hand-programming all the reauisite knovledge, way now let the language understanding system discover that hoovledges and yrugrau iusel1i ‘ine language acquisition system is a mechanized bookkeeping system for integrating ell the knowledge required for language understanding. By its very nature it treats linguistic knowledge as constantly changing object. So we know 1t would change with a changing linguistic community. We might hope that it could adapt over snort periods (like hours) to its current context Learning systems are frequently regarded as the universal panacea for all thet ails artificial intelligence. Therefore, one should be rightfully suspicious whnetker LAS will provide a vieple route to the creation of a general language understanding system. Certainly, the initial version of LAS falls far short of the desired goal. However, with our current state of knowledge it is just not possible to evaluate LAS's pretensions as an eventual lenguage understanding system. It is only by systematic exploration and development of LAS that we ever will be able to Getermine the viability of the learning approach. a + Whatever the potential of the learning approach in artificiel intelligence, clearly it is the only viable psychological means of characterizing human lin- gauistic knowledge. It would be senseless to provide a catalog of all the knov- ledge used in language widerstanding. A catalog of everything is a science of nothing {a quote from T. Bever). Rather, we must characterize the mecnonism that creates that knowledge and how that mechanism interacts with exverience. 12 Anderson tanaat wpetor Woads Systen The Linguistic formalisms used by LAS are very similar to W Bugnentec transition networks. This sechion on computer Langes. ¢ eoneludges with a description of Woodst systema and an exposition of the suita~ bility of his formalisms for the current praject. There are three critical features that LAS reauires of the formalisns thet will express jts grammatical knowledge. First, it should be a formalism thay can be used with equal facility for language parsing and lenguage generation. Tnis igs pecause it is unreeson- able to assume that a child incependently learns now to speak and how to under- stand, Second, we want @ formalism for whicn it is. easy to devise a constructive algorithm for inducing grammar. That is to say, some descriptions of grammatical nowledge are computationally easier to induce when others, even though the CAS Janguage they describe. Third, we want the f rmalism to be close a t to the assumptions it makes about the interpretative system that uses tne gremnar for speaking and understanding. This is because that interpretative sysven is taken as innate. Thus, it is not possible to induce new programs for interpreting the grammatical rules, it is only possible to induce new grammatical rules. g: A guiding consideration in this research is that these gronmatical formulation are satisfied by a finite-state tran x un we rie Qu om ry fo ct p rH © Hy Q epresentation. Tne proclem is that natural languages are fundamentally more complex than finite state languages. However, Woods has shown a way to keep ke some of the advantages of the finite state representation, put echieve the trang?farmational crammar . Unadst angmanted transition nebworks are similar to and were suggested by the and Dewar (1968) end Bobrow end Fraser (1970). Transition networks are like finite state grammars except that one permits as labels on arcs not only termin- al symbols but also names of other networks. Determinetion of whether the are should be texen is evaluated by a suoroutine call to another network. This subd-network will analyze 4 sub-phrase of the Linguistic string veing analyzed py the network thet called it. The recursive, context-free aspect of language is captured by one network's ability to call another. Figure 1 provides an example network taken from Woods’ (1970) paper. The first network in Figure 1 provides the Mainline” network for analyzing simple sentences. From this mainline network it is possible to call recursively the second network for anaiysis of noun phrases or the third network for the analysis of prepositional phrases. Wood (1970) describes how the network woule recognize an illustrative network grammers of Thorne, bratley, To recognize the sentence "Did the red bam collepse?" the network is started in state S. The first transition is the aux transition to state qo permitted by the auxiliary "did." From state qj we See that we can get to state a3 if the next "thing" in the input string is an NP. To ascertain if this is the case, We call the state NP. From state NP we can follow the are lebeled det to state gg because of the determiner "the." From here, the adjective "red" causes & loop which returns. to state qg, and the subsequent noun "harn” causes a transi- tion to state q7. Since state q7 is 2 final steve, it is possible to "pop uo" from the NP computation end continue the computation of the top level 8 beginning in state 43 whieh is at the end of the iP arc. From q3 the vero "collapse" permits a transition to the state 13 adj #8 & rn Gy ate Q . nor Spy Ee +R) MP el) FIG. A, A sample transition network, § is the start state. Gee Ga es. (From Woods, 197 ) = yo Ga Gan ANG Gy g AIC the final stat ub q),. and since this state is final and “collapse is the last worg in the string, the string 1s accepted as & sentence (po. 991-592). + . . : bennei ke nat 1s known as & recursive transitior Hq T have illustrated in Pigure l n a network which is equivalent to 2 context-free phrase-structure grammar. Woods! networks are in fact of much stronger computats nal power - essentially that of a Turing Machine. This is because Woods permivs arbitrary ections. This gives the networss the ability of transformational grogmars to permute, copy, and delete fragments of a sentence. Thus, with nis network formalisms Woods can derive tne deen structure of a senbence. The croblem with this grammaticas representation 1s that it is too powerful and permits commutation of many things that are nov pars of 4 speaker! grammatical competence. In +e Ve o e % & of Ss the LAS system all conditions and ections on networs arcs are teken from 2 small repertoire of oper tons possible in the HAM memory system {see And son & Bower; 1973}. This vay some context-sensitive xr duced into the language without introducing psycnologically unreslistic powers. Tn many way pover and beh one criticai @ network formalisas of Woods ere isomorphic in their to the program granmars of Winograd. However, + i neer Tne flow of coatrol is contained in Winograd’s pro- a particular progran is committed to 4 certain beha- e in the network formalism. The flow of control is ~ which uses the grammatical enowledge contained in different interpretative systems the same net- be used in different Ways. This is critical gram gramrars vior., Tnis is not th econtaineé in sni r the netKerss. nus by WwW Ss ee = war’ (orarcar enecifiecation can to LAS's success where cnree different interpreters use the same gronmaticel iL formalisms to guide understanding, generation, and language inducticn. 3. Researen on Gramucar Tnduction Apparently the modern work on the problem of grammar snduction began with the collaboration of N. Choasky and G. Miller in 1959 (see Milier, 1967). There have been significant formal results ootained in this field and it is essential that we review this researcn before considering LAS. The approach taken in this field is well characterized by the opening remarks of & recent highly-articulate review chepter by Biermann and Feldaxan (1972): The grammatical inference problem can be described as follows: a8 finite set of gmbol strings from some language L and possibly @ finite set of strings from the complement of L are known, and &@ grammar for the language is to be discovered . +--+ Consider & cless C of grammars and a machine 4M. Suppose some Ge C and some I (an information sequence) in t(L(G)) are chosen for pre~ sentation to the Machine Mc. .-- Intuitively, jdentifies G if it eventually guesses only one grammar and that grammar generates exactly L(G), (pp. 31-33) The significant point to note about this statement is that 44 is completely abstracted away fron the problem of a child trying to learn nis lenguage. There has been virtually no concern for algorithzs that will efficiently induce the subset of gremmars that generate natural languages. The problem 15 is posed in general terns. The character ization is with inducing 2 characteriza ation of the well-formea However, this is not the task which the child faces. mappings between conceptuali ons and strings oF th must understand what is $90: to him and Learn how Te a characterization of the well-formed strings €© product of the mapping between sentences and meanings in the formal work on language induction, there has about the contribution that semantics might have to” man is without any practical so he set of possible langusges ig too unrestricted. Worrable solutions are pos- sible to practical problems only when it is possible to sree atly PRStE Ley the candidate languages or because important clues exist wW! The grammatical inference problem as charac eterized by Blermann and Feld-~ tutions. Workadle solutions do not exist because. ct ib wa priori possible languages. Chomsky (1965) argued ssentially Yor this view with respect to the problem of a child learning his firs» danguage. He suscgested that the child could take advantage of linguistic universals which greatly restricted the possible languages. i will argue that such universals exist in the form of strong constraints between the structure of a sentence and the semantic structure of the referent. These constraints pr ovide critical cues for the induction problem, Gold's Work - Prahahiy the most influential paver in the field is by cola (1957). He provided an exp icit diterion for success in 2& lan guese induction proolem and pod , proceeded to formally determine which lLearner-teacher ractians could achieve thet criterion for which languages. Gold considers 4 = the Limit if after some finite time the Learner discove s enerates the strings of the Jenguege. He considers tyvo information S ner eq in the first the learn is presented with all the sentences of the language and in the second the Learner is presented with all strings, eacan properly jdentified as senvence or non-sentence. ‘Then Cold aszs this question: Suppose the learner can assume the language comes from some formally characterized class of languages; can he identify in the limit “hich language 1t is? Gold considers tne classical nesting of language classes — finite cardinality languages, reguler (finite state), contexc-Tfree, context-sensitive, and primitive reeursive. His clessic result is that if the learner is only given positive information acout the language (i.e., the first information sequenc e), then he can only identity finite cardinality languages. However, given Po itive arn S (i.e., the second information sequence), he can le sive languages. , The proof that the finite state cless is not identifiable with only pos- itive information is deceptively simple. Among the finite state banguages are all languages of finite cardinality (i.e., with only finitely rany strings). At every finite point in the information sequence the learner will not know if the language is gener ated by one of the infinite, nite cardin- te ality languages swhich includes the sample or ani state grammar which includes the sample. Logica s similarly easy to prove that any language in the primitive recur- S c s can be induced given positive and negative i information. Tt is possible to enumerate all possible primitive recursive grammars. Assume an AZ algorithm that proceeds through this countably in one fter another until it finds the c sta greammar 2S Long 4s the informati it. incorrect grammar G will be rejected inf sequence-~either ecause the seque & 8 anerated by G, or 35 & positive ins G. correct grammar has some Tinite p alg LL eventually consider it and stay tec atser than the above but these wi fhe algorithm outlined in the second proor m For instance, the position is astronomical of E ordering of all possible eontext-sensitive lan 3 g terminal symbols. However, Gold also proved the here 15 L . aa a ehniaue. That 1s to Sey, given any alsgo- Ss more effective t 1 3 ve language for which the enumeration rithm one can pick som algorithm will be fast So, Gold le two very startling results that we must live with. First, only fin: lity languages cea be induced without use of negative information. 2 necause children get little negative fTeedoack ck they do get (Brown, 1973). Second, - and make Litel Se *arnat negative reecpa t oO 4 enumeration. This is startling because no procedur than blind blind ennun opeless 85 a practical induction elsorithm for natur- ' al langu2:: see how natural language can be induesd despite Gold's res: 3 review some other research of the same ilk. € the esrly attempts to provide & constructive algorithm was proposed by T is, he attempted to define an elgorithm waich would con— by bit the correct grammar rather than enumerating possiole grammars. LAS is a constructive algorivthn. His ideas were never programmed an had their ou logical flaws exposed by Shamir and Bar-Hillel (1962) and by Eorning (1969). In nmart Solomonorf hes served as @ straw man that served to justify the enumerative pproeach over the constructive (e.g., Horning, 1969). 8 br arried the Gold analyses ferther. Feldman (1970: Feldman and his students have c e provided some urtner finitions of languages jdentifiability and proved Gold-Liks results for these. F man considered not only the task of inferring a grammar that generated the semple, --- also the task of inducing the most simple grammar. Gran- mar complexity was measured in terms of number of rules and the complexity of sen- tence derivations. Horning (1969) provided procedures for inducing grammars whose rules have different provabilities. Biermann (1972) provided e nucber of efficient constructive algorithms for inducing finite state grammars when the number of states is known, ‘Tnis is a relatively tractable probvlen first formulated in 1956 by Moore, however, Moore's algorithms are much less efficient than Biermann's. stave grammar induction that 4 Pao (1969) formalized an elgorithn for finite ste n advance. A sample set of did not require the number of states to be known sentences was provided which utilised ell the rules in the grammer. A minizal finite state network was constructed that generated exactly the sentences. Thea an attempt was made to generalize by merging nodes in the work, The algorithn checked the consequences of potential generalizations DEO LT Andersen mwledy Tey os a ’ asxing the teacher wa a u in the target langu t a ie J sented these induct po oD at voods' work, she d Shoe Found that such a if sne provided punc networks occur. Bas the sentence's surfa and Ruff (1963) foun easily when surface Crespi-Reghizzi eo C in program was gives. Lnroi we 5 4 in the inéuction of operat r-precedence lenguegz> wal 2 S cont free languages. For a special subset of operator precedence languages he was able to define an algorithn that worked with only positive information. Except s the only available result of success for finite cardinality languages, this 1 with just positive information. S have shown relatively efficient, constructive algoritams are Bo esting lenguage classes if the algorithms nave access to informetion ab neets surface structure. The provlem wita their work is thet this is provided in an ada hoe manner. It has the flavor of cnsating and cer-- tainly is not the way things happen with respect to naturel language induction. a *. T think the work of Pao and of Crespi-Regnizzi neve promising 352 7 Dow. Risface thructure of the contomce mov be inferred hy com] paring te to its semantic referent. Crespi-Reghizzi has also shown how the properties of a restricted subclass of languages can be used to reduce tre reliance on negative information. While natural languages cartainly have aspects that can be best captured witha context-sensitive grammatical forralisms, most context~-sensitive languages are ridiculous candidates for a natural languege. An efficient induction algorithm should not become bogged down 2s does Gold's enuneration technique considering these absurd Janguages. Cet nee de Graczar as a Mapping Between Sentence and Canception There is one sense in which ell the preceding work is irrelevant to the tesx of inducing a natural language... They have as their goal the induction of 2 correct syntactic characterization of a4 sarget language. But this is now what natural language learning is about. In learning a natural language the al is to learn a map that allows us to go trom sentences to their corresponding eptual structures or vice versa. I argue that this task is easier than learning the syntactic structure of 4 natural Langusze. This is not becaus there is any magic power in semantics per se, but peceause natural languages are to in a very non-arbitrary manner the s so structured that they incorpora ture of their semantic referent. The importance of semantics nh as b forcefully brougat home to psychologists by a pair of experiments by Moesser and Bregman (1972, 1973) on the induction of artificial languages. They con- pared Language learning in the situation wnere their subjects only saw well- formed strings of the language versus tne situation where they 547 well-formed strings plus pictures of the semantic referent of these strings. In eltaer case. the criterion test was for the subject to be able to detect wnick strings 18 Anderson of the language were well-formed -- without aid of any referent pictures After 3909 training trial s subjects in the no-referent condition were at chance in the criterion test wnerea subjects in the referent condition were essentially perfect The Role of Semantics Results lize those of Moesser and Bregman have left some b there is some magic power 1m naving senantie referent. However that there is no necessary advantage to having a semantic rerere lationship between a sentence and its semantic referent coulc, i be an aroditrery recursive relation. Inducing this relation 15 3% aifficult as inducing an arbitrary recursive language. ‘This la in need of ea proof which ft have provided (Anderson, 1975). It to reproduce here, bu 4Y Ot + algoritnm to bitrary semantic rela ences, coul La identify an arbitrary Gold's wor o ind tion algoritha for the se iv c - be more effective than tne impossible enumeration alsoritaum for identifying en arbitrary lenguase. Thus, for it to be possible to induce the semanzic relation, there. must be strong constraints on possible form of that semantic relation. How does this semantic referent facilitate grammar induction? There are at least three weys: . First, rules of natural language ere not formulated with respect to single waras but with respect to word classes Like noun or transitive vero which have & common s2nantic core. Se semantics can help determine the word classes. This is much more efficient than earning the syntactic rules Yor each word separately... Second, semantics is of considerable aid in generalizing rules. A general heuristic employed by LAS is that, if two syntactically similar rules function to. create the same semantic structure, then they can be merged into a single rule. Third, there is a non-arbitrary correspondence between the structure of the semantic reZerent and the structure of the sentence whi permits one to punctuate the sentence with surface struc- 1 ~ en ture information. The nature of this correspondence will pe explained later. Siklossy 'S Work The only attenpt to incorporate semantics as 2 guide to grammar induction Was by Siklossy (1971). He attempted to write @ program that would be able to learn languages from the language-through-pictures books {e.g., Richards et al., 1961). The books in this series attempt to teacn 2 ween ae *y pre- senting pictures paired with sentences that describe the Sit losey 's program, Zbie, used general pattern—mai correspondences between the pictures (actually han end the sentences. The program does use information to help induce tne surface structure of the senten of LAS. However, it remains unclear exactly what u s of semantics or what kinds of Lenguages the program can learn. Tne displayed exexples 0 of the program's behavior are very Sparse with exanoles n tions. AS we Will see, & progran must have strong it is to learn a language. The few examples of gen lows: Suppose Zyie sees the following three senten caniques: to find picture descriptions) on in the picture encodings e at in the manner poe a 8 it W 19 Anderson 3) Joha walks 2) Mary walks 3 John telks able sentence. If dees not a these generalizations. des no discussion of now his program's benavior relates ng a language. The one example of an attenpt to simulate x +} is Kelley (1957). His progres attempted to simulate the Cc utterances Yrom one word, to two words, to three words. aan to be making use of semantic lmao bub he never speciries ro 's performance. In general the details of the program ar exanple the program never gets to the point of producin gd iti nelear whether 1% could. be i e not coepiained. i: nS grammatical sen 3 oOo cr 5 13 QO wy a po oa Y, Rationale i z Q 0 pB er bp ption in the LAS project is that 4 language learner can some- meaning of sentences and that language Learning takes volace gtances. The specific goal is ing of the oO rh @ te _ w mu ge - 1 * 1D ct a i ke y pss pu m 3 ct oO 3 a fF a4 yor OD 3 ny ‘Ss to ex plein how the pal mantic referent permits Lansuag 2 velop @ computer program woaich ired * aa th- ‘semantic 4 intersretat ro mB ch W i oO Fe jos choy Be Fy 4 4 fo ry OF ry a © oO ry ot ms kr Ww oO 0 YOR PP A 4 a] 5 foo nO # as to hw” QO nm Oo t d E ct fay a complexity, it is ess acquisition ake the form of 4a computer pro need of a computer model after describing + imate go21 to provide a faithful simulation of child language 4 quisition. One mi question whether 4 system constructed just to succeed at age learning W have much in common with the child's acquisitian systes. I strongly suspect it will, provided we insist that the system have the sane juror mation processing linitations as a child and provided its language les suetion has the sane information-—processing demends as at of the chile. ne consideration underlying this optimistic forec at learning & naturel language imposes very severe and highly unique on-processing demands on any induction system and, consequently, there § very severe Limitat ions on the possible structures for 4 successful system. A similer argument has been forcefully advance ed oy Simon (196 to the information-processin 1g demands of various problem-s solving te This project does have as an ult eC te py MH e rym t t % be @ Oo ee a pe cr ct es Wy oO uo Oo 6 oO ct The current version of the progran LAS, L works ~ en overly simpli domain and maxes unreasonable assumptions abou z gz Nonetheless, it predicts meny of the gross rea generalization in child language learning. Tt i errivly "off" in other aspects. It turns out thav many of its failures of simulation can be traced to the un- realistic erate ns it is making about tass domain and inforzation processing abilities. Many of the proposed devel oments of the prograd nave as their goal tne elimination of these unrealistic assucpcicons. The assumptions vere made to make the problem more trac tayle in a first-pass attempt. oo ures O2 veoneralization z aA & 20 rm we ae i 5 - The Progran LAS. 1 [i -AeUe e ne aE ROO OA AN Tis section deserives LAS I, 4 relatively smali progrem that was put together in eight montas. Tt has achieved success in é€ non-trivial natural languzge in- duction will be principally concerned wita extending SPRAK which uses the network formalisms %o generate 2 uses the seme networss for sentence understancing BRACKET which punctuates sentences With their surfece structure by compari referents, aid SPFAKTEST which puilds an initial ® etbwork grammar to parse a sentence end GENERALL = which eneralizes the initiel ranmar. > LAS is an interactive program written in Michigan Lisp (Hafner & Wileox, 1975). The progres acceots as input Lists o* words, whica it treats as sentences, and scene descriptions encoded in 2 yariant of tne HAM propositional language {see Anderson & Bower, 1973). obeys commands to speak, understand, and learn. The logical structure of LAS is illustrated in Figure 2. Central to LAS is an augmented transition network grammar similer to thet of Woods (19TG). ‘In response to the ccmaand, Listen, LAD eVORES vie wEU St oH UNDORCTAND. Tac inet to UDR STAND is a senvence- LAS uses the information in the network grammar to parse the sentence and optain 2 representation or the sentence's meaning. in response to the command, Speak, LAS avoxes the progres SPEAK. SPEAA aceives a picture encoding and uses the information in the network grammar tO generate a sentence e 409 describe the encoding. Note that LAS i S Doth to speak and understand. The principle pur in LAS is to provide & test of the grant 3 using the same network formalism pose of SPEAK and UNDERSTAND ed by LEARIMORS. H “ od . ts au C QO The philosphy pehind the LEARNMORE program iS to provide LAS with the sene information that a child has when he is learning a langusse through osten- sion. it is assumed that in this learning mode the adult can both direct the child's attention to what is being described and focus the child on that aspect of the situation wnich is being deseribed. Thus, LEARNMORE is provided with a sentence, & HAM description of the scen ana an indication of the main o proposition in the sentence. It is to proauce @5 output the network grammar that will be used by SPEAK end UNDERSTAND. + is possible that the picture description provides more information then is in the senvence. This provides more information than is in the sentence. This provides no obstacle to LAS's heuristics. In this particulsr yersion o- LAS 4% is assumed shat it already knows the meaning of the content words in v! entence. With this informetion BRACKET will assign a surrace structure & he sentence. SPRAXTEST will deter- mine whether the sentence is handled by tne current grammar. if not, additions are made to handle this case- These additions generalize bo ovhner c2se5 SO that LAS can understand many more sentences than the ones it was explicitly trained with. ct Pp Mm wv el SENTENCE MAIN: PROPOSITION DTCTURE ENCODING | an Ee ai gata tee RES EATS, i LRA RNMORE | pp ? lee i? | Bracke = i soeartest? | | GENERALIZE| JEN DED 4 4 TRANSITION | NETWOR | GRAMMAR | RUGHE SENTENCE HAM CONCEPTUALIZATION SENTENCE Figure 2. A schematic represe entation sivin of the major subcomponents of Ls _ UNDERSTAND. 22 ~LEARNMORE,S PICTURE ENCODING the input and DSA The SPEAKTEST program would permit LAS to construct a parsing network adequate to nance ile all the sentences jt was presented with. Also it would make many Llow-Leve vel generalizations about phrase structures and word classes. would permit ssfull nalyae or generate many novel sentences. es c ization i % the BRALL re i he 3 “grammars is essential © GENERALE is oO The HAM, 2 Memory Syste The Het 6 De LAS. L uses 4 version of the HAA memory syste (sea Anderson & Bower 1973) ir > called HAM. 2. HAM, 2 provides LAS with two essen al features. First, it provides & representational formalism for propos siti onal knowledg2- This is used for representing the comprehension ouspus of UNDERSTAND, the to-pe-spoken input to SPEAK, the semantic information in 1 long-ter menory, end syntactic in- formation about word classes. HAM: 2 else “contains a memory sear chi rithm MATCHAL which is used to evaluate various parsing ¢ conditions. F stance, the UNDERSTAND progrem requires thee ce rtain features be true of 2@ ora for 2 parsing rule to apply. These are checsed bB & The same MATCHL process is used by the SPEAK prosras to Get et Ds ection essociated with a parsing rule ereates part of th en struc- sure. This MATCHL process is ° variant of the one deseribed in Anderson and Bower (1973; Cn. 9 2 12) and i sg details will not be discussed here. However, it would be useful to de ripe here the repress ntationalL ror- r now the int eornmation in the sc malisms used bY HAM, 2. Figure 2 jllust es sentence A rec. square is above the circle would be represents’ © ith the HAM. 2 networs formalisas. There are four distinct Pr two nodes X and Xi xX is red, X_1is 2 sauere, 4 Each propos sit son is repre sented by 4 distinecs tre ture consists of a root proposition node conn node and by @ Bli ink to a predicate node. ; posed into 4 R link pointing to a relavion node and into & ° Link pointing ‘to en object node. The semantics of these represens tations are to be interoreted in terms of simole set-theoretic notions. Th subject is @ subset of the predicate. Thus, tne individual X is & subset of the red things, the square things, and the things aoove x. The individual yis a subset of the circular things. » One other point needs empnasizing epout this representation. There is e distinction made between words and the concepts which they reference. The words are connected to their correspo onding ideas oY Jinks Labelled W. Figure 3 illustrates all the network notation needed in the current japlementation of LAS. There are & number of respects in which this represeatation is simn- eee then the old HAM representation. Cnere are not the means for represent— ng the situation {time + place) in whicn such a fact is true or for enbedding one proposition within another. Thus, we cannot express in EAM. 2 such sen- 2 tences as Yester day in =y¥ pedroom 2 rec square Was Boove the circle oF John believes that 8 red guuare Ts apove tae circle. nepre believes that § qd Figure 3. An e yample of 2 propositional netw CIRCLE ork representation in HAM-< 7 Dy. “_ Anderson re only con- a ension. In * 1 Lng ostension, tac assumed time and place are . Concepes Lise pelief woich require enoedded propositions are too for ostension. In future yesearch LAS will be extended oceyona the c asive domain. At that point, complications Will pe reqgaired in + resentations, however, wien starting out on 4 project it is prefe seep things @s simple as There are @ numocr of motivations for the associative network representa- tion. Anderson and Bower (1973) have combined this representation with a nun- per of assumptions about the psychological processes that use then. Predic- tions derived from the Anderson and Bower mo ‘ & to be generally true of human cognitive performances. Howeve 2c HAM's representation have not been emple that recommends associative network. represenvetions s a compute has to do with the facility with which they can be searcned. Another advantage of this representation is particularly relevant to the LAS project. This has to do with the modulerity of the representation. Bach proposition is coded as a netyorn structure that can be acessed end used, independents of other So far, I have snown how the HAM. 2 reoresentation encodes the episodic hn input to SPFAK and the output of UNDERSTAUD. It cen alse Ss the semantic and gsyntectic information required by the parsing eaten tenctan sy TAM 9 weavta anaods the foot that rircta ae ot PRIUS UL LCoS sew oa = and square are potn shapes, red and bine a belong to the word class *CA bub square and blue telong to the word ciass FCB. Pr . 7 oT -ae 3 2 Note the word class information 15 prediceved oF the words while the categor- teal information is predicated of the concepts attached to these words. The a mntactic rule only applied to s e both colors, circle and red te) categorical information would be used if 2 2 anrm7.37 i shaves or only to colors. Tre word class information micnt be evoked ifa 2S vy —_ y~ a ae language arbitrarily applied one syntactic rule to one word class and onother rule to @ different word class. Inflections are @ common example ort syntactic s 3 rules which apply to arbitrarily defined vord cle HAM, 2 has 2 small - language oF commands which cause various memory Links to be built. The following four are eli. that are currently used: 1. (Ideate X Y) = create 2 W link from word K to idee XY. 2, (Out-of X Y¥) - create & proposition node Z, From this root node ereate uk 3. (Relatify X Y) - create an R link from X to h. (Objectify % Y) - create an QO link = These commands wilkL appear in LAS's parsing networks to create memory erions. Often rather than memory structures required in the conditions ane iv appear in these commands. Tf the nodes, variables (denoted X1, Xe, etc) variable hes as 445 value a memory node that node is used in the structure puilding. if the variable has no value, @ memory node is created and assigned to it and that node is used in the menory operation. To illustrate the use of these comzands, the following is & Listing of the commands that would create the structure in figure 3: 92 Seo C P pf \8 \ fo > *COLOR { J CIRCLE #OA RED SQUARE Hig | BLUE Figure 4. An example of a HAM structure encoding both categorical information and word class information a (Ideate red 1) (Ideate square 2) (Ideate above 3) (Ideate circle hy (Out-of KL) (out-of X 2) (Out-of X 8) (Oojectify & Y) (Relatify 8°3) (Out-oF Y t ply to any m2 st languages will b will also be used to illustrate the SPEAX anc UNDER eribed shortly. The first, GRAMMARL, is a simple artificial grammer. Ta second, GRAMMAR, is a more complex gramzar Tor @ Su aYnAAD D aefined by tne rewrite rules in Table l. GRAMMARL wa mally different fron Englisn word order. Tne sentenc be read as asserting the first noun-phrese nas t last word to the second moun phrase. For purpose il of these languages are English but they need nov oe. GRAMMAR] 1s 4 finite Language without recurssicn. In contrast, in GRAL-MARe the NP element has an amb dann t OTATOS eratnh naam manwnndiareadear aot) MD oeamarctine 2 nntoential infin- Sp eee Se ee MOCUPILV Ry mem y peor cv. meen et constructions. we a ite embedding of uy fo 3 fp In both gremmers, it is assumed that above end below are connected to the idea as are right-of and left-of. Tne words differ in the assigment of their NP arguments to subject and object roles- Tnus the difference between the word pairs is syntactic This is indicated by having the words pelong to two word classes RA and RB. Thus, UNDERSTAND with CRANMAR2 would derive the same HAM representation in Figure 3 for the sentences The red square is evo0ve the circle and The circle is below the rea square. It yould have been pos sible to generate distinct representations For these two sentences. I think this would have Deen less psycholegicetly interesting. Basically, the network ise grangar makes the inferences that A below 3 quivalent to B above A and en- codes the latter. TABLE 1 The Two Test Grammars GRAMMARL GRAMMAR Ss + WP NP RA Ss: +> WNP is ADJ NP NP RB NP is RA Ne NP + SHAPE (COLOR) (SIZE) WP is RB NP SHAPE + square, circle, et. NP + (the,a) NP* CLAUSE» COLOR + red, blue, etc. . ye* +» SHAPE SIZE > large, small, etc. . + ADI SHAPE RA- > above, right-of CLAUSE > that is ADI that is RA NP 27 TART? 4 abbas TABLE 2 continued g +» below, left-ot CLAUSE + thet is RB uP SHAPE + square, circle, euc. ADS + red, bis, blue, ebc. RA + above, right-of RB + below, left-of Figure 5 illustrates the parsing netuorss for the grammars. It should be understood that thes networks have been deliberately written in an inefri- cient manner. For instence, note in CRAVMARL thet there are tyo distincs patns in the main START network. Tae first is for tnose sentences viva RA relations and the second for tnose sentences with 2B relations. If a sentence input to UNDERSTAID nas a RB relavion, UNDERSTAND will first attempe to parse it by the first branch. The tyvo noun phrase branches will succeed bus the relation branch will fail. UNDERSTAND will have to back-up and try the second branc that leads to 23. This costly back-up 25 not really necessary. It would have been possible to have constructed the START network in the following form: STGP NP HP aT not branch until the critical re until une e& 1e reoresentati Table 2 provides a formal specification of the information stored in LAS's network grammars. A node either hes a number of arcs proce eeding out of it (1a) or it is a stop node (1b). In speaking end vaderatandins LAS will try to find some path through the network ending with a stop noce. Each are consists of some condition that must be true of the sen z to be ed in parsing (under rstanding) the sentence. Tn t be taken if the condition is met. This acti conceptual structure to correspond to the m thet point. Finally, en are includes speci control should transfer after performing the zero or more HAM memory commands (rule 3). or more memory commands also (rule ba). These e true of the incoming word. Alternatively, push to an embedded network (rule hb). For instanc in Figure 3 were to be spo: ken using CRAMMARL. Tae START ne be called to realize the X_is above ¥ proposition. The erpedded NP netvork would be called to realize the ¥ is red and X ls scuare propositions. in pushing to a network two things must be specified-- “NODE, raich is the embedded net- work and VAR, wnaich is the memory node at waich the main end emoecded propo- sitions jntersect. The element t is rule ib is 2 plsce-hnolder for invormation that is needed vy the control mechanisms of the UNDERSTAND progran. The three rules 6a, 6b, and 6c specify three types of arguments thay memory commands can have. They can either directly refer to mexory nodes, or refer to the current word in the sentence, oF refer to varieble: c} 3 Ms workoa far Ammar te Ai ko for Cconmmanl 7 Networks for CRAMMARD a NP 2 COP € AdT START awenetiemnenntom SZ seSh tom STOP NP = RA NP 7 OSSD G errno ST OP 4 2 COD = RB . NP Ns3 = CO! 20 SJ aannammeemtioe SB mm 2STOP enem ae NP — se Nl —s STOP € ©SHAPE CLAUSE NPL > AL— - to STOP NPL AD ceeeenrnnrernennrentnnntonnsto TD & REL = cop CLAUSE CL Figure 5. The natwork grammars used by BAS 29 ry 0 QO ry rs Cc fu Q Lee) be rs ct Ly o QO G ion by wv oD © bh tS J Ky oO rs C9 NODE > ARCS (1a) > stop (1b) ARC > CONDETION ACTION NODE {2) ACTION > OMMAND* 3) CONDITION > (COMMAND® } (he) > + NODS (uD) COMMAND > aG ARG (5) ARG id menory node 62 x ‘ oO et Net Nee > (6 + Ady xh, KS (Se FUNCTION > , oojectify, relatify, ideate (7) Mable 3 provides the ancoding of the nebyork for GR Note that there tencas to be a l- and LAS networ: Tt each network expresses just calls one ; to exoress dence is not quilt: Tt in GRAMHARL or G@ no Lave necessalasy © pore ce tures to commend then. SPEAK L: These grammar networks have a2 number of a rc mtence comprehension and generaclon. 7? and UNDERSTAND use the same network for sent Thus, LAS is the first extent system to have & uniform gremmetica notation for jts parsing and generation systems. in this way, LAS hes only to induce ons set of grommatical rules to do poth tasks- Such networks are nodular in two senses. First, they are relatively indepencent of each otner. Secona, tney are independent of the SPEAK and UNDERSTAND rrograms snav use then. Thais modularity greatly simplifies LAS's tasx 0 induction. LAS cnly induce maa r gr Poaz 3 the network grammars; the interpretative SPEAK and UNDERSTAND programs repre i lve x sent innate r inguistic competences. Finally, the networks themse very simple with. limited conditions and actions. Tous, LAS need consider only a small range of possibilities in inducing 4 network. Tae n= salism gains its expressive power by tne embedding of networks. Hec network modularity, the induction task does not incresse with the complexity of embedding. fee Tt might be questioned whether it is really 2a virtue to have the same representation for the grammatical knowleage both for unders a +5 : duction. It is 4 com=on ooservation that children's ability to uncer cand sentences precedes their ability +o generate sentences. LAs would noe seen to be able to simulate this basic fact of languese iearning. However, there may be reasons way child production does not mirror comnprenension otner than that different grammatical competences underlie the two. The child rey not yet have acquired the physical mastery to produce cartel b is the case, for insvance, with Lenneberg's (1962) enarthric cnila wno under- 30 Tha eontruetion of CRAMMARL qPpuT aoEErKUP TNSUBK CORT epuT *SuBRd} A ? {PRKCON 4 corr PRP START PATH “e CELPUSH AL TONP) ({QUT-UF Al X5}) S2) 5 Ci PUSH XL T Npy £f08JeECTIF %5 X1l)) S4 })) 6 (DEEPROP S2 PATH i CE CPUSH Ke T NPQ (AOUGETCTIFY x5 xX2)3 S3 y)} 8 > (WEF PROP 33 PATA s 3 QE CEPOFATE FORD X¥4) (GOUT-OF WORD #QA)) (CRELATIFY 45 Ad) stop LO Coir enOP S4& PATH ee Li (UE PUSH X2 T NP) £(OQUT-DF %¥2 X5)) S5 423 L2 LDEEPRUP $5 PATH 13 LE UCLOFATE WORD %G}) LOUT-GF WORD HRA)) (ARELATIFY X5 X43) STOP ? io LDEFPROP NP PATH : 15 COC CTUEATE WORD X43} ({GUT-OF A4 a SHAPE) ) {(OUT-OF Al X43) NPZ 7; 17 {OCFPROP NP2 PATH is (CUPUSH Xl T COLGR) NIL NPS } 13 C NTL SIL NP3))} 20 {NFFPROP NPB PATH 2i CO¢PUSH X11 T SIZE} NIL STOP 3} 22 {NIL NIL STOP ))} ?3 (DEF PROP COLGR PATH 2% CLL CEDEATE WGRD %4) {QUT-UF X4 *CULOR)} ({QUT-OF Xi X49) STOP 29 (NEF PROP SIZE PATH 26 . CELL IDEATE WORD X42). (OUT-UF X* aS1ZE)3 @(OUT-DF ¥1 X43) STUD } 2t (TALK) - 2d ({IDEATE SQUARE XLICTCEATE CIRCLE X23 29 C(UUT-UF AL *SHAPEV(OUT-OF 2 *SHAPE}) } 30 (CIDEATE REO XB) CTOCATE GREEN %4)) 31 ((NUT-OF X%3 COLOR) (OUT-OF X4% #COLGR)) 42 CLISP SETO X1 NIL) 33 CCIDEATE SMALL X5) (1 DEATE LARGE X1)}) 34 ({UUT-GF X5 KSIZEV(OUT-OF Al *SIZE)} 35 NIL 36 (TALK) 37 ((LOEATE TRIANGLE MLL UDEATE BLUE X2)CIDEATE MEDIUM X3)) 38 ((OUT-GF Xl &SHAPE) { GUT-OF A2 =CGLOR}{OUT-OF AB *SIZE)) 39 (LISP SETQ XL NTL} - 400 {LISP SETQ Xe NIL) Gi CC TOEATE RIGHT-OF XLICLOEATE ABUVE K29) A3 , ( (QUT-CF RIGHT~Ge KRAVE OUT-OF ABQVE *RA)) 44 {(OUT-OF LEET-OF RRBICCUT-OF BELCH *RBDY 4&5 ({IDEATE LEFT-OF XLPMTDEATE BELLY 4223 &4 NIL 31 StOG Tea oe) use : astrugtion, but instead us of productis fhe fi pussibility is thot the emits non in- language 4 ing. evi ang not understand pas ives DES yarsible. It see or petween subject, ecarnaul ars when asked to Similarly ut Zz Ter hat we pe ref ct u oO rf precedes Se sho g probadi sentence fes priate taneously producing tne were : > 1 the measures of produce Ferm: Le 79) ing d 72 ring procedures , es 42 we Ve Cia ee ee ee with a HAM network of propositions sagged 25 to-pe-spoken and a topic o sentence. The topic of the sentence will correspond to the first neaning-oearing : etwor! cpraxX searches through 1vs START network Looking £0 a7 4 xen proposition attached to the topic and woich expresse pic 28 first element. It determines wnetner @ path accomplishes this by evaluating the actions associated witn @ p2acn ang determining if they created a structure that appropriately matches the +o-be-spoxen structure. When it finds such @ path it uses iv for generation. * Generation is accomplished by evaluating the conditions along the path. cursively v a If a condition involves 4 push to én embedded networs SPeAK is re c speak some gub-purase expres allied to sing @ proposition attached to one proposition. The arguments forarecursive call of FUSH ere the expedded net- work and the node that connects the main proposition and the emoedded ororo- sition. if the condition does not involve 4 TUSH it will contain a set of menory commands specifying that some features ‘Qe true OF @ word. % will use these features to determine what the word is. Tae ~ord so Ceterminec will 3 to nh As an example, consider how SPHAK nerate a sentence correspoading to the HAM structure in Figure 6 using , the En .eLish-11is ce amar in “Boeure 6. Figure 6 contains set of propositions about thre denoted by the nodes G2k6, G195, and G182. Of node C26 it is ass “ht is 2 srianghe, and shat G195 is right of it. Of G195 it is ass it is a square and that it is above G1g2. OF e182 it is asserted © SQUAT, smalL, and red. igure 7 1 jliustrates the generavion of this froa GRAMMAR2. LAS enters tne START network invent cn producin ueterance exoout G95. Thus, the topic is G1i95 (it could neve been G26 OX 162). he first path through the network involve vedi n aaa OE G195, pet the ve clas Tre second pata here is nothing in the adjectiv through tac SPART network corr ESsDo ones “say eyout G95 -~- - it is above G182. Tuer afore, LAS pia Ss main proposition. First, it must find some noun phrase “to express G195- The substructure under G195 in Figure 8 reflects the construction of this. $s supnebvork « The NP network jis called which prints the and calls NPL wpicn retrieves Square and calls CLAUSE which prints that, . is, ana right-of and ain recursively calls NP to print the squert- Sinilarly, recursive calls ere made on the HPL network to express G162 as the small red square. The actual. sentence generated is senna on choice of topic START network. Given the seme to-be-s =.xf network, but the topic G2k6, alt SPEAR generated A tri jangle is Left—o8 : mn gauere %o2% is above a sheik red square. Given the topic GiLg2 it generated Ax Foauase thas is below 4 SGUarS that is right-of a triangle gle is smelt. Note ho the cnoice of tne reletion words lefc- of vse 2 SYehecot and or Guove VS- below is Gependent on choice of topic. It is interesting to inquire what is the Linguistic power of LAS as & speaker. Clearly it can generate eny conbext-Tree Tanguegse since its transition networks correspond, in structure, to a con ree grammar However, it turns out that LAS nas certain context-sensitive asnects because its productions are constrained by the requirement that they express Some well-formed HAM conceptual structure. Consider two proolems that Sky (1957) regarded as not handled well by context-free pramnars: The first 15 agreenent of number between 4 sub- ject NP and vero. This is hard to arrange in @ context-free grammer because the NP is already puilt py the time the choice of verb number must be made The solution is trivial in LAS——wnen 09 4 the NP and yerb are spoken their. num- per is determined by ins spection of whatever concept in tne ~o-be-spoken structure underlies the subject. The other- Chomsky © xanple involves the identity of solutionel restrictions for active and 2 passive senvences.- Thais is also achieved au tomatically in LAS, since the restr ieticns in both cases are regarded simply as reflections of restrictions in the sera ntic structure from which both sen- tences are spoken. While LAS can hendle those features of naturel language suggestive of contexb-sensitive rules, it cannot handle ex amples Like languages of the form oms i oO Ss aNpich which require context-sensiti jive grammars. it is interesting, however , that it is nerd to find natural languese sentences of this structure. Tne best T can cone up with ere respectively-tyPe sentences, G-+B-2 John and Bill nit and kissed Jene and Mary . respectively: This sentence is of questioaasle aoceptavil fe ns ATE / | GIG RICHT-OF SQuaA oO Se The towbe-SspoK oT Mae sy CHAM network for the SPKBAK programe Pp wo THE NPL TRIAN A tree structure sh These networks were called G195 which exoressed the in 3D: NPL RED G182 i NPL SQUARE owing the network eallisand ward evtout wwe rating a sentence about sion contained in Figure 6, ture is 3 Vv; ° sider the possibility thet the failure may a Zo sip a on thet peti TT fer % possible to have to back into 4 network a second ontr o £ De. time to 2ote . the UNDERSTAUD prog trol structure Were Perhavs an Englisn example would be useful +o motivate the ne oO: control Structure. DB he two sentences tre Deroeratic party hones to win in '76 with The Democrs carty hopes ere hign for ‘To. A main parsing network would call a noun. vars ork to identify the Sirgt noun phrase. Suppose UNDERSTAND identi? * ratie party. bLeve elements in the second sentence would indicate narerore, the mein netvork would have to re-enter ths :fferent parsing to retrieve The Deamocresic 4ereq the noun-pnrase network to retrieve on i% must remember woien persings 44 tried tne first time so that it doe ievye the same old parsing. Tae complexities of this control ssruct wiped in a more compleve report (Anderson, 1975). Here © “lib gu a general strucwre of the gr to find some pata START network waich wibl 3 e parsing of the sentence. + evaluates tae eeceprsollity o eveluating the conditions associated with that path. ond on ma thet certain features Sa true of words in the sentence. This is Setermined by checking memory. Alternatively, 4 condition caa require & pusn to an embedded network. This network must parse some subphrase of tne sentence. When LAS finds an acceptable path Shrougn a network it wilh collect tne actions along that path to create a temporary mory structure to rep > exemple of wnere it might seem that LAS would need & In English noun phrases, it seems we can heave en arbitrary numbe me that LAS has parsed. This, for instance, given 4 antence, Tne square thet is risht-of the triangle is abd na TAS woulé parse it in tae form illustrated for Pigure 6 in LAS. 1, understandin first displayed exanp (1973) comes closest analysis. It is also of interest to consider the power of LAS as an acceptor of lan~ guages. it is clear that LAS as presently constituted can acceps exactly the context-free languages. This is because, unlike Woods' (1970) syster, actions on ares cannot influence the results of conditions o4 arcs, and therefore, play no role in determining wnether & string is accepted or not. However, “nat 15 interesting is that LAS's behavior aS 2m Janguese understander is relatively Little affected by its Limitations on grammatical powers Consider the following a n o contextb-sensitive gramnars x mber of adjectives. 36 General Conditions Tor Language Acquisition _— Dn ne a Having mov reviewed now LAS. 2 understands and produces sentences, I will - present ne three asveces of the induction progres: BRACE, SDRAXYTSST, and GeieRALIZS. Before doing SO» it is wise to priefly state tne conditions uncer waien LAS learss 4 Language. tt is assumed that LAS. 1 already nes CONnceDpes attached to the words of the Languege» Tiat is, jexicalizetion is complete. Phe task of LAS. L is to learn the grammar of the janguage--that is, how to 9 from a string of words to 4% representation of their combined meaning. Secause Li oy concerned with Learnings meanings, it cannot be a Ver! realistic econd 1 learning where many concepts can transfer fron she yage. i Will propose extensions oF 43, L concerned ZS. 52° LAS. 1 is that if works in 2 particularly restricted semantic co is presented with pictures indicating relations ana proper- ties ail geometric objects. These pictures ave aetueliy encoded into sonal networs representation. Along with these pietures LAS 2 anees describing the picture and an indication of tnat aspec which corre sponcs to the main proposition OL TRE SUULEHES: From 0: nm input, @ network grammar 1s constructed. The semantic dona y simple, put the goal is to be able to learn eny natural or natoral-like Language which may. gescribe that domain. A major aspect of the LAS project is the BRACKET progran. Tais is an alsori- for taking # sentence of an aroitrary Jeanguage nd HAM concepsual structure anc sroducing @ pracketing of the senvence shat i nis surface structure prescribes the hirerarc sentence. For BRACKET to succeed, Four condition etworks required to parse the must be satisried by the infor Condition 1. All content words in the sentence correspond to element cepsuel structure. This amounts to the clain that the teacher is 4 L the learner to conceptualize the information in his sentence. It does not. matte to tne BRACKET al oritnm whether there is more information in the conceptual Pp structure than in the sentence. Q ° ndition 2- The content words in tne sentence are connected to the elements in, the conceptual structure.. Psychologically, this amounts to the c nat exicelization is complete. That is, the Learner KNOWS ne meanings of the wor 3 e surfece structure snterconnecting the content words is isomer phic in its connectivity to & janguage-fres prototype structure. Condition 3. * 37 a eae ’ + : . . : . . - . : Condgition 4. The main proposition am conceptual structure 2s indicated. iw 0 5. fr a quire considerab S 2 prototype tar I will explain why soueb Consider Panel (2) of Figure 6 which iit sbrueture for the series of propositions in the English sentence re is noove the small cirele. Panel (b) illustrates a grapu deformation of that structure giving the surface structure of the sentence, OVS how elements within the sane nom phrase are appropriately assigned to the same subtree. Note that the prototype struc~ ture is not specific with respect to which Links sre avove whien otners and which ere right of which oceners. Althouga the HAM structure in Panel (aj is get forth in a particular spatial array, the choice is arbitrary. In contrast, the surface structure of a sentence does specify the spatial relation of links. Tt seems reasonable that all natural languages nave as their semantics the same order-free protovuype network. They differ from one eno ther in (a) the spatial ordering their surface structure assigns to the networs and (bo) the insertion of non-ucaning-bearing i moras mes into the seatence., however, the surface structure of all natural languages ig derived from the same graph patteras. Penei (c) of Figure & shows how the prototyse structure of Panel (a) can pro- - vide the surface structure for 4 sentence of the artificial GRAMMARL. All the sentences of GRAMMAR] preserve the connectivity of the underlying HAM structure. S$ OF L By this critericz, at least, GRAMMAR could be 4 natural language. tain conceivable languages would have surface structures which jons oF the underlying structure. Panel (d) illustrates COU 1 bad G such a hyvothetical language with the same syntectic structure as English, but with difver Le + the ent rules or semantic interpretation. In tnis languege the adjective inz the object noun modifies the subject noun. As Panel (a) illus- trates, there 1s no deformation of the protovyD sructure in Panel (a) to achieve a suctace structure for the sentences in the language. No matter r how it is attempted some renches must cross. n {a uy OR oO cr Yl Gs $ connectivity of the prototype network to infer what the LAS will use the t t connectivity of the surface structure of the sentence must be. The network does not specify the rignt—Lett ordering of the yrancnss or the above-below or- dering. The rignt—Left ordering can be inferred simply from the ordering or the words in the sentence. However, to specizy the aoove-below ordering, BRACKET needs one further piece of information. Figure 9 illustrates an alternate urface structure that could have been assign ad to the string in Figure 8 (c). t might be translated into English syntex as Cir ula + ig the small thing the rear ures illustrate, the Has s below the red square. Clearly, #S these two s Reet r network and the sentences are not enough tc spe eify the hierarchical ordering of subtrees in the surface structure. The difference between the sentences in Figure 8 (c) and 9 is the choice of wnich proposition is principal and which is subordinate. If PRACKET is also given information as to the main proposition it can then unambigiously retrieve the sentence's surface structur The assumption that PRACKET is given the main propo to the claim that the teacher can direct the learne asserted in the sentence. Thus, in Panel (ce), the te c would direct the learner to the picture of a red triangle above @ srall circle. He would both have to assume that the learner properly conce tyalized the picture and that he also realized the aboveness relation was what was peing asserted in the picture. 38 ad nab tnnre + y* (b) W 1 SQUARE CIRCLE SMALL . SN UTP RTNIT V* (c): iM SMALL . ceules RED BELOW ructures of the sentences graph in (a). aA Aarnfarm f, The surface st in (b) and (ce) are the HAM structure Panel (a) to 2 a THE SMALL wales | deformations ol. é.. FIgure J. Alternate Surs hoa face structure for the s entence in Figure 9c.» More on tre Graon Deformation Condition a en T think tht the graph deformation cond aa ae of 8 universal property ot language. However, to make a is elear that something ther than the HAM network wilh neve to 3 neh > works weil en 1 a nn sroused togeuner. ed ture ceria closer tog? nd open are closer together. If Fig fro sentences woiecn alternated words is no deformation of the structure or. 2 ae type, LAS cold n groups. Por ins would provice 2 o’ cr O or John opened with a key the do sentences Wal the HAM structure to cross. ‘This Saglisna sentence & Hien violate © deformation condition Tor Figure 19 6 : snething Lixe the case ually necessiple from § erguments are equally = posed by the verp open is one posed bat vary and its arguneres woile it is like Ly 3-2 some natural language. There are two Ways to deal could resort to a memory peprepenvalivn Tithe (uy. HOR ar or significant considerations that motivate tne HAM (a}. Moreover, representations like (o) finesse ons questions in Languege acquisition--nc7 we learn the ax verbs. To address this question Wwe need a represen- $i-argument verbs into @ representation Like tt bE ike ( semantic function of the case arguxents. Learning the role anguage then involves leerning hoy to assign it o a structure Tike (a). Tf will sketch systen to do this If we Keep the HAM representations then some changes are required in BRACKET grepn deformation condition. Whet is characteristic of multi-argumens verds in HAM is thet the arguments are interconnected py causal relations as in (a). Thus, BRACKST showle pe made to treat all the terminal argucent structures &5 defining @ single level of nodes in a graph structuz nected to 4 single root node. Tnat is, BRACKET can treat a HAM structure such as (a) if it were (b) for purposes of utilizing tne graph reformation con- fact, BRACKET already does tnis in the current jmplenentation. QO dition. In G5 The Details of ARACKET's Output So fer, only & deseription of how one would retrieve tne surface struc ture connectng the content words of the sentence nos been given. Suppose MACKET were given A triangle is lett-of a scuere shat is ebove & small red this senvence wnien will ce square. A bracketing structure must be imposed on 41 JOHN PURN KEY CAUSE. DOOR OPEN (d) JON KEY -—s OPEN DOOR Figure 10. Alternative prototype structures for the sentence Jorn onened the door with a “eve The HAM structure in (a) introduces too many distinctions: 42 also include the functicos words, Given enis sentence ane she conceptual Ssoruc~ ture in Figure 6, BRACES rev ned (G257 (G2k6 Geby 2@ triangle) is Left-of (G195 Gi96 a square (Gig5 G225 tna. is above (G1b2 Gi83 a smell (q182 G1s5 red (G182 Gi8h square))))))- Tae oain mroposition is 2257 which is give? as the first term in the bracketing. Tre first bracketed suD-2xoression aescribes the 3uo- ject noun o element in the sub-exXpre gon G2b6 is the node tnat h aa * = jinks the Te rst two words as The next two worGs is. propositions corressouaing & chese The re ft corresponds. to 3 description of the element G1u95. Tne first emoedded prop si G Gi95 asserts this object is 4 square and tne secoad proposition, G225, asserts that Gi95 is above Gio2. Note that the G225 proposition is emoedded a5 4 supe expression within tne G196 proposition. Te last element in the G226 proposi- tion is (G182 G1i83 11 (G Gi85 red (C182 Gish squere))). Tsais exoression G p wi » FA wy b 162 has in it three propositions G183, G ut. of BRACKET. Aostractly, the out— The above @X@sp 0G a by tne following three yexrite rues: a put of BRACKST may be specifi 1. S* proposition element 2, elemene + word > ejenent > (topic S) That is, eacn OF veted output is 2 proposition node followed yy 2 sequence of " nese elements are either rewritten @5 words (rule 2) or ans (rule 3)- A pracketed subexpression pegins with 6 tes the connection between the emoedded and embedding ants within an exoression @re either non-meaning pearing = et ct elements (rate 1, bracketed gubexoress topic node which ind propositions. The € words or elements corresponding to sudject, predicate, relation and ooject in the propositio ote that BRACKET induces @ correspondence between & level of pracketing and 4 single proposition. Zach Level of pracketing will also correspond to a new network in LAS!s grammar. Because of the modularity aay of HAH propositions, e modularity 15 acnieved for the grammatical networks. When a number of embedded propositions are attached to the same node, they are envedded within one another in 4 right-oranching manner. e is no semantic features to indicete waere they belong. Ws The insertion of non-~function words into the bracketing is 4 troublesome problem becaus® Yr Consider the first word 2 in the exemple sentence above in Figure 6. It could have been placed in she top level of bracketing OF in the subexpression con~ taining triangle. Currently, all the function words to the right of ¢ content word are placed in the sane level 2s the content word. The bracketing is closed jmnediatels after this content word. Therefore, is is not placed in the noun-pnrase prackeving This heuristic seems to work more often than not. However, there clearly are cases where it will not work. Consider the Sen~ eat. ‘The current BRACKET program would a vo Ss tence The boy who Jane spoke to was a return this 2s ((fne boy who dane spose) ) to was deaf). That is, it would not identify to 4s in the relative clause - Sinilerly, non-meaning-bearing suffixes like gender would not be retrieved 45 part of the noun by this heuristic. However, there is 2 strong cue to make pracketing appropriate in these cases. There tends to be 4 pause aften morphemes Like to. Perhaps such 43 pause structures could be called woon to help the BEACHES? prograa decide how to insert the non-meaning~bearing morphemes into tne bracketing. aring morphemes pose further problems besides such morphemes in a noun phrase. Thes seq nat, in principle, might constitute &n aroit ets semantic referent could provice no cues t language. Therefore, we would be back to 4 ag duction tasz that ve naracterized in the i comro gz to observe that the structure of these st g non-meaning~bear ing morphemes tends to pe very Simple. There are nob many exumples of tnese strings being longer than a single word, Thuc, LL Seems baat the languages consti tuted by these non-ne aning-veuring strings are nothing m than very simple finite cardinality lenguages which posc, in themselves, no serious induction problems. The yarious stretches of non-mzaning- peacing morpnemes in a senvence could also have complex intercer endencss thereby posing t hese serious induction problems. Again it does not seen = pet simple gust at those points where it would hav on program to Work. a o be the cas¢ that thes ndencies exist. So once again we find that the structure of natural language a to be for a LAS-like induc- 0 ( ch pte ce In concluding this section I should point out one example sentence which BRACKET cannot currently hendle. They are respec tively sentences Like Jonn and Bill Ganced end laughed resvectively. ‘The problem wW Will such a sentence is th at 1 ~ is the following prototype structure: 1 2 ba} rd Jonn dance Bill | lauga Thus , John and dance are close together and so ere Bill and laugh. However, tne sentence intersperses these elements just in the wa! way that nak makes bracketing impossible. There are probably other exe moles like this , but IT cannot think of them. Fortunately, this is not an utterance that appears early in child speech nor is a particulerly simple one for adults, Of all the grammatical constructions, the respectively construction is the one that most suggests tne need to have trensformational rules in the gramcar. s capable of Te funetion of SPEAKTEST is to test wnether its i ely modify the grammar 50 generating a sentence and, if it is not, eappropr lat that it can. SPSAKTEST is called after BASCART 15 complete. It receives’. from BRACKET a HAM conceptual structure, @ pack ted sentence, the main pro- position and the topic of the sentence. As in the SPEAX program SPEANTSOT attempts to find some path through its network which will express a proposi- sion attached to the topic. iz it succeeds no modifications are made to the network. If it cannot, & new path is built through the network to incorporate the sentence. ts 5 ae 3) id “ © a] The best way to understand the operavion af & through one example. ‘ine target Language js wag given to le a, GR ’ arn is illustrated in ail Lh. ais is a very simple languase, yasieally GRAMMAR of Table 1. it nas a smaller vocabulary +9 make it more tractable, The reason for choosing this Lan: guage is that it is of just surricient complexity to jllustrate LAS's acquisition mechenisns. In addition, LAS hes learned GHAMMAR2, also given in Table l. Figure iL 4llustrates LAS's come in. Tre first sentence i returned by BRACKET es (GiT4 (GLL5 6 & CLT refers to the main proposition given as an ar t this is LAS'ts first sentence of the languag® the sr network will, of course, completely fail to parse the sé ntence. It has no Ff mnar yet. Therefore, it induces the top-level START network in Fi 1. “A listing of the czact s given below th are information induced is e graphical illustrati on in Figure ll. Since the first two elements acer GLT4 i in the bracketed senvence are them- sees bracketed, the Tir network will ne pushes to subo- rc st two arcs in the e contains 4 conaition om the word aoove. The restric- is that it be & enber of tre word class Aig? . This class Was + this senvence and only contains the. word above at this point. : d a path through ene START network, SPEAKTEST checks the a % che Having now conssrucee subnetworks in thas path to see whether they can hendie the bracketed subexpres sions in the sentence. Tis is accomp ished by 2 recursive cail to SPRANTES?T. For tne first phrase SPRAKTEST 15 called, taking aS ar suzents the network AL95, a (GLLo sq 4 aes = Pe netrrertk A105 the word class ana “Uue . UUpLe oa nm square, and in network ALOT the word class A22] con~ 2 se two Suonetworks should pe the same in ea final grammar 4 prepared to risk such @ genera alizavion at this point. Note in this example how the ore ee provided by BRACKET completely specified the em bedding of network The sentence provided by BRACKET was (Girl (G11L5 G116 square) (G148 culo. triangle) esove). The first element GLT ag the main proposition. The second element (G11L5 G116 square) was 4 bracketed wubexpression indicating 2 subnetwork shoule be ereated. Similarly, the third expression indicated a sion network. Tae last element above Was @ single word and so could be hendled by | memory condition in the mein network. The second sentence is triangle sauare rignt-of. This is transformed by BRACKET.to (G315 (G2k6 G2k7 triangle) (6283 G264 square) right-of). Because s sentence cannot be handled by tre or the narrow one-member word c classes this Ss current grammar. However, SPSAKTEST does not add new network arcs to nandle the sentence. Rather, it expands wor class AL9G to include right-of, word class A211 to include ‘riensle, and word class A22h +9 include square. The grammar is now at such @ stage that LAS couid speak cr understand the sentences triangle sauare above or squ uare sauare ricnt-of and other sentence s which it had not studied. Thus, elready the Firss generalizations have een made. LAS can produce and unders tend novel sentences. This illustrates the type of generalizations that are: mage Weta une SPRAXTEST prograa. For instance, consider @ ge SPEAKTEST decided to use the existit i —_ fa eee” op j- re Or WY om re? “AI Wy ay kr In 5 ( (SQUARE) (PRIANCLE) ABOVE) oa ALS . - £19? . 1199 START-—— 95 yy gh 9 te STOP r A197 _ 2 AZZ) STOP Pp . S Ne C247 C316 Case 7 NX | V N 3/ P ; iA? PRIANGLE Gate RIGH?-OFr ass SQUARE - ( (@RIANGLE) (SQUARE) RIGHT =OF) A199——— ABOVE, RIGHT -OF A211 ty SQUARE, TRIANGLE A22h-——— TRIAN GLE, SQUARE Figure 1}. LAS"s treatment of the first two sentences in the induction sequence» 46 Anderson the first wot ra of th work Al9D that head been cr to include triangle Both deci al e second sentence . his involved (a) using tne same subpnet— xc (ob) expanding the word class A2i1 ons “Tested on semantic criteria. The networn ge attached to the main propo- 5f the node G2h6 which is this identity of semantic In making these general izations, SPEAXTAST is making a strong assumptlon about the nature of huey ‘Language. This assumption is stated as Condition Condition 5. Words or phrases with identical semen tie functions at identical a tat chically. This is the assumption points in a network behave identically syn u of semantic-induced equivalence of syntex. it is another way in wnica senantic information fac jlitates grammar induction. it clearly need not be true of an arbitrary lang uage. For insvance, Gecisions made in tne sudject noun phrase might in theory condition syntactic decisicn made in the object noun phrases. 4 LAS. because of its heuristics in SPEAKEES?T for generalization, would not be > D se able to learn such a language. 4 wee ee BRK TAG Figure 12 illustrates LAS's networ % gremiar after two more sentences have g - ‘ = 7 come in. penvences 3 ana 4 LinVOLVe tise aubuiua Ota Giaw arenes pans 7 treats these 4s syntactic variants of abo ve en rigat— of which differ in their assignment of their noun phrase ArguURENnts © 9 the “Logical categories subject and object. Therefore, LAS creates an alternat 2 branch through its START network to accommodate this possibility. Figure 13 illustrates the course of LAS's learning. Altogether LAS will is will have to meke three extra a be presented 14h sentences. Subsequently, generaliza ations to capuure the entire vaerget lenguage. Piotted on the ebscissa s this learning history and along the ordinave we have the natural Logarithm of the number of sentences which the gremmar can handie. This is a finite 4 e, unlike GR MMAR2, and therefore tne number of sentences in the language will always be finite. As can be seen fron Figure 13, by the fourth sentences LAS's gramaar is adequate to handle 16 sentences. LAS's grammar arter the next five senvences ig illustrated in Figure Lh, These are LAS's first encounters with two word noun phrases. ALL five sentences involve the relations right-of and above and therefore result in the elaboration of the A195 and Al9T suo-netWOrks. Consifer the first sentence, square red ariangle blue above, which is retrieved oy PRACKET as (C329 (C270 Ce71l square (C270 C272 red)) (6303 C304 triangle (C2 ove) Ce sider 0}. Con C + ct 03 C305 blue) above the parsing of the first noun phrase. Hote that the adjec 1 is embedded within the larger noun vohrase. This is an example of the right embedding woich BRACKET always imposes on @ sentence. ‘This will cause SPEAK TEST to create a push to an anbadded network within its A195 subnetwork. As can be seen in Figure Ls the exiscing arc containing the A2LL word class is kept to handle square. Two alternetive arcs are added—-one with a push to 47 Birnrea 12 LEU Uf. > - aA LAS*s eranmar aqdVver -_ mom MDYANAT TS SQuUARL mda Gis 1 s Whee Se Zo TRIANGLE SQUARE 2, SQUARE TRIANGLE TRIANGLE SQUARE RQ Se STOP 54 ~ « £4221 _— Alge = above,r: art 4 Poe rn ac > meow ™~ Rt whi FONT wet of Hl it i H scuare, vriangle calow, left-of square, triangle square, triangle 18 nonyta e ¥. GQAKMAR ANDER Load 1? i Apo NwoeOO 4 ie ~ ym NETSEE ob = Ce bs eyo 4 iv 0 7 by io fui atti ‘eet Ma 2500 1 274 fo ¢ ” 1040 “ - . © Ou 1296 True sentences + over generalizations N a : ie OB ee a 008 True sentences a LOO j— 20\— e/ / —_— ef / bol whe \ 1 Figure 13+ 5 & 5 6 7 8 9 20 Tt “V2 130 ~«4 Sentences Studied The growth of LAS's grammar with its learning history: ve deses be ce nee 1 2° ° 37 Ceneralizations Additions to LAS’s grammar after studying: Le SQUARS RED PRIANGLS BLUE ABOVE 2, PARTANGLE LARGE SQUARE SMALL RIGHT-OF 3° TRIANGLE RED PRIANCLE RED ABOVE i. SQUARE SMALL MRTANCLE RED RIiGHP-OF 5, SQUARE BLUE TRIANGLS LARGE RIGHT-OF A211 HB \ NIL SS TOE € A221 C560 $9 STOP NIL r ——-== STOP . o £0585 ough colo Ss. stop C560 STOR C510 = gmail, blue,large,red small, blue, large red C3 oat co Or ! 50 A nde SCO 50rh the Chgl nebvos : ce with a NIL trens: io sithin the Cho: network the word cle “yord red. Supeose & neato! ontein saquence : DOs 2 phrase “ Psat - AS fully parsed. i? fares A ao - ee a 1 = . 3 Ly a? ok, assigned to 1 . Huils 4 tne Cx OULLG. Kw A _ ar C : --. An, oy a . ; 1,3 _% Als 2S rth are Laat : € LS on mazing to 7% ge des-induczed equivalence of (a) illustrete how son in natural language. at, eta. He would set a 7 wed by any noun. Suppose, he LiS as The + poy + 's.7 oe network illustra tea in sion that foots is the : Y 7S Jian is, af eaurse, ge lege vvin, 1964). What n2 a te carious oe paegoneralize % OS + there ere 2 nunib er of alterna— uch norphenic rules is. = tives end no semantic vasis zo choose besween them. Because of its principle of sementics-induced equivalence of syntax, LAS will eo yonerelize in those situation Apparently, ¢ children ere opet rating under 4 similar rule. LiS needs to be endowed with a mechanism to allow it to recover from such overgeneralizations . Therefore, One of tne future additions to LAS will have to be a RECOVER Prose Consider how it would work witn this pluralization example. Suppos LEARNMORE receives the genience The Scat ave above the triancie. In oe cating to analyze the sentence in SPPAXTSST, the plural foots will be generated put Will mismet ch the sentence. RECOVER has as its function to note such mismatches. since ait is possible that there are two alternate Ways of expressing plurality, RECOVER cannot assucs its grammar is wrong. Rather it will interrupt the information flow and check the accepta- bility of The foots are above OF the triangle. Tat is, RECOVER will explicitly seek neg ative information. Upon 7 Learnan ne exoression is ungrammatical gv RECOVER will teke foot out of the word cless that is pluralized py 's. 1 . . fo accomplish this T would have to put within TAS some ‘pechanism that will segnent words into their morpnenes. 50 Figure 15 Some possible network grammars m7 , oN THE = NOUN > STOP oO EEO __ Se STOP ~~ STOP 51 Every bit as much as LAS, 4 enild logically needs negative information ta reeover Trom overgeneralizations. The interesting quest2on igs where the negative information comes from in the case of the child. Parents ao correct the cnild in such cov ious morphemic overgeneralizations (Brown, 1973). Even today x find myself corrected (not by my parents ) for my failures to properly pluralize esoteric words. ‘The child may also Use statistical evidence for a negative con elusion. In some manner ne may. notice shat the morphesic form foots is never used by the aduls and so conclude that 1% is wrong. Horning (1969) has formalized an algorithm for detecting such overgeneralizations py assigning probabilities to rules. vw Figure 16 illustrates LAS's treatment o training sequences. These involve some thre sion of the noun phrases on the brancn of the start network for RB re As can be seen from Figure 13, at the point of the hth sentence LAS has its grammar to the point where it will nandle 616 sentences of the target lan- ZuaZe. Actually the grammer has produced some overgeneralizations—-i ept a total of 750 sentences. LAS has encountered phrases like square, f the last four sentences e word noun phrases and also expan- 1 ace square small, square red, and square red small. From this experience, LAS has generalized to the conclusion that the sentences of the language consist of a shape, followed optionally by either @ size or color, followed optionally py @ size Thus the induced grammar includes phrases Like squares small small because ptable in poth second and third posi- size words were found to be acce erestingly, bhis mnisbake will nov cause LAS any problems. It will phrase like square small small beceuse it will never have a to- structure with.two smalls modifying an object. it will never so and thus UNDERSTAND can nov moxe any mistakes. This is how an over-general grammar can be successfully constrained of semantic acceptadility- hee QO a fo never spea hasspoken HAM hear such @ pArese mo} @ nice exaipic fhe problen of learning to sequence roun modifiers has turned out to be a source of unexpected difficulty. in part, the oréering of modifiers is governed by pragmatic factors, For instance one is likely to say small red square when yeferring to one of many red squares, but red small squer when referring to one of many small squares.. Differences like tnese could be ef om controlled by ordering of Links in the HAY memory structure. G2NERALIZE . After teking in 14 sentences LAS has built up 2 partial network grammar shat serves to generate many more sentences than those it originally encountered. However, note that LAS has constructed four copies of a noun phrase grazer. One would Like it to recognize that those grammars ere the same. The failure to do so with respect to this simple artificial language only amounts to an inelegance. However, the identification of identical networks is eritical to inducing languages with recursive rules. 52 A AAT+SI wr ook PAS sr y ad Additions to LAS’s grammar atu: 16. SQUARE BLUE SMALL ERIANGLE Tle PRIANG ® RED SQUARE BLUE LEP P-OF 12. TRIANGLE S SMALG SQUARE REO 13° SQUARE BLUS PRira NGLE BLUE e Lhe SQU ARS RED LARGE TRIANGLE RED LARGE SELOW oY 4 i) NIL ; STOP E— 8593 D1095 B56 6-6 SOP KS NIL S>STOP £5580 DLO23 B564 Se D1 OL ee STOP NIL peer STOP conn he p10232 2S El 94 —FL208__s =>STOP ON Nib SeSTOP €D1117 E884 D1LO95 > E90L- SST OP NIL TOP Sp71l4 D692 ———-————p STOP EDLOLS D1 095—£ PLY _s»st0P D1023 DLONS ss POP E 298h —© E205 s~ STOP £1368 E22 see stor D714 = small D1O45 = red, biue,small DLiL? = plue,red £905 = small,large E1395 = large 23 Anderacoa - ye, > NOUN 3 k That is, there are four networks, NP, HPL, NP no and uP. whose structuce is in- dicated by the eoove rewite rules. It Ts assumed that LAS has only experienced three consecutive adjecvives and therefore SPE KTEST has only created three embeddings. se critical inductive steo for LAS is to recognize iP, = iP... This requires recognizing the jdentity of the word classes NOUN, and HOU. and the word classes Add, a ADJ This will be done on the criterion of the emaunt of overtan af” 7 classes. it also reauires recognition that network 2Pp = Neg. Thus, to identify two networks méey require that tvo other networks ce identified. The network HP 3 is only 2 subnetwork of HP. So in the recursive jaensification of networks, GHVERALIZs will have to accept a subnetwork relation pesween one network Like NP, whieh contains another Like NP... The assumption is thet with sufficient experience the emoedded network would become filled out to be the same 45 the embedding network. After NPL hes been identified with WP2 HAM will have a new network structure where NP* represents the amalgamation of NP1, NP2, and NP3. NP > the NOUN the ADJ NP* P* + NOUN* ADJ* NP* Hote that new word classes NOUN* and ADS* have been created es the union of the word classes NOUN2, NOUN3, NOUNL and of the classes ADJ2, ADJ3, respectively. ENERALIZE was called to ruminate over the networks generated after the first fourteen senvences. GENERALIZE succeeded in identifying AlL9> with ALOT. As a consequence, network A195 replaced network Al9T at the position where 1t ceurred in the START network (see Figure 312). Similarly, B566 was identified with and replaced network pS564. Finally, B566 was identified wita and replaced 4 A195 througnout the START network. Te final effective grammar is illustrated in Figure 17. Iv now handles all tne sentences of the grammer. it hendles more sentences then the granmar that Was constructed after the fourteenth. 54 Ficure 1? Phe final crass B Sach SPARE | TOO S305 ee B508 PeAL96—S ae peA198-~ EZ A199 B593 3366 DL 6 TP NIL ne STOP €D1117 B10 ee 9g SP NIL STOP — E905 E88 —————— ee STOP B568 = below, Left-of AL99 = above, rignt- ~or B593 = square , triangle D1117 = plue, read large, small F905 = large, small 29 sentence. Tis is because the noun-parase network E556 has been expanded to jncorporate all possible noun purases. sefore the generalizations, none or ~f} 4¢. c ~ , ot e 4564, BOOS, ALQD, oF ALOT were complete. ‘The network B965 be- a. e L c types of languages. The first is the assumption of the correspo c at the surface structure of the language and the semantic structure. This is critical to BRACKET's identification of the surface structure of the sentence woich is, in turn, critical to the proper embedding of parsing networks. Second, there is the assumption of a semantics-induced equivalence of syntax. This played a eritical role poth in the generalization of SPEAXTEST and of a GENERALIZE. It was noted with respect to pluralization that such seneralize- tions can be in error and that children also tend to make such errors. However, I would want to argue that, on. the whole, natural language is not perverse. Therefore, most of tnose generalizations will turn out to be good decisions. Cleariy, for languages to be learnable there must be some set f generaliza~ tions which are usually safe. The only question is whether LAS hes captured the safe generalizations. . Tne importance of semantics to child lenguege learning has been suggested in various ways recently by many theoreticians (e.g., Bloom, 1970; Bowerman, . 1973; Brown, 1973; Schlesinger, 1971; and Sinclair-de Zwart, 1973), but there has been littie offered in the way of concrete elgorithms to make explicit tne contriputicn oF semantics. LAS. L is a Tirst small step to making thi s contribution explicit. Conclusion This concludes the explanation of the algorithms to be used by LAS.1 for language induction. In many ways the task faced by LAS. 1 is overly simplistic and its algorithms are probably too efficient and free from information-pro- cessing limitations. Therefore, the acquisition penavior of LAS. 1 does not nirror in most respects that ofthe child. Later versions of this program will attenst a more realistic simulation. Nonetheless, f think LAS.1 is a signifi- cent step forward. The following are the significant contributions embodied so far in LAS. l. 1. The transition network formalism has been interfaced with a set of simple and psychologically realistic long term memory operations. In this way we have bridled the unlimited Turing-computable power of the augmented transition network. 2. A single grammaticel formalism has been created for generation and - understanding. Thus, LAS only needs to induce one set of grammatical rules. 3, Two important ways were jdentified in which a semantic referent helps grammar induction. These were stated as the grapn deformation condi- tion and the semantics~induced equivalence of syntax conditions. 56 L, Algovrithzs have deen developed adequate to learn natural The general mode of developing the program LAS is as follows: A lanyvusge learning situation is specified py a set of conditions. tn LAS, 2 it was specified that LAS already know the meaning of the words and that it be given, as input, sentences with HAM representations of their meaning. The semantic domain was specified to be that constituted by geometric shapes. Cnee @ set of conditions is svecified, 2a set of goals is specified. In LAS. 1 there was only one real goal: to learn any natural-like language taat deserived the domain. Once a set of goals 15 specified a plan of attack is sketched ont. However, the problem is such that the details of that plea only evolve as we attempt to imolement th i Inde2d many interesting problems and ideas that in LAS. 1 were discovered in attempting ea impl ity of computer simulation in theoretical develo The LAS. 1 progr verated in a task domain which 2s means identical, to that of atural language learning situation. Its benavior was similar to © o earning a lenguage, but ezain by no means iden- tical. In sre xn two yeers i propose to create a program LAS considerably closer to sinuleting naturel language learning. I elaborate set of goals than did LAS. 1: ss __. 2whien comes h « 1. The program will incorporate realistic assumptions about short-term menory limitations and left-to-right sentence processing. 2, The program will learn the meanings of words. 3. The program should use semantic and contextual redundancy to partially replace exnlicitly provided HAM-encoding of pictures. h, The program should handle sentences in a more complex semantic domain. 5. The progran should be elaborated to handle such things as questions and commands as well as declarative sentences. The general methods for achieving these goals in the LAS. 2 program will be sketched out in the proposal section. Also in that section I wilt propose some experiments to evaluate the LAS program. While it is true that the task faced by LAS. 1 is not really natural language learning, it still is a learning task at which hucan subjects apparently can succeed, The experiments will de- termine whether humans have the same difficulties in such tasks as does LAS and whether they make the same generalizations. However, I regard these exper- inents as of secondary importance relative to program development. It is more important to further articulate our understanding of what algorithms are ade- quate for natural lenguage learning. oT It is probably inevitable tha is really necessary to expend the c ry program, Could not the model just be sp ecified nnn Tne reason 1 way this is not possible has to do with the com roLexity of any theory that addresses the details of natural language. There is no other way to test the predictions of the theory or to assure tnat ly c¢ isten The experience with large transformational gramin language is that they have hidden inconsistencies. these ere only exposed by trying to simu- late tne Eranmers on a computer (e.g., Friedman, 1971). Consider the deserip- tion given of LAS. 1 in the pr eceding section? Although lacking in many details, it was complex and lengthy. Could the reader esteblish for himself from tals deseri ption whether the model is really internally consis stent? A computer iw c program provides a proof of the consistency end @ means o2 determining & model's behavior. The stated goals of this project are to develcp explicit algorithms for natural languege learning veecify the relevant details of these algorithms, and evaluate empirical ty tne & psye hological viability or these algorithms. Without the use of computer simulation none of these goals ane could be achieved. C, Methods of Procedure First I will describe the proposed extension of the LAS program. Then I cribe some experimental tests. In reading the specific extensions pro- posed for LAS, the reader should keep in mind that they have ‘as their intent e goals set forth in the preceding section. achieving th The Semantic Domain The first matter to settle upon in the new progren is some semantic dem.in, . ie relations 2 2 the LAL, tp wertd of ataves, prorperties, na s2cnstric Te-2Ulcns +2 20 surat ished ror further work. Tne following is oroeposed as a suggestion altacugh there is nothing critical ebout its exact form. It is critical, however, that some semantic domain be chosen. It is only when there is a specified domein that an explicit goal for success in the program can pe specified. The progran will be regarded as successful if it can learn eny natural language describing this domain. I have chosen to look at a world close to that of a young child although there is perhaps nothing sacred about this domain. This world is set forth in Teble 5. There are three people in this world. In addition to these there are four categories of objects--locations, containers, supporters, and toys. These objects can have four types of properties--number, color, size, and quaii- ty. Thus, LAS will have to deal seriously wita problemas of sequencing adjec- tives. It will also have to deal with number es a property of objects. The objects permit a much richer variety of reletions than in the vorid of LAS. 1. This will provide a demanding test for the learning of complex multi-argumeat relations. There can be sentences like Mommy traded Daddy the car for a ball. In this world, people, containers, supporters, and toys can be in locations. People can change their location and that of toys. People and toys can be on supporters, toys can be in containers. People can possess toys, containers, and supporters. 58 Anderson TABLE 5 Categories in the World of LAS. 2 PEOPLE | LOCATIONS. COMPATUERS SUPPORTERS Mommy bedroom box table Daddy’ kitchen closet chair LAS den dresser bed TOYS NUMBERS COLORS. - SIZES GUALISIES dolly one red big dirty ear two blue media pretty pall three green small shiny Thus the differen: catexzories of objects enter differently into different types of relations. This Te + will prove jmoortant to the predictive parsing facili- ties that ZT will want to introduce into LAS. 2. Left-te-Zisht Processing Cnildran learn language auditorily. Thus, their induction algoritans must process incoming material in a left-to-right manner. The current LEARVMORE program does not go this. BRACKET completely processes the senten SPEAKTEST even begins to work on it. Clearly, PRACKET an integrated so that the beginning of the sentence is pracket py SPEARTSST before the end of the sentence ts considered by eithe ducing this left-to-right processing is a preliminary to introducing short- term menory limitations into the induction situation. Figure 18 illustrates in highly schematic form the left-to-right algorithn proposed for LEARNMORE. Words are considered as they cone in from the sentence LEARVMORE, as in UNDERSTAND, tries to find a path through its netvork grammar to parse the sentence. The difference petween LEARNMORS enc UNDERSTAID is that LEARNMORE hes available to it a HAM conceptual structure to enable it to better evaluate various parsing options. Suppose LEARNMORE is at some point in processing the sentence. It will also be at some point in 4 parsing n Let us consider how it would process the next word. At box 2 it in the word. At pox 3 it would set 1 to the various grameatical options (ares) at that node in the network. Boxes Ty through 7 ere concerned witn evalua. waether any of these options can handle the current word. Box 4 che there are any options left. Box 5 sets a to the first option and re the remaining options. Box 6 checks whether the word would be parse and box 7 considers whether the action associated with that ere corr a HAM structure. “If a passes the tests in 6 and 7, FARIMORES advances to con~ sidering the next word. Otherwise it tries another arc. Tf it exheus arcs, it will call FIITDPATH (box 8) to build e new are fron the curreat node. 29 Flowen wh Me rr og a program 60 The work currently assigned to BRACKE? will have to be assigned to 9x [- That is, box 7 will have to determine when an are snould involve @ push to en embedded network and when it snould pop back up to an anbeddirg network. This will be done by consulting tne information in the semantic struc vure. Tt would also be possible to consult the pause structure of the sentence tor information about phrase structure poundaries, Note that certain sentences which the old LEARNMORE system could handle will not be handled by this system. For instance, consider the sentence The Square that is above the triangle is rignt-of the square. After the first two 3 words it would not be clea: which squer a object or the subject of ; 3 a. an appropriate action to the path. In tne old LEARNMORS thi the referent of square Was resolved by let 2 n * - « —_—__ . dealing with it. Presumably, however, cal a from such sentences. bry {p [p+ 3Q fa ct t Qo rh £ H t+ Om © } ian . ce 0 3 he aes ra wy o Cc h joy In this system it will not be assume that LAS knows the meaning of the words, Rather this will be something that LAS will have to learn from the pairing of sentences with conceptions. First let's discuss the learning of words whose reference is a simpie concept or object, ©-&-> box or mommy, and postpone discussion of c mpolex relational terms like trade. Logically, the task of lexicalization is quite simple and it would not require complex algo- rithms to succeed. For instance, consider this elgorithm: LAS is given a2 sentence with n, words and a conceptualization it Geseribes with ny concepts. tore with each word the my, concepts. The next sentence that comes has Ro words and its conceptualization consists of zp concepts. If a word in this sen- tence is new, store with it the mp concepts. if the word is old, store with it the intersection of the concepts previously stored with it and the new mo concepts. Eventually, ignoring problems of polysemy, & word will become pared down to zero or one concepts. Those with zero concepts are function words end those with one concept have that concept as their meaning. a = + w Of course, this elgorithm will ma into trouble if LAS does not always eptualize all the concepts referred to by the sentenee, This can bea died by having the algorithm wait for a sequence of disconrirming pieces r idence before rejecting 2 hypothesized meaning. Incidentally, subjects ehave just this way in concept attainment situations (see Bruner, Goodnow & Austin, 1965), not teking negative evidence @&s having its full logical force about the meaning of the word. The basic problem with this algorithm is thet it makes unreasonable assunp— ions about the information processing capacities of humans. In pilot researecn £ my own, I have found that adult subjects can learn the meanings simultane- ously of a number of words in a sentence. However, they do suffer difficulties when there is high ambiguity about what a word means. Presumebly, children would have even greater difficulties extracting word meanings from complex sen- tences. Broen (1972) and Ferguson, Peizer, & Weeks (1973) report that new items of vocabulary seemed to be introduced through use in set sentence frames such as Where's ..., Here comes ...-, There's ... known as deitic phrases. The noun tends to be heavily stressed and repeated. ‘The parent frequently points to help 61 ah env t 3, provided the child Knows more complex Se nes g most of the ne satical of the sentence fT. compine these yarious considera- p L t eure 18 to deal word is reac if Li ag % about context and about the word's position in tae grammar, it {Ll co ‘this guess to menory and stick with the guess i ess later disconfirmed. Tne program wilt only hazard a guess in circumstances of low uncertainty. Thus, “4b will only guess if it can otherwise parse tne grammatical structure in which the word appears. It will not guess if the word is receded or followed bY know. Thus, the progres, much a8 adults appear to, will contrasts between grammatical pattern and a e program knows the grammatical rule NP - determiner p> a Oo rh oct Le rt Pe [3 adjecti . the phrase the ¢lick box it will suppose thet glick rerers to some property of the box. Thus, the progren will have to acquire its initial vocabulary by means of simple frames, 85 do young chilcren. With this initial vyocebulary information, it can begin to learn grammatical rules. Once in possession of grammatical rules, it will no longer nesd simple frames +o learn new lexical items. One interesting question is how function words are ever identified as non- meaning-pearing in this scheme, Presumably, 31 is done on the vasis of failing a d and any semantic yeature. FHLS esses had been associated with a word, So far I have assumed that all concepts are constructed before language acquisition takes place and that the only problem is to link up these concepts with words. But this is very unrealistic. Consider the verd give in the sen- ives the dolly to Daddy. The meaning of give is something 1ike ng one to cease to vossess 3h object end someon’ elise as ooject. Tt seens very implausible that a child comes learning situation wits sucn a concept ready made. What probably — he sees Mommy pushing the Goll to Daddy or Momny handing the ball to vany. With these experiences he hears sentences like Mommy gives the golly to Daddy or Mommy gives the pall to baby. From these examples he induces the appropriate meaning of give. Cancept attainment in these situations can be achieved by using the sort of concept jaentification used py Winston (1970) for inducing geometric concepts. That is, each use of the word give is paired with e EAM network structure given the meaning oF the sentence. Winston's heuristics allow. us to extract what these network structures Pe mon. ‘Tne concept give, es verb, is then attached to For this sort of algorithm to succeed, LAS must be set to regerd certain con~ figurations of propositions, interlinked by causal terms, 4&5 being associated - with a single relational term in the langu2ege. 62 Note also that the meaning of complex re ~ % ~ abe as e bae sentence moOmm yA TT co ERSTAND set upd, na 2 caild is that first the child eh has been in two and three word ueterances. es ib appears that children have omitt ad mo function + eonstructions. One explanation of the origin of telegraph e pealing from the point of view of LAS is the following: Suppose that LAS did not receive as input to its Teena routine complete sentences lesrapnic sentences. liy induce a te [t seexs reasoneole otel sentence he . If so, then his be receiv as their basic celesraghic Shis necothesis com fron studies of chiid imitation of adult mid bhay these tmdtati ons. amnile tanger than tha chiid's awm iso telesraphic in nature (e.g., Srown 2 Fraser, 1964). Blas- 1970) found that childre tend to repezt those words which are words which occur in terminal positions. The seme annem eng to be stressed in adult speech. Scholes (1969, 970) en tended to omit words that had unclear senantic 2 oes or What I find striking 1s th these ere just the veariebdles "ranch sentence--a language of serial pos one per- es C fectly. Of course, wh Tm tablished effects in eaningfulness el ments on immediave memory. ough an aspect “I propose to introduce telepraphic i the variables Ss ¢ D of LEARNMORE called BADEAR. The BADEAR program will simulate of stress, meaning fulness, and serial position in orovidings LAS with a depleted version of the sentence. The locus of the effect of BADEAR will be between boxes 4 and 8 in the flowchart of Figure 2. Basically it will not bass all words onto BUILDPATH. Rether some words will "slip fron consciousness” after failing to be persed. It will tend to omit words wher: (a) they are unstressed, (ob) their meaning is not known, (c) a critical nusoer of new words in the sentence nave already been pas ssed to BULLDPATH. I suspect this critical number is something Like one or two. Factors (2) and (b) would generate che effects o Factor (c) would yield good memory Tor the fir good memory children do show Hr last wor term acoustic memory. 7 oi 63 An interesting fez @ B at, expanded, LAS sould tbe able to receive more of the sentence. tons and imita- sions would grow as Goes a child's. This would be colicit mechanisn for an ides suggestes ty Braine (1971), Olson (1973, Tnducing 2 grammar Drom derenerate sentences presents an intere How is it tnat pangon its rules for generating te eecn? verely ed 4 r oR fuller sé does not follow S$ are W oe le means for expressing tne same tho hanisa incorporated that will strengenen some & mat lative Rules to be would be t essTul FUDERSTAND and uccessfully ignt. tt & ares ont of parsing netw ack ¢ eir relative Subjects wo of a sta i Tneffective ne originel rc oO a on word utterances would descend 4 of the stack and so become unavailable. This strengtn mechnanisn the same as used to order Links in the HAM memory model. This is a different way to bring formation to vear in grammer induction than thac bron posed for rather than seexing explicit éisconfircation of rules, the rules ned out of existence as more adequate rules taxe Ss over the used to occupy in sentence understending and af * eneravicn. with the following form: START NP é, This grammar requires considerable backup if the sentence does. not have an RA relation. As suggested earlier it would be more efficient if LAS were given the power to transform the grammar into the following form: STOP GRA iP NP E&.2RB a STOP Given that there are s ous time problems (see introduction of proposal) in parsing, it isc i 4 methods be incorporated in the learning program for optinizing the grammar. The merging of arcs, besides making the grammar more efficient, would be another form of generalization. It could be used to further merge and build up word classes. 6h Any a errr ok aranhkinn ar aoe s, reyin ore tyo further ways thab semantics can be used to aid language + ~ an 1 4 7 h contain color names. Curren nt wu ae u the amoun overlay between the mem er its over é ie property. joe oO ics would be to lesse s t ations of sentences. I¢ should somet 5 se interpretations. For instance, suppose @ sentence can din. Because of the conceptue constraint t to guess their connection. This use of conceptu domain could also be used py UNDERSTAND to 3d t She model of the Schank's (1972) system. Tnat tandl cing @ sentence by use of syntactic information, i c traints to predict what a interpretat sdietion can then be checked for synvac ammar. It would be profitable to try t ike Schank's within the rigors of the eb 0 Oo mY ct or o cr ry a O O ct 3 as jQ v ee 3 cr uy rae 9 sy ob 1) cr ot m n mp E fp ct foe RO Ww ry nm ct wm wy my ry UD ts fe om OM DD Ps a fe rp op oO mPct 1 @ ct rs ch OW ocr Q @ ct hy 0 mM won a i by use of the network dictive persing system PAymal formalisms, A Procedural Semantics So far LAS hes been principally concerned with representing the meaning conveyed by a declarative sentence. However, language hes other purposes than er commands oO just to communicate meanings from one speaker to enoener Co dé ly in the box, D a end questions. For instance, consider the sentence Put the do Currently, UNDERSTAND might retrieve the sentence'ts meaning as S of LAS that it out. the dolly in the box. This is the dec Laratiy However, in addition LAS should evoke an action that an ection to decide whether to comoly. Al 21 meaning of the sentence. The procedural meaning of decl: $ very Simple: store this sentence. This is already as ment of the sentence. However, the procedural meanings underlying ‘o uw 3 0D ry Pa r r cr Dp HO oom @. Nn ct c ri D Oo «MM oO “s bry WO ck F4 rs A uv ou fy 0 1 ry Om e ' § Wy ts 0 ocr 0 aw ocr 3 - ra) yo mw ct ae 3 eat pes of sentences are more complex. A large part of the success of 's system is that it.was adequately able to deal with the procedural of various sentences!’ semantics. It is important that LAS begin to tnese too. . 4 Me m = oO nF Pe iS $3 -J v 2 < eo ck ry He ~ 0 dpom (9 this would mean, in terms of LAS's network grammars, is enrich set of tions vat can be stored. Currently, the only actions are ones result in the creation of pieces of HAM structure, i.e., : i LAS will have to ene other internal actions that svecify whet it does the declarative knowledge. These will include commands to answer the qu or obey the order. HAM already has commands that direct it to answer aq but executing orders would be something new. As part - the HAM project, working on methods for incorporating procedural knowledge into a network tem. It is unclear yet what success I will have here. cr ocr ove (D Q j4 io H 3 ci } < m ree by 9 £ jos bck Mw Ug > x Pa ct oO u 0 Ln fet ch ocr O a) poe fee ° un HS a ta u ! 65 a Andersoa language whose semantics n a r Consider for ch Pinite article--the ab bjeck whi: > listener t Li s be i This partic ly g is to ¥y to speaker and context. Since the referent or you completely chanzes with speaker, a child would be lost if he tried to associate its meaning with some f a at it as having as meaning @ pro- HAM memory-node. He must be pregared to tre cedure for determining the referent. Provided that LAS has the facilities for representing and evaluating pro- cedures, there seem no difficulties in learning those aspects of language which are heavily embued with procedural seransies. Language learning wi tinue to arise from pairing sentences with secentic interpretations. Howeve serantic interpretetions will now contein & procedural as well as a declarat aspect. Again language learning will consis learning mappings betweea s tences and the now-enriched semantic represenvations. Experimentation As stated before, I do not think that i the principal focus of the project. There 11 much further research that needs to be done in the way of specifying elgorithn that are capadle of language induction. Nonetheless, in parallel with shis research, I would like to perform experiments to get some initial assessments of the viability of the proposed elgorithms. The type of information relevant to evaluating LAS is only acquired by looking at artivical languages. With these artificial languages it is possible to test LAS's predictions about language learnability and generalization. mental research should yet be Criticisms of Experiments with Artificial Languages For ethical reasons it is not possible to expose young children, just learning their first language, to an artificial language which LAS had identi- fied es degenerate and probably not learnable. This means that all experimen- tation with artificial languages must be Gone on older children already vell- established in their first language or on aaylts, Conseauently, the first lan- guage may be mediating acauisition of the second language. ‘There is evidence (see Lennenoerg, 1967) that there is a critical initial period during which languages can be learned much more succéessiu.y than in later years. Lennenberg speculates that there is a pirysiological basis for this critical period. ‘Thus, one might wonder whether the same processes are peing studied with older sub- jects as in the young child. Personally, i o doubt that the mechanisms of language-acquisition are the entirely same wi ct . s h the young child in first language learning as With the older subject in second language Learning. However, it does cr Oo Other criticisms (e.g., those of signin, 971; Milter, 1967) of stucies re 7 1a arvnoin n the fact that tnese Languages are a é molicated t an artificial labora- j lex functions; the ech. LOW a" GACL 1 phenomena. Another ose studies of 5 a semantic referent. Clearly, 5: 3 f: of algoritans @ subject can employ. neuristics used bY LAS would be useless without senan- tics. (1972, 1973) neve shown that the existeace or a seman uge effect on Language acquisition. Except for control ‘condi iments will involve @ seransic referent Languege Learneboility 2 oS on elgorithm is that the graph defo mation condi- tion ts ms ation between the surface structure of tne sen- tence and the sal structure. These is, the surface structure mist preserve the original connectivity of concepts. In Section A5 we described languages which violated this assumption. Consider the following language: oS at Bucs S$ > NP NP relation HP > noun (Color) (adjective) {clause ) CLAUSE >» te NP relation NOUN + square, circle, triangle, diamond Color > red, blue Size > small, large Relation > above, pelow, right-of, Laft-of cS This is en expanded version of GRAMMAR] described in Table 1. (The element te a in An serves the function of a relative pronoun like that.) An example of a sentence a Loy this languege is Squere red te triangle pig above circle Dive small right-of. experinent Twill Go compares Four conditions of jearning for this langucge- No reference. Here subjects simply study strings of the language trying to infer their grammatical structure, Bad semantics. Here & picture of the sentence's referent will be presentea elong with the sentences. However, the relatioaship between the sentence'’s semantic referent and the surface structure will violate LAS's constraints. The adjective associated uth the ith noun phrase will modify the (n+ 1- iL} shape in the. sentence (where n is the number of noun phrases). For example, the adjectives associated with the first noun phrase will modify the last 6T om yw ~~ S (oo) (b) ( RED SY (c) Figure 19. Different semantic the same triangle right-of. roferents for the sentencees Square red te pig above circle blue 5m ay 7 ft yet - Anderson shape. Similarly (q + 2 - ijth rel So for instence the petyeen the first pai triangle. Gne ay in picture for the example sentence 15 given in Fiuuce 19a. ud) h, Good semantics plus main oroposition. The picture in this condition wiil be the same as in 3 but the two shapes in the main proposition will be highlignted. In this cond@ition LAS would be guaranteed of successfully bracketing the sentence because the main proposition is given. In some ways this experiment is Like Moesser and Bregman's, However, here English words are used s0 that the subjects do not need to induce the language's j its grammar. Fob corresponds to the situation faced 1 5 sh words were replaced by nonsense syllables this would tion of the Language to make pLiti induction tractisle. Tne | predictions of LAS are, of course, that best learning occurs in Condition 4, next best in 3, and failure of any learning in 1 and 2. It would not be sur- prising ta see gunjects perform better in Ltren in 2 since in they might par- 7 ~ 77 * : a cute ce eee ok peace kDa BLL Ve BULB ti Ch BO PL OPE Bae ew decent theo > ve The procedure would have subjects in all conditions study the same sequence of sentences but vary the accompanying semantic information according to condi- tion. After a study phase they would be tested for grammaticelity judgments about a set of sentences, Some of which violate one of the rules for generation. Since the syntax of the language is the sane in all four condivions, the sane sentences will be eramnat 1 in all four conditions. @yen though the synvtac- tic information given d study will be the same in all conditions, marked @ifferences in syntacti Tr r 52 + Ss He ete (9 39 oo 2 ao ct uw 0 wledge should appear across conditions. ine guences of study trials with sequences of test tudy six sentences, with the semantic information jate to his condition, if any). Then he would see six test pairs, one ce of each pair violating some syntactic rule. For each pair of he would o- choose the grammatically correct pair. By frequently alternating study st, 44 would be possible to carefully monitor the growth of information in the conditions. Many readers may not be surprised by the prediction of petter learning in Conditions 3 and hk. Hopefully, the significance of such an outcome would be clear. It would snow shat semantics is impo tant to induction of the structure of a natural language. Hovever, i ck o (b would also show that semantics is useless if the relation between the semantic referent and the syntactic structure is arbitrary. The surface structure of the sentence must be a praph- deformation of the underlying semantic structure. Failures to eppreciate the contribution of semantics to Language induction and failure to understand the nature of this contribution of senanties to the induction process nave been fundamental in the stagnation of attempts to understand the algorithms permitting Janaguage induction. These facts may be obvious woen pointed one but they have . . . wate . . rd > - = D2 Phe ar wa been unavailable to the Linguistic theorists 19F fifteen years. he same purpS ea S hat is, they ¢ A 3 ssibl so that the target language can be identified. However, ainc ed by LAS are not tre came ag those suggested by Chomsky. For insvance, Chomsky proposed that vrans—~ formations which reversed the order of words in 4& sentence would be unaccentanle. Tris is because such 2 rule does not refer to tne santence's constituent struc- tore. However, 2 languesge which contained sentences of a natural language and their reversals would be learnable by LAS. Te would just develop one seu or rules for sentences in one order and another indesendent set for reverse order sentences. It would be interesting to see whether numan subjects could iearn such a language. In the example of the induction of GRAGIARI we found that tne for LAS to detect non-senantic contingencies between syntactic cno a ° first noun-phrase end tn the second noun-phrase pushed to in the. main network. Wor instance, it is possible that a morphenic emoellishment of the a jectives i ; hrase may depend on & choice of morphemie embe the noun in the first noun phrase. Human subjects should also find it hard to detect such syntactic contingencies. oO m There ar nother set of predictions, besides those concerned with language learnability, waic + will be useful to explore. LAS makes predictions about the situations under whieh humans will ten to generalize rules end when humans will not. Suppose LAS learned the following gremmar: , S$ > VERB WP NP we > (PREPP) Wy, (ADS) PREPP > PREP Ne Ny > boy, girl, ete. No > room, bank, etc. ADJ + tall, nice, etc. PREP > in, near, etc. VERB > like, nit, etc. 1 + A typical sentence in this language would be Like which means The tall boy in the room likes the nice 83 1. Tnis lang e is given English terms only to maxe its semanties Clearer. Suppose, in fact, words in the language were das meaning man, ji> meaning wonar, Fos meaning boy, and 3 tuk meaning girl. Suppose the subject studies the following pair of sentences! 1. Like das tuk. 2, Like fos jir. 10 Then, it is interesving to consider his judgmenes of the nacceptaoility of sentences Like: 3, Like das tuk. 4. Like das jir. 5S, Like jir Gas. Accept involves recalling senteace (1), but nVOLV c LAS would currently mexe th 3 ' ’ cy fa wv ip uy) oa _- a “~ Oe wONeECNE 3, 5325 oo 5 mS 7 of their semantic similarity \ s. ‘Tne words 3: could, for Dax 4iff 5 cane inflection wnen they apr ferent t when pr ion in this artificial Jangu Would he accent senven 6. Like in room boy tall girl 7. Like girl in room boy tall That is, will rules gener alize from the subject noun purase to the object noun phrase. As LAS is curren ntly constituted such gener rai tzasson s would not occur until it hed built up fairly stable now pnrases. Again suppose LAS had initially only encounterecé simple sentences suen es (8): uch as (8) LAS would learn the class of nouns that Go From sentences Ss e first and second noun phrase slots. Suppose then sencence (9) was studied. On the basis of it, would senvence (10) be accepted as grammatical? That is, would the preposi itional phrase in bank generalize to Ov “near nouns in the same class as woman? 9, Like boy in bank women 10. Like girl in bank man This would be am example of right zgenereli In contrast, LAS does perform left generali LAS would accept (12). zation which does not occur in LAS. zation. That is, after studying {11} lL. Like boy woman nice 12, Like boy man nice fi poses, one concerned with psycnology and one ence. IL think this mixed purpose is fruic- réilization of ideas from two fields and so on. There is no gueranvee that LAS, in the will ever achleve the goal of an adequate acauisition of language. However, 2 certain outcome er understanding of the information-processing demands and of the role of a semantic referent in gremmar in- “re Will learn wnet is wrong with one explicit set of i | =ven that would be 2a significant contribution to the Currie UbEUre be cas Gevelupmeub dn a PlelG rich in Gave Dub abmvel, LuLaLiy Jecking explicic information-processing theories. I hope, of course, that the processes uncovered in the LAS project wiil be the same as those used bY humans in language learning. A successful simulation program would constitute an enormous advance in our understanding of cognitive development. The contributions of LAS to the artificial intelligence field are less certain and more distant. Nonetheless, generality in language understanding systems is an important goal and one for which a learning system approach seens ideal. It is therefore importent to understand the contribution language learning systems can make in this field. It yould be a significant advance to know in detail way & learning system approach was not the answer to language understanding or at least why LAS was not the right sort of learning systen. Of course, if LAS does prove to be the basis for a viable language understanding system, its contribution to artificial intelligence will also pe of considerable importance. , FE, Fecilities Available { shall neve evailable the entire facili Center, University of Michigan. My current @ but can be extended for one to three years. My pr Michigan Terminal System which supports a rich vari the programming will be performed in Michigan LISP ( which is e relatively economical and an error-free ve c ty of programs. Most of S £ % Wilcox, 197%) LISP. {2 mag ae son ALPAC (Automated Languag Language and machines : te ; >t National Scieaces, Washiagton, 232 Andersoa, J. R. Computer simulation of a language-acquisitioa system in R, L Solso (&d.) Information Processing and Cognition: The Loyola Svmoostun. Washington: Lawrence Exlbaun, 19/5. Anderson, J. &. and: Bower, G. H. -Human associative menory. Washington: Winston and Sons, 1973. Bar-Hiliel, Y. Language aad information. Reading, Mass: Addison-Wesley, 1964. ns s In J. R. Hayes opment of languaze. New York: Wiley, 1970. Bever, T. G.. The cog nt itive basis for lingulstic structure (Ed.) Cogn t vel ks wv | od Q ban 9 overs! Bierman, A. W. An interactive finite-state language learner. First USA-JAPAN Computer Conference, 1972. Bierman, A. W. and Feldman, J. A. A survey of resuits in gramnatical inference. In (£d.), Frontiers of pattern recoenition. New York: Academic Press, 1972. blasaeil, K. and yensen, rv. stress and word position as detetminants of induction in first-language learners. Journal cf Speech and Hearing Research, 1970, 13, 193-202. . Bloon, L. One word at a time. The Hague: Mouton, 1973. Bobrow, D. G. A question~answering system for high school algebra, word problems. AFIPS Conference Proceedings, 1964, 26, 577-589. Bowerman, M. Early syntactic development. Cambridge, England: University Press, 1973. Boyer, R. S. Loc! king: A restriction of resolution. Ph.D. Dissertation. University of Texa xas at Austin, 1971. Braine, M. D. S. On learning grammatical order ox words. Psychological Review, 1963, 70, 323-348. Braine, M. D. S. On two types of models of the internalization of grammars. In D. J. Sloben (Ed.) The ontogenesis of grammar. New York: Academic Press, 1971, 153-188. Broan, FP. vhe verbal enviroauent of the language-Learaing child. boaogra hg P of the American Speech & Hearing Association, 1972, 1?. rT eee We ee ae er rr cs ae Brown, R. A first language. Cambridge, Masse: larvard Brown, R. and Fraser, © The acquisitioa of syntax. im c. N. Cofe B. S, Musgrave (Eds ° { + & , Yarbal pehavior and jearnmings Problens and processes. Vetoes oo 2 Q jearning: Problens 2n8 8 - New York: MeGraw-Hil Bruner, J. 5-, Coodnow, J., and Austin, G- Ae A study of thinking. hew York: Wiley, 1956. Charniak, E. Computer solution of word problems. proceedings of the Intec national Joint Conference of Arcificial Tarelligence- Wasnington, D- C.3 1969, 303-316- Chomsky, N. Aspects of the theory of syntax: Cambridge, Mass-:; MIT Press, 1955. Colby, K. Me and Enea, H. Inductive inference by intelligent machines. Scientia, 103, 669-720 (Jan. 7 Feb. 1968). Clark, E. Vv. Non-Linguistic strategies and the acquisition of word meanings. Cogaition: International Journal of Cogaitive Psycholosy, 1974, in press. Coles, L. S- Talking with a robot in Eng ~narional Joiat Conference on Artifict: 1969, 587-596. ~ — Proceedings of the Inter- elligence.- Washington, D. ©. f3 ee r 3 rr? uisition of precedence grammars. Engineering and Applied Science, es, 1970. Crespi-Reghizzi, S. The mechanical acq Report No. UCLA-ENG-7054, School of University of California at Los Angel Dreyfus, H. L. What computers can't do. New York: Harper and Row, 1972. RO Ervin, S. M. Imitation and structural change ina children's Language. In E. H. Lennenberg (Ed.), New directions in the study_of Language. Cambridge, Mass-: MIT Press, 1964, 163-159. Feldman, J. A. Sone decidability results on g ramnatical inference and complexity. A. Lt. memo No 93.1, Computer Science Department, Stanford University, 1970. Fergusen, C. A., Perzer, D. B., & Weeks, T. E. Model-and-replica phonological grammar of a child's first words. Lingua, 1973, 31, 35-55. of syntax. Paper presented Fernald, C- Children’s active and passive knowleds to the Midwestern Psychological Association, 197 on Fikes, R. E., Hart, p. £. and Nilsson, WN. OJ- Some new directions in robot problem solving. Stanford Research Institute, August, 1972. 74 Fhllmore, C. J. The casa for case. in E. Bach and R. J. Harus (Eds.), + Universals in tinenistic theory. New York: Holz, Rinehart anc Hiaston, 190d. Friedmac, J. A computer mod American Elsevier,1971. Praser, D., Bellugi, U., hension, and projection. Jour 1963, 2, 121-135. Research on intelligent question answering che ACM 23rd National Conference. Princeton: , 169~i8l. Hafner, C. and Wilcox, B. LIS? MTS programmer's manual. tent Research Communication 302 and Information Processing Paper ¥y 1974 a Tim wnt oot nhac University or “icnigar., de atical initerence. Technical Report No. CS 139, Computer Science Department, Stanford University, August, 1969. uary 1966. Technical Repor Naval Pasearch, Information Systems Branch. Reliey, . L- hachy syucecclé ACQULSAL LU. E$<-3729, Lhe Rand CoLye, baka Monica, California, 1907. Kelloggs, C. H. A natural language compiler for on-line data management. Proceedings of the 1965 Fall Joint Computer Conference, 473-492. Kuno, S. The predictive analyzer and a path elimination technique. Communications of the ACM, 1965, 7, 453-462. uage without ability to speak: A case report. Lenneberg, E. H. Understand ychology, 1962, 65, A19-425. i Journal of Abnormal and Sozia Ww a9 Lenneberg, E. H. Biolowical foundations of language. New York: Wiley, 1967. Lindsay, R. K. Inferential memory aS a basis of machines which understand natural language. In E. A. Feigenbaum and J. Faldman (Eds.), Computers and thought. New York: McGraw-Hill, 1963. Loveland, D. W. A linear format for resolution. Proceedings of the IRIA Symposium on Automatic Demonstration. New York: Springer-Varlay, 1970, 147-162. . Luckham, D. Reitinemants in resolution theory. Proceedings of the IRIA Symposium on Automatic Demonstration. New York: Springer-Varlay, 1970, 163-190. 75 Miller, G. A. Thea pss chology of comeunication, New York: Basic Books, 196 Minsky, M. CEd.), 5 Semantic information orocessing. Cambridge, Mass.: HIT Press, 1958 Moeser, S. D. and Bregman, A. 5 of Verbal Learning aad. Verb: Moore, E. F. Gedanken experiments on sequeatial machines. Automata Studies, Princeton, 1956. Olson, G. M. Developmental changes in memory and the acquisition of language. In T. E. Moore (Ed.), Cognitive development and the acquisition of language. New York: Academié Press, 1973. Pao, T. W. L. A solution of the syntactic i nduction-conference problea for 3 non-trivial subset context-tree languages. Report } No. 70-19, The , see School of Electrical Eagineering, University of Pennsylvania, slat, 1959. Guillian, M. R. The teachable language comprehender. Communications of the Association for Computing Machinery, 1969, 12, 459-476. Reber, A. S. Transfer of structure in synthetic languages. Journal of Experimental Psychology, 1969, 81, 115-119. Richards, I. A., Jasuilko, E. and Gibson, C. Russian through pictures, Book I. New York: Washington Square Press, 1961. Robinson, J. A. A machine-oriented logic based on the resolution principle. Journal of the ACM, 1965, 12, 23-41. Rumelhart, D. E., Lindsay, P, and Norman, p. A. A process model for long- term memory. In E. Tulving and W. Donaldson (Eds.), Organization of memory. . New York: Academic Press, 1972. Saporta, S., Blumenthal, A. L., Lackowski, P. and Reiff, D. G. Grammatical models of language learning. In R. J. DePietro (Ed.), Monograph Series on language and Linguistics, Vol. 16. Report of the 14th Annual Round Table Meeting on Linguistic aad Language Studies, 1963, 133-142. Schank, R. C. Conceptual dependency: A theory of natural-language understanding. Cognitive Psychology, 1972, 3, 552-631. , Schlesinger, I. M. . Production of utterances and language acquisition. In D. I. Slobin (Ed.), The ontogenesis of grammar. New York: Academic Press, 1971. 76 Scholes, R. J. The role of grammaticality fin the imitation of wore s by children and aduits. Journal of Yerbal Learning anc ; 1969, 8, 225-228. Scholes, R. J. On functors and conten ntives in childzena 's imbtations of word stcings. Journal of Verbal Learning and Verbal Behavior 1979, 9, 167-170. Sehwarez, R. M., B rgex, J. F. and Simmons, R. F. A deductive questioa~ answerer for natural langus is atio Association for Computing Machinery, 1970, 13, 167-133. sd Shamir, E. and Bar-Hillel, Y. Review 2476. Computing Raviews, 1962, 3, 5. Siklossy, L. A Language~Leatning ,euristic progran. Cosnitive Psycnoioay, a tai 1971, 2, 479-495. Simmons, R. F. Natural language question- answer ring systems: 19909. Communications of the Asso ciation for Computing Machinery, 1970, 13, 15-30. Simmons, R. F. Semantic ft networks: Their computations and use for understanding English sentences. In RB. C. Schank and K. M Colby (Eds.), Comput ar models of chought end language. San FY neisco: Freeman, 1973. > -ce le 3 ad rt GS Simon, H. A The sciences of the artificial. Cazbricge, Mass.: MIT Pressy/76 7. oe acquisitioa and cognitive development. In tive development end tne acquisition of language. =s, 1973. Slagle, J. R. Experiments with a deductive question-answering program. Communications of the As sociation for Computing Machinery, 1965, 8, 792-798. Slagle, J. R. Automatic theorem proving with renamable and senantic resolution. Journal of the ACM, 1967, 14, 687-697. Siobin, D. I. The ontogenesis of grammar. New York: Academic Press, 1971. Solomonoff, R.- J. A fornal theory of inductive inference, Part It. Information and Control, 1964, 7, 224-254. Weizenbaum, J. ELIZA - a computer program for the study of natural Language communications between man and machine. Comzunicetions of the ACM, 1966, 1, 36-45. Wilks, Y. The Stanford MI and understanding PIO} ect. In Senank and Colby (Eds.), Computer models of thought _and language. San Francisco, 1973. Winograd, T. Understanding natural language. Cognitive Psychology, 1972, 3, 1-191. 77 Anderson Winston, P.-H. Learnin ng Structural descriptions fron examples. MIT Artificial Intellisenc ce Laboratory Project AI-TR -~23i, L970. Woods, W. A, Procedural semantics for a question-answering machine. Proceedings ofthe 1958 Fall Joint Comsurer Conference, 457-471, atural languaze understanding: An application woods, W. A. Progress inn FLIPS Proceedings, 1973 National Computer Gonftorence to lunar geology. AF and Exposition. Woods, W. A. Transition network grammars for natural language analysis, Communications of tha ACM, 1970, 13, 591-606. 78