Figure 15 Some possible network grammars m7 , oN THE = NOUN > STOP oO EEO __ Se STOP ~~ STOP 51 Every bit as much as LAS, 4 enild logically needs negative information ta reeover Trom overgeneralizations. The interesting quest2on igs where the negative information comes from in the case of the child. Parents ao correct the cnild in such cov ious morphemic overgeneralizations (Brown, 1973). Even today x find myself corrected (not by my parents ) for my failures to properly pluralize esoteric words. ‘The child may also Use statistical evidence for a negative con elusion. In some manner ne may. notice shat the morphesic form foots is never used by the aduls and so conclude that 1% is wrong. Horning (1969) has formalized an algorithm for detecting such overgeneralizations py assigning probabilities to rules. vw Figure 16 illustrates LAS's treatment o training sequences. These involve some thre sion of the noun phrases on the brancn of the start network for RB re As can be seen from Figure 13, at the point of the hth sentence LAS has its grammar to the point where it will nandle 616 sentences of the target lan- ZuaZe. Actually the grammer has produced some overgeneralizations—-i ept a total of 750 sentences. LAS has encountered phrases like square, f the last four sentences e word noun phrases and also expan- 1 ace square small, square red, and square red small. From this experience, LAS has generalized to the conclusion that the sentences of the language consist of a shape, followed optionally by either @ size or color, followed optionally py @ size Thus the induced grammar includes phrases Like squares small small because ptable in poth second and third posi- size words were found to be acce erestingly, bhis mnisbake will nov cause LAS any problems. It will phrase like square small small beceuse it will never have a to- structure with.two smalls modifying an object. it will never so and thus UNDERSTAND can nov moxe any mistakes. This is how an over-general grammar can be successfully constrained of semantic acceptadility- hee QO a fo never spea hasspoken HAM hear such @ pArese mo} @ nice exaipic fhe problen of learning to sequence roun modifiers has turned out to be a source of unexpected difficulty. in part, the oréering of modifiers is governed by pragmatic factors, For instance one is likely to say small red square when yeferring to one of many red squares, but red small squer when referring to one of many small squares.. Differences like tnese could be ef om controlled by ordering of Links in the HAY memory structure. G2NERALIZE . After teking in 14 sentences LAS has built up 2 partial network grammar shat serves to generate many more sentences than those it originally encountered. However, note that LAS has constructed four copies of a noun phrase grazer. One would Like it to recognize that those grammars ere the same. The failure to do so with respect to this simple artificial language only amounts to an inelegance. However, the identification of identical networks is eritical to inducing languages with recursive rules. 52 A AAT+SI wr ook PAS sr y ad Additions to LAS’s grammar atu: 16. SQUARE BLUE SMALL ERIANGLE Tle PRIANG ® RED SQUARE BLUE LEP P-OF 12. TRIANGLE S SMALG SQUARE REO 13° SQUARE BLUS PRira NGLE BLUE e Lhe SQU ARS RED LARGE TRIANGLE RED LARGE SELOW oY 4 i) NIL ; STOP E— 8593 D1095 B56 6-6 SOP KS NIL S>STOP £5580 DLO23 B564 Se D1 OL ee STOP NIL peer STOP conn he p10232 2S El 94 —FL208__s =>STOP ON Nib SeSTOP €D1117 E884 D1LO95 > E90L- SST OP NIL TOP Sp71l4 D692 ———-————p STOP EDLOLS D1 095—£ PLY _s»st0P D1023 DLONS ss POP E 298h —© E205 s~ STOP £1368 E22 see stor D714 = small D1O45 = red, biue,small DLiL? = plue,red £905 = small,large E1395 = large 23 Anderacoa - ye, > NOUN 3 k That is, there are four networks, NP, HPL, NP no and uP. whose structuce is in- dicated by the eoove rewite rules. It Ts assumed that LAS has only experienced three consecutive adjecvives and therefore SPE KTEST has only created three embeddings. se critical inductive steo for LAS is to recognize iP, = iP... This requires recognizing the jdentity of the word classes NOUN, and HOU. and the word classes Add, a ADJ This will be done on the criterion of the emaunt of overtan af” 7 classes. it also reauires recognition that network 2Pp = Neg. Thus, to identify two networks méey require that tvo other networks ce identified. The network HP 3 is only 2 subnetwork of HP. So in the recursive jaensification of networks, GHVERALIZs will have to accept a subnetwork relation pesween one network Like NP, whieh contains another Like NP... The assumption is thet with sufficient experience the emoedded network would become filled out to be the same 45 the embedding network. After NPL hes been identified with WP2 HAM will have a new network structure where NP* represents the amalgamation of NP1, NP2, and NP3. NP > the NOUN the ADJ NP* P* + NOUN* ADJ* NP* Hote that new word classes NOUN* and ADS* have been created es the union of the word classes NOUN2, NOUN3, NOUNL and of the classes ADJ2, ADJ3, respectively. ENERALIZE was called to ruminate over the networks generated after the first fourteen senvences. GENERALIZE succeeded in identifying AlL9> with ALOT. As a consequence, network A195 replaced network Al9T at the position where 1t ceurred in the START network (see Figure 312). Similarly, B566 was identified with and replaced network pS564. Finally, B566 was identified wita and replaced 4 A195 througnout the START network. Te final effective grammar is illustrated in Figure 17. Iv now handles all tne sentences of the grammer. it hendles more sentences then the granmar that Was constructed after the fourteenth. 54 Ficure 1? Phe final crass B Sach SPARE | TOO S305 ee B508 PeAL96—S ae peA198-~ EZ A199 B593 3366 DL 6 TP NIL ne STOP €D1117 B10 ee 9g SP NIL STOP — E905 E88 —————— ee STOP B568 = below, Left-of AL99 = above, rignt- ~or B593 = square , triangle D1117 = plue, read large, small F905 = large, small 29 sentence. Tis is because the noun-parase network E556 has been expanded to jncorporate all possible noun purases. sefore the generalizations, none or ~f} 4¢. c ~ , ot e 4564, BOOS, ALQD, oF ALOT were complete. ‘The network B965 be- a. e L c types of languages. The first is the assumption of the correspo c at the surface structure of the language and the semantic structure. This is critical to BRACKET's identification of the surface structure of the sentence woich is, in turn, critical to the proper embedding of parsing networks. Second, there is the assumption of a semantics-induced equivalence of syntax. This played a eritical role poth in the generalization of SPEAXTEST and of a GENERALIZE. It was noted with respect to pluralization that such seneralize- tions can be in error and that children also tend to make such errors. However, I would want to argue that, on. the whole, natural language is not perverse. Therefore, most of tnose generalizations will turn out to be good decisions. Cleariy, for languages to be learnable there must be some set f generaliza~ tions which are usually safe. The only question is whether LAS hes captured the safe generalizations. . Tne importance of semantics to child lenguege learning has been suggested in various ways recently by many theoreticians (e.g., Bloom, 1970; Bowerman, . 1973; Brown, 1973; Schlesinger, 1971; and Sinclair-de Zwart, 1973), but there has been littie offered in the way of concrete elgorithms to make explicit tne contriputicn oF semantics. LAS. L is a Tirst small step to making thi s contribution explicit. Conclusion This concludes the explanation of the algorithms to be used by LAS.1 for language induction. In many ways the task faced by LAS. 1 is overly simplistic and its algorithms are probably too efficient and free from information-pro- cessing limitations. Therefore, the acquisition penavior of LAS. 1 does not nirror in most respects that ofthe child. Later versions of this program will attenst a more realistic simulation. Nonetheless, f think LAS.1 is a signifi- cent step forward. The following are the significant contributions embodied so far in LAS. l. 1. The transition network formalism has been interfaced with a set of simple and psychologically realistic long term memory operations. In this way we have bridled the unlimited Turing-computable power of the augmented transition network. 2. A single grammaticel formalism has been created for generation and - understanding. Thus, LAS only needs to induce one set of grammatical rules. 3, Two important ways were jdentified in which a semantic referent helps grammar induction. These were stated as the grapn deformation condi- tion and the semantics~induced equivalence of syntax conditions. 56 L, Algovrithzs have deen developed adequate to learn natural The general mode of developing the program LAS is as follows: A lanyvusge learning situation is specified py a set of conditions. tn LAS, 2 it was specified that LAS already know the meaning of the words and that it be given, as input, sentences with HAM representations of their meaning. The semantic domain was specified to be that constituted by geometric shapes. Cnee @ set of conditions is svecified, 2a set of goals is specified. In LAS. 1 there was only one real goal: to learn any natural-like language taat deserived the domain. Once a set of goals 15 specified a plan of attack is sketched ont. However, the problem is such that the details of that plea only evolve as we attempt to imolement th i Inde2d many interesting problems and ideas that in LAS. 1 were discovered in attempting ea impl ity of computer simulation in theoretical develo The LAS. 1 progr verated in a task domain which 2s means identical, to that of atural language learning situation. Its benavior was similar to © o earning a lenguage, but ezain by no means iden- tical. In sre xn two yeers i propose to create a program LAS considerably closer to sinuleting naturel language learning. I elaborate set of goals than did LAS. 1: ss __. 2whien comes h « 1. The program will incorporate realistic assumptions about short-term menory limitations and left-to-right sentence processing. 2, The program will learn the meanings of words. 3. The program should use semantic and contextual redundancy to partially replace exnlicitly provided HAM-encoding of pictures. h, The program should handle sentences in a more complex semantic domain. 5. The progran should be elaborated to handle such things as questions and commands as well as declarative sentences. The general methods for achieving these goals in the LAS. 2 program will be sketched out in the proposal section. Also in that section I wilt propose some experiments to evaluate the LAS program. While it is true that the task faced by LAS. 1 is not really natural language learning, it still is a learning task at which hucan subjects apparently can succeed, The experiments will de- termine whether humans have the same difficulties in such tasks as does LAS and whether they make the same generalizations. However, I regard these exper- inents as of secondary importance relative to program development. It is more important to further articulate our understanding of what algorithms are ade- quate for natural lenguage learning. oT It is probably inevitable tha is really necessary to expend the c ry program, Could not the model just be sp ecified nnn Tne reason 1 way this is not possible has to do with the com roLexity of any theory that addresses the details of natural language. There is no other way to test the predictions of the theory or to assure tnat ly c¢ isten The experience with large transformational gramin language is that they have hidden inconsistencies. these ere only exposed by trying to simu- late tne Eranmers on a computer (e.g., Friedman, 1971). Consider the deserip- tion given of LAS. 1 in the pr eceding section? Although lacking in many details, it was complex and lengthy. Could the reader esteblish for himself from tals deseri ption whether the model is really internally consis stent? A computer iw c program provides a proof of the consistency end @ means o2 determining & model's behavior. The stated goals of this project are to develcp explicit algorithms for natural languege learning veecify the relevant details of these algorithms, and evaluate empirical ty tne & psye hological viability or these algorithms. Without the use of computer simulation none of these goals ane could be achieved. C, Methods of Procedure First I will describe the proposed extension of the LAS program. Then I cribe some experimental tests. In reading the specific extensions pro- posed for LAS, the reader should keep in mind that they have ‘as their intent e goals set forth in the preceding section. achieving th The Semantic Domain The first matter to settle upon in the new progren is some semantic dem.in, . ie relations 2 2 the LAL, tp wertd of ataves, prorperties, na s2cnstric Te-2Ulcns +2 20 surat ished ror further work. Tne following is oroeposed as a suggestion altacugh there is nothing critical ebout its exact form. It is critical, however, that some semantic domain be chosen. It is only when there is a specified domein that an explicit goal for success in the program can pe specified. The progran will be regarded as successful if it can learn eny natural language describing this domain. I have chosen to look at a world close to that of a young child although there is perhaps nothing sacred about this domain. This world is set forth in Teble 5. There are three people in this world. In addition to these there are four categories of objects--locations, containers, supporters, and toys. These objects can have four types of properties--number, color, size, and quaii- ty. Thus, LAS will have to deal seriously wita problemas of sequencing adjec- tives. It will also have to deal with number es a property of objects. The objects permit a much richer variety of reletions than in the vorid of LAS. 1. This will provide a demanding test for the learning of complex multi-argumeat relations. There can be sentences like Mommy traded Daddy the car for a ball. In this world, people, containers, supporters, and toys can be in locations. People can change their location and that of toys. People and toys can be on supporters, toys can be in containers. People can possess toys, containers, and supporters. 58 Anderson TABLE 5 Categories in the World of LAS. 2 PEOPLE | LOCATIONS. COMPATUERS SUPPORTERS Mommy bedroom box table Daddy’ kitchen closet chair LAS den dresser bed TOYS NUMBERS COLORS. - SIZES GUALISIES dolly one red big dirty ear two blue media pretty pall three green small shiny Thus the differen: catexzories of objects enter differently into different types of relations. This Te + will prove jmoortant to the predictive parsing facili- ties that ZT will want to introduce into LAS. 2. Left-te-Zisht Processing Cnildran learn language auditorily. Thus, their induction algoritans must process incoming material in a left-to-right manner. The current LEARVMORE program does not go this. BRACKET completely processes the senten SPEAKTEST even begins to work on it. Clearly, PRACKET an integrated so that the beginning of the sentence is pracket py SPEARTSST before the end of the sentence ts considered by eithe ducing this left-to-right processing is a preliminary to introducing short- term menory limitations into the induction situation. Figure 18 illustrates in highly schematic form the left-to-right algorithn proposed for LEARNMORE. Words are considered as they cone in from the sentence LEARVMORE, as in UNDERSTAND, tries to find a path through its netvork grammar to parse the sentence. The difference petween LEARNMORS enc UNDERSTAID is that LEARNMORE hes available to it a HAM conceptual structure to enable it to better evaluate various parsing options. Suppose LEARNMORE is at some point in processing the sentence. It will also be at some point in 4 parsing n Let us consider how it would process the next word. At box 2 it in the word. At pox 3 it would set 1 to the various grameatical options (ares) at that node in the network. Boxes Ty through 7 ere concerned witn evalua. waether any of these options can handle the current word. Box 4 che there are any options left. Box 5 sets a to the first option and re the remaining options. Box 6 checks whether the word would be parse and box 7 considers whether the action associated with that ere corr a HAM structure. “If a passes the tests in 6 and 7, FARIMORES advances to con~ sidering the next word. Otherwise it tries another arc. Tf it exheus arcs, it will call FIITDPATH (box 8) to build e new are fron the curreat node. 29 Flowen wh Me rr og a program 60 The work currently assigned to BRACKE? will have to be assigned to 9x [- That is, box 7 will have to determine when an are snould involve @ push to en embedded network and when it snould pop back up to an anbeddirg network. This will be done by consulting tne information in the semantic struc vure. Tt would also be possible to consult the pause structure of the sentence tor information about phrase structure poundaries, Note that certain sentences which the old LEARNMORE system could handle will not be handled by this system. For instance, consider the sentence The Square that is above the triangle is rignt-of the square. After the first two 3 words it would not be clea: which squer a object or the subject of ; 3 a. an appropriate action to the path. In tne old LEARNMORS thi the referent of square Was resolved by let 2 n * - « —_—__ . dealing with it. Presumably, however, cal a from such sentences. bry {p [p+ 3Q fa ct t Qo rh £ H t+ Om © } ian . ce 0 3 he aes ra wy o Cc h joy In this system it will not be assume that LAS knows the meaning of the words, Rather this will be something that LAS will have to learn from the pairing of sentences with conceptions. First let's discuss the learning of words whose reference is a simpie concept or object, ©-&-> box or mommy, and postpone discussion of c mpolex relational terms like trade. Logically, the task of lexicalization is quite simple and it would not require complex algo- rithms to succeed. For instance, consider this elgorithm: LAS is given a2 sentence with n, words and a conceptualization it Geseribes with ny concepts. tore with each word the my, concepts. The next sentence that comes has Ro words and its conceptualization consists of zp concepts. If a word in this sen- tence is new, store with it the mp concepts. if the word is old, store with it the intersection of the concepts previously stored with it and the new mo concepts. Eventually, ignoring problems of polysemy, & word will become pared down to zero or one concepts. Those with zero concepts are function words end those with one concept have that concept as their meaning. a = + w Of course, this elgorithm will ma into trouble if LAS does not always eptualize all the concepts referred to by the sentenee, This can bea died by having the algorithm wait for a sequence of disconrirming pieces r idence before rejecting 2 hypothesized meaning. Incidentally, subjects ehave just this way in concept attainment situations (see Bruner, Goodnow & Austin, 1965), not teking negative evidence @&s having its full logical force about the meaning of the word. The basic problem with this algorithm is thet it makes unreasonable assunp— ions about the information processing capacities of humans. In pilot researecn £ my own, I have found that adult subjects can learn the meanings simultane- ously of a number of words in a sentence. However, they do suffer difficulties when there is high ambiguity about what a word means. Presumebly, children would have even greater difficulties extracting word meanings from complex sen- tences. Broen (1972) and Ferguson, Peizer, & Weeks (1973) report that new items of vocabulary seemed to be introduced through use in set sentence frames such as Where's ..., Here comes ...-, There's ... known as deitic phrases. The noun tends to be heavily stressed and repeated. ‘The parent frequently points to help 61 ah env t 3, provided the child Knows more complex Se nes g most of the ne satical of the sentence fT. compine these yarious considera- p L t eure 18 to deal word is reac if Li ag % about context and about the word's position in tae grammar, it {Ll co ‘this guess to menory and stick with the guess i ess later disconfirmed. Tne program wilt only hazard a guess in circumstances of low uncertainty. Thus, “4b will only guess if it can otherwise parse tne grammatical structure in which the word appears. It will not guess if the word is receded or followed bY know. Thus, the progres, much a8 adults appear to, will contrasts between grammatical pattern and a e program knows the grammatical rule NP - determiner p> a Oo rh oct Le rt Pe [3 adjecti . the phrase the ¢lick box it will suppose thet glick rerers to some property of the box. Thus, the progren will have to acquire its initial vocabulary by means of simple frames, 85 do young chilcren. With this initial vyocebulary information, it can begin to learn grammatical rules. Once in possession of grammatical rules, it will no longer nesd simple frames +o learn new lexical items. One interesting question is how function words are ever identified as non- meaning-pearing in this scheme, Presumably, 31 is done on the vasis of failing a d and any semantic yeature. FHLS esses had been associated with a word, So far I have assumed that all concepts are constructed before language acquisition takes place and that the only problem is to link up these concepts with words. But this is very unrealistic. Consider the verd give in the sen- ives the dolly to Daddy. The meaning of give is something 1ike ng one to cease to vossess 3h object end someon’ elise as ooject. Tt seens very implausible that a child comes learning situation wits sucn a concept ready made. What probably — he sees Mommy pushing the Goll to Daddy or Momny handing the ball to vany. With these experiences he hears sentences like Mommy gives the golly to Daddy or Mommy gives the pall to baby. From these examples he induces the appropriate meaning of give. Cancept attainment in these situations can be achieved by using the sort of concept jaentification used py Winston (1970) for inducing geometric concepts. That is, each use of the word give is paired with e EAM network structure given the meaning oF the sentence. Winston's heuristics allow. us to extract what these network structures Pe mon. ‘Tne concept give, es verb, is then attached to For this sort of algorithm to succeed, LAS must be set to regerd certain con~ figurations of propositions, interlinked by causal terms, 4&5 being associated - with a single relational term in the langu2ege. 62 Note also that the meaning of complex re ~ % ~ abe as e bae sentence moOmm yA TT co ERSTAND set upd, na 2 caild is that first the child eh has been in two and three word ueterances. es ib appears that children have omitt ad mo function + eonstructions. One explanation of the origin of telegraph e pealing from the point of view of LAS is the following: Suppose that LAS did not receive as input to its Teena routine complete sentences lesrapnic sentences. liy induce a te [t seexs reasoneole otel sentence he . If so, then his be receiv as their basic celesraghic Shis necothesis com fron studies of chiid imitation of adult mid bhay these tmdtati ons. amnile tanger than tha chiid's awm iso telesraphic in nature (e.g., Srown 2 Fraser, 1964). Blas- 1970) found that childre tend to repezt those words which are words which occur in terminal positions. The seme annem eng to be stressed in adult speech. Scholes (1969, 970) en tended to omit words that had unclear senantic 2 oes or What I find striking 1s th these ere just the veariebdles "ranch sentence--a language of serial pos one per- es C fectly. Of course, wh Tm tablished effects in eaningfulness el ments on immediave memory. ough an aspect “I propose to introduce telepraphic i the variables Ss ¢ D of LEARNMORE called BADEAR. The BADEAR program will simulate of stress, meaning fulness, and serial position in orovidings LAS with a depleted version of the sentence. The locus of the effect of BADEAR will be between boxes 4 and 8 in the flowchart of Figure 2. Basically it will not bass all words onto BUILDPATH. Rether some words will "slip fron consciousness” after failing to be persed. It will tend to omit words wher: (a) they are unstressed, (ob) their meaning is not known, (c) a critical nusoer of new words in the sentence nave already been pas ssed to BULLDPATH. I suspect this critical number is something Like one or two. Factors (2) and (b) would generate che effects o Factor (c) would yield good memory Tor the fir good memory children do show Hr last wor term acoustic memory. 7 oi 63 An interesting fez @ B at, expanded, LAS sould tbe able to receive more of the sentence. tons and imita- sions would grow as Goes a child's. This would be colicit mechanisn for an ides suggestes ty Braine (1971), Olson (1973, Tnducing 2 grammar Drom derenerate sentences presents an intere How is it tnat pangon its rules for generating te eecn? verely ed 4 r oR fuller sé does not follow S$ are W oe le means for expressing tne same tho hanisa incorporated that will strengenen some & mat lative Rules to be would be t essTul FUDERSTAND and uccessfully ignt. tt & ares ont of parsing netw ack ¢ eir relative Subjects wo of a sta i Tneffective ne originel rc oO a on word utterances would descend 4 of the stack and so become unavailable. This strengtn mechnanisn the same as used to order Links in the HAM memory model. This is a different way to bring formation to vear in grammer induction than thac bron posed for rather than seexing explicit éisconfircation of rules, the rules ned out of existence as more adequate rules taxe Ss over the used to occupy in sentence understending and af * eneravicn. with the following form: START NP é, This grammar requires considerable backup if the sentence does. not have an RA relation. As suggested earlier it would be more efficient if LAS were given the power to transform the grammar into the following form: STOP GRA iP NP E&.2RB a STOP Given that there are s ous time problems (see introduction of proposal) in parsing, it isc i 4 methods be incorporated in the learning program for optinizing the grammar. The merging of arcs, besides making the grammar more efficient, would be another form of generalization. It could be used to further merge and build up word classes. 6h Any a errr ok aranhkinn ar aoe s, reyin ore tyo further ways thab semantics can be used to aid language + ~ an 1 4 7 h contain color names. Curren nt wu ae u the amoun overlay between the mem er its over é ie property. joe oO ics would be to lesse s t ations of sentences. I¢ should somet 5 se interpretations. For instance, suppose @ sentence can din. Because of the conceptue constraint t to guess their connection. This use of conceptu domain could also be used py UNDERSTAND to 3d t She model of the Schank's (1972) system. Tnat tandl cing @ sentence by use of syntactic information, i c traints to predict what a interpretat sdietion can then be checked for synvac ammar. It would be profitable to try t ike Schank's within the rigors of the eb 0 Oo mY ct or o cr ry a O O ct 3 as jQ v ee 3 cr uy rae 9 sy ob 1) cr ot m n mp E fp ct foe RO Ww ry nm ct wm wy my ry UD ts fe om OM DD Ps a fe rp op oO mPct 1 @ ct rs ch OW ocr Q @ ct hy 0 mM won a i by use of the network dictive persing system PAymal formalisms, A Procedural Semantics So far LAS hes been principally concerned with representing the meaning conveyed by a declarative sentence. However, language hes other purposes than er commands oO just to communicate meanings from one speaker to enoener Co dé ly in the box, D a end questions. For instance, consider the sentence Put the do Currently, UNDERSTAND might retrieve the sentence'ts meaning as S of LAS that it out. the dolly in the box. This is the dec Laratiy However, in addition LAS should evoke an action that an ection to decide whether to comoly. Al 21 meaning of the sentence. The procedural meaning of decl: $ very Simple: store this sentence. This is already as ment of the sentence. However, the procedural meanings underlying ‘o uw 3 0D ry Pa r r cr Dp HO oom @. Nn ct c ri D Oo «MM oO “s bry WO ck F4 rs A uv ou fy 0 1 ry Om e ' § Wy ts 0 ocr 0 aw ocr 3 - ra) yo mw ct ae 3 eat pes of sentences are more complex. A large part of the success of 's system is that it.was adequately able to deal with the procedural of various sentences!’ semantics. It is important that LAS begin to tnese too. . 4 Me m = oO nF Pe iS $3 -J v 2 < eo ck ry He ~ 0 dpom (9 this would mean, in terms of LAS's network grammars, is enrich set of tions vat can be stored. Currently, the only actions are ones result in the creation of pieces of HAM structure, i.e., : i LAS will have to ene other internal actions that svecify whet it does the declarative knowledge. These will include commands to answer the qu or obey the order. HAM already has commands that direct it to answer aq but executing orders would be something new. As part - the HAM project, working on methods for incorporating procedural knowledge into a network tem. It is unclear yet what success I will have here. cr ocr ove (D Q j4 io H 3 ci } < m ree by 9 £ jos bck Mw Ug > x Pa ct oO u 0 Ln fet ch ocr O a) poe fee ° un HS a ta u ! 65 a Andersoa language whose semantics n a r Consider for ch Pinite article--the ab bjeck whi: > listener t Li s be i This partic ly g is to ¥y to speaker and context. Since the referent or you completely chanzes with speaker, a child would be lost if he tried to associate its meaning with some f a at it as having as meaning @ pro- HAM memory-node. He must be pregared to tre cedure for determining the referent. Provided that LAS has the facilities for representing and evaluating pro- cedures, there seem no difficulties in learning those aspects of language which are heavily embued with procedural seransies. Language learning wi tinue to arise from pairing sentences with secentic interpretations. Howeve serantic interpretetions will now contein & procedural as well as a declarat aspect. Again language learning will consis learning mappings betweea s tences and the now-enriched semantic represenvations. Experimentation As stated before, I do not think that i the principal focus of the project. There 11 much further research that needs to be done in the way of specifying elgorithn that are capadle of language induction. Nonetheless, in parallel with shis research, I would like to perform experiments to get some initial assessments of the viability of the proposed elgorithms. The type of information relevant to evaluating LAS is only acquired by looking at artivical languages. With these artificial languages it is possible to test LAS's predictions about language learnability and generalization. mental research should yet be Criticisms of Experiments with Artificial Languages For ethical reasons it is not possible to expose young children, just learning their first language, to an artificial language which LAS had identi- fied es degenerate and probably not learnable. This means that all experimen- tation with artificial languages must be Gone on older children already vell- established in their first language or on aaylts, Conseauently, the first lan- guage may be mediating acauisition of the second language. ‘There is evidence (see Lennenoerg, 1967) that there is a critical initial period during which languages can be learned much more succéessiu.y than in later years. Lennenberg speculates that there is a pirysiological basis for this critical period. ‘Thus, one might wonder whether the same processes are peing studied with older sub- jects as in the young child. Personally, i o doubt that the mechanisms of language-acquisition are the entirely same wi ct . s h the young child in first language learning as With the older subject in second language Learning. However, it does cr Oo Other criticisms (e.g., those of signin, 971; Milter, 1967) of stucies re 7 1a arvnoin n the fact that tnese Languages are a é molicated t an artificial labora- j lex functions; the ech. LOW a" GACL 1 phenomena. Another ose studies of 5 a semantic referent. Clearly, 5: 3 f: of algoritans @ subject can employ. neuristics used bY LAS would be useless without senan- tics. (1972, 1973) neve shown that the existeace or a seman uge effect on Language acquisition. Except for control ‘condi iments will involve @ seransic referent Languege Learneboility 2 oS on elgorithm is that the graph defo mation condi- tion ts ms ation between the surface structure of tne sen- tence and the sal structure. These is, the surface structure mist preserve the original connectivity of concepts. In Section A5 we described languages which violated this assumption. Consider the following language: oS at Bucs S$ > NP NP relation HP > noun (Color) (adjective) {clause ) CLAUSE >» te NP relation NOUN + square, circle, triangle, diamond Color > red, blue Size > small, large Relation > above, pelow, right-of, Laft-of cS This is en expanded version of GRAMMAR] described in Table 1. (The element te a in An serves the function of a relative pronoun like that.) An example of a sentence a Loy this languege is Squere red te triangle pig above circle Dive small right-of. experinent Twill Go compares Four conditions of jearning for this langucge- No reference. Here subjects simply study strings of the language trying to infer their grammatical structure, Bad semantics. Here & picture of the sentence's referent will be presentea elong with the sentences. However, the relatioaship between the sentence'’s semantic referent and the surface structure will violate LAS's constraints. The adjective associated uth the ith noun phrase will modify the (n+ 1- iL} shape in the. sentence (where n is the number of noun phrases). For example, the adjectives associated with the first noun phrase will modify the last 6T om yw ~~ S (oo) (b) ( RED SY (c) Figure 19. Different semantic the same triangle right-of. roferents for the sentencees Square red te pig above circle blue 5m ay 7 ft yet - Anderson shape. Similarly (q + 2 - ijth rel So for instence the petyeen the first pai triangle. Gne ay in picture for the example sentence 15 given in Fiuuce 19a. ud) h, Good semantics plus main oroposition. The picture in this condition wiil be the same as in 3 but the two shapes in the main proposition will be highlignted. In this cond@ition LAS would be guaranteed of successfully bracketing the sentence because the main proposition is given. In some ways this experiment is Like Moesser and Bregman's, However, here English words are used s0 that the subjects do not need to induce the language's j its grammar. Fob corresponds to the situation faced 1 5 sh words were replaced by nonsense syllables this would tion of the Language to make pLiti induction tractisle. Tne | predictions of LAS are, of course, that best learning occurs in Condition 4, next best in 3, and failure of any learning in 1 and 2. It would not be sur- prising ta see gunjects perform better in Ltren in 2 since in they might par- 7 ~ 77 * : a cute ce eee ok peace kDa BLL Ve BULB ti Ch BO PL OPE Bae ew decent theo > ve The procedure would have subjects in all conditions study the same sequence of sentences but vary the accompanying semantic information according to condi- tion. After a study phase they would be tested for grammaticelity judgments about a set of sentences, Some of which violate one of the rules for generation. Since the syntax of the language is the sane in all four condivions, the sane sentences will be eramnat 1 in all four conditions. @yen though the synvtac- tic information given d study will be the same in all conditions, marked @ifferences in syntacti Tr r 52 + Ss He ete (9 39 oo 2 ao ct uw 0 wledge should appear across conditions. ine guences of study trials with sequences of test tudy six sentences, with the semantic information jate to his condition, if any). Then he would see six test pairs, one ce of each pair violating some syntactic rule. For each pair of he would o- choose the grammatically correct pair. By frequently alternating study st, 44 would be possible to carefully monitor the growth of information in the conditions. Many readers may not be surprised by the prediction of petter learning in Conditions 3 and hk. Hopefully, the significance of such an outcome would be clear. It would snow shat semantics is impo tant to induction of the structure of a natural language. Hovever, i ck o (b would also show that semantics is useless if the relation between the semantic referent and the syntactic structure is arbitrary. The surface structure of the sentence must be a praph- deformation of the underlying semantic structure. Failures to eppreciate the contribution of semantics to Language induction and failure to understand the nature of this contribution of senanties to the induction process nave been fundamental in the stagnation of attempts to understand the algorithms permitting Janaguage induction. These facts may be obvious woen pointed one but they have . . . wate . . rd > - = D2 Phe ar wa been unavailable to the Linguistic theorists 19F fifteen years. he same purpS ea S hat is, they ¢ A 3 ssibl so that the target language can be identified. However, ainc ed by LAS are not tre came ag those suggested by Chomsky. For insvance, Chomsky proposed that vrans—~ formations which reversed the order of words in 4& sentence would be unaccentanle. Tris is because such 2 rule does not refer to tne santence's constituent struc- tore. However, 2 languesge which contained sentences of a natural language and their reversals would be learnable by LAS. Te would just develop one seu or rules for sentences in one order and another indesendent set for reverse order sentences. It would be interesting to see whether numan subjects could iearn such a language. In the example of the induction of GRAGIARI we found that tne for LAS to detect non-senantic contingencies between syntactic cno a ° first noun-phrase end tn the second noun-phrase pushed to in the. main network. Wor instance, it is possible that a morphenic emoellishment of the a jectives i ; hrase may depend on & choice of morphemie embe the noun in the first noun phrase. Human subjects should also find it hard to detect such syntactic contingencies. oO m There ar nother set of predictions, besides those concerned with language learnability, waic + will be useful to explore. LAS makes predictions about the situations under whieh humans will ten to generalize rules end when humans will not. Suppose LAS learned the following gremmar: , S$ > VERB WP NP we > (PREPP) Wy, (ADS) PREPP > PREP Ne Ny > boy, girl, ete. No > room, bank, etc. ADJ + tall, nice, etc. PREP > in, near, etc. VERB > like, nit, etc. 1 + A typical sentence in this language would be Like which means The tall boy in the room likes the nice 83 1. Tnis lang e is given English terms only to maxe its semanties Clearer. Suppose, in fact, words in the language were das meaning man, ji> meaning wonar, Fos meaning boy, and 3 tuk meaning girl. Suppose the subject studies the following pair of sentences! 1. Like das tuk. 2, Like fos jir. 10 Then, it is interesving to consider his judgmenes of the nacceptaoility of sentences Like: 3, Like das tuk. 4. Like das jir. 5S, Like jir Gas. Accept involves recalling senteace (1), but nVOLV c LAS would currently mexe th 3 ' ’ cy fa wv ip uy) oa _- a “~ Oe wONeECNE 3, 5325 oo 5 mS 7 of their semantic similarity \ s. ‘Tne words 3: could, for Dax 4iff 5 cane inflection wnen they apr ferent t when pr ion in this artificial Jangu Would he accent senven 6. Like in room boy tall girl 7. Like girl in room boy tall That is, will rules gener alize from the subject noun purase to the object noun phrase. As LAS is curren ntly constituted such gener rai tzasson s would not occur until it hed built up fairly stable now pnrases. Again suppose LAS had initially only encounterecé simple sentences suen es (8): uch as (8) LAS would learn the class of nouns that Go From sentences Ss e first and second noun phrase slots. Suppose then sencence (9) was studied. On the basis of it, would senvence (10) be accepted as grammatical? That is, would the preposi itional phrase in bank generalize to Ov “near nouns in the same class as woman? 9, Like boy in bank women 10. Like girl in bank man This would be am example of right zgenereli In contrast, LAS does perform left generali LAS would accept (12). zation which does not occur in LAS. zation. That is, after studying {11} lL. Like boy woman nice 12, Like boy man nice fi poses, one concerned with psycnology and one ence. IL think this mixed purpose is fruic- réilization of ideas from two fields and so on. There is no gueranvee that LAS, in the will ever achleve the goal of an adequate acauisition of language. However, 2 certain outcome er understanding of the information-processing demands and of the role of a semantic referent in gremmar in- “re Will learn wnet is wrong with one explicit set of i | =ven that would be 2a significant contribution to the Currie UbEUre be cas Gevelupmeub dn a PlelG rich in Gave Dub abmvel, LuLaLiy Jecking explicic information-processing theories. I hope, of course, that the processes uncovered in the LAS project wiil be the same as those used bY humans in language learning. A successful simulation program would constitute an enormous advance in our understanding of cognitive development. The contributions of LAS to the artificial intelligence field are less certain and more distant. Nonetheless, generality in language understanding systems is an important goal and one for which a learning system approach seens ideal. It is therefore importent to understand the contribution language learning systems can make in this field. It yould be a significant advance to know in detail way & learning system approach was not the answer to language understanding or at least why LAS was not the right sort of learning systen. Of course, if LAS does prove to be the basis for a viable language understanding system, its contribution to artificial intelligence will also pe of considerable importance. , FE, Fecilities Available { shall neve evailable the entire facili Center, University of Michigan. My current @ but can be extended for one to three years. My pr Michigan Terminal System which supports a rich vari the programming will be performed in Michigan LISP ( which is e relatively economical and an error-free ve c ty of programs. Most of S £ % Wilcox, 197%) LISP. {2 mag ae son ALPAC (Automated Languag Language and machines : te ; >t National Scieaces, Washiagton, 232 Andersoa, J. R. Computer simulation of a language-acquisitioa system in R, L Solso (&d.) Information Processing and Cognition: The Loyola Svmoostun. Washington: Lawrence Exlbaun, 19/5. Anderson, J. &. and: Bower, G. H. -Human associative menory. Washington: Winston and Sons, 1973. Bar-Hiliel, Y. Language aad information. Reading, Mass: Addison-Wesley, 1964. ns s In J. R. Hayes opment of languaze. New York: Wiley, 1970. Bever, T. G.. The cog nt itive basis for lingulstic structure (Ed.) Cogn t vel ks wv | od Q ban 9 overs! Bierman, A. W. An interactive finite-state language learner. First USA-JAPAN Computer Conference, 1972. Bierman, A. W. and Feldman, J. A. A survey of resuits in gramnatical inference. In (£d.), Frontiers of pattern recoenition. New York: Academic Press, 1972. blasaeil, K. and yensen, rv. stress and word position as detetminants of induction in first-language learners. Journal cf Speech and Hearing Research, 1970, 13, 193-202. . Bloon, L. One word at a time. The Hague: Mouton, 1973. Bobrow, D. G. A question~answering system for high school algebra, word problems. AFIPS Conference Proceedings, 1964, 26, 577-589. Bowerman, M. Early syntactic development. Cambridge, England: University Press, 1973. Boyer, R. S. Loc! king: A restriction of resolution. Ph.D. Dissertation. University of Texa xas at Austin, 1971. Braine, M. D. S. On learning grammatical order ox words. Psychological Review, 1963, 70, 323-348. Braine, M. D. S. On two types of models of the internalization of grammars. In D. J. Sloben (Ed.) The ontogenesis of grammar. New York: Academic Press, 1971, 153-188. Broan, FP. vhe verbal enviroauent of the language-Learaing child. boaogra hg P of the American Speech & Hearing Association, 1972, 1?. rT eee We ee ae er rr cs ae Brown, R. A first language. Cambridge, Masse: larvard Brown, R. and Fraser, © The acquisitioa of syntax. im c. N. Cofe B. S, Musgrave (Eds ° { + & , Yarbal pehavior and jearnmings Problens and processes. Vetoes oo 2 Q jearning: Problens 2n8 8 - New York: MeGraw-Hil Bruner, J. 5-, Coodnow, J., and Austin, G- Ae A study of thinking. hew York: Wiley, 1956. Charniak, E. Computer solution of word problems. proceedings of the Intec national Joint Conference of Arcificial Tarelligence- Wasnington, D- C.3 1969, 303-316- Chomsky, N. Aspects of the theory of syntax: Cambridge, Mass-:; MIT Press, 1955. Colby, K. Me and Enea, H. Inductive inference by intelligent machines. Scientia, 103, 669-720 (Jan. 7 Feb. 1968). Clark, E. Vv. Non-Linguistic strategies and the acquisition of word meanings. Cogaition: International Journal of Cogaitive Psycholosy, 1974, in press. Coles, L. S- Talking with a robot in Eng ~narional Joiat Conference on Artifict: 1969, 587-596. ~ — Proceedings of the Inter- elligence.- Washington, D. ©. f3 ee r 3 rr? uisition of precedence grammars. Engineering and Applied Science, es, 1970. Crespi-Reghizzi, S. The mechanical acq Report No. UCLA-ENG-7054, School of University of California at Los Angel Dreyfus, H. L. What computers can't do. New York: Harper and Row, 1972. RO Ervin, S. M. Imitation and structural change ina children's Language. In E. H. Lennenberg (Ed.), New directions in the study_of Language. Cambridge, Mass-: MIT Press, 1964, 163-159. Feldman, J. A. Sone decidability results on g ramnatical inference and complexity. A. Lt. memo No 93.1, Computer Science Department, Stanford University, 1970. Fergusen, C. A., Perzer, D. B., & Weeks, T. E. Model-and-replica phonological grammar of a child's first words. Lingua, 1973, 31, 35-55. of syntax. Paper presented Fernald, C- Children’s active and passive knowleds to the Midwestern Psychological Association, 197 on Fikes, R. E., Hart, p. £. and Nilsson, WN. OJ- Some new directions in robot problem solving. Stanford Research Institute, August, 1972. 74 Fhllmore, C. J. The casa for case. in E. Bach and R. J. Harus (Eds.), + Universals in tinenistic theory. New York: Holz, Rinehart anc Hiaston, 190d. Friedmac, J. A computer mod American Elsevier,1971. Praser, D., Bellugi, U., hension, and projection. Jour 1963, 2, 121-135. Research on intelligent question answering che ACM 23rd National Conference. Princeton: , 169~i8l. Hafner, C. and Wilcox, B. LIS? MTS programmer's manual. tent Research Communication 302 and Information Processing Paper ¥y 1974 a Tim wnt oot nhac University or “icnigar., de atical initerence. Technical Report No. CS 139, Computer Science Department, Stanford University, August, 1969. uary 1966. Technical Repor Naval Pasearch, Information Systems Branch. Reliey, . L- hachy syucecclé ACQULSAL LU. E$<-3729, Lhe Rand CoLye, baka Monica, California, 1907. Kelloggs, C. H. A natural language compiler for on-line data management. Proceedings of the 1965 Fall Joint Computer Conference, 473-492. Kuno, S. The predictive analyzer and a path elimination technique. Communications of the ACM, 1965, 7, 453-462. uage without ability to speak: A case report. Lenneberg, E. H. Understand ychology, 1962, 65, A19-425. i Journal of Abnormal and Sozia Ww a9 Lenneberg, E. H. Biolowical foundations of language. New York: Wiley, 1967. Lindsay, R. K. Inferential memory aS a basis of machines which understand natural language. In E. A. Feigenbaum and J. Faldman (Eds.), Computers and thought. New York: McGraw-Hill, 1963. Loveland, D. W. A linear format for resolution. Proceedings of the IRIA Symposium on Automatic Demonstration. New York: Springer-Varlay, 1970, 147-162. . Luckham, D. Reitinemants in resolution theory. Proceedings of the IRIA Symposium on Automatic Demonstration. New York: Springer-Varlay, 1970, 163-190. 75 Miller, G. A. Thea pss chology of comeunication, New York: Basic Books, 196 Minsky, M. CEd.), 5 Semantic information orocessing. Cambridge, Mass.: HIT Press, 1958 Moeser, S. D. and Bregman, A. 5 of Verbal Learning aad. Verb: Moore, E. F. Gedanken experiments on sequeatial machines. Automata Studies, Princeton, 1956. Olson, G. M. Developmental changes in memory and the acquisition of language. In T. E. Moore (Ed.), Cognitive development and the acquisition of language. New York: Academié Press, 1973. Pao, T. W. L. A solution of the syntactic i nduction-conference problea for 3 non-trivial subset context-tree languages. Report } No. 70-19, The , see School of Electrical Eagineering, University of Pennsylvania, slat, 1959. Guillian, M. R. The teachable language comprehender. Communications of the Association for Computing Machinery, 1969, 12, 459-476. Reber, A. S. Transfer of structure in synthetic languages. Journal of Experimental Psychology, 1969, 81, 115-119. Richards, I. A., Jasuilko, E. and Gibson, C. Russian through pictures, Book I. New York: Washington Square Press, 1961. Robinson, J. A. A machine-oriented logic based on the resolution principle. Journal of the ACM, 1965, 12, 23-41. Rumelhart, D. E., Lindsay, P, and Norman, p. A. A process model for long- term memory. In E. Tulving and W. Donaldson (Eds.), Organization of memory. . New York: Academic Press, 1972. Saporta, S., Blumenthal, A. L., Lackowski, P. and Reiff, D. G. Grammatical models of language learning. In R. J. DePietro (Ed.), Monograph Series on language and Linguistics, Vol. 16. Report of the 14th Annual Round Table Meeting on Linguistic aad Language Studies, 1963, 133-142. Schank, R. C. Conceptual dependency: A theory of natural-language understanding. Cognitive Psychology, 1972, 3, 552-631. , Schlesinger, I. M. . Production of utterances and language acquisition. In D. I. Slobin (Ed.), The ontogenesis of grammar. New York: Academic Press, 1971. 76 Scholes, R. J. The role of grammaticality fin the imitation of wore s by children and aduits. Journal of Yerbal Learning anc ; 1969, 8, 225-228. Scholes, R. J. On functors and conten ntives in childzena 's imbtations of word stcings. Journal of Verbal Learning and Verbal Behavior 1979, 9, 167-170. Sehwarez, R. M., B rgex, J. F. and Simmons, R. F. A deductive questioa~ answerer for natural langus is atio Association for Computing Machinery, 1970, 13, 167-133. sd Shamir, E. and Bar-Hillel, Y. Review 2476. Computing Raviews, 1962, 3, 5. Siklossy, L. A Language~Leatning ,euristic progran. Cosnitive Psycnoioay, a tai 1971, 2, 479-495. Simmons, R. F. Natural language question- answer ring systems: 19909. Communications of the Asso ciation for Computing Machinery, 1970, 13, 15-30. Simmons, R. F. Semantic ft networks: Their computations and use for understanding English sentences. In RB. C. Schank and K. M Colby (Eds.), Comput ar models of chought end language. San FY neisco: Freeman, 1973. > -ce le 3 ad rt GS Simon, H. A The sciences of the artificial. Cazbricge, Mass.: MIT Pressy/76 7. oe acquisitioa and cognitive development. In tive development end tne acquisition of language. =s, 1973. Slagle, J. R. Experiments with a deductive question-answering program. Communications of the As sociation for Computing Machinery, 1965, 8, 792-798. Slagle, J. R. Automatic theorem proving with renamable and senantic resolution. Journal of the ACM, 1967, 14, 687-697. Siobin, D. I. The ontogenesis of grammar. New York: Academic Press, 1971. Solomonoff, R.- J. A fornal theory of inductive inference, Part It. Information and Control, 1964, 7, 224-254. Weizenbaum, J. ELIZA - a computer program for the study of natural Language communications between man and machine. Comzunicetions of the ACM, 1966, 1, 36-45. Wilks, Y. The Stanford MI and understanding PIO} ect. In Senank and Colby (Eds.), Computer models of thought _and language. San Francisco, 1973. Winograd, T. Understanding natural language. Cognitive Psychology, 1972, 3, 1-191. 77 Anderson Winston, P.-H. Learnin ng Structural descriptions fron examples. MIT Artificial Intellisenc ce Laboratory Project AI-TR -~23i, L970. Woods, W. A, Procedural semantics for a question-answering machine. Proceedings ofthe 1958 Fall Joint Comsurer Conference, 457-471, atural languaze understanding: An application woods, W. A. Progress inn FLIPS Proceedings, 1973 National Computer Gonftorence to lunar geology. AF and Exposition. Woods, W. A. Transition network grammars for natural language analysis, Communications of tha ACM, 1970, 13, 591-606. 78