92 Seo C P pf \8 \ fo > *COLOR { J CIRCLE #OA RED SQUARE Hig | BLUE Figure 4. An example of a HAM structure encoding both categorical information and word class information a (Ideate red 1) (Ideate square 2) (Ideate above 3) (Ideate circle hy (Out-of KL) (out-of X 2) (Out-of X 8) (Oojectify & Y) (Relatify 8°3) (Out-oF Y t ply to any m2 st languages will b will also be used to illustrate the SPEAX anc UNDER eribed shortly. The first, GRAMMARL, is a simple artificial grammer. Ta second, GRAMMAR, is a more complex gramzar Tor @ Su aYnAAD D aefined by tne rewrite rules in Table l. GRAMMARL wa mally different fron Englisn word order. Tne sentenc be read as asserting the first noun-phrese nas t last word to the second moun phrase. For purpose il of these languages are English but they need nov oe. GRAMMAR] 1s 4 finite Language without recurssicn. In contrast, in GRAL-MARe the NP element has an amb dann t OTATOS eratnh naam manwnndiareadear aot) MD oeamarctine 2 nntoential infin- Sp eee Se ee MOCUPILV Ry mem y peor cv. meen et constructions. we a ite embedding of uy fo 3 fp In both gremmers, it is assumed that above end below are connected to the idea as are right-of and left-of. Tne words differ in the assigment of their NP arguments to subject and object roles- Tnus the difference between the word pairs is syntactic This is indicated by having the words pelong to two word classes RA and RB. Thus, UNDERSTAND with CRANMAR2 would derive the same HAM representation in Figure 3 for the sentences The red square is evo0ve the circle and The circle is below the rea square. It yould have been pos sible to generate distinct representations For these two sentences. I think this would have Deen less psycholegicetly interesting. Basically, the network ise grangar makes the inferences that A below 3 quivalent to B above A and en- codes the latter. TABLE 1 The Two Test Grammars GRAMMARL GRAMMAR Ss + WP NP RA Ss: +> WNP is ADJ NP NP RB NP is RA Ne NP + SHAPE (COLOR) (SIZE) WP is RB NP SHAPE + square, circle, et. NP + (the,a) NP* CLAUSE» COLOR + red, blue, etc. . ye* +» SHAPE SIZE > large, small, etc. . + ADI SHAPE RA- > above, right-of CLAUSE > that is ADI that is RA NP 27 TART? 4 abbas TABLE 2 continued g +» below, left-ot CLAUSE + thet is RB uP SHAPE + square, circle, euc. ADS + red, bis, blue, ebc. RA + above, right-of RB + below, left-of Figure 5 illustrates the parsing netuorss for the grammars. It should be understood that thes networks have been deliberately written in an inefri- cient manner. For instence, note in CRAVMARL thet there are tyo distincs patns in the main START network. Tae first is for tnose sentences viva RA relations and the second for tnose sentences with 2B relations. If a sentence input to UNDERSTAID nas a RB relavion, UNDERSTAND will first attempe to parse it by the first branch. The tyvo noun phrase branches will succeed bus the relation branch will fail. UNDERSTAND will have to back-up and try the second branc that leads to 23. This costly back-up 25 not really necessary. It would have been possible to have constructed the START network in the following form: STGP NP HP aT not branch until the critical re until une e& 1e reoresentati Table 2 provides a formal specification of the information stored in LAS's network grammars. A node either hes a number of arcs proce eeding out of it (1a) or it is a stop node (1b). In speaking end vaderatandins LAS will try to find some path through the network ending with a stop noce. Each are consists of some condition that must be true of the sen z to be ed in parsing (under rstanding) the sentence. Tn t be taken if the condition is met. This acti conceptual structure to correspond to the m thet point. Finally, en are includes speci control should transfer after performing the zero or more HAM memory commands (rule 3). or more memory commands also (rule ba). These e true of the incoming word. Alternatively, push to an embedded network (rule hb). For instanc in Figure 3 were to be spo: ken using CRAMMARL. Tae START ne be called to realize the X_is above ¥ proposition. The erpedded NP netvork would be called to realize the ¥ is red and X ls scuare propositions. in pushing to a network two things must be specified-- “NODE, raich is the embedded net- work and VAR, wnaich is the memory node at waich the main end emoecded propo- sitions jntersect. The element t is rule ib is 2 plsce-hnolder for invormation that is needed vy the control mechanisms of the UNDERSTAND progran. The three rules 6a, 6b, and 6c specify three types of arguments thay memory commands can have. They can either directly refer to mexory nodes, or refer to the current word in the sentence, oF refer to varieble: c} 3 Ms workoa far Ammar te Ai ko for Cconmmanl 7 Networks for CRAMMARD a NP 2 COP € AdT START awenetiemnenntom SZ seSh tom STOP NP = RA NP 7 OSSD G errno ST OP 4 2 COD = RB . NP Ns3 = CO! 20 SJ aannammeemtioe SB mm 2STOP enem ae NP — se Nl —s STOP € ©SHAPE CLAUSE NPL > AL— - to STOP NPL AD ceeeenrnnrernennrentnnntonnsto TD & REL = cop CLAUSE CL Figure 5. The natwork grammars used by BAS 29 ry 0 QO ry rs Cc fu Q Lee) be rs ct Ly o QO G ion by wv oD © bh tS J Ky oO rs C9 NODE > ARCS (1a) > stop (1b) ARC > CONDETION ACTION NODE {2) ACTION > OMMAND* 3) CONDITION > (COMMAND® } (he) > + NODS (uD) COMMAND > aG ARG (5) ARG id menory node 62 x ‘ oO et Net Nee > (6 + Ady xh, KS (Se FUNCTION > , oojectify, relatify, ideate (7) Mable 3 provides the ancoding of the nebyork for GR Note that there tencas to be a l- and LAS networ: Tt each network expresses just calls one ; to exoress dence is not quilt: Tt in GRAMHARL or G@ no Lave necessalasy © pore ce tures to commend then. SPEAK L: These grammar networks have a2 number of a rc mtence comprehension and generaclon. 7? and UNDERSTAND use the same network for sent Thus, LAS is the first extent system to have & uniform gremmetica notation for jts parsing and generation systems. in this way, LAS hes only to induce ons set of grommatical rules to do poth tasks- Such networks are nodular in two senses. First, they are relatively indepencent of each otner. Secona, tney are independent of the SPEAK and UNDERSTAND rrograms snav use then. Thais modularity greatly simplifies LAS's tasx 0 induction. LAS cnly induce maa r gr Poaz 3 the network grammars; the interpretative SPEAK and UNDERSTAND programs repre i lve x sent innate r inguistic competences. Finally, the networks themse very simple with. limited conditions and actions. Tous, LAS need consider only a small range of possibilities in inducing 4 network. Tae n= salism gains its expressive power by tne embedding of networks. Hec network modularity, the induction task does not incresse with the complexity of embedding. fee Tt might be questioned whether it is really 2a virtue to have the same representation for the grammatical knowleage both for unders a +5 : duction. It is 4 com=on ooservation that children's ability to uncer cand sentences precedes their ability +o generate sentences. LAs would noe seen to be able to simulate this basic fact of languese iearning. However, there may be reasons way child production does not mirror comnprenension otner than that different grammatical competences underlie the two. The child rey not yet have acquired the physical mastery to produce cartel b is the case, for insvance, with Lenneberg's (1962) enarthric cnila wno under- 30 Tha eontruetion of CRAMMARL qPpuT aoEErKUP TNSUBK CORT epuT *SuBRd} A ? {PRKCON 4 corr PRP START PATH “e CELPUSH AL TONP) ({QUT-UF Al X5}) S2) 5 Ci PUSH XL T Npy £f08JeECTIF %5 X1l)) S4 })) 6 (DEEPROP S2 PATH i CE CPUSH Ke T NPQ (AOUGETCTIFY x5 xX2)3 S3 y)} 8 > (WEF PROP 33 PATA s 3 QE CEPOFATE FORD X¥4) (GOUT-OF WORD #QA)) (CRELATIFY 45 Ad) stop LO Coir enOP S4& PATH ee Li (UE PUSH X2 T NP) £(OQUT-DF %¥2 X5)) S5 423 L2 LDEEPRUP $5 PATH 13 LE UCLOFATE WORD %G}) LOUT-GF WORD HRA)) (ARELATIFY X5 X43) STOP ? io LDEFPROP NP PATH : 15 COC CTUEATE WORD X43} ({GUT-OF A4 a SHAPE) ) {(OUT-OF Al X43) NPZ 7; 17 {OCFPROP NP2 PATH is (CUPUSH Xl T COLGR) NIL NPS } 13 C NTL SIL NP3))} 20 {NFFPROP NPB PATH 2i CO¢PUSH X11 T SIZE} NIL STOP 3} 22 {NIL NIL STOP ))} ?3 (DEF PROP COLGR PATH 2% CLL CEDEATE WGRD %4) {QUT-UF X4 *CULOR)} ({QUT-OF Xi X49) STOP 29 (NEF PROP SIZE PATH 26 . CELL IDEATE WORD X42). (OUT-UF X* aS1ZE)3 @(OUT-DF ¥1 X43) STUD } 2t (TALK) - 2d ({IDEATE SQUARE XLICTCEATE CIRCLE X23 29 C(UUT-UF AL *SHAPEV(OUT-OF 2 *SHAPE}) } 30 (CIDEATE REO XB) CTOCATE GREEN %4)) 31 ((NUT-OF X%3 COLOR) (OUT-OF X4% #COLGR)) 42 CLISP SETO X1 NIL) 33 CCIDEATE SMALL X5) (1 DEATE LARGE X1)}) 34 ({UUT-GF X5 KSIZEV(OUT-OF Al *SIZE)} 35 NIL 36 (TALK) 37 ((LOEATE TRIANGLE MLL UDEATE BLUE X2)CIDEATE MEDIUM X3)) 38 ((OUT-GF Xl &SHAPE) { GUT-OF A2 =CGLOR}{OUT-OF AB *SIZE)) 39 (LISP SETQ XL NTL} - 400 {LISP SETQ Xe NIL) Gi CC TOEATE RIGHT-OF XLICLOEATE ABUVE K29) A3 , ( (QUT-CF RIGHT~Ge KRAVE OUT-OF ABQVE *RA)) 44 {(OUT-OF LEET-OF RRBICCUT-OF BELCH *RBDY 4&5 ({IDEATE LEFT-OF XLPMTDEATE BELLY 4223 &4 NIL 31 StOG Tea oe) use : astrugtion, but instead us of productis fhe fi pussibility is thot the emits non in- language 4 ing. evi ang not understand pas ives DES yarsible. It see or petween subject, ecarnaul ars when asked to Similarly ut Zz Ter hat we pe ref ct u oO rf precedes Se sho g probadi sentence fes priate taneously producing tne were : > 1 the measures of produce Ferm: Le 79) ing d 72 ring procedures , es 42 we Ve Cia ee ee ee with a HAM network of propositions sagged 25 to-pe-spoken and a topic o sentence. The topic of the sentence will correspond to the first neaning-oearing : etwor! cpraxX searches through 1vs START network Looking £0 a7 4 xen proposition attached to the topic and woich expresse pic 28 first element. It determines wnetner @ path accomplishes this by evaluating the actions associated witn @ p2acn ang determining if they created a structure that appropriately matches the +o-be-spoxen structure. When it finds such @ path it uses iv for generation. * Generation is accomplished by evaluating the conditions along the path. cursively v a If a condition involves 4 push to én embedded networs SPeAK is re c speak some gub-purase expres allied to sing @ proposition attached to one proposition. The arguments forarecursive call of FUSH ere the expedded net- work and the node that connects the main proposition and the emoedded ororo- sition. if the condition does not involve 4 TUSH it will contain a set of menory commands specifying that some features ‘Qe true OF @ word. % will use these features to determine what the word is. Tae ~ord so Ceterminec will 3 to nh As an example, consider how SPHAK nerate a sentence correspoading to the HAM structure in Figure 6 using , the En .eLish-11is ce amar in “Boeure 6. Figure 6 contains set of propositions about thre denoted by the nodes G2k6, G195, and G182. Of node C26 it is ass “ht is 2 srianghe, and shat G195 is right of it. Of G195 it is ass it is a square and that it is above G1g2. OF e182 it is asserted © SQUAT, smalL, and red. igure 7 1 jliustrates the generavion of this froa GRAMMAR2. LAS enters tne START network invent cn producin ueterance exoout G95. Thus, the topic is G1i95 (it could neve been G26 OX 162). he first path through the network involve vedi n aaa OE G195, pet the ve clas Tre second pata here is nothing in the adjectiv through tac SPART network corr ESsDo ones “say eyout G95 -~- - it is above G182. Tuer afore, LAS pia Ss main proposition. First, it must find some noun phrase “to express G195- The substructure under G195 in Figure 8 reflects the construction of this. $s supnebvork « The NP network jis called which prints the and calls NPL wpicn retrieves Square and calls CLAUSE which prints that, . is, ana right-of and ain recursively calls NP to print the squert- Sinilarly, recursive calls ere made on the HPL network to express G162 as the small red square. The actual. sentence generated is senna on choice of topic START network. Given the seme to-be-s =.xf network, but the topic G2k6, alt SPEAR generated A tri jangle is Left—o8 : mn gauere %o2% is above a sheik red square. Given the topic GiLg2 it generated Ax Foauase thas is below 4 SGUarS that is right-of a triangle gle is smelt. Note ho the cnoice of tne reletion words lefc- of vse 2 SYehecot and or Guove VS- below is Gependent on choice of topic. It is interesting to inquire what is the Linguistic power of LAS as & speaker. Clearly it can generate eny conbext-Tree Tanguegse since its transition networks correspond, in structure, to a con ree grammar However, it turns out that LAS nas certain context-sensitive asnects because its productions are constrained by the requirement that they express Some well-formed HAM conceptual structure. Consider two proolems that Sky (1957) regarded as not handled well by context-free pramnars: The first 15 agreenent of number between 4 sub- ject NP and vero. This is hard to arrange in @ context-free grammer because the NP is already puilt py the time the choice of verb number must be made The solution is trivial in LAS——wnen 09 4 the NP and yerb are spoken their. num- per is determined by ins spection of whatever concept in tne ~o-be-spoken structure underlies the subject. The other- Chomsky © xanple involves the identity of solutionel restrictions for active and 2 passive senvences.- Thais is also achieved au tomatically in LAS, since the restr ieticns in both cases are regarded simply as reflections of restrictions in the sera ntic structure from which both sen- tences are spoken. While LAS can hendle those features of naturel language suggestive of contexb-sensitive rules, it cannot handle ex amples Like languages of the form oms i oO Ss aNpich which require context-sensiti jive grammars. it is interesting, however , that it is nerd to find natural languese sentences of this structure. Tne best T can cone up with ere respectively-tyPe sentences, G-+B-2 John and Bill nit and kissed Jene and Mary . respectively: This sentence is of questioaasle aoceptavil fe ns ATE / | GIG RICHT-OF SQuaA oO Se The towbe-SspoK oT Mae sy CHAM network for the SPKBAK programe Pp wo THE NPL TRIAN A tree structure sh These networks were called G195 which exoressed the in 3D: NPL RED G182 i NPL SQUARE owing the network eallisand ward evtout wwe rating a sentence about sion contained in Figure 6, ture is 3 Vv; ° sider the possibility thet the failure may a Zo sip a on thet peti TT fer % possible to have to back into 4 network a second ontr o £ De. time to 2ote . the UNDERSTAUD prog trol structure Were Perhavs an Englisn example would be useful +o motivate the ne oO: control Structure. DB he two sentences tre Deroeratic party hones to win in '76 with The Democrs carty hopes ere hign for ‘To. A main parsing network would call a noun. vars ork to identify the Sirgt noun phrase. Suppose UNDERSTAND identi? * ratie party. bLeve elements in the second sentence would indicate narerore, the mein netvork would have to re-enter ths :fferent parsing to retrieve The Deamocresic 4ereq the noun-pnrase network to retrieve on i% must remember woien persings 44 tried tne first time so that it doe ievye the same old parsing. Tae complexities of this control ssruct wiped in a more compleve report (Anderson, 1975). Here © “lib gu a general strucwre of the gr to find some pata START network waich wibl 3 e parsing of the sentence. + evaluates tae eeceprsollity o eveluating the conditions associated with that path. ond on ma thet certain features Sa true of words in the sentence. This is Setermined by checking memory. Alternatively, 4 condition caa require & pusn to an embedded network. This network must parse some subphrase of tne sentence. When LAS finds an acceptable path Shrougn a network it wilh collect tne actions along that path to create a temporary mory structure to rep > exemple of wnere it might seem that LAS would need & In English noun phrases, it seems we can heave en arbitrary numbe me that LAS has parsed. This, for instance, given 4 antence, Tne square thet is risht-of the triangle is abd na TAS woulé parse it in tae form illustrated for Pigure 6 in LAS. 1, understandin first displayed exanp (1973) comes closest analysis. It is also of interest to consider the power of LAS as an acceptor of lan~ guages. it is clear that LAS as presently constituted can acceps exactly the context-free languages. This is because, unlike Woods' (1970) syster, actions on ares cannot influence the results of conditions o4 arcs, and therefore, play no role in determining wnether & string is accepted or not. However, “nat 15 interesting is that LAS's behavior aS 2m Janguese understander is relatively Little affected by its Limitations on grammatical powers Consider the following a n o contextb-sensitive gramnars x mber of adjectives. 36 General Conditions Tor Language Acquisition _— Dn ne a Having mov reviewed now LAS. 2 understands and produces sentences, I will - present ne three asveces of the induction progres: BRACE, SDRAXYTSST, and GeieRALIZS. Before doing SO» it is wise to priefly state tne conditions uncer waien LAS learss 4 Language. tt is assumed that LAS. 1 already nes CONnceDpes attached to the words of the Languege» Tiat is, jexicalizetion is complete. Phe task of LAS. L is to learn the grammar of the janguage--that is, how to 9 from a string of words to 4% representation of their combined meaning. Secause Li oy concerned with Learnings meanings, it cannot be a Ver! realistic econd 1 learning where many concepts can transfer fron she yage. i Will propose extensions oF 43, L concerned ZS. 52° LAS. 1 is that if works in 2 particularly restricted semantic co is presented with pictures indicating relations ana proper- ties ail geometric objects. These pictures ave aetueliy encoded into sonal networs representation. Along with these pietures LAS 2 anees describing the picture and an indication of tnat aspec which corre sponcs to the main proposition OL TRE SUULEHES: From 0: nm input, @ network grammar 1s constructed. The semantic dona y simple, put the goal is to be able to learn eny natural or natoral-like Language which may. gescribe that domain. A major aspect of the LAS project is the BRACKET progran. Tais is an alsori- for taking # sentence of an aroitrary Jeanguage nd HAM concepsual structure anc sroducing @ pracketing of the senvence shat i nis surface structure prescribes the hirerarc sentence. For BRACKET to succeed, Four condition etworks required to parse the must be satisried by the infor Condition 1. All content words in the sentence correspond to element cepsuel structure. This amounts to the clain that the teacher is 4 L the learner to conceptualize the information in his sentence. It does not. matte to tne BRACKET al oritnm whether there is more information in the conceptual Pp structure than in the sentence. Q ° ndition 2- The content words in tne sentence are connected to the elements in, the conceptual structure.. Psychologically, this amounts to the c nat exicelization is complete. That is, the Learner KNOWS ne meanings of the wor 3 e surfece structure snterconnecting the content words is isomer phic in its connectivity to & janguage-fres prototype structure. Condition 3. * 37 a eae ’ + : . . : . . - . : Condgition 4. The main proposition am conceptual structure 2s indicated. iw 0 5. fr a quire considerab S 2 prototype tar I will explain why soueb Consider Panel (2) of Figure 6 which iit sbrueture for the series of propositions in the English sentence re is noove the small cirele. Panel (b) illustrates a grapu deformation of that structure giving the surface structure of the sentence, OVS how elements within the sane nom phrase are appropriately assigned to the same subtree. Note that the prototype struc~ ture is not specific with respect to which Links sre avove whien otners and which ere right of which oceners. Althouga the HAM structure in Panel (aj is get forth in a particular spatial array, the choice is arbitrary. In contrast, the surface structure of a sentence does specify the spatial relation of links. Tt seems reasonable that all natural languages nave as their semantics the same order-free protovuype network. They differ from one eno ther in (a) the spatial ordering their surface structure assigns to the networs and (bo) the insertion of non-ucaning-bearing i moras mes into the seatence., however, the surface structure of all natural languages ig derived from the same graph patteras. Penei (c) of Figure & shows how the prototyse structure of Panel (a) can pro- - vide the surface structure for 4 sentence of the artificial GRAMMARL. All the sentences of GRAMMAR] preserve the connectivity of the underlying HAM structure. S$ OF L By this critericz, at least, GRAMMAR could be 4 natural language. tain conceivable languages would have surface structures which jons oF the underlying structure. Panel (d) illustrates COU 1 bad G such a hyvothetical language with the same syntectic structure as English, but with difver Le + the ent rules or semantic interpretation. In tnis languege the adjective inz the object noun modifies the subject noun. As Panel (a) illus- trates, there 1s no deformation of the protovyD sructure in Panel (a) to achieve a suctace structure for the sentences in the language. No matter r how it is attempted some renches must cross. n {a uy OR oO cr Yl Gs $ connectivity of the prototype network to infer what the LAS will use the t t connectivity of the surface structure of the sentence must be. The network does not specify the rignt—Lett ordering of the yrancnss or the above-below or- dering. The rignt—Left ordering can be inferred simply from the ordering or the words in the sentence. However, to specizy the aoove-below ordering, BRACKET needs one further piece of information. Figure 9 illustrates an alternate urface structure that could have been assign ad to the string in Figure 8 (c). t might be translated into English syntex as Cir ula + ig the small thing the rear ures illustrate, the Has s below the red square. Clearly, #S these two s Reet r network and the sentences are not enough tc spe eify the hierarchical ordering of subtrees in the surface structure. The difference between the sentences in Figure 8 (c) and 9 is the choice of wnich proposition is principal and which is subordinate. If PRACKET is also given information as to the main proposition it can then unambigiously retrieve the sentence's surface structur The assumption that PRACKET is given the main propo to the claim that the teacher can direct the learne asserted in the sentence. Thus, in Panel (ce), the te c would direct the learner to the picture of a red triangle above @ srall circle. He would both have to assume that the learner properly conce tyalized the picture and that he also realized the aboveness relation was what was peing asserted in the picture. 38 ad nab tnnre + y* (b) W 1 SQUARE CIRCLE SMALL . SN UTP RTNIT V* (c): iM SMALL . ceules RED BELOW ructures of the sentences graph in (a). aA Aarnfarm f, The surface st in (b) and (ce) are the HAM structure Panel (a) to 2 a THE SMALL wales | deformations ol. é.. FIgure J. Alternate Surs hoa face structure for the s entence in Figure 9c.» More on tre Graon Deformation Condition a en T think tht the graph deformation cond aa ae of 8 universal property ot language. However, to make a is elear that something ther than the HAM network wilh neve to 3 neh > works weil en 1 a nn sroused togeuner. ed ture ceria closer tog? nd open are closer together. If Fig fro sentences woiecn alternated words is no deformation of the structure or. 2 ae type, LAS cold n groups. Por ins would provice 2 o’ cr O or John opened with a key the do sentences Wal the HAM structure to cross. ‘This Saglisna sentence & Hien violate © deformation condition Tor Figure 19 6 : snething Lixe the case ually necessiple from § erguments are equally = posed by the verp open is one posed bat vary and its arguneres woile it is like Ly 3-2 some natural language. There are two Ways to deal could resort to a memory peprepenvalivn Tithe (uy. HOR ar or significant considerations that motivate tne HAM (a}. Moreover, representations like (o) finesse ons questions in Languege acquisition--nc7 we learn the ax verbs. To address this question Wwe need a represen- $i-argument verbs into @ representation Like tt bE ike ( semantic function of the case arguxents. Learning the role anguage then involves leerning hoy to assign it o a structure Tike (a). Tf will sketch systen to do this If we Keep the HAM representations then some changes are required in BRACKET grepn deformation condition. Whet is characteristic of multi-argumens verds in HAM is thet the arguments are interconnected py causal relations as in (a). Thus, BRACKST showle pe made to treat all the terminal argucent structures &5 defining @ single level of nodes in a graph structuz nected to 4 single root node. Tnat is, BRACKET can treat a HAM structure such as (a) if it were (b) for purposes of utilizing tne graph reformation con- fact, BRACKET already does tnis in the current jmplenentation. QO dition. In G5 The Details of ARACKET's Output So fer, only & deseription of how one would retrieve tne surface struc ture connectng the content words of the sentence nos been given. Suppose MACKET were given A triangle is lett-of a scuere shat is ebove & small red this senvence wnien will ce square. A bracketing structure must be imposed on 41 JOHN PURN KEY CAUSE. DOOR OPEN (d) JON KEY -—s OPEN DOOR Figure 10. Alternative prototype structures for the sentence Jorn onened the door with a “eve The HAM structure in (a) introduces too many distinctions: 42 also include the functicos words, Given enis sentence ane she conceptual Ssoruc~ ture in Figure 6, BRACES rev ned (G257 (G2k6 Geby 2@ triangle) is Left-of (G195 Gi96 a square (Gig5 G225 tna. is above (G1b2 Gi83 a smell (q182 G1s5 red (G182 Gi8h square))))))- Tae oain mroposition is 2257 which is give? as the first term in the bracketing. Tre first bracketed suD-2xoression aescribes the 3uo- ject noun o element in the sub-exXpre gon G2b6 is the node tnat h aa * = jinks the Te rst two words as The next two worGs is. propositions corressouaing & chese The re ft corresponds. to 3 description of the element G1u95. Tne first emoedded prop si G Gi95 asserts this object is 4 square and tne secoad proposition, G225, asserts that Gi95 is above Gio2. Note that the G225 proposition is emoedded a5 4 supe expression within tne G196 proposition. Te last element in the G226 proposi- tion is (G182 G1i83 11 (G Gi85 red (C182 Gish squere))). Tsais exoression G p wi » FA wy b 162 has in it three propositions G183, G ut. of BRACKET. Aostractly, the out— The above @X@sp 0G a by tne following three yexrite rues: a put of BRACKST may be specifi 1. S* proposition element 2, elemene + word > ejenent > (topic S) That is, eacn OF veted output is 2 proposition node followed yy 2 sequence of " nese elements are either rewritten @5 words (rule 2) or ans (rule 3)- A pracketed subexpression pegins with 6 tes the connection between the emoedded and embedding ants within an exoression @re either non-meaning pearing = et ct elements (rate 1, bracketed gubexoress topic node which ind propositions. The € words or elements corresponding to sudject, predicate, relation and ooject in the propositio ote that BRACKET induces @ correspondence between & level of pracketing and 4 single proposition. Zach Level of pracketing will also correspond to a new network in LAS!s grammar. Because of the modularity aay of HAH propositions, e modularity 15 acnieved for the grammatical networks. When a number of embedded propositions are attached to the same node, they are envedded within one another in 4 right-oranching manner. e is no semantic features to indicete waere they belong. Ws The insertion of non-~function words into the bracketing is 4 troublesome problem becaus® Yr Consider the first word 2 in the exemple sentence above in Figure 6. It could have been placed in she top level of bracketing OF in the subexpression con~ taining triangle. Currently, all the function words to the right of ¢ content word are placed in the sane level 2s the content word. The bracketing is closed jmnediatels after this content word. Therefore, is is not placed in the noun-pnrase prackeving This heuristic seems to work more often than not. However, there clearly are cases where it will not work. Consider the Sen~ eat. ‘The current BRACKET program would a vo Ss tence The boy who Jane spoke to was a return this 2s ((fne boy who dane spose) ) to was deaf). That is, it would not identify to 4s in the relative clause - Sinilerly, non-meaning-bearing suffixes like gender would not be retrieved 45 part of the noun by this heuristic. However, there is 2 strong cue to make pracketing appropriate in these cases. There tends to be 4 pause aften morphemes Like to. Perhaps such 43 pause structures could be called woon to help the BEACHES? prograa decide how to insert the non-meaning~bearing morphemes into tne bracketing. aring morphemes pose further problems besides such morphemes in a noun phrase. Thes seq nat, in principle, might constitute &n aroit ets semantic referent could provice no cues t language. Therefore, we would be back to 4 ag duction tasz that ve naracterized in the i comro gz to observe that the structure of these st g non-meaning~bear ing morphemes tends to pe very Simple. There are nob many exumples of tnese strings being longer than a single word, Thuc, LL Seems baat the languages consti tuted by these non-ne aning-veuring strings are nothing m than very simple finite cardinality lenguages which posc, in themselves, no serious induction problems. The yarious stretches of non-mzaning- peacing morpnemes in a senvence could also have complex intercer endencss thereby posing t hese serious induction problems. Again it does not seen = pet simple gust at those points where it would hav on program to Work. a o be the cas¢ that thes ndencies exist. So once again we find that the structure of natural language a to be for a LAS-like induc- 0 ( ch pte ce In concluding this section I should point out one example sentence which BRACKET cannot currently hendle. They are respec tively sentences Like Jonn and Bill Ganced end laughed resvectively. ‘The problem wW Will such a sentence is th at 1 ~ is the following prototype structure: 1 2 ba} rd Jonn dance Bill | lauga Thus , John and dance are close together and so ere Bill and laugh. However, tne sentence intersperses these elements just in the wa! way that nak makes bracketing impossible. There are probably other exe moles like this , but IT cannot think of them. Fortunately, this is not an utterance that appears early in child speech nor is a particulerly simple one for adults, Of all the grammatical constructions, the respectively construction is the one that most suggests tne need to have trensformational rules in the gramcar. s capable of Te funetion of SPEAKTEST is to test wnether its i ely modify the grammar 50 generating a sentence and, if it is not, eappropr lat that it can. SPSAKTEST is called after BASCART 15 complete. It receives’. from BRACKET a HAM conceptual structure, @ pack ted sentence, the main pro- position and the topic of the sentence. As in the SPEAX program SPEANTSOT attempts to find some path through its network which will express a proposi- sion attached to the topic. iz it succeeds no modifications are made to the network. If it cannot, & new path is built through the network to incorporate the sentence. ts 5 ae 3) id “ © a] The best way to understand the operavion af & through one example. ‘ine target Language js wag given to le a, GR ’ arn is illustrated in ail Lh. ais is a very simple languase, yasieally GRAMMAR of Table 1. it nas a smaller vocabulary +9 make it more tractable, The reason for choosing this Lan: guage is that it is of just surricient complexity to jllustrate LAS's acquisition mechenisns. In addition, LAS hes learned GHAMMAR2, also given in Table l. Figure iL 4llustrates LAS's come in. Tre first sentence i returned by BRACKET es (GiT4 (GLL5 6 & CLT refers to the main proposition given as an ar t this is LAS'ts first sentence of the languag® the sr network will, of course, completely fail to parse the sé ntence. It has no Ff mnar yet. Therefore, it induces the top-level START network in Fi 1. “A listing of the czact s given below th are information induced is e graphical illustrati on in Figure ll. Since the first two elements acer GLT4 i in the bracketed senvence are them- sees bracketed, the Tir network will ne pushes to subo- rc st two arcs in the e contains 4 conaition om the word aoove. The restric- is that it be & enber of tre word class Aig? . This class Was + this senvence and only contains the. word above at this point. : d a path through ene START network, SPEAKTEST checks the a % che Having now conssrucee subnetworks in thas path to see whether they can hendie the bracketed subexpres sions in the sentence. Tis is accomp ished by 2 recursive cail to SPRANTES?T. For tne first phrase SPRAKTEST 15 called, taking aS ar suzents the network AL95, a (GLLo sq 4 aes = Pe netrrertk A105 the word class ana “Uue . UUpLe oa nm square, and in network ALOT the word class A22] con~ 2 se two Suonetworks should pe the same in ea final grammar 4 prepared to risk such @ genera alizavion at this point. Note in this example how the ore ee provided by BRACKET completely specified the em bedding of network The sentence provided by BRACKET was (Girl (G11L5 G116 square) (G148 culo. triangle) esove). The first element GLT ag the main proposition. The second element (G11L5 G116 square) was 4 bracketed wubexpression indicating 2 subnetwork shoule be ereated. Similarly, the third expression indicated a sion network. Tae last element above Was @ single word and so could be hendled by | memory condition in the mein network. The second sentence is triangle sauare rignt-of. This is transformed by BRACKET.to (G315 (G2k6 G2k7 triangle) (6283 G264 square) right-of). Because s sentence cannot be handled by tre or the narrow one-member word c classes this Ss current grammar. However, SPSAKTEST does not add new network arcs to nandle the sentence. Rather, it expands wor class AL9G to include right-of, word class A211 to include ‘riensle, and word class A22h +9 include square. The grammar is now at such @ stage that LAS couid speak cr understand the sentences triangle sauare above or squ uare sauare ricnt-of and other sentence s which it had not studied. Thus, elready the Firss generalizations have een made. LAS can produce and unders tend novel sentences. This illustrates the type of generalizations that are: mage Weta une SPRAXTEST prograa. For instance, consider @ ge SPEAKTEST decided to use the existit i —_ fa eee” op j- re Or WY om re? “AI Wy ay kr In 5 ( (SQUARE) (PRIANCLE) ABOVE) oa ALS . - £19? . 1199 START-—— 95 yy gh 9 te STOP r A197 _ 2 AZZ) STOP Pp . S Ne C247 C316 Case 7 NX | V N 3/ P ; iA? PRIANGLE Gate RIGH?-OFr ass SQUARE - ( (@RIANGLE) (SQUARE) RIGHT =OF) A199——— ABOVE, RIGHT -OF A211 ty SQUARE, TRIANGLE A22h-——— TRIAN GLE, SQUARE Figure 1}. LAS"s treatment of the first two sentences in the induction sequence» 46 Anderson the first wot ra of th work Al9D that head been cr to include triangle Both deci al e second sentence . his involved (a) using tne same subpnet— xc (ob) expanding the word class A2i1 ons “Tested on semantic criteria. The networn ge attached to the main propo- 5f the node G2h6 which is this identity of semantic In making these general izations, SPEAXTAST is making a strong assumptlon about the nature of huey ‘Language. This assumption is stated as Condition Condition 5. Words or phrases with identical semen tie functions at identical a tat chically. This is the assumption points in a network behave identically syn u of semantic-induced equivalence of syntex. it is another way in wnica senantic information fac jlitates grammar induction. it clearly need not be true of an arbitrary lang uage. For insvance, Gecisions made in tne sudject noun phrase might in theory condition syntactic decisicn made in the object noun phrases. 4 LAS. because of its heuristics in SPEAKEES?T for generalization, would not be > D se able to learn such a language. 4 wee ee BRK TAG Figure 12 illustrates LAS's networ % gremiar after two more sentences have g - ‘ = 7 come in. penvences 3 ana 4 LinVOLVe tise aubuiua Ota Giaw arenes pans 7 treats these 4s syntactic variants of abo ve en rigat— of which differ in their assignment of their noun phrase ArguURENnts © 9 the “Logical categories subject and object. Therefore, LAS creates an alternat 2 branch through its START network to accommodate this possibility. Figure 13 illustrates the course of LAS's learning. Altogether LAS will is will have to meke three extra a be presented 14h sentences. Subsequently, generaliza ations to capuure the entire vaerget lenguage. Piotted on the ebscissa s this learning history and along the ordinave we have the natural Logarithm of the number of sentences which the gremmar can handie. This is a finite 4 e, unlike GR MMAR2, and therefore tne number of sentences in the language will always be finite. As can be seen fron Figure 13, by the fourth sentences LAS's gramaar is adequate to handle 16 sentences. LAS's grammar arter the next five senvences ig illustrated in Figure Lh, These are LAS's first encounters with two word noun phrases. ALL five sentences involve the relations right-of and above and therefore result in the elaboration of the A195 and Al9T suo-netWOrks. Consifer the first sentence, square red ariangle blue above, which is retrieved oy PRACKET as (C329 (C270 Ce71l square (C270 C272 red)) (6303 C304 triangle (C2 ove) Ce sider 0}. Con C + ct 03 C305 blue) above the parsing of the first noun phrase. Hote that the adjec 1 is embedded within the larger noun vohrase. This is an example of the right embedding woich BRACKET always imposes on @ sentence. ‘This will cause SPEAK TEST to create a push to an anbadded network within its A195 subnetwork. As can be seen in Figure Ls the exiscing arc containing the A2LL word class is kept to handle square. Two alternetive arcs are added—-one with a push to 47 Birnrea 12 LEU Uf. > - aA LAS*s eranmar aqdVver -_ mom MDYANAT TS SQuUARL mda Gis 1 s Whee Se Zo TRIANGLE SQUARE 2, SQUARE TRIANGLE TRIANGLE SQUARE RQ Se STOP 54 ~ « £4221 _— Alge = above,r: art 4 Poe rn ac > meow ™~ Rt whi FONT wet of Hl it i H scuare, vriangle calow, left-of square, triangle square, triangle 18 nonyta e ¥. GQAKMAR ANDER Load 1? i Apo NwoeOO 4 ie ~ ym NETSEE ob = Ce bs eyo 4 iv 0 7 by io fui atti ‘eet Ma 2500 1 274 fo ¢ ” 1040 “ - . © Ou 1296 True sentences + over generalizations N a : ie OB ee a 008 True sentences a LOO j— 20\— e/ / —_— ef / bol whe \ 1 Figure 13+ 5 & 5 6 7 8 9 20 Tt “V2 130 ~«4 Sentences Studied The growth of LAS's grammar with its learning history: ve deses be ce nee 1 2° ° 37 Ceneralizations Additions to LAS’s grammar after studying: Le SQUARS RED PRIANGLS BLUE ABOVE 2, PARTANGLE LARGE SQUARE SMALL RIGHT-OF 3° TRIANGLE RED PRIANCLE RED ABOVE i. SQUARE SMALL MRTANCLE RED RIiGHP-OF 5, SQUARE BLUE TRIANGLS LARGE RIGHT-OF A211 HB \ NIL SS TOE € A221 C560 $9 STOP NIL r ——-== STOP . o £0585 ough colo Ss. stop C560 STOR C510 = gmail, blue,large,red small, blue, large red C3 oat co Or ! 50 A nde SCO 50rh the Chgl nebvos : ce with a NIL trens: io sithin the Cho: network the word cle “yord red. Supeose & neato! ontein saquence : DOs 2 phrase “ Psat - AS fully parsed. i? fares A ao - ee a 1 = . 3 Ly a? ok, assigned to 1 . Huils 4 tne Cx OULLG. Kw A _ ar C : --. An, oy a . ; 1,3 _% Als 2S rth are Laat : € LS on mazing to 7% ge des-induczed equivalence of (a) illustrete how son in natural language. at, eta. He would set a 7 wed by any noun. Suppose, he LiS as The + poy + 's.7 oe network illustra tea in sion that foots is the : Y 7S Jian is, af eaurse, ge lege vvin, 1964). What n2 a te carious oe paegoneralize % OS + there ere 2 nunib er of alterna— uch norphenic rules is. = tives end no semantic vasis zo choose besween them. Because of its principle of sementics-induced equivalence of syntax, LAS will eo yonerelize in those situation Apparently, ¢ children ere opet rating under 4 similar rule. LiS needs to be endowed with a mechanism to allow it to recover from such overgeneralizations . Therefore, One of tne future additions to LAS will have to be a RECOVER Prose Consider how it would work witn this pluralization example. Suppos LEARNMORE receives the genience The Scat ave above the triancie. In oe cating to analyze the sentence in SPPAXTSST, the plural foots will be generated put Will mismet ch the sentence. RECOVER has as its function to note such mismatches. since ait is possible that there are two alternate Ways of expressing plurality, RECOVER cannot assucs its grammar is wrong. Rather it will interrupt the information flow and check the accepta- bility of The foots are above OF the triangle. Tat is, RECOVER will explicitly seek neg ative information. Upon 7 Learnan ne exoression is ungrammatical gv RECOVER will teke foot out of the word cless that is pluralized py 's. 1 . . fo accomplish this T would have to put within TAS some ‘pechanism that will segnent words into their morpnenes. 50