Proposal to use the SUMEX-AIM
Resource for Computer Simulation of Language Acquisition
John R. Anderson
Human Performance Center
University of Michigan

Ann Arbor, Michigan
3 Oo

The purpose of this research is to understand language acquisition.
There has been a great deal of research on first language acquisition in
children, second language learning by adults, and learning of artificial
languages by laboratory subjects. The principle goal of this research is
not gecting more experimental evidence. Rather it is to develop a working
computer simulation model that can learn natural languages. The model
would attempt to explain the already available set of experimental facts.

It is also hoped that such a model would be a contribution to the artificial
intelligence goal of developing language understanding systems.

Some of the detailed plans of the research are described in the
accompanying grant proposal that was awarded by NIMH (grant number 1 RO 1
MH26383-01). The period of this award is May 1, 1975 to May 1, 1977. That
proposal states an intention to use Augmented Transition Networks as the
basic grammatical formalism. I have already completed some initial learning
programs using the augmented transition network formalism. The very earliest
of this work is described in the NIMH proposal. More recently I have decided
to try to develop a production system formalism as en alternate to the

augmented transition network. There are three main reasons for this switch
2.
in representational formalism. First, I think it is easier to represent
the grammatical knowledge contained in highly inflected languages (eg.,
Finnish, Latin) by production systems rather than augmented transition
networks. Second, I think it is easier to represent human information
processing limitations in terms of production systems, Third, I think
production systems serve as a means of representing non-linguistic proced-
ures such as inference-making. Therefore, a theory of induction of pro.
duction systems for language has the promise of generalizing to the induc~
tion of other human cognitive skills.

I have bean using the SUMEX facility in a pilot project this
summer. I have been bringing up a version of my production system called
ACT on this facility. It is hoped that in a few months this program will

be in a sufficiently developed form that other SUMEX users may use that

t+

production system. t uses an associative network representation as its
basic data base. This is a variant of the HAM propositional network that

I developed earlier and is described in the accoupanyine proposal (p. 23 -
27). In the ACT system various portions of the network are active at any
point. of time. The productions look for patterns of activation in the net-
work. If these patterns exist, the productions are executed causing exter-
nal actions to be taken, building network structure, and possibly changing
the state of activation of the network, Activation spreads associatively
through the network and there is also a dampening process which deactivates
network structure. A preliminary description of the ACT system is given

in the accompanying document "An Overview of ACT." It is a chapter froma

forthcoming book. The most relevant section in that chapter is from pages

il to 25.
It was originally projected that this simulation work would
be performed on the Michigan Computer Systen. However, there are @ number

£

of advantages of the SUMEX-AIM facility. All the programming will occur
in LISP. The INTERLISP system in SUMEX, as surmised from my own experi-~
entation, permits programming and debugging ¢0 progress at least twice
as fast as with Michigan LISP. Also programs in INTERLISP would be more
available to other A.f. users than programs in Michigan LISP. The Michigan
computer is isolated from the national A.i. community whereas I can take
advantage of the connections SOMEX-AIM has through the TYMNET and the
ARPANET. Finally, the SUMEX-ATM facility provides free computing resources
and so will relieve some of the-strain fron my tight research budget.

It is intended that there will be continued development and
testing of this production system formalism as a model of human information
processing. There are plans to build substantial ACT production system

models for language generation and understanding.and for inference making.
A.2.

cC.3.

c.4.

c.5.

Responses to SUMEX-AIM Questionnaire

Read the accompanying proposal.

The research is currently supported by a grant from NIMH (grant
number 1 RO 1 MH 26383-O1) for the period May 1, 1975 to May 1,
1977. The amount of the award for the first year is $20,000.
This is to pay for a programmer, computer time, and rental of a
terminal.

Read the accomparnjing proposal.
It is expected that this research will have some general contribution

to make to development of language understanding systems, modeling
human cognitive processes, and development of production systems.

None

There should be no difficulty in making my programs generally’
available to users of SUMEX-AIM.

Yes

Yes

Read next to last paragraph in accompanying proposal.

The INTERLISP language on SUMEX is the principle requirement of my

research. I do not anticipate requiring any additional systems
programs not already available at SUMEX.

Estinated requirements per month:

100 connect hours
2 CPU hours ee
~ Lb i
Qa UY a>

1500 file pages bh 2
_ us ‘

The principle times of use in Ann Arbor would probably be 0600-0900
and 1800-2100

I intend to communicate with SUMEX via the TYMNET. I would either
use the private node in Ann Arbor or the public node in Detroit.
The toll cost to Detroit could be met from my current grant as
could the cost of terminal rental.

Not really relevant
  

 
  

~ mes tem Seite wate phe soar @ ap eae pegs y a > pee DB elogebeg at :
(Tiva the fot wig 1th ore sor fOr wet Sond 3 ad Os ri ght ae Pep hag) re Ege 4G

BIRT MDATE

  

 
 

 

 

 

 

 

 

 

 

 

    

 

. : 4 | 5 ,
Jonn BR. Anderson Junior Fellow i Aug. 27, 1947
oe nem = —- a an nc
PLEGE OF MATA icity, State, County! NPE NATIONALITY (UF noa-U.S. citken, SEX

to kind of ving god expiva0d dite} ds >
Vancouver, B.C-, Canada Canadian - Ji - 4ug- 1974

 

 

 

  
   

-

calaursate tan sng and ingiude PastIoele erat)

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

- YEAR SCMENTSFIC
mr mo
DEGRES CONFERRED FIELD
| Oe m
L { - : +
. B.A. 1965 Psychology
. Gancouver. wd
Stanford University
~ oe Ph.D. 1972 Psycholoz:
Stanford, Califo ormia y “

 

 

 

 

 

 

 

 

 

 

 

 

HONORS

 

 

 

 

 

 

 

 

1968-~The Governor-Gener ral's Gold Me adal (Head of eraduating classes in
ris and Sciances , University of British colunbia)
MAJOA RESEARCH INTEREST ROLE IN PROPOSED PROJECT
Language & Human Memory Principal Investigator

 

 

 

 

 

 

RESEAACH SUPPORT (See instructions)
NSF - Recognition Memory for sentences: a procass madel
Sept. 1, 1973 - Sept. 1, 1975 - $40,000
$20,265 for year 1
50% of research effor

grant number — GB-40298

 

 

 

 

 

 

 

RESEARCH AND/OR PROFESSIONAL EXPE RIENCE (Starcing with present position, jist trainiag and expenance relevant [0 arse of project Listall
or mest representative publications Do not axceed 3 pages far each individual.

Research and Profess! ional Experience:
Junior Fellow, University of Michigan, 1973 - present .
Assistant Professor, Yale University, 1972 - 1973

£

mory under the supervision 0

Numerous experiments in graduate school in human me
Gordon H. Bower at Stanford University, 1968 ~ 1972

*

Publications:

Reber, A. S- ‘and Anderson, J. R. The perception of clicks in linguistic and
non-lLinguistic messages. Perception & Pevchovhysics, 19/0, 8, 81-89.

Anderson, J. R- and Bower, G. H. On an assoc ofa rive trace for sentence memory.
Journal © of Verbal Le Learning and Verbal Behavior, 1971, 19, 673-680.

Anderson, J. R. FRAN: A simulation model of aes recal in G. H. Bower (Ed.),
The Psychology of Learning an nd. “Motivation on, Vol. 5. New York: Acadamic Press, 197:

Anderson, J. RB. and Bower, G. H- Recognition and retrieval processes in free recall.

Psychological Review, 1972, 79; 97-123.

 

 

 

41H 398 (FORMERLY PHS 493) 6 .
Rev. 1/73 ,

u, 5. COVERNMENT PRINTING OFFICE: as7t @ - 454-723
 

 

Angarson, J. BR. A stochastic model of sentence memory. Doctoral dissertation,
Stanford Unis ity, “June, 1972.

dnderson, J. R. and Bowar, G. H. Configural properties in sentence memory
Journal of Verbal Learning and Verbal Behavior, 1972, 11, 594-605.

son, J. R. and Bowe c, G. H. Human Associative Mamory. Washington:
3 3

-l

a OnS

s, 1973.

Reder, L. M., Anderson, J. R., & Bjork, R. A. A semantic interpretation of
encoding spacificity. Journal of Experimental Psychology, 1974, 4, 648-656

Anderson, J. R. Verbatim and propositional representation of sentences in
ismediate and leng-term memory. Journal of Verbal Learning and Verbal
Benavior, in press..

 

son, J. R. and Bower, G. H. A propositional theory of recognition memory.
Memory & Cognition, in press.

Anderson, J. R. and Bower, G. H. Interference in memory for multiple contexts.
n & Cognition, in press.

 

 

 

Anderson, J. R. Retrieval of propositional information from long-term memory.
Cognitive Psychology, in press. .

Anderson. J. R. and Hastie. R. Individuation and reference in memorv: proper
names end definite descriptions. Cognitive Psychology, in prass.

Anderson, J. R. Computer simulation of a language-acquisition system,
first report. In R. L. Solso (Ed.) Information Processing and Cognition:

 

The Loyola Symposium, in press.

 

Anderson, J. R. Language acquisition by computer and child. To appear in:
S. Y¥. Sedelow & W. A. Sedelow (Eds.), Current Trends in Computer Use for
Language Research, in preparation.

 

x Special Note

I am in the second year of an exchange visitor's visa. [I can renew
the visa for another year. My wife, an American citizen, is currently petitioning
to have my status changed to that of a permanent resident. Therefore, I /
will be able to be at the University of Michigan for the entire period of the
proposed research.
 

 

 

 

 

 

 

 

 

COMPUTER SIMULATION OF LANGUAGE ACQUISITION
: -
A. Introduction
iL. Direction and goals of the research
Most simply stated, the purpose of this research is to understand language
sequisition, There has been a great deal of research on first lenguage ecqui-
sition in children, second Language learning by adults, and Learning of arti-
ficial lenguages by Laboratory subjects. Tais research is not principally
concerned With getting more exper rimental evidence. Rasner it is concerned wit
developing an infornation-processing model that can be used to explain tne
already aveilasle set of experimental “facts. One of tne principal concerns
governing the design of this model is just that it be able to Learn a natural
Language Twill snow that this, in itself, is a very significant goal.
it that algorithms adequate to learn & naturel language are

quite complex. It is not possible to sit down and sinply specify tnem verovally
or with. quations. This research makes Use of the computer as &
tool tc pnd test complex modes. Wwaererore, i nave been aeveLouinis &
computer jon model of language acquisition. Tris model is calied LAS
(en acronym ror Langu23® Acquisition Syste). Most OF the proposed budget is
concerned with suppor rting the developme ant of this progre®. Input to LAS con-
sists of sentences of the Language paired with represenvaet tons of their

meaning. Therefore it simulates langusge learning in sitnetions where 4

Learner cen figure o f the sentenc
ease of such & situation
simple pictures and sentences describing then.

grammar which allows it to go from sentences to
lying meaning. The grammar can also be used to
meanings. It is also hoped that this program
evolution of computer language understanding sys

really has two purposes, one in psychology and one

ut the meaning oO
would be one in

I became interested in language acquisition
with a computer simulation model of human memory -
in e book by myself and Gordon Bower entitled Hunz
computer program Was 6m attempt t
principal purpose of that re
retrieval system (called HAM) and test it in

ersion of HAM is used witnin LAS. HAM 's sy
understander which was capable. of dealing with
subset of English and which was capable of usin
to resolve reference. Nevertheless, it was re

a2 5

em

a

etom
a

&

wey

which the learner
Tne progran
representations OL

will make 2 contr

o simulate simple que
search was to develop

e from context. The simplest
is presented with
constructs &

their umdaer-
sentences to convey
ibution to the
Tous, the research

ieial intelligence.

generate

Tens.
@ in artifs

of my Work
escribed
The
The

fu

 

experivents.

€

= considerable
exh guete and
Anderson

bilities compared to the work of Schank (1973
result of my own experiences and sbudyin
T vecame pessimistic about the value of re

D> : e
rms of a computer program. To represent the unbounded Lins
Yr o

oO
o
Oo
rh
ct
ry
%
ny Pp
5
$ oH
ou rs
Oo xs
QO 0
he
a
i pu
fu
Kh
ch
a 8
DO
a
ch
ck O
by
ow
oD
0,0
nog
ft fee
ue ..
Dp
0
4 fo
He
go
Oo
hock ct
a 5
Sb ©
wo
er 8
ch oO \— «
co
QO M
mG
is)
ry ee
fo
oO P
ce
oO UW
yy
DO
ta
Oo
ra
0

OQubline of Provosal

The concern in this proposal will be primarily with developing a systen
logically adequate for language acquisition nd only secondarily with 2 systen
t } ,

that simulated actual humen performance. I do not
istic goal until we have a characterization of the so

adequate for natural language acquisition. Tais emphasis’ on Logical. adequacy
is clear in the organization of the proposal. I will irst review the work
that has been done on computer language understanding. This is importent ce-
cause LAS is a language understander as well es a learner. Then I will review
the formal results on granmar induction. Tnen LAS-1 will be deseribed. LAS 1
is a first pass version of the LAS program adequa to learn simple languages.

rh OY

te
Than I will propose en extensive set of developments to be added to the progran,
aimed both at increasing its Linguistic powers and making it a realistic sinu-
Jation. In describing LAS-1 and the proposed extensions, I will review rele-
vant research in the child language literature. Finally, I will propose a
series of experiments with artificial languages to check specific claims LAS
makes about language jearnability.— ‘

om

2. Computer Language nderstanding

Computers have been applied to natural language processing for 25 years.
There has been & succession of major reconcentualizations of the problem of
language understanding, each of which constitutes @ clear advance over the
previous. conceptions. However, any realistic assessment would concede that
we are very far from @ general language understanding system of human capability.
The ergument has been advanced that there are fundamental obstacles that will
prevent this goal from ever being realized (Dreyfus, 1972). ‘These arguuents
are shamefully imprecise and lacking in rigor. Te best (e.g., Bar-Hillel,
1962) has to do with the extreme open-endedness of language, that en effectively
unbounded variety of knowledge is relevant to the understanding process. It is
boldly asserted, without proof, that it is not possible to rovide the computer
with the requisite background knowledge.

In reviewing the work on natural language systems, T will constantly
measure them with respect to the goal or general language understanding. I
appreciate that it is a legitimate artiviciel intelligence goal to develop
a lenguage system for some special purpose application. Such attempts are free
from the Dreyfus and Bar-Hillel criticisms. However, from any psychological
point of view these systems are interesting only as they advance our under-
standing of how lenguage is understood in general.

9
Machine Treastation
SAL S 2 Ue ee

 

tye first inten enea
with treasiation. Comp: us a ial ssive
effort turned out to be & dismal, failure (ALPAC, 1955; Jus agner
1965). Today, it is fashionable to attribute the Failure to che then-current
ingoverished concepsion oF lenguage (e.g., Simmons, 21970; WilKS, 1973). Tre
early attemots took the Tors of substitution of equivalent words across lansguagss
This was ausnentes oy use oF surface sbtrucvure and word associations but ab no
point was tne word abandoned 2S the princioal unit or meaning. Recent work
on language understanding (e.g., Schank, 1972s Winograd, 1973) has ecandoned
the word as tae mit of meaning. It remains to be seen “~hether current attempts
e.g., Wilks, 1973) at machine translation nave better success.

Interactive Systens
inveracties =f

Tae now popular task domain for applications of computers to lenguase is
in constructing systems thay can interact with tne user in nis own langusgs2-

. PS ~

a

Question-answering systexs are the most common, the User can in
program abouts Kno s data base and input new xno
depen j

%
yp
pu

    

DY %
eantatinn snd store n taat Lo. the ini is 37 & wh
used to guide an interrogetion of the data base For the answer. The f
system is critical in tne answering of questions since many answers will not
be directly stored but will have to be inferred from what is in memory. Both

parsing and inferencing run into time problems.

The central tine problem in parsing has to do witha the exsrerme syntactic
end lexical ambiguity of natural languege. Rach word in & sentence admits of
n syntactic and semantic interpretations where mon the average may be as high
es 10. If there are a words, mt interpretations must be consicere ealtnoughn

a
only one is intended. ‘The fact that language 4s so amoiguous Was & surprising

6

n

discovery of the early machine attempts at parsing (e.g., Xuno, 31955). Thus,
there is exponential growth in processing time with sentence length. To date,
no heuristics have been gemonstrated that change in general this exponential
function of sentence length to something closer to @ Linear function. The
human cen use general context to reduce ambiguity to something approxinating

the linear yelation.

Suppose there are m facts in the data base and the desired @educt
wineti

long. Then, there 15 something Like mm possible comssne

There is also an exponential growth factor in tne task of in
xr

the desired @eduction. ‘This suggests that very caeep ings n
is difficult to acnieve and this is certainly true of our every-c38y yeasoning.
However, it also suggests that inference making should become more difficult as
we knoy more facts (i.e., nigh & shich is clearly not the case. The provlem
fecing inference systems is to select only those facts that are relevant.

10
Aaderson

Resolution theorem-proving (Robinson, 1965) is the most studied of the mechaai-
cal inference systems. It is also here thet the most careful work has Been cone
on heuristics for selecting facts fron the data vase. ‘These methods include
semantic resolution (Slagle, 1965), lock resolution (Boyer, 1971), and linear
resolution (Loveland, 1970; and Luckhan, 1970). In practical applications these
heoristies have served to considerably reduce the growth in computation tine.
e t nstrations of the optimality of these heuristics are tasx-
ir are no gensral theorems about their optimality. I suspect that
eral deal effectively with the problens of exponential g20¥ he

 

Althousn there are potentially serious time problems both in pars
pr

 

inferencing, 2 problems have not surfaced in tne past ograns might
haye expects isis because these programs have all been rather narrowly
constrained. ir lenguage systems only need to deal with a srall portion of
possible syntactic constructions and possible word meanings. Also, because of
restrictions in the domain of discourse, only 4 restricted set of inferences

are needed.

Some of the interac
ano r

etive systems (ELIZA - Weizenbaum, 1966; PERRY -— Colby &
effort to Go a complete job of sentence analysis.

gs performed to permit success in marrowly circum-
‘antences were generated by filling in pre-prosramnzed

The ambition in programs like Colby's or Weisen-
earance of understanding. Weisenbaun's program

jan psychotnerapist and cColoy's a paranola patient.

L uage understanding it was difficult

» %
H

nat these might just be manifesta-

 
 
 
 
  
    

  

Other attempts made more serious efforts at language understanding. They
avoided the time vrobdlems inherent in arsing and inferencing by Gealing with
restricted task domains. Slagle's DEDUCOM (1965) dealt with simple set inclu-
Sion probleas; Green, Wolf, Chomsky & Laughery (1963) with baseball questions;
Lindsay (1963) with kinship terms; Kelloggs (1968) with data management systems;
Woods (1953) with airline schedules, Woods (1973) with iumar geology; Bobrow
(1964) and Charniak (1969) with word arithmetic problems; Fikes, Hart & Nilsson
(1972) with a robot world; Winograd (1973) with a blocks world. Other systems
like Green and Raphael (1968), Coles (1969), Schank (1972), Schwarez, Berger,

2 3 (1969), Anderson and Bower (1973), Rumelhart, Lindsey end Norman
(1972), and Guillian (1969) have not been especially designed for specific task
domains but nonetheless succeed only because they worked with sericusly Limited
data pases and restricted classes of English input. Because the parser deals
with only certain word senses and certain syntactic structures Linguistic am-
biguity is much reduced. Those programs that use general inference procedures
{ke resolution theorem proving are notabdly inefficient even with restricted
a bases. Winograd made extensive use of tne Ta itie
ecting inferencing with specific neuristic information. Tne validity of
se heuristics depended criticaliy on the constraints

Ae.

toe

li
fom Dae ary
PNG CSG

   
   

  

 

Winograd (1 1973) has comoined good task analyses, programming: iil,
powers of advanced progreé pning languages to create the oeast ext neud
standing system, I have heard it seriously claimed that the W sys
could be extended to baceme a general model of language unders 2 WHRAD
is needed would be to program in all the knowledge o n adult chend the
parsing rules to the point where they handled all English sent Admitted
ly, this would be a big tasx requiring hundreds of man-years oO put, it
is argued, no greater than the work that goes into writing big ing
systems, Clearly, this argument is faulty if only because it + deal with
the time problems in general inferencing and general parsing. r, it is
also unclear whether human langueze understanding can ve captur a Fixed
program, Further, it is dubious whether it is manageable to a ook keeping
-thas is necessary to assure that all the specific pieces of Kn are
properly integrated and interact in the intended ways. Our Li e conpe=
tence is not a fixed object. This is clear over the period of as We
learn new gramnat ce styles, new words, and new ways of thinki think this
is also true ove nort spans of time. That is, the way humans ¢ with the
time problems naan in parsing and inferencing is to adjust the parsing and
inferencing eccording to context.
Language Acauisition as the Road to General Language Understanding

The preceding remarks were meant to suggest how an adaotive language
system might provide the solution to the fundan antal croblems in general
language understanding. Rather than def ining and hand-programming all the
reauisite knovledge, way now let the language understanding system discover
that hoovledges and yrugrau iusel1i ‘ine language acquisition system is a
mechanized bookkeeping system for integrating ell the knowledge required for
language understanding. By its very nature it treats linguistic knowledge as

constantly changing object. So we know 1t would change with a changing

linguistic community. We might hope that it could adapt over snort periods
(like hours) to its current context

Learning systems are frequently regarded as the universal panacea for
all thet ails artificial intelligence. Therefore, one should be rightfully
suspicious whnetker LAS will provide a vieple route to the creation of a
general language understanding system. Certainly, the initial version of
LAS falls far short of the desired goal. However, with our current state of
knowledge it is just not possible to evaluate LAS's pretensions as an eventual
lenguage understanding system. It is only by systematic exploration and
development of LAS that we ever will be able to Getermine the viability of

the learning approach.

a

+

 

Whatever the potential of the learning approach in artificiel intelligence,
clearly it is the only viable psychological means of characterizing human lin-
gauistic knowledge. It would be senseless to provide a catalog of all the knov-
ledge used in language widerstanding. A catalog of everything is a science of
nothing {a quote from T. Bever). Rather, we must characterize the mecnonism
that creates that knowledge and how that mechanism interacts with exverience.

12
Anderson

tanaat wpetor
Woads Systen

 

 

The Linguistic formalisms used by LAS are very similar to W
Bugnentec transition networks. This sechion on computer Langes. ¢
eoneludges with a description of Woodst systema and an exposition of the suita~
bility of his formalisms for the current praject. There are three critical
features that LAS reauires of the formalisns thet will express jts grammatical
knowledge. First, it should be a formalism thay can be used with equal facility
for language parsing and lenguage generation. Tnis igs pecause it is unreeson-
able to assume that a child incependently learns now to speak and how to under-
stand, Second, we want @ formalism for whicn it is. easy to devise a constructive
algorithm for inducing grammar. That is to say, some descriptions of grammatical
nowledge are computationally easier to induce when others, even though the

CAS

Janguage they describe.
Third, we want the f rmalism to be close a t to the assumptions it
makes about the interpretative system that uses tne gremnar for speaking and
understanding. This is because that interpretative sysven is taken as innate.
Thus, it is not possible to induce new programs for interpreting the grammatical

rules, it is only possible to induce new grammatical rules.

g:

A guiding consideration in this research is that these
gronmatical formulation are satisfied by a finite-state tran
x

un
we
rie
Qu
om
ry
fo
ct
p
rH
©
Hy
Q

epresentation. Tne proclem is that natural languages are fundamentally more
complex than finite state languages. However, Woods has shown a way to keep

ke
some of the advantages of the finite state representation, put echieve the

trang?farmational crammar . Unadst angmanted transition nebworks

 

are similar to and were suggested by the

and Dewar (1968) end Bobrow end Fraser (1970). Transition networks are like
finite state grammars except that one permits as labels on arcs not only termin-
al symbols but also names of other networks. Determinetion of whether the

are should be texen is evaluated by a suoroutine call to another network. This
subd-network will analyze 4 sub-phrase of the Linguistic string veing analyzed
py the network thet called it. The recursive, context-free aspect of language
is captured by one network's ability to call another. Figure 1 provides an
example network taken from Woods’ (1970) paper. The first network in Figure 1
provides the Mainline” network for analyzing simple sentences. From this
mainline network it is possible to call recursively the second network for
anaiysis of noun phrases or the third network for the analysis of prepositional

phrases. Wood (1970) describes how the network woule recognize an illustrative

network grammers of Thorne, bratley,

To recognize the sentence "Did the red bam collepse?" the network is
started in state S. The first transition is the aux transition to
state qo permitted by the auxiliary "did." From state qj we See that
we can get to state a3 if the next "thing" in the input string is an
NP. To ascertain if this is the case, We call the state NP. From
state NP we can follow the are lebeled det to state gg because of the
determiner "the." From here, the adjective "red" causes & loop which
returns. to state qg, and the subsequent noun "harn” causes a transi-
tion to state q7. Since state q7 is 2 final steve, it is possible

to "pop uo" from the NP computation end continue the computation of
the top level 8 beginning in state 43 whieh is at the end of the iP
arc. From q3 the vero "collapse" permits a transition to the state

13
 

adj
#8 & rn Gy
ate Q .
nor

Spy Ee +R) MP el)

FIG. A, A sample transition network, § is the start state.

Gee

Ga
es. (From Woods, 197

)

= yo Ga Gan ANG Gy g AIC the final stat

ub
q),. and since this state is final and “collapse is the last
worg in the string, the string 1s accepted as & sentence

(po. 991-592).

+ . . : bennei ke
nat 1s known as & recursive transitior

Hq

 

T have illustrated in Pigure l
n a

network which is equivalent to 2 context-free phrase-structure grammar.
Woods! networks are in fact of much stronger computats nal power - essentially
that of a Turing Machine. This is because Woods permivs arbitrary ections.
This gives the networss the ability of transformational grogmars to permute,
copy, and delete fragments of a sentence. Thus, with nis network formalisms
Woods can derive tne deen structure of a senbence. The croblem with this
grammaticas representation 1s that it is too powerful and permits commutation
of many things that are nov pars of 4 speaker! grammatical competence. In

+e Ve o

e % & of Ss
the LAS system all conditions and ections on networs arcs are teken from 2
small repertoire of oper tons possible in the HAM memory system {see And
son & Bower; 1973}. This vay some context-sensitive xr
duced into the language without introducing psycnologically unreslistic powers.
Tn many way
pover and beh
one criticai

@ network formalisas of Woods ere isomorphic in their

to the program granmars of Winograd. However, + i
neer Tne flow of coatrol is contained in Winograd’s pro-
a particular progran is committed to 4 certain beha-
e in the network formalism. The flow of control is
~ which uses the grammatical enowledge contained in
different interpretative systems the same net-
be used in different Ways. This is critical

gram gramrars

vior., Tnis is not th
econtaineé in sni r
the netKerss. nus by WwW

Ss
ee
=

 

war’ (orarcar enecifiecation can
to LAS's success where cnree different interpreters use the same gronmaticel

iL
formalisms to guide understanding, generation, and language inducticn.

3. Researen on Gramucar Tnduction

 

Apparently the modern work on the problem of grammar snduction began with
the collaboration of N. Choasky and G. Miller in 1959 (see Milier, 1967). There
have been significant formal results ootained in this field and it is essential
that we review this researcn before considering LAS. The approach taken in this
field is well characterized by the opening remarks of & recent highly-articulate
review chepter by Biermann and Feldaxan (1972):

The grammatical inference problem can be described as follows: a8
finite set of gmbol strings from some language L and possibly @ finite
set of strings from the complement of L are known, and &@ grammar for
the language is to be discovered . +--+

Consider & cless C of grammars and a machine 4M. Suppose some
Ge C and some I (an information sequence) in t(L(G)) are chosen for pre~
sentation to the Machine Mc. .--

Intuitively, jdentifies G if it eventually guesses only
one grammar and that grammar generates exactly L(G),

(pp. 31-33)

The significant point to note about this statement is that 44 is completely
abstracted away fron the problem of a child trying to learn nis lenguage.
There has been virtually no concern for algorithzs that will efficiently
induce the subset of gremmars that generate natural languages. The problem

15
  

is posed in general terns. The character ization is
with inducing 2 characteriza ation of the well-formea
However, this is not the task which the child faces.
mappings between conceptuali ons and strings oF th
must understand what is $90: to him and Learn how
Te a characterization of the well-formed strings €©
product of the mapping between sentences and meanings
in the formal work on language induction, there has
about the contribution that semantics might have to”

man is without any practical so

he set of possible langusges ig too unrestricted. Worrable solutions are pos-
sible to practical problems only when it is possible to sree atly PRStE Ley the
candidate languages or because important clues exist wW!

The grammatical inference problem as charac eterized by Blermann and Feld-~
tutions. Workadle solutions do not exist because.

ct
ib
wa

priori possible languages. Chomsky (1965) argued ssentially Yor this view

with respect to the problem of a child learning his firs» danguage. He suscgested
that the child could take advantage of linguistic universals which greatly
restricted the possible languages. i will argue that such universals exist

in the form of strong constraints between the structure of a sentence and the
semantic structure of the referent. These constraints pr ovide critical cues

for the induction problem,

Gold's Work

 

- Prahahiy the most influential paver in the field is by cola (1957). He
provided an exp icit diterion for success in 2& lan guese induction proolem and
pod

,

proceeded to formally determine which lLearner-teacher ractians could achieve
thet criterion for which languages. Gold considers 4

= the Limit if after some finite time the Learner discove

s
enerates the strings of the Jenguege. He considers tyvo information S
ner

 

eq
in the first the learn is presented with all the sentences of the language
and in the second the Learner is presented with all strings, eacan properly
jdentified as senvence or non-sentence. ‘Then Cold aszs this question: Suppose
the learner can assume the language comes from some formally characterized class
of languages; can he identify in the limit “hich language 1t is? Gold considers
tne classical nesting of language classes — finite cardinality languages, reguler
(finite state), contexc-Tfree, context-sensitive, and primitive reeursive. His
clessic result is that if the learner is only given positive information acout
the language (i.e., the first information sequenc e), then he can only identity
finite cardinality languages. However, given Po itive

arn

S
(i.e., the second information sequence), he can le
sive languages. ,

The proof that the finite state cless is not identifiable with only pos-
itive information is deceptively simple. Among the finite state banguages
are all languages of finite cardinality (i.e., with only finitely rany strings).
At every finite point in the information sequence the learner will not know
if the language is gener ated by one of the infinite, nite cardin-
te

ality languages swhich includes the sample or ani
state grammar which includes the sample. Logica
s similarly easy to prove that any language in the primitive recur-
S c s can be induced given positive and negative i information. Tt is

possible to enumerate all possible primitive recursive grammars. Assume an

AZ
  

algorithm that proceeds through this countably in
one fter another until it finds the c
sta greammar 2S Long 4s the informati
it. incorrect grammar G will be rejected
inf sequence-~either ecause the seque
& 8 anerated by G, or 35 & positive ins
G. correct grammar has some Tinite p
alg LL eventually consider it and stay
tec atser than the above but these wi
fhe algorithm outlined in the second proor m
For instance, the position is astronomical of E
ordering of all possible eontext-sensitive lan 3 g
terminal symbols. However, Gold also proved the here 15
L . aa

a
ehniaue. That 1s to Sey, given any alsgo-

Ss
more effective t 1
3 ve language for which the enumeration

rithm one can pick som
algorithm will be fast

So, Gold le two very startling results that we must live with.
First, only fin: lity languages cea be induced without use of negative
information. 2 necause children get little negative fTeedoack

  
 

ck they do get (Brown, 1973). Second,

-
and make Litel Se *arnat negative reecpa t
oO
4 enumeration. This is startling because

no procedur than blind
blind ennun opeless 85 a practical induction elsorithm for natur-

'

al langu2:: see how natural language can be induesd despite
Gold's res: 3 review some other research of the same ilk.

€ the esrly attempts to provide & constructive algorithm was proposed by
T is, he attempted to define an elgorithm waich would con—

by bit the correct grammar rather than enumerating possiole grammars.

LAS is a constructive algorivthn. His ideas were never programmed an had their

ou

logical flaws exposed by Shamir and Bar-Hillel (1962) and by Eorning (1969). In
nmart Solomonorf hes served as @ straw man that served to justify the enumerative

pproeach over the constructive (e.g., Horning, 1969).

8 br

arried the Gold analyses ferther. Feldman (1970:

Feldman and his students have c e
provided some urtner finitions of languages jdentifiability and proved Gold-Liks
results for these. F man considered not only the task of inferring a grammar that

generated the semple, --- also the task of inducing the most simple grammar. Gran-
mar complexity was measured in terms of number of rules and the complexity of sen-
tence derivations. Horning (1969) provided procedures for inducing grammars whose
rules have different provabilities. Biermann (1972) provided e nucber of efficient
constructive algorithms for inducing finite state grammars when the number of states
is known, ‘Tnis is a relatively tractable probvlen first formulated in 1956 by Moore,
however, Moore's algorithms are much less efficient than Biermann's.

 

stave grammar induction that
4

Pao (1969) formalized an elgorithn for finite ste
n advance. A sample set of

did not require the number of states to be known
sentences was provided which utilised ell the rules in the grammer. A minizal
finite state network was constructed that generated exactly the
sentences. Thea an attempt was made to generalize by merging nodes in the
work, The algorithn checked the consequences of potential generalizations

DEO

 

LT
Andersen

mwledy Tey os a ’
asxing the teacher wa

a
u

in the target langu
t

    

a

ie

J

sented these induct

po
oD

at voods' work, she d
Shoe Found that such a
if sne provided punc
networks occur. Bas
the sentence's surfa
and Ruff (1963) foun
easily when surface

Crespi-Reghizzi

eo

C in
program was gives. Lnroi we 5 4
in the inéuction of operat r-precedence lenguegz> wal 2 S cont
free languages. For a special subset of operator precedence languages he was
able to define an algorithn that worked with only positive information. Except
s the only available result of success

for finite cardinality languages, this 1
with just positive information.

S
have shown relatively efficient, constructive algoritams are Bo
esting lenguage classes if the algorithms nave access to informetion ab
neets surface structure. The provlem wita their work is thet this

is provided in an ada hoe manner. It has the flavor of cnsating and cer--
tainly is not the way things happen with respect to naturel language induction.

a *.

T think the work of Pao and of Crespi-Regnizzi neve promising
352

7

   

Dow. Risface thructure of the contomce mov be inferred hy com]
paring te to its semantic referent. Crespi-Reghizzi has also shown

how the properties of a restricted subclass of languages can be used to reduce
tre reliance on negative information. While natural languages cartainly have
aspects that can be best captured witha context-sensitive grammatical forralisms,
most context~-sensitive languages are ridiculous candidates for a natural languege.
An efficient induction algorithm should not become bogged down 2s does Gold's
enuneration technique considering these absurd Janguages.

Cet nee de

Graczar as a Mapping Between Sentence and Canception

 

There is one sense in which ell the preceding work is irrelevant to the
tesx of inducing a natural language... They have as their goal the induction of
2 correct syntactic characterization of a4 sarget language. But this is now
what natural language learning is about. In learning a natural language the
al is to learn a map that allows us to go trom sentences to their corresponding
eptual structures or vice versa. I argue that this task is easier than
learning the syntactic structure of 4 natural Langusze. This is not becaus
there is any magic power in semantics per se, but peceause natural languages are
to in a very non-arbitrary manner the s

so structured that they incorpora
ture of their semantic referent. The importance of semantics nh

as b
forcefully brougat home to psychologists by a pair of experiments by Moesser
and Bregman (1972, 1973) on the induction of artificial languages. They con-
pared Language learning in the situation wnere their subjects only saw well-
formed strings of the language versus tne situation where they 547 well-formed
strings plus pictures of the semantic referent of these strings. In eltaer
case. the criterion test was for the subject to be able to detect wnick strings

18
Anderson

 

 

  
     

   

  

 

of the language were well-formed -- without aid of any referent pictures After
3909 training trial s subjects in the no-referent condition were at chance in
the criterion test wnerea subjects in the referent condition were essentially
perfect
The Role of Semantics

Results lize those of Moesser and Bregman have left some b
there is some magic power 1m naving senantie referent. However
that there is no necessary advantage to having a semantic rerere
lationship between a sentence and its semantic referent coulc, i
be an aroditrery recursive relation. Inducing this relation 15 3%
aifficult as inducing an arbitrary recursive language. ‘This la
in need of ea proof which ft have provided (Anderson, 1975). It
to reproduce here, bu 4Y Ot + algoritnm to
bitrary semantic rela ences, coul La
identify an arbitrary Gold's wor o ind
tion algoritha for the se iv c - be more effective than tne
impossible enumeration alsoritaum for identifying en arbitrary lenguase. Thus,
for it to be possible to induce the semanzic relation, there. must be strong
constraints on possible form of that semantic relation.

How does this semantic referent facilitate grammar induction? There are
at least three weys: . First, rules of natural language ere not formulated
with respect to single waras but with respect to word classes Like noun or
transitive vero which have & common s2nantic core. Se semantics can help
determine the word classes. This is much more efficient than earning the
syntactic rules Yor each word separately... Second, semantics is of considerable
aid in generalizing rules. A general heuristic employed by LAS is that, if
two syntactically similar rules function to. create the same semantic structure,

then they can be merged into a single rule. Third, there is a non-arbitrary

correspondence between the structure of the semantic reZerent and the structure
of the sentence whi permits one to punctuate the sentence with surface struc-
1

~ en
ture information. The nature of this correspondence will pe explained later.

Siklossy 'S Work

 

The only attenpt to incorporate semantics as 2 guide to grammar induction
Was by Siklossy (1971). He attempted to write @ program that would be able
to learn languages from the language-through-pictures books {e.g., Richards
et al., 1961). The books in this series attempt to teacn 2 ween ae *y pre-

senting pictures paired with sentences that describe the

Sit losey 's program, Zbie, used general pattern—mai
correspondences between the pictures (actually han
end the sentences. The program does use information
to help induce tne surface structure of the senten
of LAS. However, it remains unclear exactly what u s of semantics
or what kinds of Lenguages the program can learn. Tne displayed exexples 0 of
the program's behavior are very Sparse with exanoles n
tions. AS we Will see, & progran must have strong
it is to learn a language. The few examples of gen
lows: Suppose Zyie sees the following three senten

caniques: to find
picture descriptions)

on in the picture encodings

e at in the manner

poe
a

8 it W

19
Anderson

3) Joha walks
2) Mary walks
3 John telks

able sentence. If dees not
a

these generalizations.

 

des no discussion of now his program's benavior relates
ng a language. The one example of an attenpt to simulate

x
+}

is Kelley (1957). His progres attempted to simulate the
Cc utterances Yrom one word, to two words, to three words.
aan to be making use of semantic lmao bub he never speciries
ro 's performance. In general the details of the program ar
exanple the program never gets to the point of producin
gd iti nelear whether 1% could.

be

i

e
not coepiained. i: nS
grammatical sen

3

oOo

cr
5 13
QO
wy
a
po
oa

Y, Rationale

i
z
Q
0
pB
er
bp

ption in the LAS project is that 4 language learner can some-
meaning of sentences and that language Learning takes volace
gtances. The specific goal is ing of the

oO

rh @
te
_
w
mu
ge

-
1

* 1D

ct
a
i
ke
y
pss
pu
m
3
ct

oO
3
a fF a4

yor OD
3

ny

‘Ss
to ex plein how the pal
mantic referent permits Lansuag 2
velop @ computer program woaich

ired * aa th- ‘semantic 4 intersretat

ro
mB
ch
W
i

oO
Fe jos
choy Be Fy

4
4

fo ry
OF
ry
a
© oO
ry
ot
ms
kr
Ww

oO

0
YOR
PP
A
4
a]
5

foo
nO #
as
to
hw”

QO

nm Oo
t
d
E
ct
fay

a

complexity, it is ess
acquisition ake the form of 4a computer pro
need of a computer model after describing +

imate go21 to provide a faithful simulation
of child language 4 quisition. One mi question whether 4 system constructed
just to succeed at age learning W have much in common with the child's
acquisitian systes. I strongly suspect it will, provided we insist that the
system have the sane juror mation processing linitations as a child and provided
its language les suetion has the sane information-—processing demends as
at of the chile. ne consideration underlying this optimistic forec
at learning & naturel language imposes very severe and highly unique
on-processing demands on any induction system and, consequently, there §
very severe Limitat ions on the possible structures for 4 successful system.
A similer argument has been forcefully advance ed oy Simon (196
to the information-processin 1g demands of various problem-s solving te

This project does have as an ult
eC

te

py MH

e

rym

t
t
%

be

@ Oo

ee
a
pe
cr ct
es
Wy
oO
uo
Oo
6
oO
ct

The current version of the progran LAS, L works ~ en overly simpli
domain and maxes unreasonable assumptions abou z gz
Nonetheless, it predicts meny of the gross rea
generalization in child language learning. Tt i errivly "off" in other aspects.
It turns out thav many of its failures of simulation can be traced to the un-
realistic erate ns it is making about tass domain and inforzation processing
abilities. Many of the proposed devel oments of the prograd nave as their goal
tne elimination of these unrealistic assucpcicons. The assumptions vere made to

make the problem more trac tayle in a first-pass attempt.

oo

ures O2 veoneralization z
aA
&

20
rm we ae i
5 - The Progran LAS. 1

[i -AeUe e ne aE ROO OA AN

 
  

Tis section deserives LAS I, 4 relatively smali progrem that was put together
in eight montas. Tt has achieved success in é€ non-trivial natural languzge in-
duction will be principally concerned wita extending

SPRAK which uses the network formalisms %o generate 2

uses the seme networss for sentence understancing BRACKET which punctuates
sentences With their surfece structure by compari

referents, aid SPFAKTEST which puilds an initial ®

etbwork grammar to parse a
sentence end GENERALL = which eneralizes the initiel ranmar.
>

LAS is an interactive program written in Michigan Lisp (Hafner & Wileox,
1975). The progres acceots as input Lists o* words, whica it treats as sentences,
and scene descriptions encoded in 2 yariant of tne HAM propositional language

{see Anderson & Bower, 1973). obeys commands to speak, understand, and learn.
The logical structure of LAS is illustrated in Figure 2. Central to LAS is an
augmented transition network grammar similer to thet of Woods (19TG). ‘In response
to the ccmaand, Listen, LAD eVORES vie wEU St oH UNDORCTAND. Tac inet to UDR
STAND is a senvence- LAS uses the information in the network grammar to parse
the sentence and optain 2 representation or the sentence's meaning. in response
to the command, Speak, LAS avoxes the progres SPEAK. SPEAA aceives a picture
encoding and uses the information in the network grammar tO generate a sentence

e
409 describe the encoding. Note that LAS i S
Doth to speak and understand. The principle pur

in LAS is to provide & test of the grant

3 using the same network formalism
pose of SPEAK and UNDERSTAND
ed by LEARIMORS.

H
“
od .
ts
au
C
QO

The philosphy pehind the LEARNMORE program iS to provide LAS with the

sene information that a child has when he is learning a langusse through osten-
sion. it is assumed that in this learning mode the adult can both direct the
child's attention to what is being described and focus the child on that

aspect of the situation wnich is being deseribed. Thus, LEARNMORE is provided
with a sentence, & HAM description of the scen ana an indication of the main

o

proposition in the sentence. It is to proauce @5 output the network grammar
that will be used by SPEAK end UNDERSTAND. + is possible that the picture
description provides more information then is in the senvence. This provides
more information than is in the sentence. This provides no obstacle to LAS's
heuristics. In this particulsr yersion o- LAS 4% is assumed shat it already
knows the meaning of the content words in v! entence. With this informetion
BRACKET will assign a surrace structure & he sentence. SPRAXTEST will deter-
mine whether the sentence is handled by tne current grammar. if not, additions
are made to handle this case- These additions generalize bo ovhner c2se5 SO
that LAS can understand many more sentences than the ones it was explicitly
trained with.

ct Pp
Mm
wv

el
 

SENTENCE MAIN: PROPOSITION DTCTURE ENCODING
|
an Ee ai gata tee RES EATS,
i LRA RNMORE |
pp ?
lee i?
| Bracke =
i soeartest? |
| GENERALIZE|

 

JEN DED 4

4
TRANSITION |
NETWOR |
GRAMMAR |

  

RUGHE

SENTENCE

HAM CONCEPTUALIZATION SENTENCE

Figure 2. A schematic represe entation sivin
of the major subcomponents of Ls

_ UNDERSTAND.

22

~LEARNMORE,S

PICTURE ENCODING

the input and
DSA
   

  

The SPEAKTEST program would permit LAS to construct a parsing network
adequate to nance ile all the sentences jt was presented with. Also it would
make many Llow-Leve vel generalizations about phrase structures and word classes.
would permit ssfull nalyae or generate many novel sentences.
es c ization i % the
BRALL re i he

3
“grammars is essential ©
GENERALE is oO

The HAM, 2 Memory Syste

The Het 6 De

LAS. L uses 4 version of the HAA memory syste (sea Anderson & Bower 1973)
ir

>
called HAM. 2. HAM, 2 provides LAS with two essen al features. First, it
provides & representational formalism for propos siti onal knowledg2- This is

used for representing the comprehension ouspus of UNDERSTAND, the to-pe-spoken
input to SPEAK, the semantic information in 1 long-ter menory, end syntactic in-
formation about word classes. HAM: 2 else “contains a memory sear chi

rithm MATCHAL which is used to evaluate various parsing ¢ conditions. F
stance, the UNDERSTAND progrem requires thee ce rtain features be true of 2@
ora for 2 parsing rule to apply. These are checsed bB &
The same MATCHL process is used by the SPEAK prosras to Get
et

Ds

ection essociated with a parsing rule ereates part of th en struc-
sure. This MATCHL process is ° variant of the one deseribed in Anderson and
Bower (1973; Cn. 9 2 12) and i sg details will not be discussed here.

However, it would be useful to de ripe here the repress ntationalL ror-
r now the int eornmation in the

sc
malisms used bY HAM, 2. Figure 2 jllust es
sentence A rec. square is above the circle would be represents’ © ith the HAM. 2
networs formalisas. There are four distinct Pr
two nodes X and Xi xX is red, X_1is 2 sauere, 4
Each propos sit son is repre sented by 4 distinecs tre
ture consists of a root proposition node conn
node and by @ Bli ink to a predicate node. ;
posed into 4 R link pointing to a relavion node and into & ° Link pointing ‘to
en object node. The semantics of these represens tations are to be interoreted
in terms of simole set-theoretic notions. Th subject is @ subset of the
predicate. Thus, tne individual X is & subset of the red things, the square
things, and the things aoove x. The individual yis a subset of the circular
things.

»

One other point needs empnasizing epout this representation. There is

e distinction made between words and the concepts which they reference. The

words are connected to their correspo onding ideas oY Jinks Labelled W. Figure

3 illustrates all the network notation needed in the current japlementation

of LAS. There are & number of respects in which this represeatation is simn-

eee then the old HAM representation. Cnere are not the means for represent—
ng the situation {time + place) in whicn such a fact is true or for enbedding

one proposition within another. Thus, we cannot express in EAM. 2 such sen-

 

2
tences as Yester day in =y¥ pedroom 2 rec square Was Boove the circle oF John
believes that 8 red guuare Ts apove tae circle. nepre

believes that §
qd

Figure 3.

 

An e

yample of 2 propositional netw

CIRCLE

ork representation in HAM-<
7 Dy. “_
Anderson

re only con-

a

ension. In
*
1

 

Lng
ostension, tac assumed time and place are . Concepes Lise pelief
woich require enoedded propositions are too for ostension. In future
yesearch LAS will be extended oceyona the c asive domain. At that
point, complications Will pe reqgaired in + resentations, however,
wien starting out on 4 project it is prefe seep things @s simple as

There are @ numocr of motivations for the associative network representa-
tion. Anderson and Bower (1973) have combined this representation with a nun-
per of assumptions about the psychological processes that use then. Predic-
tions derived from the Anderson and Bower mo ‘ & to be generally true
of human cognitive performances. Howeve 2c

HAM's representation have not been emple

 

that recommends associative network. represenvetions s a compute

has to do with the facility with which they can be searcned. Another advantage
of this representation is particularly relevant to the LAS project. This has
to do with the modulerity of the representation. Bach proposition is coded

as a netyorn structure that can be acessed end used, independents of other

So far, I have snown how the HAM. 2 reoresentation encodes the episodic
hn input to SPFAK and the output of UNDERSTAUD. It cen alse

Ss
the semantic and gsyntectic information required by the parsing
eaten tenctan sy TAM 9 weavta anaods the foot that rircta

ae ot PRIUS UL LCoS sew oa =

and square are potn shapes, red and bine a

belong to the word class *CA bub square and blue telong to the word ciass FCB.

Pr . 7 oT -ae 3 2

Note the word class information 15 prediceved oF the words while the categor-

teal information is predicated of the concepts attached to these words. The
a

mntactic rule only applied to

s

 

  

e both colors, circle and red

te)

categorical information would be used if

2
2 anrm7.37
i

shaves or only to colors. Tre word class information micnt be evoked ifa

2S vy —_ y~ a ae

language arbitrarily applied one syntactic rule to one word class and onother

rule to @ different word class. Inflections are @ common example ort syntactic
s

3
rules which apply to arbitrarily defined vord cle

HAM, 2 has 2 small - language oF commands which cause various memory
Links to be built. The following four are eli. that are currently used:

1. (Ideate X Y) = create 2 W link from word K to idee XY.

2, (Out-of X Y¥) - create & proposition node Z, From this root node ereate

uk
3. (Relatify X Y) - create an R link from X to
h. (Objectify % Y) - create an QO link =

These commands wilkL appear in LAS's parsing networks to create memory
erions. Often rather than memory

structures required in the conditions ane iv
appear in these commands. Tf the

nodes, variables (denoted X1, Xe, etc)
variable hes as 445 value a memory node that node is used in the structure

puilding. if the variable has no value, @ memory node is created and assigned
to it and that node is used in the menory operation.

 

To illustrate the use of these comzands, the following is & Listing of
the commands that would create the structure in figure 3:
92

Seo

C
P pf \8
\ fo >

  
 
  

*COLOR {

   

J

CIRCLE #OA RED SQUARE Hig | BLUE

 

Figure 4. An example of a HAM structure encoding both categorical information
and word class information a
(Ideate red 1)
(Ideate square 2)
(Ideate above 3)
(Ideate circle hy
(Out-of KL)
(out-of X 2)
(Out-of X 8)
(Oojectify & Y)
(Relatify 8°3)
(Out-oF Y

 

t ply to any m2

st languages will b
will also be used to illustrate the SPEAX anc UNDER
eribed shortly. The first, GRAMMARL, is a simple artificial grammer. Ta
second, GRAMMAR, is a more complex gramzar Tor @ Su

aYnAAD

D
aefined by tne rewrite rules in Table l. GRAMMARL wa
mally different fron Englisn word order. Tne sentenc
be read as asserting the first noun-phrese nas t
last word to the second moun phrase. For purpose il

of these languages are English but they need nov oe. GRAMMAR] 1s 4 finite
Language without recurssicn. In contrast, in GRAL-MARe the NP element has an

amb dann t OTATOS eratnh naam manwnndiareadear aot) MD  oeamarctine 2 nntoential infin-

Sp eee Se ee MOCUPILV Ry mem y peor cv. meen et

constructions.

we

a

ite embedding of

uy
fo

3
fp

In both gremmers, it is assumed that above end below are connected to the
idea as are right-of and left-of. Tne words differ in the assigment of their
NP arguments to subject and object roles- Tnus the difference between the
word pairs is syntactic This is indicated by having the words pelong to
two word classes RA and RB. Thus, UNDERSTAND with CRANMAR2 would derive the
same HAM representation in Figure 3 for the sentences The red square is evo0ve
the circle and The circle is below the rea square. It yould have been pos
sible to generate distinct representations For these two sentences. I think
this would have Deen less psycholegicetly interesting. Basically, the network
ise

 

grangar makes the inferences that A below 3 quivalent to B above A and en-

codes the latter.

 

TABLE 1

The Two Test Grammars

GRAMMARL GRAMMAR

Ss + WP NP RA Ss: +> WNP is ADJ

NP NP RB NP is RA Ne
NP + SHAPE (COLOR) (SIZE) WP is RB NP
SHAPE + square, circle, et. NP + (the,a) NP* CLAUSE»
COLOR + red, blue, etc. . ye* +» SHAPE
SIZE > large, small, etc. . + ADI SHAPE
RA- > above, right-of CLAUSE > that is ADI

that is RA NP

27
 

TART? 4 abbas
TABLE 2 continued

g +» below, left-ot CLAUSE + thet is RB uP

SHAPE + square, circle, euc.

ADS + red, bis, blue, ebc.

RA + above, right-of

RB + below, left-of

Figure 5 illustrates the parsing netuorss for the grammars. It should
be understood that thes networks have been deliberately written in an inefri-
cient manner. For instence, note in CRAVMARL thet there are tyo distincs patns
in the main START network. Tae first is for tnose sentences viva RA relations
and the second for tnose sentences with 2B relations. If a sentence input
to UNDERSTAID nas a RB relavion, UNDERSTAND will first attempe to parse it by
the first branch. The tyvo noun phrase branches will succeed bus the relation
branch will fail. UNDERSTAND will have to back-up and try the second branc
that leads to 23. This costly back-up 25 not really necessary. It would have
been possible to have constructed the START network in the following form:
STGP
NP HP aT

 

not branch until the critical re
until une e&
1e reoresentati

  

Table 2 provides a formal specification of the information stored in LAS's
network grammars. A node either hes a number of arcs proce eeding out of it
(1a) or it is a stop node (1b). In speaking end vaderatandins LAS will try to
find some path through the network ending with a stop noce. Each are consists
of some condition that must be true of the sen z to be ed
in parsing (under rstanding) the sentence. Tn t
be taken if the condition is met. This acti
conceptual structure to correspond to the m

thet point. Finally, en are includes speci
control should transfer after performing the
zero or more HAM memory commands (rule 3).
or more memory commands also (rule ba). These

e true of the incoming word. Alternatively,
push to an embedded network (rule hb). For instanc

 

 

in Figure 3 were to be spo: ken using CRAMMARL. Tae START ne be
called to realize the X_is above ¥ proposition. The erpedded NP netvork would
be called to realize the ¥ is red and X ls scuare propositions. in pushing
to a network two things must be specified-- “NODE, raich is the embedded net-
work and VAR, wnaich is the memory node at waich the main end emoecded propo-
sitions jntersect. The element t is rule ib is 2 plsce-hnolder for invormation
that is needed vy the control mechanisms of the UNDERSTAND progran. The
three rules 6a, 6b, and 6c specify three types of arguments thay memory
commands can have. They can either directly refer to mexory nodes, or refer
to the current word in the sentence, oF refer to varieble: c} 3
Ms workoa far Ammar te
Ai ko for Cconmmanl 7

 

Networks for CRAMMARD a

 

 

 

 

 

NP 2 COP  € AdT
START awenetiemnenntom SZ seSh tom STOP
NP = RA
NP
7 OSSD G errno ST OP
4 2 COD = RB . NP
Ns3 = CO! 20 SJ aannammeemtioe SB mm 2STOP
enem ae
NP — se Nl —s STOP
€ ©SHAPE CLAUSE
NPL > AL— - to STOP
NPL
AD ceeeenrnnrernennrentnnntonnsto TD
& REL = cop

 

CLAUSE CL

 

Figure 5. The natwork grammars used by BAS

29
ry
0
QO
ry
rs
Cc
fu
Q
Lee)
be
rs
ct
Ly
o
QO
G
ion
by
wv
oD
©
bh
tS
J
Ky
oO
rs
C9

NODE > ARCS (1a)
> stop (1b)
ARC > CONDETION ACTION NODE {2)

 

ACTION > OMMAND* 3)
CONDITION > (COMMAND® } (he)
> + NODS (uD)
COMMAND > aG ARG (5)
ARG id menory node 62
x

‘
oO
et Net Nee

 

> (6
+ Ady xh, KS (Se
FUNCTION > , oojectify, relatify, ideate (7)

Mable 3 provides the ancoding of the nebyork for GR

   

  
 
 

Note that there tencas to be a l-
and LAS networ: Tt each network expresses just
calls one ; to exoress
dence is not quilt: Tt in GRAMHARL or G@

no Lave necessalasy © pore ce

tures to commend then. SPEAK
L:

These grammar networks have a2 number of a
rc mtence comprehension and generaclon.
7?

and UNDERSTAND use the same network for sent
Thus, LAS is the first extent system to have & uniform gremmetica notation for
jts parsing and generation systems. in this way, LAS hes only to induce ons
set of grommatical rules to do poth tasks- Such networks are nodular in two
senses. First, they are relatively indepencent of each otner. Secona, tney
are independent of the SPEAK and UNDERSTAND rrograms snav use then. Thais
modularity greatly simplifies LAS's tasx 0 induction. LAS cnly induce

maa

r gr Poaz 3
the network grammars; the interpretative SPEAK and UNDERSTAND programs repre
i lve x

sent innate

r

inguistic competences. Finally, the networks themse
very simple with. limited conditions and actions. Tous, LAS need consider
only a small range of possibilities in inducing 4 network. Tae n=
salism gains its expressive power by tne embedding of networks. Hec

network modularity, the induction task does not incresse with the complexity
of embedding.

fee

Tt might be questioned whether it is really 2a virtue to have the same
representation for the grammatical knowleage both for unders
a +5 :

duction. It is 4 com=on ooservation that children's ability to uncer cand
sentences precedes their ability +o generate sentences. LAs would noe seen
to be able to simulate this basic fact of languese iearning. However, there
may be reasons way child production does not mirror comnprenension otner than
that different grammatical competences underlie the two. The child rey not

yet have acquired the physical mastery to produce cartel b
is the case, for insvance, with Lenneberg's (1962) enarthric cnila wno under-

30
Tha eontruetion of CRAMMARL

qPpuT aoEErKUP TNSUBK CORT epuT *SuBRd}

A

? {PRKCON

4 corr PRP START PATH

“e CELPUSH AL TONP) ({QUT-UF Al X5}) S2)

5 Ci PUSH XL T Npy £f08JeECTIF %5 X1l)) S4 }))

6 (DEEPROP S2 PATH

i CE CPUSH Ke T NPQ (AOUGETCTIFY x5 xX2)3 S3 y)}

8 > (WEF PROP 33 PATA s
3 QE CEPOFATE FORD X¥4) (GOUT-OF WORD #QA)) (CRELATIFY 45 Ad) stop
LO Coir enOP S4& PATH ee

Li (UE PUSH X2 T NP) £(OQUT-DF %¥2 X5)) S5 423

L2 LDEEPRUP $5 PATH

13 LE UCLOFATE WORD %G}) LOUT-GF WORD HRA)) (ARELATIFY X5 X43) STOP ?
io LDEFPROP NP PATH :
15 COC CTUEATE WORD X43} ({GUT-OF A4 a SHAPE) ) {(OUT-OF Al X43) NPZ 7;
17 {OCFPROP NP2 PATH

is (CUPUSH Xl T COLGR) NIL NPS }

13 C NTL SIL NP3))}
20 {NFFPROP NPB PATH

2i CO¢PUSH X11 T SIZE} NIL STOP 3}
22 {NIL NIL STOP ))}
?3 (DEF PROP COLGR PATH

2% CLL CEDEATE WGRD %4) {QUT-UF X4 *CULOR)} ({QUT-OF Xi X49) STOP
29 (NEF PROP SIZE PATH

26 . CELL IDEATE WORD X42). (OUT-UF X* aS1ZE)3 @(OUT-DF ¥1 X43) STUD }
2t (TALK) -
2d ({IDEATE SQUARE XLICTCEATE CIRCLE X23

29 C(UUT-UF AL *SHAPEV(OUT-OF 2 *SHAPE}) }

30 (CIDEATE REO XB) CTOCATE GREEN %4))

31 ((NUT-OF X%3 COLOR) (OUT-OF X4% #COLGR))

42 CLISP SETO X1 NIL)

33 CCIDEATE SMALL X5) (1 DEATE LARGE X1)})

34 ({UUT-GF X5 KSIZEV(OUT-OF Al *SIZE)}
35 NIL

36 (TALK)

37 ((LOEATE TRIANGLE MLL UDEATE BLUE X2)CIDEATE MEDIUM X3))

38 ((OUT-GF Xl &SHAPE) { GUT-OF A2 =CGLOR}{OUT-OF AB *SIZE))

39 (LISP SETQ XL NTL} -
400 {LISP SETQ Xe NIL)

Gi CC TOEATE RIGHT-OF XLICLOEATE ABUVE K29)
A3 , ( (QUT-CF RIGHT~Ge KRAVE OUT-OF ABQVE *RA))
44 {(OUT-OF LEET-OF RRBICCUT-OF BELCH *RBDY
4&5 ({IDEATE LEFT-OF XLPMTDEATE BELLY 4223
&4 NIL

31
  
     
  
   
 
 

   

     
   
   
  
 
 

StOG Tea oe)
use : astrugtion, but instead us
of productis fhe fi pussibility is thot the emits
non in- language 4 ing.
evi ang not understand pas ives
DES yarsible. It see

or petween subject,
ecarnaul ars when asked to
Similarly ut Zz

Ter

hat we

pe ref ct
u oO
rf

 
    
   

precedes

Se

sho g

probadi sentence fes
priate taneously producing tne
were : > 1 the measures of produce
Ferm: Le 79) ing d 72 ring procedures ,

es 42

we Ve Cia ee ee ee

 

   

with a HAM network of propositions sagged 25 to-pe-spoken and
a topic o sentence. The topic of the sentence will correspond to the
first neaning-oearing : etwor! cpraxX searches through 1vs
START network Looking £0 a7 4 xen proposition
attached to the topic and woich expresse pic 28 first element. It
determines wnetner @ path accomplishes this by evaluating the actions associated
witn @ p2acn ang determining if they created a structure that appropriately
matches the +o-be-spoxen structure. When it finds such @ path it uses iv for
generation. *

Generation is accomplished by evaluating the conditions along the path.
cursively

v a
If a condition involves 4 push to én embedded networs SPeAK is re
c speak some gub-purase expres

allied to sing @ proposition attached to one
proposition. The arguments forarecursive call of FUSH ere the expedded net-
work and the node that connects the main proposition and the emoedded ororo-
sition. if the condition does not involve 4 TUSH it will contain a set of
menory commands specifying that some features ‘Qe true OF @ word. % will use
these features to determine what the word is. Tae ~ord so Ceterminec will

3

to
nh
  

As an example, consider how SPHAK nerate a sentence correspoading
to the HAM structure in Figure 6 using , the En .eLish-11is ce amar in
“Boeure 6. Figure 6 contains set of propositions about thre denoted
by the nodes G2k6, G195, and G182. Of node C26 it is ass “ht is 2
srianghe, and shat G195 is right of it. Of G195 it is ass it is a
square and that it is above G1g2. OF e182 it is asserted © SQUAT,
smalL, and red. igure 7 1 jliustrates the generavion of this froa
GRAMMAR2. LAS enters tne START network invent cn producin ueterance
exoout G95. Thus, the topic is G1i95 (it could neve been G26 OX 162). he
first path through the network involve vedi n aaa OE G195, pet
the ve clas Tre second pata

here is nothing in the adjectiv

through tac SPART network corr ESsDo ones “say eyout G95 -~- -
it is above G182. Tuer afore, LAS pia Ss main proposition.
First, it must find some noun phrase “to express G195- The substructure under
G195 in Figure 8 reflects the construction of this. $s supnebvork « The NP network
jis called which prints the and calls NPL wpicn retrieves Square and calls
CLAUSE which prints that, . is, ana right-of and ain recursively calls NP

to print the squert- Sinilarly, recursive calls ere made on the HPL network
to express G162 as the small red square.

 

The actual. sentence generated is senna on choice of topic
START network. Given the seme to-be-s =.xf network, but the topic G2k6,
alt

 

 

    

 

SPEAR generated A tri jangle is Left—o8 : mn gauere %o2% is above a sheik red square.
Given the topic GiLg2 it generated Ax Foauase thas is below 4 SGUarS that is
right-of a triangle gle is smelt. Note ho the cnoice of tne reletion words lefc-
of vse 2 SYehecot and or Guove VS- below is Gependent on choice of topic.

It is interesting to inquire what is the Linguistic power of LAS as &
speaker. Clearly it can generate eny conbext-Tree Tanguegse since its transition
networks correspond, in structure, to a con ree grammar However, it turns
out that LAS nas certain context-sensitive asnects because its productions are
constrained by the requirement that they express Some well-formed HAM conceptual
structure. Consider two proolems that Sky (1957) regarded as not handled
well by context-free pramnars: The first 15 agreenent of number between 4 sub-
ject NP and vero. This is hard to arrange in @ context-free grammer because
the NP is already puilt py the time the choice of verb number must be made
The solution is trivial in LAS——wnen 09 4 the NP and yerb are spoken their. num-
per is determined by ins spection of whatever concept in tne ~o-be-spoken structure
underlies the subject. The other- Chomsky © xanple involves the identity of
solutionel restrictions for active and 2 passive senvences.- Thais is also achieved
au tomatically in LAS, since the restr ieticns in both cases are regarded simply
as reflections of restrictions in the sera ntic structure from which both sen-
tences are spoken.
While LAS can hendle those features of naturel language suggestive of

contexb-sensitive rules, it cannot handle ex amples Like languages of the form

oms
i

oO Ss

aNpich which require context-sensiti jive grammars. it is interesting, however ,
that it is nerd to find natural languese sentences of this structure. Tne best
T can cone up with ere respectively-tyPe sentences, G-+B-2 John and Bill nit and
kissed Jene and Mary . respectively: This sentence is of questioaasle aoceptavil
fe
ns

ATE / |
GIG

RICHT-OF SQuaA

oO
Se

The towbe-SspoK

oT

Mae

sy CHAM network for the

  
    

SPKBAK programe
Pp
wo
THE NPL
TRIAN
A tree structure sh
These networks were called
G195 which exoressed the in

3D:

NPL
RED G182
i
NPL
SQUARE

owing the network eallisand ward evtout

wwe

rating a sentence about

sion contained in Figure 6,
 

 

ture

   

 

is 3 Vv; °

sider the possibility thet the failure may a Zo sip a
on thet peti TT fer % possible to have to back into 4 network a second

ontr o £

De.
time to 2ote .
the UNDERSTAUD prog
trol structure Were

 

  
  
  
    

 

 

 

 

 

 

 

    
  
 
 

 

 

Perhavs an Englisn example would be useful +o motivate the ne oO:

control Structure. DB he two sentences tre Deroeratic party hones to win
in '76 with The Democrs carty hopes ere hign for ‘To. A main parsing network
would call a noun. vars ork to identify the Sirgt noun phrase. Suppose
UNDERSTAND identi? * ratie party. bLeve elements in the second sentence
would indicate narerore, the mein netvork would have
to re-enter ths :fferent parsing to retrieve
The Deamocresic 4ereq the noun-pnrase network
to retrieve on i% must remember woien persings 44 tried tne first time
so that it doe ievye the same old parsing. Tae complexities of this
control ssruct wiped in a more compleve report (Anderson, 1975).

Here © “lib gu a general strucwre of the gr

to find some pata START network waich wibl 3 e

parsing of the sentence. + evaluates tae eeceprsollity o

eveluating the conditions associated with that path. ond on ma

thet certain features Sa true of words in the sentence. This is Setermined by
checking memory. Alternatively, 4 condition caa require & pusn to an embedded
network. This network must parse some subphrase of tne sentence. When LAS finds
an acceptable path Shrougn a network it wilh collect tne actions along that path

to create a temporary mory structure to rep

 

 

 

 

>

exemple of wnere it might seem that LAS would need &
In English noun phrases, it seems we can heave en arbitrary numbe

me

that LAS has parsed. This, for instance, given 4 antence, Tne square thet is
risht-of the triangle is abd na TAS woulé parse it in tae
form illustrated for Pigure 6

in LAS. 1, understandin

first displayed exanp

(1973) comes closest

analysis.

It is also of interest to consider the power of LAS as an acceptor of lan~
guages. it is clear that LAS as presently constituted can acceps exactly the
context-free languages. This is because, unlike Woods' (1970) syster, actions
on ares cannot influence the results of conditions o4 arcs, and therefore, play
no role in determining wnether & string is accepted or not. However, “nat 15
interesting is that LAS's behavior aS 2m Janguese understander is relatively
Little affected by its Limitations on grammatical powers Consider the following

a n

o

contextb-sensitive gramnars
x mber of adjectives.

36
 

General Conditions Tor Language Acquisition

_— Dn ne a

 

 

Having mov reviewed now LAS. 2 understands and produces sentences, I will -
present ne three asveces of the induction progres: BRACE, SDRAXYTSST, and
GeieRALIZS. Before doing SO» it is wise to priefly state tne conditions uncer
waien LAS learss 4 Language. tt is assumed that LAS. 1 already nes CONnceDpes
attached to the words of the Languege» Tiat is, jexicalizetion is complete.
Phe task of LAS. L is to learn the grammar of the janguage--that is, how to 9
from a string of words to 4% representation of their combined meaning. Secause
Li oy concerned with Learnings meanings, it cannot be a Ver! realistic
econd 1 learning where many concepts can transfer fron she
yage. i Will propose extensions oF 43, L concerned
ZS.
52° LAS. 1 is that if works in 2 particularly restricted
semantic co is presented with pictures indicating relations ana proper-
ties ail geometric objects. These pictures ave aetueliy encoded
into sonal networs representation. Along with these pietures
LAS 2 anees describing the picture and an indication of tnat
aspec which corre sponcs to the main proposition OL TRE SUULEHES:
From 0: nm input, @ network grammar 1s constructed. The semantic
dona y simple, put the goal is to be able to learn eny natural or
natoral-like Language which may. gescribe that domain.

A major aspect of the LAS project is the BRACKET progran. Tais is an alsori-
for taking # sentence of an aroitrary Jeanguage nd HAM concepsual structure anc
sroducing @ pracketing of the senvence shat i

nis surface structure prescribes the hirerarc
sentence. For BRACKET to succeed, Four condition

etworks required to parse the
must be satisried by the infor

Condition 1. All content words in the sentence correspond to element
cepsuel structure. This amounts to the clain that the teacher is 4 L

the learner to conceptualize the information in his sentence. It does not. matte
to tne BRACKET al oritnm whether there is more information in the conceptual
Pp

structure than in the sentence.

Q
°

ndition 2- The content words in tne sentence are connected to the elements
in, the conceptual structure.. Psychologically, this amounts to the c nat
exicelization is complete. That is, the Learner KNOWS ne meanings of the wor

3

e surfece structure snterconnecting the content words is isomer
phic in its connectivity to & janguage-fres prototype structure.

Condition 3. *

 

37
a eae ’ + : . . : . . - . :
Condgition 4. The main proposition am conceptual structure 2s indicated.

 

iw
0
5.
fr
a

quire considerab
S 2 prototype
tar I will explain why soueb

   

 

 

Consider Panel (2) of Figure 6 which iit sbrueture for the
series of propositions in the English sentence re is noove the small
cirele. Panel (b) illustrates a grapu deformation of that structure giving the
surface structure of the sentence, OVS how elements within the sane nom phrase
are appropriately assigned to the same subtree. Note that the prototype struc~
ture is not specific with respect to which Links sre avove whien otners and
which ere right of which oceners. Althouga the HAM structure in Panel (aj is
get forth in a particular spatial array, the choice is arbitrary. In contrast,
the surface structure of a sentence does specify the spatial relation of links.
Tt seems reasonable that all natural languages nave as their semantics the same
order-free protovuype network. They differ from one eno ther in (a) the spatial
ordering their surface structure assigns to the networs and (bo) the insertion
of non-ucaning-bearing i moras mes into the seatence., however, the surface
structure of all natural languages ig derived from the same graph patteras.
Penei (c) of Figure & shows how the prototyse structure of Panel (a) can pro- -
vide the surface structure for 4 sentence of the artificial GRAMMARL. All the
sentences of GRAMMAR] preserve the connectivity of the underlying HAM structure.

S$ OF L
By this critericz, at least, GRAMMAR could be 4 natural language.

tain conceivable languages would have surface structures which
jons oF the underlying structure. Panel (d) illustrates

COU 1 bad G
such a hyvothetical language with the same syntectic structure as English, but
with difver Le

+

the

ent rules or semantic interpretation. In tnis languege the adjective
inz the object noun modifies the subject noun. As Panel (a) illus-
trates, there 1s no deformation of the protovyD sructure in Panel (a) to
achieve a suctace structure for the sentences in the language. No matter r how

it is attempted some renches must cross.

n

{a

uy
OR
oO
cr Yl Gs

$

connectivity of the prototype network to infer what the

LAS will use the t t

connectivity of the surface structure of the sentence must be. The network

does not specify the rignt—Lett ordering of the yrancnss or the above-below or-
dering. The rignt—Left ordering can be inferred simply from the ordering or
the words in the sentence. However, to specizy the aoove-below ordering, BRACKET
needs one further piece of information. Figure 9 illustrates an alternate
urface structure that could have been assign ad to the string in Figure 8 (c).

t might be translated into English syntex as Cir ula +

ig the small thing the

 

rear

ures illustrate, the Has

 

s below the red square. Clearly, #S these two s
Reet

r

network and the sentences are not enough tc spe eify the hierarchical ordering
of subtrees in the surface structure. The difference between the sentences

in Figure 8 (c) and 9 is the choice of wnich proposition is principal and
which is subordinate. If PRACKET is also given information as to the main
proposition it can then unambigiously retrieve the sentence's surface structur

The assumption that PRACKET is given the main propo
to the claim that the teacher can direct the learne
asserted in the sentence. Thus, in Panel (ce), the te c would direct the

learner to the picture of a red triangle above @ srall circle. He would both
have to assume that the learner properly conce tyalized the picture and that
he also realized the aboveness relation was what was peing asserted in the picture.

38
 

ad nab tnnre + y*

(b)

W

  

1 SQUARE CIRCLE SMALL .

SN UTP RTNIT V*

 

 

(c):

 

  

iM

 

 

 

SMALL . ceules RED BELOW
ructures of the sentences
graph
in (a).
aA Aarnfarm f,

The surface st
in (b) and (ce) are

the HAM structure Panel (a)

to 2

a

THE SMALL wales

|

deformations ol.
é..
FIgure J. Alternate Surs

hoa

face structure for the s

entence in Figure 9c.»
More on tre Graon Deformation Condition

a

 

 

en

 
  
 
    

T think tht the graph deformation cond

aa

ae
of 8 universal property ot language. However, to make a
is elear that something ther than the HAM network wilh neve to
3 neh > works weil en

   
   
 
 

1 a nn

sroused togeuner.

  

ed

ture ceria

    

  

closer tog? nd open are closer together. If Fig
fro

sentences woiecn alternated words
is no deformation of the structure

or.
2
ae

type, LAS cold n
groups. Por ins
would provice 2

 
 

o’ cr O

or John opened with a key the do

   

sentences Wal

    
 

 

the HAM structure to cross. ‘This Saglisna sentence &
Hien violate © deformation condition Tor Figure 19
6 :

snething Lixe the case
ually necessiple from §

erguments are equally
= posed by the verp open is one posed
bat

vary and its arguneres woile it is like

Ly
3-2 some natural language. There are two Ways to deal
could resort to a memory peprepenvalivn Tithe (uy. HOR
ar or significant considerations that motivate tne HAM
(a}. Moreover, representations like (o) finesse ons
questions in Languege acquisition--nc7 we learn the
ax verbs. To address this question Wwe need a represen-

$i-argument verbs into @ representation Like

tt
bE

ike (

semantic function of the case arguxents. Learning the role
anguage then involves leerning hoy to assign it

o a structure Tike (a). Tf will sketch systen to do this

If we Keep the HAM representations then some changes are required in BRACKET

grepn deformation condition. Whet is characteristic of multi-argumens verds
in HAM is thet the arguments are interconnected py causal relations as in (a).

Thus, BRACKST showle pe made to treat all the terminal argucent

structures &5 defining @ single level of nodes in a graph structuz

nected to 4 single root node. Tnat is, BRACKET can treat a HAM structure

such as (a) if it were (b) for purposes of utilizing tne graph reformation con-
fact, BRACKET already does tnis in the current jmplenentation.

QO

dition. In G5
The Details of ARACKET's Output
So fer, only & deseription of how one would retrieve tne surface struc
ture connectng the content words of the sentence nos been given. Suppose
MACKET were given A triangle is lett-of a scuere shat is ebove & small red
this senvence wnien will

ce

square. A bracketing structure must be imposed on

41

    
JOHN PURN KEY CAUSE. DOOR OPEN

(d)

JON KEY -—s OPEN DOOR

Figure 10. Alternative prototype structures for the sentence Jorn
onened the door with a “eve The HAM structure in (a)

introduces too many distinctions:

42
 

 

also include the functicos words, Given enis sentence ane she conceptual Ssoruc~

ture in Figure 6, BRACES rev ned (G257 (G2k6 Geby 2@ triangle) is Left-of (G195

Gi96 a square (Gig5 G225 tna. is above (G1b2 Gi83 a smell (q182 G1s5 red (G182

Gi8h square))))))- Tae oain mroposition is 2257 which is give? as the first

term in the bracketing. Tre first bracketed suD-2xoression aescribes the 3uo-

ject noun o element in the sub-exXpre gon G2b6 is the node tnat
h aa * =

 

 

 

 

jinks the Te rst

two words as

The next two worGs is.

propositions corressouaing & chese The re ft

corresponds. to 3 description of the element G1u95. Tne first emoedded prop si G

Gi95 asserts this object is 4 square and tne secoad proposition, G225, asserts

that Gi95 is above Gio2. Note that the G225 proposition is emoedded a5 4 supe

expression within tne G196 proposition. Te last element in the G226 proposi-

tion is (G182 G1i83 11 (G Gi85 red (C182 Gish squere))). Tsais exoression
G

p
wi
» FA
wy
b

162
has in it three propositions G183, G

ut. of BRACKET. Aostractly, the out—

The above @X@sp 0G
a by tne following three yexrite rues:

a
put of BRACKST may be specifi

1. S* proposition element
2, elemene + word
> ejenent > (topic S)

That is, eacn OF veted output is 2 proposition node followed yy 2 sequence of
" nese elements are either rewritten @5 words (rule 2) or
ans (rule 3)- A pracketed subexpression pegins with 6

tes the connection between the emoedded and embedding
ants within an exoression @re either non-meaning pearing

=

et ct

elements (rate 1,
bracketed gubexoress
topic node which ind
propositions. The €
words or elements corresponding to sudject, predicate, relation and ooject
in the propositio ote that BRACKET induces @ correspondence between &
level of pracketing and 4 single proposition. Zach Level of pracketing will
also correspond to a new network in LAS!s grammar. Because of the modularity

aay

of HAH propositions, e modularity 15 acnieved for the grammatical networks.
When a number of embedded propositions are attached to the same node, they
are envedded within one another in 4 right-oranching manner.

e is no semantic features to indicete waere they belong.

Ws

The insertion of non-~function words into the bracketing is 4 troublesome
problem becaus® Yr
Consider the first word 2 in the exemple sentence above in Figure 6. It could
have been placed in she top level of bracketing OF in the subexpression con~
taining triangle. Currently, all the function words to the right of ¢ content
word are placed in the sane level 2s the content word. The bracketing is
closed jmnediatels after this content word. Therefore, is is not placed in
the noun-pnrase prackeving This heuristic seems to work more often than not.
However, there clearly are cases where it will not work. Consider the Sen~
eat. ‘The current BRACKET program would

a
vo
Ss

tence The boy who Jane spoke to was a

return this 2s ((fne boy who dane spose) ) to was deaf). That is, it would
not identify to 4s in the relative clause - Sinilerly, non-meaning-bearing
suffixes like gender would not be retrieved 45 part of the noun by this
heuristic. However, there is 2 strong cue to make pracketing appropriate in
these cases. There tends to be 4 pause aften morphemes Like to. Perhaps such

  

43
pause structures could be called woon to help the BEACHES? prograa decide how
to insert the non-meaning~bearing morphemes into tne bracketing.

   

aring morphemes pose further problems besides
such morphemes in a noun phrase. Thes seq
nat, in principle, might constitute &n aroit
ets semantic referent could provice no cues
t language. Therefore, we would be back to 4
ag duction tasz that ve naracterized in the i
comro gz to observe that the structure of these st g
non-meaning~bear ing morphemes tends to pe very Simple. There are nob many
exumples of tnese strings being longer than a single word, Thuc, LL Seems
baat the languages consti tuted by these non-ne aning-veuring strings are nothing
m than very simple finite cardinality lenguages which posc, in themselves,
no serious induction problems. The yarious stretches of non-mzaning- peacing
morpnemes in a senvence could also have complex intercer endencss thereby posing
t hese

serious induction problems. Again it does not seen
= pet
simple gust at those points where it would hav

on program to Work.

a

o be the cas¢ that thes

ndencies exist. So once again we find that the structure of natural language
a to be for a LAS-like induc-

0

(

ch pte
ce

In concluding this section I should point out one example sentence which
BRACKET cannot currently hendle. They are respec tively sentences Like Jonn and
Bill Ganced end laughed resvectively. ‘The problem wW Will such a sentence is th at

1 ~ is the following prototype structure:

1 2

ba}
rd

Jonn dance Bill | lauga

Thus , John and dance are close together and so ere Bill and laugh. However,
tne sentence intersperses these elements just in the wa! way that nak makes bracketing
impossible. There are probably other exe moles like this , but IT cannot think
of them. Fortunately, this is not an utterance that appears early in child
speech nor is a particulerly simple one for adults, Of all the grammatical
constructions, the respectively construction is the one that most suggests tne
need to have trensformational rules in the gramcar.

 

s capable of

Te funetion of SPEAKTEST is to test wnether its i
ely modify the grammar 50

generating a sentence and, if it is not, eappropr lat
that it can. SPSAKTEST is called after BASCART 15 complete. It receives’.
from BRACKET a HAM conceptual structure, @ pack ted sentence, the main pro-
position and the topic of the sentence. As in the SPEAX program SPEANTSOT
attempts to find some path through its network which will express a proposi-
sion attached to the topic. iz it succeeds no modifications are made to the
network. If it cannot, & new path is built through the network to incorporate
the sentence.
ts
5
ae
3)
id
“
©
a]

The best way to understand the operavion af &
through one example. ‘ine target Language js wag given to le
a, GR

’ arn is illustrated
in ail Lh. ais is a very simple languase, yasieally GRAMMAR of Table 1. it
nas a smaller vocabulary +9 make it more tractable, The reason for choosing
this Lan: guage is that it is of just surricient complexity to jllustrate LAS's
acquisition mechenisns. In addition, LAS hes learned GHAMMAR2, also given in

Table l.

 

 

Figure iL 4llustrates LAS's
come in. Tre first sentence i
returned by BRACKET es (GiT4 (GLL5 6 &
CLT refers to the main proposition given as an ar t
this is LAS'ts first sentence of the languag® the sr network will, of course,
completely fail to parse the sé

ntence. It has no Ff mnar yet. Therefore,
it induces the top-level START network in Fi 1. “A listing of the czact
s given below th

are information induced is e graphical illustrati on in Figure ll.
Since the first two elements acer GLT4 i in the bracketed senvence are them-
sees bracketed, the Tir network will ne pushes to subo-

rc

st two arcs in the
e contains 4 conaition om the word aoove. The restric-
is that it be & enber of tre word class Aig? . This class Was
+ this senvence and only contains the. word above at this point.

: d a path through ene START network, SPEAKTEST checks the
a % che

 

Having now conssrucee

subnetworks in thas path to see whether they can hendie the bracketed subexpres

sions in the sentence. Tis is accomp ished by 2 recursive cail to SPRANTES?T.

For tne first phrase SPRAKTEST 15 called, taking aS ar suzents the network AL95,
a (GLLo sq 4 aes =

Pe netrrertk A105 the word class

 

ana “Uue . UUpLe oa
nm square, and in network ALOT the word class A22] con~
2 se two Suonetworks should pe the same in ea final grammar

4 prepared to risk such @ genera alizavion at this point.

 

Note in this example how the ore ee provided by BRACKET completely
specified the em bedding of network The sentence provided by BRACKET was
(Girl (G11L5 G116 square) (G148 culo. triangle) esove). The first element GLT
ag the main proposition. The second element (G11L5 G116 square) was 4 bracketed
wubexpression indicating 2 subnetwork shoule be ereated. Similarly, the third
expression indicated a sion network. Tae last element above Was @ single word

and so could be hendled by | memory condition in the mein network.

 

 

 

The second sentence is triangle sauare rignt-of. This is transformed by
BRACKET.to (G315 (G2k6 G2k7 triangle) (6283 G264 square) right-of). Because
s sentence cannot be handled by tre

 

 

or the narrow one-member word c classes this Ss

current grammar. However, SPSAKTEST does not add new network arcs to nandle
the sentence. Rather, it expands wor class AL9G to include right-of, word
class A211 to include ‘riensle, and word class A22h +9 include square. The
grammar is now at such @ stage that LAS couid speak cr understand the sentences
triangle sauare above or squ uare sauare ricnt-of and other sentence s which it
had not studied. Thus, elready the Firss generalizations have een made. LAS
can produce and unders tend novel sentences.

This illustrates the type of generalizations that are: mage Weta une
SPRAXTEST prograa. For instance, consider @ ge
SPEAKTEST decided to use the existit i
—_
fa
eee”
op
j-
re
Or
WY
om
re?
“AI
Wy
ay
kr
In
5

( (SQUARE) (PRIANCLE) ABOVE)

 

oa ALS . - £19? . 1199
START-—— 95 yy gh 9 te STOP

r
A197 _ 2 AZZ) STOP

 

 

 

Pp
. S Ne
C247 C316 Case
7 NX | V N 3/ P

; iA?
PRIANGLE Gate RIGH?-OFr ass SQUARE -

( (@RIANGLE) (SQUARE) RIGHT =OF)
A199——— ABOVE, RIGHT -OF

A211 ty SQUARE, TRIANGLE

 

A22h-——— TRIAN GLE, SQUARE

Figure 1}. LAS"s treatment of the first two sentences in the
induction sequence»

46
Anderson

the first wot ra of th

work Al9D that head been cr

to include triangle Both deci
al

e second sentence . his involved (a) using tne same subpnet—
xc (ob) expanding the word class A2i1

ons “Tested on semantic criteria. The networn

ge attached to the main propo-

5f the node G2h6 which is
this identity of semantic

 

   

In making these general izations, SPEAXTAST is making a strong assumptlon
about the nature of huey ‘Language. This assumption is stated as Condition

Condition 5. Words or phrases with identical semen tie functions at identical
a tat chically. This is the assumption

points in a network behave identically syn u

of semantic-induced equivalence of syntex. it is another way in wnica senantic
information fac jlitates grammar induction. it clearly need not be true of an
arbitrary lang uage. For insvance, Gecisions made in tne sudject noun phrase
might in theory condition syntactic decisicn made in the object noun phrases.

4
LAS. because of its heuristics in SPEAKEES?T for generalization, would not be

> D se
able to learn such a language.

4

wee ee BRK TAG

Figure 12 illustrates LAS's networ % gremiar after two more sentences have
g

- ‘ = 7
come in. penvences 3 ana 4 LinVOLVe tise

 

aubuiua Ota Giaw arenes pans 7
treats these 4s syntactic variants of abo ve en rigat— of which differ in their
assignment of their noun phrase ArguURENnts © 9 the “Logical categories subject and
object. Therefore, LAS creates an alternat 2 branch through its START network

to accommodate this possibility.

Figure 13 illustrates the course of LAS's learning. Altogether LAS will
is will have to meke three extra

a

be presented 14h sentences. Subsequently,
generaliza ations to capuure the entire vaerget lenguage. Piotted on the ebscissa
s this learning history and along the ordinave we have the natural Logarithm
of the number of sentences which the gremmar can handie. This is a finite

4 e, unlike GR MMAR2, and therefore tne number of sentences in the language
will always be finite. As can be seen fron Figure 13, by the fourth sentences
LAS's gramaar is adequate to handle 16 sentences.

 

LAS's grammar arter the next five senvences ig illustrated in Figure Lh,
These are LAS's first encounters with two word noun phrases. ALL five sentences
involve the relations right-of and above and therefore result in the elaboration

of the A195 and Al9T suo-netWOrks. Consifer the first sentence, square red
ariangle blue above, which is retrieved oy PRACKET as (C329 (C270 Ce71l square
(C270 C272 red)) (6303 C304 triangle (C2 ove) Ce sider

0}. Con
C

+

ct

03 C305 blue) above
the parsing of the first noun phrase. Hote that the adjec 1
is embedded within the larger noun vohrase. This is an example of the right
embedding woich BRACKET always imposes on @ sentence. ‘This will cause SPEAK
TEST to create a push to an anbadded network within its A195 subnetwork. As
can be seen in Figure Ls the exiscing arc containing the A2LL word class
is kept to handle square. Two alternetive arcs are added—-one with a push to

 

47
 

Birnrea 12

LEU Uf.
> - aA

LAS*s eranmar aqdVver

-_

mom MDYANAT TS
SQuUARL mda Gis

1 s Whee Se
Zo TRIANGLE SQUARE
2, SQUARE TRIANGLE

TRIANGLE SQUARE

 

   
 
  

RQ Se STOP
54
~ « £4221 _—
Alge = above,r:
art 4 Poe rn ac >
meow ™~ Rt whi FONT wet
of

Hl

it

i

H

scuare, vriangle
calow, left-of

square, triangle
square, triangle

18
nonyta e
¥. GQAKMAR

ANDER

Load

1?
i

Apo
NwoeOO  4

ie
~

ym NETSEE

ob

=
Ce bs

eyo
4 iv 0

7
by io fui

atti
‘eet

Ma

2500 1

 

 

274
fo

¢
”

1040 “

-

. ©
Ou 1296

True sentences +
over generalizations
N a
: ie OB
ee a
008 True sentences
a
LOO j—
20\— e/
/
—_—
ef
/ bol whe \
1

 

Figure 13+

5 & 5 6 7 8 9 20 Tt “V2 130 ~«4

Sentences Studied

The growth of LAS's grammar with its learning history:

ve deses be ce nee

1 2° ° 37
Ceneralizations
Additions to LAS’s grammar after studying:

Le SQUARS RED PRIANGLS BLUE ABOVE

2, PARTANGLE LARGE SQUARE SMALL RIGHT-OF
3° TRIANGLE RED PRIANCLE RED ABOVE

i. SQUARE SMALL MRTANCLE RED RIiGHP-OF
5, SQUARE BLUE TRIANGLS LARGE RIGHT-OF

A211 HB

\ NIL
SS TOE

€ A221 C560
$9 STOP

NIL

r

——-== STOP
. o £0585
ough colo Ss. stop C560 STOR

C510 = gmail, blue,large,red
small, blue, large red

C3

oat

co

Or
!

50
A nde SCO 50rh

   
 
  

the Chgl nebvos : ce with a NIL trens: io sithin the Cho: network

the word cle “yord red.

Supeose &

neato! ontein saquence : DOs 2 phrase
“ Psat -
AS fully parsed.
i? fares A ao
- ee a 1
= . 3 Ly a?
ok, assigned to
1 . Huils 4
tne Cx OULLG. Kw
A
_ ar C : --. An,
oy a . ; 1,3 _%
Als 2S rth are Laat : € LS on mazing to

7% ge
des-induczed equivalence of

(a) illustrete how
son in natural language.
at, eta. He would set
a 7
wed by any noun. Suppose, he
LiS as The + poy + 's.7
oe network illustra tea in
sion that foots is the
: Y 7S
Jian is, af eaurse,
ge lege vvin, 1964). What
n2

a te carious oe paegoneralize
% OS + there ere 2 nunib er of alterna—

uch norphenic rules is.
=

tives end no semantic vasis zo choose besween them. Because of its principle
of sementics-induced equivalence of syntax, LAS will eo yonerelize in those
situation Apparently, ¢ children ere opet rating under 4 similar rule.

LiS needs to be endowed with a mechanism to allow it to recover from such
overgeneralizations . Therefore, One of tne future additions to LAS will have
to be a RECOVER Prose Consider how it would work witn this pluralization

example. Suppos LEARNMORE receives the genience The Scat ave above the
triancie. In oe cating to analyze the sentence in SPPAXTSST, the plural
foots will be generated put Will mismet ch the sentence. RECOVER has as its

function to note such mismatches. since ait is possible that there are two
alternate Ways of expressing plurality, RECOVER cannot assucs its grammar is
wrong. Rather it will interrupt the information flow and check the accepta-
bility of The foots are above OF the triangle. Tat is, RECOVER will explicitly
seek neg ative information. Upon 7 Learnan ne exoression is ungrammatical

gv
RECOVER will teke foot out of the word cless that is pluralized py 's.

 

1 . .
fo accomplish this T would have to put within TAS some ‘pechanism that will
segnent words into their morpnenes.

50
Figure 15

Some possible network grammars

 

m7 , oN
THE = NOUN > STOP

oO EEO __ Se STOP
~~ STOP

51
Every bit as much as LAS, 4 enild logically needs negative information ta
reeover Trom overgeneralizations. The interesting quest2on igs where the negative
information comes from in the case of the child. Parents ao correct the cnild

in such cov ious morphemic overgeneralizations (Brown, 1973). Even today x

find myself corrected (not by my parents ) for my failures to properly pluralize
esoteric words. ‘The child may also Use statistical evidence for a negative con
elusion. In some manner ne may. notice shat the morphesic form foots is never
used by the aduls and so conclude that 1% is wrong. Horning (1969) has formalized
an algorithm for detecting such overgeneralizations py assigning probabilities

to rules.

vw

Figure 16 illustrates LAS's treatment o
training sequences. These involve some thre
sion of the noun phrases on the brancn of the start network for RB re
As can be seen from Figure 13, at the point of the hth sentence LAS has
its grammar to the point where it will nandle 616 sentences of the target lan-
ZuaZe. Actually the grammer has produced some overgeneralizations—-i

ept a total of 750 sentences. LAS has encountered phrases like square,

f the last four sentences
e word noun phrases and also expan-
1

 

 

ace

square small, square red, and square red small. From this experience, LAS
has generalized to the conclusion that the sentences of the language consist
of a shape, followed optionally by either @ size or color, followed optionally
py @ size Thus the induced grammar includes phrases Like squares small small
because ptable in poth second and third posi-

size words were found to be acce

erestingly, bhis mnisbake will nov cause LAS any problems. It will

phrase like square small small beceuse it will never have a to-

structure with.two smalls modifying an object. it will never
so and thus UNDERSTAND can nov moxe any mistakes. This is

how an over-general grammar can be successfully constrained

of semantic acceptadility-

hee QO

a
fo

never spea

hasspoken HAM

hear such @ pArese
mo}

@ nice exaipic

fhe problen of learning to sequence roun modifiers has turned out to be
a source of unexpected difficulty. in part, the oréering of modifiers is
governed by pragmatic factors, For instance one is likely to say small red

square when yeferring to one of many red squares, but red small squer when
referring to one of many small squares.. Differences like tnese could be
ef om

controlled by ordering of Links in the HAY memory structure.
G2NERALIZE .

After teking in 14 sentences LAS has built up 2 partial network grammar
shat serves to generate many more sentences than those it originally encountered.
However, note that LAS has constructed four copies of a noun phrase grazer.

One would Like it to recognize that those grammars ere the same. The failure

to do so with respect to this simple artificial language only amounts to an
inelegance. However, the identification of identical networks is eritical to
inducing languages with recursive rules.

52
A AAT+SI wr ook PAS sr y ad
Additions to LAS’s grammar atu:

  
   

      

 

 

 

 

 

 

 

 

 

16. SQUARE BLUE SMALL ERIANGLE
Tle PRIANG ® RED SQUARE BLUE LEP P-OF
12. TRIANGLE S SMALG SQUARE REO
13° SQUARE BLUS PRira NGLE BLUE e
Lhe SQU ARS RED LARGE TRIANGLE RED LARGE SELOW
oY 4 i)
NIL ;
STOP
E— 8593 D1095
B56 6-6 SOP
KS NIL
S>STOP
£5580 DLO23
B564 Se D1 OL ee STOP
NIL
peer STOP
conn he
p10232 2S El 94 —FL208__s =>STOP
ON Nib
SeSTOP
€D1117 E884
D1LO95 > E90L- SST OP
NIL
TOP
Sp71l4
D692 ———-————p STOP
EDLOLS
D1 095—£ PLY _s»st0P D1023 DLONS ss POP
E
298h —© E205 s~ STOP £1368 E22 see stor
D714 = small
D1O45 = red, biue,small
DLiL? = plue,red
£905 = small,large
E1395 = large

23
Anderacoa -

 

ye, > NOUN

3 k

 

That is, there are four networks, NP, HPL, NP no and uP. whose structuce is in-
dicated by the eoove rewite rules. It Ts assumed that LAS has only experienced
three consecutive adjecvives and therefore SPE KTEST has only created three
embeddings. se critical inductive steo for LAS is to recognize iP, = iP...
This requires recognizing the jdentity of the word classes NOUN, and HOU. and
the word classes Add, a ADJ This will be done on the criterion of the
emaunt of overtan af” 7 classes. it also reauires recognition
that network 2Pp = Neg. Thus, to identify two networks méey require that tvo
other networks ce identified. The network HP 3 is only 2 subnetwork of HP.

So in the recursive jaensification of networks, GHVERALIZs will have to accept
a subnetwork relation pesween one network Like NP, whieh contains another Like
NP... The assumption is thet with sufficient experience the emoedded network
would become filled out to be the same 45 the embedding network. After NPL
hes been identified with WP2 HAM will have a new network structure where NP*

represents the amalgamation of NP1, NP2, and NP3.

NP > the NOUN
the ADJ NP*
P* + NOUN*
ADJ* NP*
Hote that new word classes NOUN* and ADS* have been created es the union of
the word classes NOUN2, NOUN3, NOUNL and of the classes ADJ2, ADJ3, respectively.
ENERALIZE was called to ruminate over the networks generated after the

first fourteen senvences. GENERALIZE succeeded in identifying AlL9> with ALOT.
As a consequence, network A195 replaced network Al9T at the position where 1t

 

 

ceurred in the START network (see Figure 312). Similarly, B566 was identified
with and replaced network pS564. Finally, B566 was identified wita and replaced

4

A195 througnout the START network. Te final effective grammar is illustrated
in Figure 17. Iv now handles all tne sentences of the grammer. it hendles
more sentences then the granmar that Was constructed after the fourteenth.

54
Ficure 1?

Phe final crass

 

 

B Sach
SPARE | TOO
S305 ee
B508 PeAL96—S ae peA198-~ EZ A199
B593
3366 DL 6 TP
NIL
ne STOP
€D1117
B10 ee 9g SP
NIL

STOP

— E905
E88 —————— ee STOP

B568 = below, Left-of
AL99 = above, rignt- ~or
B593 = square , triangle

D1117 = plue, read large, small
F905 = large, small

29
sentence. Tis is because the noun-parase network E556 has been expanded to
jncorporate all possible noun purases. sefore the generalizations, none or
~f} 4¢. c ~ , ot
e 4564, BOOS, ALQD, oF ALOT were complete. ‘The network B965 be-

 

a. e L c
types of languages. The first is the assumption of the correspo c at
the surface structure of the language and the semantic structure. This is
critical to BRACKET's identification of the surface structure of the sentence
woich is, in turn, critical to the proper embedding of parsing networks.
Second, there is the assumption of a semantics-induced equivalence of syntax.
This played a eritical role poth in the generalization of SPEAXTEST and of

a

GENERALIZE. It was noted with respect to pluralization that such seneralize-
tions can be in error and that children also tend to make such errors. However,
I would want to argue that, on. the whole, natural language is not perverse.
Therefore, most of tnose generalizations will turn out to be good decisions.
Cleariy, for languages to be learnable there must be some set f generaliza~
tions which are usually safe. The only question is whether LAS hes captured
the safe generalizations. .

Tne importance of semantics to child lenguege learning has been suggested
in various ways recently by many theoreticians (e.g., Bloom, 1970; Bowerman, .
1973; Brown, 1973; Schlesinger, 1971; and Sinclair-de Zwart, 1973), but there
has been littie offered in the way of concrete elgorithms to make explicit
tne contriputicn oF semantics. LAS. L is a Tirst small step to making thi s

contribution explicit.
Conclusion

This concludes the explanation of the algorithms to be used by LAS.1 for
language induction. In many ways the task faced by LAS. 1 is overly simplistic
and its algorithms are probably too efficient and free from information-pro-
cessing limitations. Therefore, the acquisition penavior of LAS. 1 does not
nirror in most respects that ofthe child. Later versions of this program will
attenst a more realistic simulation. Nonetheless, f think LAS.1 is a signifi-
cent step forward. The following are the significant contributions embodied
so far in LAS. l.

1. The transition network formalism has been interfaced with a set of
simple and psychologically realistic long term memory operations.
In this way we have bridled the unlimited Turing-computable power of
the augmented transition network.

2. A single grammaticel formalism has been created for generation and
- understanding. Thus, LAS only needs to induce one set of grammatical
rules.

3, Two important ways were jdentified in which a semantic referent helps

grammar induction. These were stated as the grapn deformation condi-
tion and the semantics~induced equivalence of syntax conditions.

56
L, Algovrithzs have deen developed adequate to learn natural

The general mode of developing the program LAS is as follows: A lanyvusge
learning situation is specified py a set of conditions. tn LAS, 2 it was
specified that LAS already know the meaning of the words and that it be given,
as input, sentences with HAM representations of their meaning. The semantic
domain was specified to be that constituted by geometric shapes. Cnee @ set
of conditions is svecified, 2a set of goals is specified. In LAS. 1 there was
only one real goal: to learn any natural-like language taat deserived the
domain. Once a set of goals 15 specified a plan of attack is sketched ont.
However, the problem is such that the details of that plea only evolve as we
attempt to imolement th i Inde2d many interesting
problems and ideas that in LAS. 1 were discovered
in attempting ea impl ity of computer simulation
in theoretical develo

 

The LAS. 1 progr verated in a task domain which 2s
means identical, to that of atural language learning situation. Its benavior
was similar to © o earning a lenguage, but ezain by no means iden-
tical. In sre xn two yeers i propose to create a program LAS
considerably closer to sinuleting naturel language learning. I
elaborate set of goals than did LAS. 1:

   

ss __.
2whien comes
h

«

1. The program will incorporate realistic assumptions about short-term
menory limitations and left-to-right sentence processing.

2, The program will learn the meanings of words.

3. The program should use semantic and contextual redundancy to partially
replace exnlicitly provided HAM-encoding of pictures.

h, The program should handle sentences in a more complex semantic domain.

5. The progran should be elaborated to handle such things as questions
and commands as well as declarative sentences.

The general methods for achieving these goals in the LAS. 2 program will
be sketched out in the proposal section. Also in that section I wilt propose
some experiments to evaluate the LAS program. While it is true that the task
faced by LAS. 1 is not really natural language learning, it still is a learning
task at which hucan subjects apparently can succeed, The experiments will de-

termine whether humans have the same difficulties in such tasks as does LAS
and whether they make the same generalizations. However, I regard these exper-
inents as of secondary importance relative to program development. It is more
important to further articulate our understanding of what algorithms are ade-

quate for natural lenguage learning.

oT
 

 

It is probably inevitable tha
is really necessary to expend the c ry
program, Could not the model just be sp ecified nnn Tne reason 1 way
this is not possible has to do with the com roLexity of any theory that addresses
the details of natural language. There is no other way to test the predictions
of the theory or to assure tnat ly c¢ isten The experience
with large transformational gramin language is that
they have hidden inconsistencies. these ere only exposed by trying to simu-
late tne Eranmers on a computer (e.g., Friedman, 1971). Consider the deserip-
tion given of LAS. 1 in the pr eceding section? Although lacking in many details,
it was complex and lengthy. Could the reader esteblish for himself from tals
deseri ption whether the model is really internally consis stent? A computer

iw

c
program provides a proof of the consistency end @ means o2 determining &
model's behavior. The stated goals of this project are to develcp explicit
algorithms for natural languege learning veecify the relevant details of
these algorithms, and evaluate empirical ty tne & psye hological viability or

these algorithms. Without the use of computer simulation none of these goals

ane

could be achieved.
C, Methods of Procedure

First I will describe the proposed extension of the LAS program. Then I
cribe some experimental tests. In reading the specific extensions pro-

posed for LAS, the reader should keep in mind that they have ‘as their intent

e goals set forth in the preceding section.

achieving th

The Semantic Domain

 

The first matter to settle upon in the new progren is some semantic dem.in,
. ie relations 2 2

the LAL, tp wertd of ataves, prorperties, na s2cnstric Te-2Ulcns +2 20 surat
ished ror further work. Tne following is oroeposed as a suggestion altacugh
there is nothing critical ebout its exact form. It is critical, however, that
some semantic domain be chosen. It is only when there is a specified domein
that an explicit goal for success in the program can pe specified. The progran
will be regarded as successful if it can learn eny natural language describing

this domain.

I have chosen to look at a world close to that of a young child although
there is perhaps nothing sacred about this domain. This world is set forth in
Teble 5. There are three people in this world. In addition to these there are
four categories of objects--locations, containers, supporters, and toys.

These objects can have four types of properties--number, color, size, and quaii-
ty. Thus, LAS will have to deal seriously wita problemas of sequencing adjec-
tives. It will also have to deal with number es a property of objects. The
objects permit a much richer variety of reletions than in the vorid of LAS. 1.
This will provide a demanding test for the learning of complex multi-argumeat
relations. There can be sentences like Mommy traded Daddy the car for a ball.
In this world, people, containers, supporters, and toys can be in locations.
People can change their location and that of toys. People and toys can be on
supporters, toys can be in containers. People can possess toys, containers,

and supporters.

 

58
Anderson

TABLE 5

Categories in the World of LAS. 2

 

 

 

PEOPLE | LOCATIONS. COMPATUERS SUPPORTERS

Mommy bedroom box table

Daddy’ kitchen closet chair

LAS den dresser bed

TOYS NUMBERS COLORS. - SIZES GUALISIES
dolly one red big dirty

ear two blue media pretty
pall three green small shiny

Thus the differen: catexzories of objects enter differently into different types
of relations. This Te + will prove jmoortant to the predictive parsing facili-

ties that ZT will want to introduce into LAS. 2.

Left-te-Zisht Processing

Cnildran learn language auditorily. Thus, their induction algoritans must
process incoming material in a left-to-right manner. The current LEARVMORE
program does not go this. BRACKET completely processes the senten

SPEAKTEST even begins to work on it. Clearly, PRACKET an
integrated so that the beginning of the sentence is pracket

py SPEARTSST before the end of the sentence ts considered by eithe

ducing this left-to-right processing is a preliminary to introducing short-
term menory limitations into the induction situation.

Figure 18 illustrates in highly schematic form the left-to-right algorithn
proposed for LEARNMORE. Words are considered as they cone in from the sentence
LEARVMORE, as in UNDERSTAND, tries to find a path through its netvork grammar
to parse the sentence. The difference petween LEARNMORS enc UNDERSTAID is
that LEARNMORE hes available to it a HAM conceptual structure to enable it to
better evaluate various parsing options. Suppose LEARNMORE is at some point in
processing the sentence. It will also be at some point in 4 parsing n
Let us consider how it would process the next word. At box 2 it
in the word. At pox 3 it would set 1 to the various grameatical options (ares)
at that node in the network. Boxes Ty through 7 ere concerned witn evalua.
waether any of these options can handle the current word. Box 4 che
there are any options left. Box 5 sets a to the first option and re
the remaining options. Box 6 checks whether the word would be parse
and box 7 considers whether the action associated with that ere corr
a HAM structure. “If a passes the tests in 6 and 7, FARIMORES advances to con~
sidering the next word. Otherwise it tries another arc. Tf it exheus
arcs, it will call FIITDPATH (box 8) to build e new are fron the curreat node.

29
Flowen

wh
Me

rr
og

a

program

 

 

 

 

 

 

 

 

 

 

 

60

 
The work currently assigned to BRACKE? will have to be assigned to 9x [-
That is, box 7 will have to determine when an are snould involve @ push to en
embedded network and when it snould pop back up to an anbeddirg network. This
will be done by consulting tne information in the semantic struc vure. Tt would
also be possible to consult the pause structure of the sentence tor information
about phrase structure poundaries,

Note that certain sentences which the old LEARNMORE system could handle
will not be handled by this system. For instance, consider the sentence The
Square that is above the triangle is rignt-of the square. After the first two

 

 

 

 

 

3
words it would not be clea: which squer a
object or the subject of ; 3 a.
an appropriate action to the path. In tne old LEARNMORS thi
the referent of square Was resolved by let 2 n
* - « —_—__ .

dealing with it. Presumably, however, cal a
from such sentences.

 

bry
{p
[p+
3Q
fa
ct
t
Qo
rh £
H

t+ Om ©
}
ian .
ce
0
3
he
aes
ra
wy
o
Cc
h
joy

 

 

In this system it will not be assume that LAS knows the meaning of the
words, Rather this will be something that LAS will have to learn from the
pairing of sentences with conceptions. First let's discuss the learning of
words whose reference is a simpie concept or object, ©-&-> box or mommy, and
postpone discussion of c mpolex relational terms like trade. Logically, the
task of lexicalization is quite simple and it would not require complex algo-
rithms to succeed. For instance, consider this elgorithm: LAS is given a2
sentence with n, words and a conceptualization it Geseribes with ny concepts.

tore with each word the my, concepts. The next sentence that comes has Ro
words and its conceptualization consists of zp concepts. If a word in this sen-
tence is new, store with it the mp concepts. if the word is old, store with
it the intersection of the concepts previously stored with it and the new mo
concepts. Eventually, ignoring problems of polysemy, & word will become pared
down to zero or one concepts. Those with zero concepts are function words
end those with one concept have that concept as their meaning.

 

 

a
=
+
w

Of course, this elgorithm will ma into trouble if LAS does not always
eptualize all the concepts referred to by the sentenee, This can bea

died by having the algorithm wait for a sequence of disconrirming pieces
r idence before rejecting 2 hypothesized meaning. Incidentally, subjects
ehave just this way in concept attainment situations (see Bruner, Goodnow &
Austin, 1965), not teking negative evidence @&s having its full logical force
about the meaning of the word.

The basic problem with this algorithm is thet it makes unreasonable assunp—
ions about the information processing capacities of humans. In pilot researecn
£ my own, I have found that adult subjects can learn the meanings simultane-
ously of a number of words in a sentence. However, they do suffer difficulties
when there is high ambiguity about what a word means. Presumebly, children
would have even greater difficulties extracting word meanings from complex sen-
tences. Broen (1972) and Ferguson, Peizer, & Weeks (1973) report that new items
of vocabulary seemed to be introduced through use in set sentence frames such
as Where's ..., Here comes ...-, There's ... known as deitic phrases. The noun
tends to be heavily stressed and repeated. ‘The parent frequently points to help

 

61
ah env t
3, provided the child Knows

 
  
 
 

more complex Se nes g most of the ne
satical of the sentence fT. compine these yarious considera-
p L t eure 18 to deal

word is reac if Li ag %

about context and about the word's position in tae grammar, it {Ll co

‘this guess to menory and stick with the guess i ess later disconfirmed. Tne

program wilt only hazard a guess in circumstances of low uncertainty. Thus,

“4b will only guess if it can otherwise parse tne grammatical structure in which

the word appears. It will not guess if the word is receded or followed bY

know. Thus, the progres, much a8 adults appear to, will
contrasts between grammatical pattern and a
e program knows the grammatical rule NP - determiner

p>
a
Oo
rh
oct
Le
rt
Pe
[3

adjecti . the phrase the ¢lick box it will suppose thet
glick rerers to some property of the box.

Thus, the progren will have to acquire its initial vocabulary by means of
simple frames, 85 do young chilcren. With this initial vyocebulary information,
it can begin to learn grammatical rules. Once in possession of grammatical
rules, it will no longer nesd simple frames +o learn new lexical items.

One interesting question is how function words are ever identified as non-
meaning-pearing in this scheme, Presumably, 31 is done on the vasis of failing

a
d and any semantic yeature. FHLS
esses had been associated with a word,

 

So far I have assumed that all concepts are constructed before language

acquisition takes place and that the only problem is to link up these concepts

with words. But this is very unrealistic. Consider the verd give in the sen-
ives the dolly to Daddy. The meaning of give is something 1ike

 
   
    

ng
one to cease to vossess 3h object end someon’ elise
as ooject. Tt seens very implausible that a child comes
learning situation wits sucn a concept ready made. What probably —
he sees Mommy pushing the Goll to Daddy or Momny handing the

ball to vany. With these experiences he hears sentences like Mommy gives the

golly to Daddy or Mommy gives the pall to baby. From these examples he induces

the appropriate meaning of give. Cancept attainment in these situations can be
achieved by using the sort of concept jaentification used py Winston (1970) for
inducing geometric concepts. That is, each use of the word give is paired with

e EAM network structure given the meaning oF the sentence. Winston's heuristics
allow. us to extract what these network structures Pe
mon. ‘Tne concept give, es verb, is then attached to
For this sort of algorithm to succeed, LAS must be set to regerd certain con~
figurations of propositions, interlinked by causal terms, 4&5 being associated -
with a single relational term in the langu2ege.

 

 

62
   
    
  
  
 
   
  

Note also that the

meaning of complex re

  
     

~ % ~ abe as
e bae sentence moOmm
yA TT co
ERSTAND set upd,

 

 

 

 

 

 

 

      

 

 

na 2 caild is that
first the child
eh has been
in two and three
word ueterances. es ib appears
that children have omitt ad mo function + eonstructions.
One explanation of the origin of telegraph e pealing from the
point of view of LAS is the following: Suppose that LAS did not receive as input
to its Teena routine complete sentences lesrapnic sentences.
liy induce a te [t seexs reasoneole
otel sentence he
. If so, then his
be receiv as their basic
celesraghic
Shis necothesis com fron studies of chiid imitation of adult
mid bhay these tmdtati ons. amnile tanger than tha chiid's awm
iso telesraphic in nature (e.g., Srown 2 Fraser, 1964). Blas-
1970) found that childre tend to repezt those words which are
words which occur in terminal positions. The seme annem
eng to be stressed in adult speech. Scholes (1969, 970)
en tended to omit words that had unclear senantic 2 oes or
What I find striking 1s th these ere just the veariebdles

"ranch sentence--a language
of serial pos one per-
es

C
fectly. Of course, wh
Tm tablished effects in

eaningfulness el
ments on immediave memory.

 

ough an aspect

“I propose to introduce telepraphic i
the variables
Ss

¢

D

of LEARNMORE called BADEAR. The BADEAR program will simulate

of stress, meaning fulness, and serial position in orovidings LAS with a depleted
version of the sentence. The locus of the effect of BADEAR will be between
boxes 4 and 8 in the flowchart of Figure 2. Basically it will not bass all
words onto BUILDPATH. Rether some words will "slip fron consciousness” after
failing to be persed. It will tend to omit words wher: (a) they are unstressed,
(ob) their meaning is not known, (c) a critical nusoer of new words in the
sentence nave already been pas ssed to BULLDPATH. I suspect this critical number

is something Like one or two.

Factors (2) and (b) would generate che effects o
Factor (c) would yield good memory Tor the fir

good memory children do show Hr last wor
term acoustic memory.

7
oi

63
    
  

   
 

An interesting fez @ B at, expanded, LAS
sould tbe able to receive more of the sentence. tons and imita-
sions would grow as Goes a child's. This would be colicit mechanisn
for an ides suggestes ty Braine (1971), Olson (1973, Tnducing 2
grammar Drom derenerate sentences presents an intere How is it tnat

pangon its rules for generating te eecn? verely
ed 4 r oR fuller sé does not follow
S$ are W oe le means for
expressing tne same tho hanisa incorporated
that will strengenen some & mat lative Rules to be
would be t essTul FUDERSTAND and
uccessfully ignt. tt & ares ont of
parsing netw ack ¢ eir relative
Subjects wo of a sta i Tneffective
ne originel rc oO a on word utterances would descend
4 of the stack and so become unavailable. This strengtn mechnanisn
the same as used to order Links in the HAM memory model. This is a different
way to bring formation to vear in grammer induction than thac bron
posed for rather than seexing explicit éisconfircation of rules,
the rules ned out of existence as more adequate rules taxe

 

Ss
over the used to occupy in sentence understending and

af *
eneravicn.

 

with the following form:

 

START

NP é,

This grammar requires considerable backup if the sentence does. not have an RA
relation. As suggested earlier it would be more efficient if LAS were given the
power to transform the grammar into the following form:
STOP
GRA
iP NP

E&.2RB
a
STOP

Given that there are s ous time problems (see introduction of proposal)
in parsing, it isc i 4 methods be incorporated in the learning program
for optinizing the grammar. The merging of arcs, besides making the grammar

more efficient, would be another form of generalization. It could be used to
further merge and build up word classes.

6h
 

Any a errr ok aranhkinn ar aoe s, reyin ore
tyo further ways thab semantics can be used to aid language
+ ~ an 1 4

 

7
h contain color names. Curren
nt
wu

ae

u
the amoun overlay between the mem
er its over é

ie property.

joe

oO ics would be to lesse s

t ations of sentences. I¢ should somet 5
se interpretations. For instance, suppose @ sentence can

din. Because of the conceptue constraint t
to guess their connection. This use of conceptu
domain could also be used py UNDERSTAND to 3d t
She model of the Schank's (1972) system. Tnat
tandl cing @ sentence by use of syntactic information, i
c traints to predict what a interpretat
sdietion can then be checked for synvac
ammar. It would be profitable to try t
ike Schank's within the rigors of the

eb 0
Oo mY ct or
o
cr
ry
a

O

O ct
3

as

jQ v
ee

3

cr

uy

rae

9

sy ob

1)
cr
ot
m
n
mp
E
fp
ct
foe
RO

Ww
ry
nm
ct
wm

wy
my
ry UD ts
fe om
OM DD Ps
a fe
rp op oO
mPct 1
@ ct rs ch OW ocr
Q

@ ct hy
0

mM
won
a

i

by use of the network
dictive persing system

PAymal
formalisms,

 

A Procedural Semantics
So far LAS hes been principally concerned with representing the meaning
conveyed by a declarative sentence. However, language hes other purposes than

er commands

oO
just to communicate meanings from one speaker to enoener Co dé
ly in the box,
D
a

end questions. For instance, consider the sentence Put the do
Currently, UNDERSTAND might retrieve the sentence'ts meaning as S
of LAS that it out. the dolly in the box. This is the dec Laratiy
However, in addition LAS should evoke an action that

an ection to decide whether to comoly. Al

21 meaning of the sentence. The procedural meaning of decl:
$ very Simple: store this sentence. This is already as
ment of the sentence. However, the procedural meanings underlying

‘o

 

 

uw
3

0D
ry

Pa
r
r

cr Dp
HO
oom
@.
Nn
ct
c
ri
D

Oo «MM
oO

“s

bry WO ck
F4 rs

A

uv

ou

fy

0

 

1 ry
Om
e
'

§
Wy ts

0
ocr
0

aw ocr
3
-
ra)

yo mw
ct
ae

3

eat

pes of sentences are more complex. A large part of the success of

's system is that it.was adequately able to deal with the procedural
of various sentences!’ semantics. It is important that LAS begin to
tnese too. .

4
Me

m =
oO nF
Pe iS $3
-J v 2
<
eo
ck
ry

He

~

0

dpom
(9

this would mean, in terms of LAS's network grammars, is enrich
set of tions vat can be stored. Currently, the only actions are ones
result in the creation of pieces of HAM structure, i.e., : i

LAS will have to ene other internal actions that svecify whet it does
the declarative knowledge. These will include commands to answer the qu
or obey the order. HAM already has commands that direct it to answer aq
but executing orders would be something new. As part - the HAM project,
working on methods for incorporating procedural knowledge into a network
tem. It is unclear yet what success I will have here.

cr ocr

ove
(D

Q

j4
io

H
3
ci

}
<
m
ree
by
9
£

jos bck

Mw
Ug
>

x
Pa
ct

oO
u

0

Ln
fet ch ocr
O
a)

poe fee
°
un

HS

a

ta

u
!

65
a

Andersoa

language whose semantics
n a r Consider for
ch Pinite article--the ab
bjeck whi: > listener
t

 

 

Li s be i

This partic ly g is to ¥y

to speaker and context. Since the referent or you completely chanzes with

speaker, a child would be lost if he tried to associate its meaning with some
f a at it as having as meaning @ pro-

HAM memory-node. He must be pregared to tre
cedure for determining the referent.

Provided that LAS has the facilities for representing and evaluating pro-
cedures, there seem no difficulties in learning those aspects of language
which are heavily embued with procedural seransies. Language learning wi
tinue to arise from pairing sentences with secentic interpretations. Howeve
serantic interpretetions will now contein & procedural as well as a declarat
aspect. Again language learning will consis learning mappings betweea s
tences and the now-enriched semantic represenvations.

Experimentation

 

As stated before, I do not think that i
the principal focus of the project. There 11 much further research that
needs to be done in the way of specifying elgorithn that are capadle of language
induction. Nonetheless, in parallel with shis research, I would like to perform
experiments to get some initial assessments of the viability of the proposed
elgorithms. The type of information relevant to evaluating LAS is only acquired
by looking at artivical languages. With these artificial languages it is possible
to test LAS's predictions about language learnability and generalization.

mental research should yet be

Criticisms of Experiments with Artificial Languages

For ethical reasons it is not possible to expose young children, just
learning their first language, to an artificial language which LAS had identi-
fied es degenerate and probably not learnable. This means that all experimen-
tation with artificial languages must be Gone on older children already vell-
established in their first language or on aaylts, Conseauently, the first lan-
guage may be mediating acauisition of the second language. ‘There is evidence
(see Lennenoerg, 1967) that there is a critical initial period during which
languages can be learned much more succéessiu.y than in later years. Lennenberg
speculates that there is a pirysiological basis for this critical period. ‘Thus,
one might wonder whether the same processes are peing studied with older sub-
jects as in the young child. Personally, i o doubt that the mechanisms of
language-acquisition are the entirely same wi

ct

.

s
h the young child in first language
learning as With the older subject in second language Learning. However, it does

cr
Oo
 
  

      
 

   

 

Other criticisms (e.g., those of signin, 971; Milter, 1967) of stucies
re 7 1a arvnoin n the fact that tnese Languages are a
é molicated t an artificial labora-
j lex functions; the
ech. LOW a"
GACL
1 phenomena. Another
ose studies of
5 a semantic referent.
Clearly, 5: 3 f: of algoritans @ subject
can employ. neuristics used bY LAS would be useless without senan-
tics. (1972, 1973) neve shown that the existeace or a
seman uge effect on Language acquisition. Except for control
‘condi iments will involve @ seransic referent

  
  

Languege Learneboility

2 oS

on elgorithm is that the graph defo mation condi-
tion ts ms ation between the surface structure of tne sen-
tence and the sal structure. These is, the surface structure mist
preserve the original connectivity of concepts. In Section A5 we described
languages which violated this assumption. Consider the following language:
oS at Bucs

S$ > NP NP relation

HP > noun (Color) (adjective) {clause )
CLAUSE >» te NP relation

NOUN + square, circle, triangle, diamond
Color > red, blue

Size > small, large

Relation > above, pelow, right-of, Laft-of

cS

This is en expanded version of GRAMMAR] described in Table 1. (The element te
a

in
An

serves the function of a relative pronoun like that.) An example of a sentence
a

Loy
this languege is Squere red te triangle pig above circle Dive small right-of.

experinent Twill Go compares Four conditions of jearning for this langucge-

 

 

 

No reference. Here subjects simply study strings of the language trying to
infer their grammatical structure,

Bad semantics. Here & picture of the sentence's referent will be presentea
elong with the sentences. However, the relatioaship between the sentence'’s
semantic referent and the surface structure will violate LAS's constraints.
The adjective associated uth the ith noun phrase will modify the (n+ 1- iL}
shape in the. sentence (where n is the number of noun phrases). For example,
the adjectives associated with the first noun phrase will modify the last

6T
om
yw
~~

 

S

(oo)

 

 

 

 

 

 

 

 

 

 

(b)
( RED
SY
(c)

 

Figure 19. Different

semantic
the same
triangle
right-of.

roferents for the
sentencees Square red te
pig above circle blue 5m

ay

   
  
 

 

 
7 ft yet -
Anderson

     
 

shape. Similarly

(q + 2 - ijth rel

So for instence the

petyeen the first pai triangle. Gne ay in
picture for the example sentence 15 given in Fiuuce 19a.

ud)

 

 

h, Good semantics plus main oroposition. The picture in this condition wiil
be the same as in 3 but the two shapes in the main proposition will be
highlignted. In this cond@ition LAS would be guaranteed of successfully
bracketing the sentence because the main proposition is given.

In some ways this experiment is Like Moesser and Bregman's, However, here

English words are used s0 that the subjects do not need to induce the language's

j its grammar. Fob corresponds to the situation faced

1 5
sh words were replaced by nonsense syllables this would
tion of the Language to make

pLiti induction tractisle. Tne |
predictions of LAS are, of course, that best learning occurs in Condition 4,
next best in 3, and failure of any learning in 1 and 2. It would not be sur-
prising ta see gunjects perform better in Ltren in 2 since in they might par-

7 ~ 77 * : a cute ce eee ok peace kDa
BLL Ve BULB ti Ch BO PL OPE Bae ew decent theo
> ve

The procedure would have subjects in all conditions study the same sequence
of sentences but vary the accompanying semantic information according to condi-
tion. After a study phase they would be tested for grammaticelity judgments
about a set of sentences, Some of which violate one of the rules for generation.
Since the syntax of the language is the sane in all four condivions, the sane
sentences will be eramnat 1 in all four conditions. @yen though the synvtac-
tic information given d study will be the same in all conditions, marked
@ifferences in syntacti Tr

r 52
+

Ss

He ete (9
39

oo 2

ao
ct
uw 0

wledge should appear across conditions. ine

guences of study trials with sequences of test
tudy six sentences, with the semantic information
jate to his condition, if any). Then he would see six test pairs, one

ce of each pair violating some syntactic rule. For each pair of he would
o- choose the grammatically correct pair. By frequently alternating study
st, 44 would be possible to carefully monitor the growth of information
in the conditions.

Many readers may not be surprised by the prediction of petter learning in
Conditions 3 and hk. Hopefully, the significance of such an outcome would be
clear. It would snow shat semantics is impo tant to induction of the
structure of a natural language. Hovever, i

ck
o
(b

would also show that semantics

is useless if the relation between the semantic referent and the syntactic
structure is arbitrary. The surface structure of the sentence must be a praph-
deformation of the underlying semantic structure. Failures to eppreciate the
contribution of semantics to Language induction and failure to understand the
nature of this contribution of senanties to the induction process nave been
fundamental in the stagnation of attempts to understand the algorithms permitting
Janaguage induction. These facts may be obvious woen pointed one but they have
. . . wate . . rd > - = D2 Phe ar wa
been unavailable to the Linguistic theorists 19F fifteen years.

he same purpS

ea

S
hat is, they ¢
A

 

3
ssibl so that the target

language can be identified. However, ainc ed by LAS are not tre
came ag those suggested by Chomsky. For insvance, Chomsky proposed that vrans—~
formations which reversed the order of words in 4& sentence would be unaccentanle.
Tris is because such 2 rule does not refer to tne santence's constituent struc-
tore. However, 2 languesge which contained sentences of a natural language and
their reversals would be learnable by LAS. Te would just develop one seu or
rules for sentences in one order and another indesendent set for reverse order
sentences. It would be interesting to see whether numan subjects could iearn

such a language.

In the example of the induction of GRAGIARI we found that tne
for LAS to detect non-senantic contingencies between syntactic cno
a °

first noun-phrase end tn the second noun-phrase pushed to in the. main network.
Wor instance, it is possible that a morphenic emoellishment of the a jectives
i ; hrase may depend on & choice of morphemie embe

the noun in the first noun phrase. Human subjects should also find it hard
to detect such syntactic contingencies.

 

 

oO
m

There ar nother set of predictions, besides those concerned with language
learnability, waic + will be useful to explore. LAS makes predictions about
the situations under whieh humans will ten to generalize rules end when humans
will not. Suppose LAS learned the following gremmar: ,

S$ > VERB WP NP

we > (PREPP) Wy, (ADS)
PREPP > PREP Ne

Ny > boy, girl, ete.
No > room, bank, etc.
ADJ + tall, nice, etc.
PREP > in, near, etc.
VERB > like, nit, etc.

1
+

 

 

A typical sentence in this language would be Like

 

 

which means The tall boy in the room likes the nice 83 1. Tnis lang e is
given English terms only to maxe its semanties Clearer. Suppose, in fact, words
in the language were das meaning man, ji> meaning wonar, Fos meaning boy, and

3
tuk meaning girl. Suppose the subject studies the following pair of sentences!

1. Like das tuk.
2, Like fos jir.

10
 

      
   
   
  

Then, it is interesving to consider his judgmenes of the nacceptaoility of
sentences Like:
3, Like das tuk.
4. Like das jir.
5S, Like jir Gas.
Accept involves recalling senteace (1), but
nVOLV c LAS would currently mexe th

3

'
’

cy fa
wv

ip
uy)

  

oa _- a “~
Oe wONeECNE 3, 5325
oo 5 mS 7

of their semantic similarity

\ s. ‘Tne words 3:
could, for Dax 4iff 5 cane inflection wnen they apr

ferent
t when pr

 

ion in this artificial
Jangu Would he accent
senven

6. Like in room boy tall girl

7. Like girl in room boy tall
That is, will rules gener alize from the subject noun purase to the object noun
phrase. As LAS is curren ntly constituted such gener rai tzasson s would not occur
until it hed built up fairly stable now pnrases. Again suppose LAS had initially
only encounterecé simple sentences suen es (8):

uch as (8) LAS would learn the class of nouns that Go

From sentences Ss e

first and second noun phrase slots. Suppose then sencence (9) was studied. On
the basis of it, would senvence (10) be accepted as grammatical? That is, would
the preposi itional phrase in bank generalize to Ov “near nouns in the same class as
woman?

9, Like boy in bank women
10. Like girl in bank man

This would be am example of right zgenereli
In contrast, LAS does perform left generali
LAS would accept (12).

zation which does not occur in LAS.
zation. That is, after studying {11}

lL. Like boy woman nice
12, Like boy man nice

fi
 

   
 

poses, one concerned with psycnology and one
ence. IL think this mixed purpose is fruic-
réilization of ideas from two fields and so
on. There is no gueranvee that LAS, in the
will ever achleve the goal of an adequate
acauisition of language. However, 2 certain outcome
er understanding of the information-processing demands
and of the role of a semantic referent in gremmar in-
“re Will learn wnet is wrong with one explicit set of
i | =ven that would be 2a significant contribution to the
Currie UbEUre be cas Gevelupmeub dn a PlelG rich in Gave Dub abmvel, LuLaLiy
Jecking explicic information-processing theories. I hope, of course, that the
processes uncovered in the LAS project wiil be the same as those used bY
humans in language learning. A successful simulation program would constitute
an enormous advance in our understanding of cognitive development.

The contributions of LAS to the artificial intelligence field are less
certain and more distant. Nonetheless, generality in language understanding
systems is an important goal and one for which a learning system approach
seens ideal. It is therefore importent to understand the contribution language
learning systems can make in this field. It yould be a significant advance to
know in detail way & learning system approach was not the answer to language
understanding or at least why LAS was not the right sort of learning systen.

Of course, if LAS does prove to be the basis for a viable language understanding
system, its contribution to artificial intelligence will also pe of considerable
importance. ,

FE, Fecilities Available

 

{ shall neve evailable the entire facili
Center, University of Michigan. My current @
but can be extended for one to three years. My pr
Michigan Terminal System which supports a rich vari
the programming will be performed in Michigan LISP (
which is e relatively economical and an error-free ve

c

ty of programs. Most of
S £ % Wilcox, 197%)
LISP.

{2
mag ae son

    
  
 

ALPAC (Automated Languag

Language and
machines :

te ; >t
National

 

 

 

Scieaces, Washiagton, 232
Andersoa, J. R. Computer simulation of a language-acquisitioa system in
R, L Solso (&d.) Information Processing and Cognition: The Loyola Svmoostun.

 

 

 

 

Washington: Lawrence Exlbaun, 19/5.

Anderson, J. &. and: Bower, G. H. -Human associative menory. Washington:
Winston and Sons, 1973.

 

 

 

 

Bar-Hiliel, Y. Language aad information. Reading, Mass: Addison-Wesley, 1964.

ns s In J. R. Hayes
opment of languaze. New York: Wiley, 1970.

Bever, T. G.. The cog
nt

itive basis for lingulstic structure
(Ed.) Cogn t vel

ks

wv

| od

Q

ban

9
overs!

 

 

Bierman, A. W. An interactive finite-state language learner. First USA-JAPAN
Computer Conference, 1972.

Bierman, A. W. and Feldman, J. A. A survey of resuits in gramnatical inference.
In (£d.), Frontiers of pattern recoenition. New York: Academic
Press, 1972.

blasaeil, K. and yensen, rv. stress and word position as detetminants of
induction in first-language learners. Journal cf Speech and Hearing
Research, 1970, 13, 193-202. .

 

Bloon, L. One word at a time. The Hague: Mouton, 1973.

Bobrow, D. G. A question~answering system for high school algebra, word
problems. AFIPS Conference Proceedings, 1964, 26, 577-589.

 

 

Bowerman, M. Early syntactic development. Cambridge, England: University
Press, 1973.

Boyer, R. S. Loc!

king: A restriction of resolution. Ph.D. Dissertation.
University of Texa

xas at Austin, 1971.

Braine, M. D. S. On learning grammatical order ox words. Psychological
Review, 1963, 70, 323-348.

Braine, M. D. S. On two types of models of the internalization of grammars.
In D. J. Sloben (Ed.) The ontogenesis of grammar. New York: Academic Press,

1971, 153-188.

 
 

Broan, FP. vhe verbal enviroauent of the language-Learaing child. boaogra hg

P
of the American Speech & Hearing Association, 1972, 1?.

rT eee We ee ae

er rr cs ae

 

 

Brown, R. A first language. Cambridge, Masse: larvard

Brown, R. and Fraser, ©

The acquisitioa of syntax. im c. N. Cofe
B. S, Musgrave (Eds °
{

+
&

, Yarbal pehavior and jearnmings Problens and processes.

Vetoes oo 2 Q jearning: Problens 2n8 8 -

New York: MeGraw-Hil

Bruner, J. 5-, Coodnow, J., and Austin, G- Ae A study of thinking. hew York:
Wiley, 1956.

Charniak, E. Computer solution of word problems. proceedings of the Intec
national Joint Conference of Arcificial Tarelligence- Wasnington, D- C.3
1969, 303-316-

Chomsky, N. Aspects of the theory of syntax: Cambridge, Mass-:; MIT Press, 1955.
Colby, K. Me and Enea, H. Inductive inference by intelligent machines.
Scientia, 103, 669-720 (Jan. 7 Feb. 1968).

Clark, E. Vv. Non-Linguistic strategies and the acquisition of word meanings.

Cogaition: International Journal of Cogaitive Psycholosy, 1974, in press.

Coles, L. S- Talking with a robot in Eng
~narional Joiat Conference on Artifict:

1969, 587-596. ~ —

Proceedings of the Inter-

elligence.- Washington, D. ©.

f3
ee
r

3
rr?

 

uisition of precedence grammars.
Engineering and Applied Science,
es, 1970.

Crespi-Reghizzi, S. The mechanical acq
Report No. UCLA-ENG-7054, School of
University of California at Los Angel

Dreyfus, H. L. What computers can't do. New York: Harper and Row, 1972.

RO

Ervin, S. M. Imitation and structural change ina children's Language.
In E. H. Lennenberg (Ed.), New directions in the study_of Language.
Cambridge, Mass-: MIT Press, 1964, 163-159.

Feldman, J. A. Sone decidability results on g

ramnatical inference and complexity.
A. Lt. memo No 93.1, Computer Science Department,

Stanford University, 1970.

Fergusen, C. A., Perzer, D. B., & Weeks, T. E. Model-and-replica phonological
grammar of a child's first words. Lingua, 1973, 31, 35-55.

of syntax. Paper presented

Fernald, C- Children’s active and passive knowleds
to the Midwestern Psychological Association,

197

on

Fikes, R. E., Hart, p. £. and Nilsson, WN. OJ- Some new directions in robot
problem solving. Stanford Research Institute, August, 1972.

74
Fhllmore, C. J. The casa for case. in E. Bach and R. J. Harus (Eds.),
+

 

     
     
 

 

 

Universals in tinenistic theory. New York: Holz, Rinehart anc

Hiaston, 190d.
Friedmac, J. A computer mod American Elsevier,1971.
Praser, D., Bellugi, U.,

hension, and projection. Jour

1963, 2, 121-135.

Research on intelligent question answering
che ACM 23rd National Conference. Princeton:

 

 

 

 

, 169~i8l.

Hafner, C. and Wilcox, B. LIS? MTS programmer's manual. tent
Research Communication 302 and Information Processing Paper
¥y 1974
a

Tim wnt oot nhac
University or “icnigar.,

de atical initerence. Technical Report No. CS 139,
Computer Science Department, Stanford University, August, 1969.

uary 1966. Technical Repor
Naval Pasearch, Information Systems Branch.
Reliey, . L- hachy syucecclé ACQULSAL LU. E$<-3729, Lhe Rand CoLye, baka
Monica, California, 1907.

Kelloggs, C. H. A natural language compiler for on-line data management.
Proceedings of the 1965 Fall Joint Computer Conference, 473-492.

 

 

 

Kuno, S. The predictive analyzer and a path elimination technique. Communications
of the ACM, 1965, 7, 453-462.

uage without ability to speak: A case report.

Lenneberg, E. H. Understand
ychology, 1962, 65, A19-425.

i
Journal of Abnormal and Sozia

Ww a9

Lenneberg, E. H. Biolowical foundations of language. New York: Wiley, 1967.

Lindsay, R. K. Inferential memory aS a basis of machines which understand
natural language. In E. A. Feigenbaum and J. Faldman (Eds.), Computers
and thought. New York: McGraw-Hill, 1963.

Loveland, D. W. A linear format for resolution. Proceedings of the IRIA
Symposium on Automatic Demonstration. New York: Springer-Varlay,

1970, 147-162. .

 

 

 

 

 

Luckham, D. Reitinemants in resolution theory. Proceedings of the IRIA
Symposium on Automatic Demonstration. New York: Springer-Varlay,

1970, 163-190.

75
Miller, G. A. Thea pss chology of comeunication, New York: Basic Books, 196

Minsky, M. CEd.), 5 Semantic information orocessing. Cambridge, Mass.: HIT
Press, 1958

 

 

 

 

 

Moeser, S. D. and Bregman, A. 5

of Verbal Learning aad. Verb:

 

 

Moore, E. F. Gedanken experiments on sequeatial machines. Automata Studies,
Princeton, 1956.

Olson, G. M. Developmental changes in memory and the acquisition of language.
In T. E. Moore (Ed.), Cognitive development and the acquisition of language.
New York: Academié Press, 1973.

Pao, T. W. L. A solution of the syntactic i nduction-conference problea for
3 non-trivial subset context-tree languages. Report } No. 70-19, The ,

see School of Electrical Eagineering, University of Pennsylvania,

slat, 1959.

 

 

Guillian, M. R. The teachable language comprehender. Communications of
the Association for Computing Machinery, 1969, 12, 459-476.

 

Reber, A. S. Transfer of structure in synthetic languages. Journal of
Experimental Psychology, 1969, 81, 115-119.

 

Richards, I. A., Jasuilko, E. and Gibson, C. Russian through pictures,
Book I. New York: Washington Square Press, 1961.

Robinson, J. A. A machine-oriented logic based on the resolution principle.
Journal of the ACM, 1965, 12, 23-41.

Rumelhart, D. E., Lindsay, P, and Norman, p. A. A process model for long-
term memory. In E. Tulving and W. Donaldson (Eds.), Organization of
memory. . New York: Academic Press, 1972.

Saporta, S., Blumenthal, A. L., Lackowski, P. and Reiff, D. G. Grammatical
models of language learning. In R. J. DePietro (Ed.), Monograph Series
on language and Linguistics, Vol. 16. Report of the 14th Annual Round
Table Meeting on Linguistic aad Language Studies, 1963, 133-142.

 

 

 

 

 

 

Schank, R. C. Conceptual dependency: A theory of natural-language understanding.
Cognitive Psychology, 1972, 3, 552-631. ,

Schlesinger, I. M. . Production of utterances and language acquisition.
In D. I. Slobin (Ed.), The ontogenesis of grammar. New York: Academic
Press, 1971.

 

76
 

Scholes, R. J. The role of grammaticality fin the imitation of wore s
by children and aduits. Journal of Yerbal Learning anc ;
1969, 8, 225-228.

 

 

 

 

Scholes, R. J. On functors and conten ntives in childzena 's imbtations of
word stcings. Journal of Verbal Learning and Verbal Behavior 1979,
9, 167-170.

Sehwarez, R. M., B rgex, J. F. and Simmons, R. F. A deductive questioa~
answerer for natural langus is

atio
Association for Computing Machinery, 1970, 13, 167-133.

sd

 

 

Shamir, E. and Bar-Hillel, Y. Review 2476. Computing Raviews, 1962, 3, 5.
Siklossy, L. A Language~Leatning ,euristic progran. Cosnitive Psycnoioay,

a tai
1971, 2, 479-495.

 

 

 

 

 

 

 

 

 

Simmons, R. F. Natural language question- answer ring systems: 19909.
Communications of the Asso ciation for Computing Machinery, 1970, 13, 15-30.
Simmons, R. F. Semantic ft networks: Their computations and use for understanding
English sentences. In RB. C. Schank and K. M Colby (Eds.), Comput ar
models of chought end language. San FY neisco: Freeman, 1973.
> -ce le 3 ad rt GS
Simon, H. A The sciences of the artificial. Cazbricge, Mass.: MIT Pressy/76 7.

 

 

oe acquisitioa and cognitive development. In
tive development end tne acquisition of language.

 

 

 

 

 

=s, 1973.
Slagle, J. R. Experiments with a deductive question-answering program.
Communications of the As sociation for Computing Machinery, 1965,

 

8, 792-798.

Slagle, J. R. Automatic theorem proving with renamable and senantic resolution.

Journal of the ACM, 1967, 14, 687-697.

Siobin, D. I. The ontogenesis of grammar. New York: Academic Press, 1971.

 

 

 

Solomonoff, R.- J. A fornal theory of inductive inference, Part It. Information
and Control, 1964, 7, 224-254.

 

Weizenbaum, J. ELIZA - a computer program for the study of natural Language
communications between man and machine. Comzunicetions of the ACM,
1966, 1, 36-45.

Wilks, Y. The Stanford MI and understanding PIO} ect. In Senank and Colby
(Eds.), Computer models of thought _and language. San Francisco, 1973.

 

Winograd, T. Understanding natural language. Cognitive Psychology, 1972,
3, 1-191.

77
Anderson

 

 

 

Winston, P.-H. Learnin ng Structural descriptions fron examples. MIT
Artificial Intellisenc ce Laboratory Project AI-TR -~23i, L970.

Woods, W. A, Procedural semantics for a question-answering machine.
Proceedings ofthe 1958 Fall Joint Comsurer Conference, 457-471,

atural languaze understanding: An application

woods, W. A. Progress inn
FLIPS Proceedings, 1973 National Computer Gonftorence

to lunar geology. AF
and Exposition.

 

 

Woods, W. A. Transition network grammars for natural language analysis,
Communications of tha ACM, 1970, 13, 591-606.

78