RESEARCH PROPOSAL BY: Division olf Engineering Research / . MICHIGAN STATE UNIVERSITY Age East Lansing, Michigan 48824 Telephone: 517-355-5103 PY 387 oss TO: SUMEX-AIM Advisory Group i SUMEX-AIM Computer Project S/4 2 OSs/ —_— CL c/o Department of Genetics, S047 eg ° Stanford University Medical Center Stanford, California 94305 ATTN: Elliott Levinthal, Ph.D. AIM User Liaison TITLE: In Support of the Design of Intelligent] Speech Prostheses Principal Investigators: Dr. Carl V. Page, Professor Department of Comptter Science Dr. John B. Eulenbeér Assistant Professor Department of Computer Science ABSTRACT: The proposal requests SUMEX-AIM support of two terminals, a connection to the Tymshare network, and computer services to facilitate research on the design of intelligent speech prostheses. Since the request is for "in kind" support; the proposal is presented as a response to specific questions posed by the SUMEX-AIM Advisory Group. APPROVALS: C.V. Page (372-38-2697) J.W. Hoffman, Director Co-principal Investigator Division of Engineering Research J.B. Eulenberg (339-34-3734) J.E. Cantlon, Vice President for Co-principal Investigator Research and Graduate Studies H.G. Hedges, Chairman H.G. Grider, Director, Research Department of Computer Science and Contract Administration 11/3/76 Date Il. INTRODUCTION Millions of people around the world experience com- munication handicaps resulting from causes such as cerebral palsy, stroke, and accidents. Very often com- munication is by means of a standard letter board which requires mobilization of the full attention and intel- ligence of the communicants. It is believed that pros- theses can be designed possessing some of this intel- ligence, thus making communication more fluent. Existing technology can revolutionize communication opportunities for many people in the future. It is possible to replace manual letter boards and various types of simple elec- tronic scanners with more sophisticated speech prostheses using microcomputer technology and relatively inexpensive voice generators. The hypothesis of the group at MSU is that the most efficient means of generation of speech must make use of the same knowledge sources that Speech understanding re- quires. Although in some sense generation is the inverse of understanding, partial success in efficient generation would be of great social benefit to many people, while partial speech understanding may be of less value. BACKGROUND Results of the effort at Michigan State University indicate that access to the SUMEX-AIM Network will prove most useful and it is with this in mind that consideration of support is requested from SUMEX-AIM in the funding of 2 computer silent terminals, computer time, and connec- tion to the Tymshare Network to Ann Arbor and Detroit. Current work of the Michigan State University group in- volves working with an experimental group of about ten students (ages 10-25) for whom experimental speech pros- theses are being designed. The following results have been achieved: (a) A research team of about twelve faculty members and IIl. twenty students (graduate and undergraduate) are involved in this project in some manner, as well as fifteen people at the location where the students reside and go to school. (b) 4 Microcomputer-based speech prostheses with joy stick input, TV display and voice output have been designed and built. The prototype model was demonstrated at the Communication Enhancement Institute August 1976, held at Michigan State University. (c) The use of a voice recognition system (Scope Elec- tronics VDETS) is being explored as both an input means for enhancing nonstandard speech and as a positive feedback means for improving natural but nonstandard speech. (d) A means of assessing communicative behavior and changes therein of the students by means of objective testing using videotape as one recording medium is being developed. (e) Interfacing of myoelectric signals from muscles with the VDETS to determine their potential as generators of patterned audio signals is being used. (f£) Software is being designed which will be a stimulus to the general education of our clients who have had little formal educational opportunity. RESEARCH OBJECTIVES The major objective is to determine how knowledge may be represented so that speech can be generated in a fluent manner using a low bit rate for input. Existing scanners (visual output) use probabilistic knowledge or none at all. The Tufts Interactive Com- mMunicator, now under development, will use letter fre- quencies given the previous three letters to control the selection menu. IV. Intelligent scanners could use a variety of knowledge sources. a. Semantic Sources. Semantic nets, Conceptual dependency graphs, Bliss symbols, Production systems. Syntactic Sources. Augmented transition nets (woods) and transformational grammars, Production systems. Both general and person-dependent information could be represented to improve the process of selection. A secondary objective is to improve the quality of inputs by the use of both statistical pattern recog- nition and semantic pattern recognition using knowledge sources which describe input noise processes. An important criterion is the comparison of the effectiveness of alternative types of knowledge so that results can be scaled down from SUMEX-AIM prototypes to minicomputer and microcomputer based speech generation systems. GENERAL INFORMATION Attachment A is an answer to specific questions as outlined in the "SUMEX-AIM RESOURCE-INFORMATION FOR POTENTIAL USERS", attached. ATTACHMENT A SUMEX -AIM RESOURCE RESPONSE TO QUESTIONNAIRE FOR POTENTIAL USERS COMMUNICATION ENHANCEMENT PROJECT MICHIGAN STATE UNIVERSITY DESIGN OF INTELLIGENT SPEECH PROSTHESES Dr. Carl V. Page Dr. John B. Eulenberg A) MEDICAL AND COMPUTER SCIENCE GOALS 1) The proposed research concerns the design of software and hard - ware for speech prostheses for persons who experience severe com- munication handicaps. (a) The major question is how knowledge can be represented so that speech can be generated in a fluent manner by a person who can provide a relatively low bit rate. (b) A secondary question concerns the interpretation of very noisy inputs (verbal and nonverbal) to a speech generation system. We expect to apply syntactic methods as well as statistical pattern recognition to this problem. (c) Future research includes the following: (i) Studying the educational and therapeutic potential of speech prostheses for people who do not command generally intelligible speech as an aid to learning speech skills. (ii) Exploring the use of myoelectric interfacing to a speech prosthesis to achieve a controllable, relatively high bit rate for persons who do not use a standard input key - board. (iii) Development of "talking" software to teach blind people how to write in longhand. (We see this as a spin-off of (b).) (iv) Study of the use of a speech prosthesis in encouraging autistic children to communicate. \ G/ five . 2) Present support. Some money has been made available to us for demonstration systems for specific individuals. Much basic research in artificial intelligence underlies the construction of such demonstration systems. Based on our recent experience we are currently approaching various agencies of state and federal government with proposals for more basic re- search. Private foundations are also a viable source of support for sophisticated demonstrations. Our recent support has been from the following sources: Wayne County (Detroit) Intermediate School District \ 1976 $ 183,000. 1977 $217,000. Pending. Arkansas Enterprises for the Blind (Rehabilitation Services Administration (HEW) and The U.S. Civil Service Commission) 1975-76 $ 36,900. Jackson County Intermediate School District, Michigan 1976 $ 8,500. State of Michigan Vocational Rehabilitation 1976 $ 30,000. Pending. United Cerebral Palsy Association of Michigan 1977 $ 50,000. Expected. United Cerebral Palsy Association (National Office) 1976 $ 60,000. Proposal submitted for study of myoelectric inputs to speech prostheses. 3) Relevance of AI approach of SUMEX-AIM as opposed to other computing alternatives. Whenever possible the systems we build will contain informal know - ledge represented by tools of artificial intelligence such as production systems and semantic nets. Our goal of constructing an intelligent speech prosthesis is close in technical content to the problem of con~ structing good speech understanding systems. The development of systems for representing information so it can be used and extended in complex and unpredictable ways seems central to both AI and our research. B) COLLABORATIVE COMMUNITY BUILDING 1) Analogous applications. Most of our work will be applicable to general problems of speech generation. We hope to complement and augment the work of Ken Colby with people with aphasia. We are working to serve the needs of blind computer users and expect some of our effort to be of value to them as well as to the many people who cannot speak. 2) Application programs publicly available. We would like to consider modifying MYCIN so that it could be a production rule front-end to a speech generation system. When KRL (if that is what Winograd and Bobrow still call it) is available, we would like to use it. We might use microprecessor design facilities if available (or else make them available ourselves). We would like to experiment with a large world model in some semantic net form (such as Shank's restaurant or supermarket nets) as a means of organizing the sequence of contexts from which a person would generate speech. . We believe we will be using facilities on SUMEX-AIM not avail- able elsewhere. 3) Availability of Programs. We will be pleased to make our programs available to others with similar needs. Our project is located within the Computer Science Department and we consider it our professional obligation to produce programs of high quality and complete documentation. Since our work is being done on a variety of computers, portability and high- level language compatibility are important. 4) We would like at some time to discuss our work with the SUMEX- AIM staff but have some familiarity with the system. Dr. Carl V. Page of our project spent his sabbatical at Stanford in the 1974-75 academic year and attended the recent AIM Workshop. Mr. Douglas Appelt, who is our systems analyst for summer 1976, is a Computer Science student at Stanford, providing us valuable liaison. Dr. John B. Eulenberg was a faculty member at Stanford for three years and has been in recent communication with members of the Linguistics Department and the Children's Hospital at Stanford. 3 C) 5) Other collaborative opportunities. We believe that collaborative activities are an important part of our work and will be happy to share such information with SUMEX-AIM. Recently an international center for rehabilitation research in special education was established at Michigan State with federal funding. Also, together with the State of Michigan, we sponsored a Communica- tion Enhancement Institute in East Lansing, August 25, 26, and 27, 1976 which brought researchers toghther with those who fund and need such services. Organizations participating included the TRACE Center of the University of Wisconsin, Children's Hospital at Stanford, Tufts-New England Medical Center, The National Research Council of Canada, and Ottawa Crippled Children's Hospital. HARDWARE AND SOFTWARE REQUIREMENTS 1) We have a CDC 6500 with about 140 K of 60 bit core available to a user for large scale number crunching. We have a NOVA 2/10 (configured to work with a SCOPE Electronics Incorporated VDETS voice recognition system). We also use a PDP 11/40 in our Depart- ment of Audiology and Speech Sciences, equipped with one channel of A/D and four channels of D/A. We do not have available any modern AI languages or programs. We have versions of SNOBOL, LISP and SLIP, but are interested in expanding our access to high-level AI software. 2) Languages to be used. INTERLISP MAINSAIL or maybe SAIL KRL (when implemented) Cross compilers and assemblers We do not expect to require any new system programs. Our work will mainly be creation of prototypes to run on other machines here. Our major application programs are written in FORTRAN and ALGOL and need not be of much concern to your staff. 3) Estimate of machine utilization. Our peak utilization can be synchronized to your more lightly loaded times since our students work at all hours. Our major activities will be debugging and documentation which should require only a small fraction of available CPU cycles. We can live within any reasonable disk space limitation. Our best guesses are: a) CPU cycles (Not too many. Our CDC 6500 is available for problems involving heavy computation. ) b) Connect time and communications. 3-4 hours/day. c) User terminals. We would be able to make good use of 2 silent terminals and connections from East Lansing to the net. d) Disk space. 300 Blocks. e) Off-line outputs. Except for rare instances we have means of making suitable hard copy. Our work with the cerebral palsied students (never more than three at one session) takes place at their school in the Detroit area. If we wanted to use a program there or do realistic, field site, de- bugging it would probably be between 0500 and 0900 PST. 4) Communication plans. The ARPA net is not available at MSU. There are TYMNET nodes in Ann Arbor (59 miles away) and Detroit (85 miles away). It would be essential for us to have a TYMNET connection to East Lansing since that is where most of our programmers reside. It would be helpful to have available the existing links in both Detroit and Ann Arbor so that small application programs could be run for our clients ‘near these sites. At this time we would like SUMEX-AIM to pay for the connections. 5) We view this as both a research and developmental project which involves considerable software development. We plan to share our work with Ken Colby, who is part of the SUMEX-AIM community. We will be making use of the talents of a computer science depart- ment which has several people with more than 20 years programming experience. Our students won the ACM Mid-East Programming Championship in 1976. (They won the Central Region Championship in 1975.) We expect to produce high quality programs which will be building blocks for AI speech generation efforts of the future.