At ve on “4 ¥ er . RUTGERS THE STATE UNIVERSITY OF NEW JERSEY abies PROCEEDINGS OF THE FIRST ANNUAL A.1.M. ARTIFICIAL INTELLIGENCE IN MEDICINE WORKSHOP JUNE 14-17, 1975 Theme: KNOWLEDGE BASED AL. SYSTEMS Sponsored by: The Rutgers Research Resource on Computers in Biomedicine Department of Computer Science Rutgers University, New Brunswick, N.J. 08903 SAUL AMAREL, Principal Investigator C. A. KULIKOWSKI, Organizer N. S. SRIDHARAN, Technical Director DEIRDRE SRIDHARAN, Proceedings Editor The AIM Workshop Series is Supported by the Biotechnology Resources Branch of The National Institutes of Health Grant RR-643 Il. IIIf. Iv. VI. VII. CONTENTS Introduction Schedule of the Workshop List of Panel Participants and their Affiliations Brief Description of Systems Presented at the Workshop Panels A. MEDICAL PERSPECTIVES OF AIM SYSTEMS B. ANALYSIS AND COMPARISON OF MEDICAL SYSTEMS C. KNOWLEDGE ACQUISITION AND REPRESENTATION D. METHODS OF INFERENCE - FORMAL AND CLINICAL PROBLEMS E. PROBLEMS OF SYSTEMS DEVELOPMENT References AIM Organization I. INTRODUCTION AIM (Artificial Intelligence in Medicine) is a NIH supported national project devoted to the development and dissemination of AI applications in Biomedicine. The SUMEX-AIM computer facility at Stanford University is the major shared resource of the project. This facility is accessed by several research groups in the national AIM community via TYMNET and ARPANET. The Rutgers Research Resource on Computers in Biomedicine is one of the major projects in the AIM community. As part of its responsibilities the Rutgers Research Resource, directed by Dr. Saul Amarel, is sponsoring a series of annual AIM Workshops. The first Workshop was held at Rutgers University on June 14-17, 1975. Dr. C. Kulikowski was the Workshop Organizer; Dr. N.S. Sridharan was the Technical Director; Ms. P. Moore and Mr. K. Brown were administrative coordinators. The Rutgers Research Resource is supported by the Biotechnology Resources branch of the NIH, grant number RR-643. A description of the Resource appears in SIGART Newsletter, An ACM Publication, No.54 (Oct. 1975). The Stanford University SUMEX/AIM Project is directed by Dr. Joshua Lederberg. Users of the SUMEX facility are divided for administrative purposes into two groups: 1) those at Stanford University School of Medicine, and 2) those elsewhere in the United States. The facility resources (computing capacity and consulting Support) are allocated in equal portions to the two groups. As Principal Investigator for the SUMEX grant, Dr. Lederberg reviews Stanford medical school projects with the assistance of a local advisory committee. The governance of AIM includes the AIM Executive Committee and the AIM Advisory Group. The membership of these committees is given in Section VII. National users may gain access to the facility resources with the approval of the national Advisory and Executive groups. The Workshop was designed to provide insight into existing and potential systems that apply methods of Artificial Intelligence to problems of biomedical research and health care. The attendees were selected from a broad range of investigators specializing in Chemistry, Psychology, Medicine and Computer Science. They were chosen in consultation with an advisory group of AIM investigators and with the approval of the AIM Executive Committee. The 1975 theme of "Knowledge-based Systems in Biomedicine" centered around discussions, demonstrations, and hands-on systems experience in - medical modeling and decision making for diagnostic/therapeutic consultation; - psychiatric simulation, psychological modeling, language analysis and common sense reasoning; - biomolecular characterization of organic molecules on the basis of chemical analysis, protein Structure determination and chemical synthesis planning. No formal papers were prepared for the Workshop. Emphasis was placed on brief presentations of current AIM projects, followed by in-depth discussions of basic issues which underlie AIM activities. Most of the discussions took place in panels which were recorded. Section V of the Proceedings contain summaries of transcripts of five panels. Many of the key issues and concerns that came up in the Workshop are captured in these panel discussions. Section IV of the Proceedings provides brief descriptions of the systems presented at the Workshop. The list of panel participants and their affiliations is given in Section III. The Workshop participants were provided continuous access to several working application systems running both at the Rutgers PDP-10 and at the SUMEX PDP-10-TENEX. System access and hands-on experience proved valuable in the dissemination of AI applications in Biomedicine and will be a recurring feature of future Workshops in the series. If. Morning 9:15 10:05 10:30 10:50 11:15 SCHEDULE OF THE FIRST ANNUAL AIM WORKSHOP Held at Rutgers University, June 14-17, 1975 GENERAL SESSION (Saturday, June 14) Session: 9:00 Registration 9:15 Introduction to the Workshop 10:50 11:15 12:10 (S. Amarel, Rutgers University) KNOWLEDGE-BASED SYSTEMS IN MEDICINE MYCIN: Antimicrobial Therapy Consultation System (E. Shortliffe, Stanford University). DIALOG: Diagnostic Logic System in Internal Medicine (H. Pople, University of Pittsburgh). Model-based Systems for Consultation: CASNET (Causal-Association Network Systems) and other approaches (C. Kulikowski, Rutgers University). Break Analyzing and Simulating the Present Illness (S. Pauker, Tufts-New England Med Center & MIT). Panel Discussion: Medical Perspectives of AIM Systems. Moderator: A. Safir, Mt. Sinai School of Medicine Panelists: R. Engle, Cornell Medical School & - Y. Hospital, Lindberg, University of Missouri, Meyers, University of Pittsburgh, Pauker, Tufts-New England Medical Center, Yamamoto, George Washington University. ZznNa0 gS Afternoon Session: II. KNOWLEDGE-BASED SYSTEMS IN PSYCHOLOGY AND PSYCHIATRY 1:15 - 2:15 PARRY: Improving a Simulation of Paranoid Thought Processes (K. Colby, UCLA). 2:15 - 2:40 BELIEVER: Belief Systems Interpretation (C. Schmidt, Rutgers University). Ill. KNOWLEDGE-BASED SYSTEMS IN PSYCHOLOGY AND PSYCHIATRY 2:40 - 3:05 CONGEN: Constrained Generation of Chemical Structures (B. Buchanan, Stanford University). 3:05 - 3:30 SECS: Organic Synthesis System (T. Wipke, Princeton University). 3:30 - 3:55 Protein Crystallography System (R. Engelmore, Stanford University). 3:55 - 4:15 Break IV. OVERVIEW OF SYSTEMS AND METHODOLOGY 4:15 - 5:00 Panel Discussion on Artificial Intelligence Methodology in Medicine, Psychology, and Biochemistry. Comparative review of systems and future problems and perspectives. (E. Feigenbaum Stanford University - Moderator) Panelists: S. Amarel, Rutgers University J. Feldman, University of Rochester B. McCormick, University of Illinois at Chicago Circle R. Schank, Yale University 5:00 - 5:40 Panel Discussion on Shared Resources and Computer Networking Schedule of Technical Sessions of First Annual AIM Workshop Sunday, June 15, 1975 Morning Session: 8:30 - 9:45 9:45 - 10:15 10:15 - 11:30 A. Seminar on the DIALOG System (Pople and Meyers). B. Seminar on the BELIEVER System (Schmidt). Break. A. Seminar on Analysis and Simulation of the Illness (Pauker). B. Seminar on the CONGEN System (Smith/Carhart). Afternoon Session: 1:00 - 2:15 2:15 - 2:45 2:45 - 3:40 4:00 - 5:30 Dinner: 6:30 Evening: 8:30 - 10:00 A. Seminar on CASNET and related systems (Kulikowski and Safir). B. Seminar on Protein Crystallography (Engelmore). Break. A. Seminar on the MYCIN System (Shortliffe). B. Seminar on the SECS System for Organic Synthesis (Wipke). Panel Discussion: Analysis and Comparison of Medical Systems. Keynote Speech. (Dr. Edward Bloustein, President, Rutgers University) Guest Speaker. (Dr. William Raub, Associate Director, Extramural and Collaborative Programs, National Eye Institute, NIH) Special Interest Group Meetings and Hands-on Experience with the Systems. Monday, June 16, 1975 Morning Session: 8:30 - 9:45 9:45 - 10:15 10:15 - 11:30 A. Seminar on the PARRY System (Colby). B. Seminar on METADENDRAL (Buchanan). Break. Special Interest Group Meetings; Hands-on Systems Experience. Afternoon Session: 1:00 - 3:15 3:15 - 3:45 3:45 - 5:15 Evening: 7:30 - 9:00 9:00 - 10:00 Seminar on Artificial Intelligence Systems (FUZZY, PEDAGLOT, MDS and other Knowledge-Based Systems). (B. Bruce - Moderator) Break. Panel Discussions on Analysis and Comparison of Systems: A. Biochemistry (Smith - Moderator). B. Psychology (Colby - Moderator). Seminar on Medical Systems (MISL Project) (McCormick UICC) Digitalis Therapy Advisory Program (Silverman MIT) Special Interest Group Meetings and Hands-on Systems Experience. Tuesday, June 17, 1975 Morning Session: 8:30 - 9:45 Panel Discussion: Methods of Inference: Formal and Clinical Problems (T. Shortliffe - Moderator) 9:45 - 10:15 Break. 10:15 - 11:30 Panel Discussion: Knowledge Acquisition and Representation (B. Buchanan - Moderator) Afternoon Session: 1:15 ~ 3:15 Panel Discussion: Problems of Systems Development; Issues of Collaboration across Disciplines ~ Shared Resources and Computer Networking, Methodological Conclusions. (S. Amarel - Moderator) 3:30 Break. Departure: 4:00 III. List of Panel AMAREL, Saul AXLINE, Stanton BAKER, William BUCHANAN, Bruce CARHART, Ray DAVIS, Randy ENGLE, Ralph FEIGENBAUM, Edward KULIKOWSKI, Casimir LINDBERG, Don MCCORMICK, Bruce MILLER, Randy MEYERS, Jack PARKINSON, Roger PAUKER, Stephen POPLE, Harry RINDFLEISCH, Thomas SAFIR, Aran SAFRAN, Charles SCHMIDT, Charles SCHWARTZ, William SHORTLIFFE, Ted SILVERMAN, Howard SMITH, Dennis SRIDHARAN, N.S. Participants and their Affiliations Principal Investigator, Rutgers Research Resource MYCIN Stanford Medical Center Biotechnology Resources NIH HEURISTIC DENDRAL, Meta-DENDRAL Stanford Computer Science Department HEURISTIC DENDRAL Stanford Computer Science Department MYCIN Stanford Computer Science Department New York Hospital Cornell University Medical School HEURISTIC DENDRAL Stanford Computer Science Department CASNET Rutgers Computer Science Department Chairman, SUMEX/AIM Advisory Committee MISL Project University of Illinois at Chicago Circle DIALOG University of Pittsburgh DIALOG University of Pittsburgh PARRY Stanford AI Lab PRESENT ILLNESS New England Medical Center DIALOG University of Pittsburgh SUMEX System Stanford Medical Center CASNET Department of Ophthalmology Mount Sinai School of Medicine PRESENT ILLNESS Project MAC, MIT BELIEVER Rutgers Psychology Department PRESENT ILLNESS Tufts-New England Medical Center MYCIN Stanford Medical Center Digitalis Therapy Advisory Program Project MAC, MIT HEURISTIC DENDRAL Stanford Chemistry Department BELIEVER Rutgers Computer Science Department SRINIVASAN, C.V. MDS Rutgers Computer Science Department SZOLOVITZ, Peter PRESENT ILLNESS Project MAC, MIT YAMAMOTO, William Department of Clinical Engineering George Washington University Iv. Brief Description of Systems Presented at the Workshop BELIEVER [Rutgers]: This system models how a person in the role of an observer, perceives and explains observed or reported actions to others. The goal of the system acting as observer is to answer the question: “Why did person P perform act A at time T?". The question is to be answered by attributing to person P a plan and motives which caused that person to decide to perform action A. Thus the problem is to move’ from observations to inferences about the internal states (Believes, Expects, Wants etc.) of person P. This type of causal explanation of observation is similar to reasoning in other knowledge based problems such as medical or psychiatric diagnosis. The AI framework adopted for this work called MDS is being developed at Rutgers and provides a formalism for describing the theory. CASNET [Rutgers]: This system embodies a causal representation of the processes of dysfunction incorporating four main structural elements: the patient findings (signs, symptoms) and test results); the patho-physiological states that summarize and explain the findings; the disease hypotheses expressed by their component states; the therapeutic actions which attempt to counteract various aspects of the disease. Such a model has been applied to several dysfunctions, but principally to the glaucomas. Reasoning schemes have been developed for the interpretation of findings, diagnostic decision making, prognosis, therapy selection, and explanation of reasoning in terms of the model and supportive research references. DIALOG [University of Pittsburgh]: A computer based system for general medical consultation that incorporates a hypothesis formation system using a medical knowledge base now encompassing a substantial portion of the major diseases of internal medicine. The system thereby exhibits diagnostic behavior and competence comparable to that of the skilled clinician, and handles systematically, cases where two or more distinct clinico-pathological entities are present. HEURISTIC DENDRAL [Stanford]: The objectives of the Heuristic DENDRAL research program are the development of innovative computer and biomedical analysis techniques for application in medical research and related aspects of investigative patient care. The global aim is to apply the unique analytical capabilities of gas chromatography/mass spectrometry (GC/MS) with the assistance of data interpreting computer programs utilizing artificial intelligence techniques, to investigate the chemical constituents of human body fluids in a variety of clinical contexts. A set of artificial intelligence programs interpret data and generate plausible molecular structures. The most important program is the constrained structure generator CONGEN, which generates molecular structures within structural limits. These limits (for example, ring size) are either specified by a chemist directly or inferred from mass spectrometry data by another program called the DENDRAL PLANNER. The problems of organizing and developing this complex system are common to many knowledge based problem solving programs. META~DENDRAL [Stanford]: Meta-DENDRAL is an induction program for finding rules that characterize the processes that are of interest to the chemist (for example, rules of fragmentation in mass spectrometry). The name Meta-DENDRAL suggests an effort beyond, but not entirely separate from that of Heuristic DENDRAL and is a response to the immense task of extracting inferential knowledge from experts and making that knowledge accessible to the Heuristic DENDRAL engine. The number of rules is potentially very large and experts have yet to investigate most of them. Therefore, automating the rule formation process’ seems essential. As in Heuristic DENDRAL, the heart of the program is a generator of legal solutions, in this case a rule generator called RULEGEN. The generator needs prospective constraints in order to generate plausible rules rather than all possible rules. The planning program for doing this is called INTSUM. The test phase of Meta-DENDRAL under’ the PLAN~GENERATE-TEST paradigm is a program called RULEMOD which evaluates and modifies rules in the context of other rules. MISL [University of Illinois]: The Medical Information Systems Laboratory (MISL) is set up to explore the use of artificial intelligence techniques in clinical decision making and pursues three major activities: clinical research and decision support; construction and modeling of a data base in ophthalmology; and network-compatible data base design. The project explores the inferential relationships between analytic data and the natural history of selected eye diseases both in treated and untreated forms. SUMEX/AIM is utilized to build a data base to be used as a test bed for the development of clinical decision support algorithms. MYCIN [Stanford]: A computer program that uses expert clinical knowledge to advise physicians on the diagnosis of bacterial infections and the selection of appropriate therapy. The distinguishing characteristics of this system are: it acquires information through human engineered interaction; it permits extension of its rule structured knowledge base; it explains its reasoning process in response to simple questions posed in English. PARRY [UCLA]: An interactive program that simulates the behavior of a paranoid patient during a diagnostic interview in a hospital setting. The conversation is carried out in English. The model consists of a delusional network which operates by detecting flare concepts in the doctor's statements, and thereby modifying its own affect states such as Fear, Anger, Shame, Mistrust in response. The affect states guide the nature of the resonses given by the program. The degree of paranoia can be set at the start of the interview. The model has undergone elaborate validation and sensitivity tests. THE PRESENT ILLNESS PROGRAM [MIT]: A system which analyzes the history of the present illness for a patient starting with a certain complaint. The knowledge base was developed by analysis of the behavior and declared reasoning of clinicians and by introspection. The knowledge base is organized into Frames as defined by Marvin Minsky, that are linked into an associative memory. The memory is partitioned into long term and short term types which permits likely hypotheses to be arrived at rapidly and considers frames that are closely linked to the hypotheses. PROTEIN CRYSTALLOGRAPHY [Stanford]: This system has as its goal the application of AI techniques to the Phase Problem of X-ray crystallography in order to determine the three dimensional phase structure of proteins. The system obtains from experts the knowledge and heuristics needed to infer the structure of proteins and to represent them as a cooperative set of processes that can successfully arrive at plausible structure descriptions in a reasonable amount of time. The goals of this project are clearly long term but are organized in such a way that significant intermediate goals can be realized before the project is completed. SECS [Princeton]: This is an interactive program for computer assisted planning of organic chemical syntheses. It is human engineered and makes extensive use of graphics whenever possible to display chemical structures, synthesis sequences and the solution search graph. SECS has extensive knowledge of chemical transforms and chemical principles’ and is designed to let the chemist expert do the major portion of the search guidance interactivity. SECS uses an English-like chemical language for describing transforms that the chemist uses to extend the knowledge base. Current work is centered on developing advanced strategies that exploit three dimensional models and an electron structure model that SECS currently knows how to build. RKEKKKEKEKEKEKKEKK V. PANELS MEDICAL PERSPECTIVES OF AIM SYSTEMS ANALYSIS AND COMPARISON OF MEDICAL SYSTEMS KNOWLEDGE ACQUISITION AND REPRESENTATION METHODS OF INFERENCE - FORMAL AND CLINICAL PROBLEMS PROBLEMS OF SYSTEMS DEVELOPMENT A. MEDICAL PERSPECTIVES OF AIM SYSTEMS MODERATOR - ARAN SAFIR ENGLE: In his stimulating book PHILOSOPHY OF AS IF, H. Vaihinger (Routledge and Kegan 1935) presents a thesis which relates directly to the application of so called Artificial Intelligence to the field of medicine. He postulates that we often accept as true the fiction of approximations because of some useful benefits which result. In a sense all of science and mathematics is an approximation of the real world, and there are benefits to be gained if we act as if science were the real world. Similarly, benefits can result from acting as if artificial intelligence were the same as human intelligence though the term Artificial Intelligence seems a bit presumptuous to some individuals. The full benefit of the use of computers as tools of thought can come only when we learn to dissect intelligence into a portion best suited to the human being and a portion best suited to the computer, and then find a way to mesh the two processes. The science of Artificial Intelligence is concerned with that very important task. YAMAMOTO: Artificial intelligence as it appears to me is attempting to emulate or imitate the performance of the academic physician working generally with the most severe disease patterns. And when you mention artificial intelligence to a number of physicians you arouse a basic hostility because you are threatening them in the area they have reserved for themselves. They are willing to give the Iv's to the nurse and the drugs to the pharmacologist and _ the surgical preparations to the OR nurse. But what they reserve for themselves is what they consider the intelligence. One can attend conferences devoted to defining the phrase Artificial Intelligence. I have found that you can reach an innocent ground by calling it Artificial Behavior because in identifying what intelligence is we use certain phrases which generally define subtypes of behavior. I would like to list those types of behavior one might refer to in determining whether or not someone or something is behaving intelligently, and more specifically those types of behavior that I think physicians would include if they attempted to assess Al. The first intelligent component is the choice between alternatives where the alternatives are not necessarily mutually conflicting. I think the AI community has done a fair job of answering that. Second is execution of pre-determined processes. That is, physicians learn as do others, things which are pretty well defined algorithmically or procedurally which are stored away and then invoked at a_ select time. The ability to do this very often appears to be intelligent and I think the AI comminity has made substantial inroads here. Third is learning facts or knowledge by inductive inference and learning by rote. Learning by rote is what we do in medical school, learning by inductive inference is what we hope the doctor will do when he gets out. There is a questionable level of success here as far as AI is concerned. Fourth is initiative and invention. These two words we associate with intelligence although there are other components mainly emotional that determine the manifestation of it. I think there has been no contribution from AI in this area. Fifth is operating under conflicting policy where policy covers a broad range like "don't do harm". As far as I know there has been very little activity along this line in AI although it seems to be an attackable problem. Sixth, self awareness has to be a component of intelligence and this of course is a basic philosophical, perhaps epistemological problem which AI probably has not attempted to answer. Seventh is to assign value judgements or assign value to judgements that the performer executes, or values in the context of a society, that is, in the context not just of a patient but also that patient's family. This type of extraneous but nevertheless relevant intelligent activity expands the scope of your problem. This is another area in which AI has not made any substantial inroads. Eighth is solving problems. This can include playing games to more complicated diagnostic games. I think here AI has contributed a number of very interesting and powerful paradigms. Ninth is recognition of logical consistency which is something that AI people try to pull into their systems. We cannot say at the present time that AI has a method by which logical consistency of new systems can be determined, but this is a problem which is not unique to AI. Tenth is operating under tentative decision. Most of the front line physicians operate under tentative decision circumstances. I think MYCIN is an example of an attempt to go in that direction and necessary. to emulate if you are going to imitate the intelligent behavior of the clinician. Eleventh is operating toward an indeterminate or the "qualitative end point". That is, intelligence often allows you to say you don't know what the end point is but you will know when you get’ there. The ability to operate under that scheme is a manifestation of intelligence. I am sure all of you can think of other forms of behavior which contribute to the definition of intelligence and until we geta substantial number of these under control we probably will not be able to convince the street physician that AI has a great deal to contribute. Let me say that as far as disseminating AI in the medical community is concerned, I'm greatly heartened by the interest of major medical physicians in the country like Dr. Meyers and Dr. Schwartz because the only way there will be a more congenial reception of AI in medicine in the profession is through clinical leaders becoming interested, and training their students to be aware that thought processes have Structure and that structure can be experimented with by using machines. PAUKER: In the past the clinical importance of computer science in medicine involved both data handling and the dissemination of medical knowledge. Now an additional capability has developed, the ability of the computer to serve as a laboratory to model decision making and to test theories. Our group has explored as have others, the impact of decision analysis on the decision processes in medicine both in diagnosis and in treatments. It has made me far more aware of the necessity for being explicit in my decision making processes after seeking firm and relevant data upon which to base any deduction. Probability theory and especially Bayes rule now form a central part of my diagnostic approach in terms of computer programs. However, our recent studies have emphasized the importance of a richly cross-linked data base of guessing and heuristic approaches. These ideas fit more closely the romantic notion of what clinical expertise is and to some extent have underlined the need for complex learning and indexing processes. With this new kind of laboratory and approach we are beginning to understand better how to teach students what clinical expertise really is. Having more patterns with which to match and explore the expert can plunge in and guess and if he makes a mistake he has rules by which he can back up. And having seen that this is also the procedure of some programs, aS a clinician I am pleased to know that there is nothing wrong with exploring in this manner. It works and because it works perhaps AI has something to learn from medicine in the same way medicine has something to learn from AI. LINDBERG: First I want to say why I consider the SUMEX/AIM project to be of great significance. The first reason is that reliable high performance computing which is required for reasonable AI development is now available at a reasonable cost and hence the experiments may succeed or fail on their own merit without the added complexity of inadequate computer resources. There are still some inadequacies in the system, especially in the area of large files. But these aside, it now’ seems quite possible to test if AI in fact has anything to offer medicine which I think is the fundamental raison d'etre of the experiment. The SUMEX/AIM is significant because it's mode of providing computer services to medicine is an attractive alternative to the traditional Single, large institutional computer center. Personally, I would like to see it succeed. The SUMEX experiment provides that the cost of maintaining an advanced system be borne by a single group, with other institutions using the facilities. In addition one might say that the approach allows networking to reduce the programming/hardware compatibility problem. For what purpose then should one attempt to employ AI techniques in medicine? For me there is absolutely no doubt on this point. I think AI should be used to do in medicine what cannot be done without a computer. Now that would mean that the universe of choices be divided not between forbidden patient care applications and permissible basic research applications but rather between those things which cannot be done and those things hopefully which can. And I have three examples that I would like to mention. First, we do not have presently a uniform terminology for medicine let alone a vocabulary, nor do we have a means to create either. It goes without saying there is no meaningful national accumulative data base effort. Therefore there really is no systematic way for clinical records to become the basis for research. It is likely that AI could create a means to build a vocabulary and I point that out as a problem of major importance. Secondly, we do not have a general means to test potential causal Or non-causal medical associations, a consequence of that being the thalidomide/pregnancy association for example. If there are such assocations to be made today we are no better prepared to recognize them or be alerted to them by a computer than we were ten years ago. When we speak of early warning systems for drug side-effect or drug interaction, we are hypothesizing the particular effects for that special problem. The more general problem would be to prescribe the way in which such an association is actually recognized. If we could do this we probably would not have to plead so hard every year for data collection systems. Lastly, I would like to suggest that we cannot as yet manage very large files or large and complex data bases. You may say that this is being done already but I am suggesting that we really only think we are doing it. Let me give you the file problems I have in mind. First, geographical data systems. There are practically no usable systems which allow medical data observations to retain their geographical structures along with their other attributes, The Lighthill Report for the National Research Council in England singled out this application area, geographical systems, as the most promising AI application and I think it is not being followed up. To illustrate the value of such application I need only to remind you of the well known but little understood geographical distirbution of multiple sclerosis in the USA. It is sixteenfold more common in New Orleans than it is in Seattle. Or the varying attack rates of coronary artery insufficiency which is threefold higher in Georgia than in Lincoln Nebraska. We do not have any means to recognize these associations. Those particular ones have been made and validated but how many others are there? The second data base problem I want to mention has to do with the- medical record data file. We are doing the computations but not really managing the information in the files. I think a reasonable solution to that would be to design a system in which the file knew more about what it contained than the inquirer. And that is a problem which I believe is suited to AI methodology. I want to make a statement about Dr. Meyers system because what I've said may seem in conflict with the fact that I very much admire what Meyers and Pople have done. I think it is very sophisticated work aimed at a very important problem. But I do not feel it is important because they are automating the good consultant. We cannot make another Jack Meyers but american medicine does turn out very good internal medical consultants nonetheless who may grow to be as good as he. For me the importance lies in the fact that they are accomplishing in this program something which cannot be done without the computer by providing a facility whereby diagnostic rules are made accessible and can be applied to a particular case without the presence of the consulting physician. MEYERS: In spite of Dr. Lindberg's point of view I still believe that the kinds of programs we are developing using the techniques of AI, will continue to have diagnostic application even in the tertiary care institution. Now the number of applications is obviously going to be limited, I thoroughly agree with that. It is probably not so important to develop these AI techniques for routine tasks. No physician by and large needs a program to help diagnose common symptoms. But the paramedical personnel who are taking on care responsibilities may well need this kind of support. My last comment has to do with the educational application of AI techniques. I have mentioned already the use of our data base for educational purposes, but I hope you can see that these kinds of techniques can be used for standard self-education as well. For example, in our program if a medical student just wants to _ add "stortness of breath" and stop there, the computer can provide quite a thorough and differential diagnosis of shortness of breath. In addition these systems could be utilized for measuring clinical competence not only in students but also in graduate physicians. And this is becoming an increasingly important aspect of medical practice. SAFIR: I believe that developing computer methods for intelligent problem solving in medicine can be accomplished only by close collaboration between the computer scientist and physician. And a true understanding of the nature of the data and the problem can be achieved only if the computer scientist is exposed to the very long and difficult process of education in medical problems. He has to serve a clinical clerkship as we call it in medical school because what one gets out of text books and the literature is really just enough to get started. One has to develop a feeling for the complexity and unreliability of the data. Dr. Kulikowski and colleagues have been very involved in observing glaucoma surgery and seeing patients undergoing the measuring process. As a resSult their understanding and interpretation of the literature has changed tremendously. Likewise, the physician who gets involved with the computer scientist cannot just preach medicine. He must learn how the computer scientist imbeds these clinical lessons in some logical structure and manipulates’ them. These may sound like > relatively easy goals but they require the selection of personalities that are not at all typical of the professions involved. Computer scientists are selected mainly from among those who have a talent for mathematical disciplines and who are encouraged to develop orderly systems of thought that function with predictability and precision. Physicians on the other hand have entered by choice a profession in which disorder and unpredictability are nearly the rule. If someone comes to a mathematical scientist with a problem for which there is yet no solution there is rarely any pressure placed on him to supply one immediately. Clinical physicians obey a very different mandate. They must solve the problem at the time it is brought to them no matter how imperfectly and they are compelled regularly to make crucial decisions in situations that are characterized by inadequate theory and grossly imperfect data. I've often thought that the entire system of medical education is a means of teaching an intelligent and sensitive person to live happily with the intolerable. So computer scientists who can thrive within the disorder of medicine and physicians who can work happily within the logical and mathematical world of computer science are, to use doctors’ terminology not rare but destinctly uncommon. And I believe that good work in computing and medicine will result only from such collaborative teams. SCHWARTZ: The process of developing large systems that are reliable enough to make an impact on clinical research will require inevitably a large investment of resources over the next few decades. And I wonder if society and the funding agencies are willing to wait that long. Quality care is one of the key issues around the country today. And I feel we ought to be able to convince those who are making the financial decisions that this work really has nothing to do with computer programs but has to do with the development of insights into high quality clinical care and clinical judgement which will allow an enormous up-grading of medical education and medical curriculum. Most physicians including fourth year medical students are already so professionalized and acculturated in the traditional way of learning medicine that their minds are not open to analyze the structure of their decision making and cognitive processes. I am convinced that we should be teaching problem solving and the nature of the cognitive process in second year of medical school before students are so professionalized. We now know enough to be able to do that. As a community we comprise an important resource which can be a force for encouraging the development of medical curriculum that will emphasize processing of information more than simply acquisition of information. And I believe that is a societal good which a great many people will be able to appreciate and accept on its own merit. *#**kk*X*KEND OF PANEL DISCUSSION*******x B. ANALYSIS AND COMPARISON OF MEDICAL SYSTEMS MODERATOR - HARRY POPLE POPLE: I have been asked to summarize a paper I submitted to the IEEE in which I compared three of the four systems represented here, MYCIN, CASNET and DIALOG. All three systems deal with the problem of hypothesis formation but the hypothesis formation imbedded in MYCIN as I see it, is a special case of deductive reasoning. The organization of rules takes the form of a tree structure and the analysis to derive hypotheses is deductive inference. One begins with the goal which in this case is to prove the occurrence of a disease, and each candidate disease is considered in turn in an attempt to prove the occurrence of that disease by working back to antecedent structures until it is possible to establish a confidence level. The other systems use the alternative reasoning tactic of inductive inference, or reasoning from consequence back to hypothesis and from hypothesis to consequence. DIALOG for example, has pointers running in two directions from manifestations to disease entities and disease entities to manifestations. Going from a manifestation to hypothesis is what I call the hypothesis formation step or the abductive step. Working from hypotheses back to resulting predictions is the deductive step which corresponds exactly to what goes on in MYCIN as I see it. So we are employing two distinctly different forms of logic to achieve the same kinds of activity. SHORTLIFFE: I interpret the underlying logic of MYCIN differently. MYCIN was conceived originally as a consequence theorem system. We work backwards from a goal and we invoke pieces of knowledge on the basis of what hypetheses we are trying to reach. The introduction of certainty factors into the scheme makes it difficult for me to interpret that as deduction because we are dealing with antecedent rules. We recently introduced antecedent theorems into the system. As soon as we know the identity of an organism we immediately determine the gram _ stain morphology. And that is a forward looking mechanism that we did not have before. In the past when we needed a gram stain we had to find rules that would allow us to deduce them in a very round about way. I agree that there are differences among the systems but your description of those differences, namely deduction vs. induction or abduction is not an accurate interpretation in my opinion. We have felt from the outset that the perfect system would be one in which the clinician who needs advice could sit down at the terminal and set the scene with information that we in turn would use to ask the appropriate questions. That is the way patients are presented and discussed in the clinical setting. That of course would require adequate natural language understanding in the system. So we look for ways Of avoiding natural language within the context of the consultation itself. We needed some natural language processing in order to answer questions and to do some of the explaining, but we at least wanted to let people get out advice without having to deal with the frustrations of what still is an unfinished piece of AI research, natural language comprehension. The work we did on natural language understanding cannot be defended in any theoretical sense. It was a stop gap measure to get something that would work well enough for our purposes. We recognize the need for it and I believe it is the way these systems should go. AXLINE: I believe clinicians are more comfortable if they can use the standard format for entering data about a case. But we were interested in simulating the logic process the clinician uses, not necessarily the standard format he uses to gather clinical data which I consider stilted. Our approach is to collect only that information which is going to be used at that time, rather than to accumulate large amounts of data. So in terms of understanding the logic process our approach has been particularly productive. The approach that Ted is describing of setting the scene is of equal merit. SZOLOVITZ: The clinician combines a highly specialized vocabulary with a set format to enter clinical information. So that in this instance designing a natural language system would be much easier than it is when you must contend with totally context free input. Anything that gives you a structure provides a handle on the problem. And _ there is a reasonable amount known about parsing so that this is not entirely an impossible problem. I want to comment on the way we model how our expert clinician deals with data that is presented to him, We are very strongly influenced by Bill Schwartz's absolute refusal to listen to a case in any but the Standard order of presentation. And there is a methodological point here. If the program is not able to make use of information as it comes in, then what does it mean to say that you are accurately simulating the deductive or logical process of the clinician? AXLINE: There are several ways of looking at the information gathering procedure of the clinician. The general internist for example looks at the whole patient and all the problems he presents. This is different from the procedure followed by the consultant who is the person we are talking about here. The consultant plays a much different role in that it is not his function to reproduce all the information related to a case, which in part means that he can collect information for processing in whatever way he wants. SHORTLIFFE: I'd like to describe the way in which MYCIN's rules have been acquired to make it clear that we are not necessarily trying to make the program perform the way a clinician does. What we are tryng to do is understand well enough the way the consultant analyzes the problem so that we can come up with representation that works. All the rules we have in our system have been acquired at weekly meetings in which Dr. Axline and Dr. Cohen, the two clinicians most closely associated with our project took patient charts and with the end of those charts still unknown to them, began to review them. Those of us unfamiliar with the clinical aspects of what was being discussed would listen and try to pick out the underlying threads of reasoning. We would then code these into rules and use them to run patients' charts. We would then bring back the results to show the expert how the system actually used the rules in order to come up with recommendations. So our concerns were whether or not the rule we used represented a fact that the expert could agree with, whether or not he had ever used it before in that way, and whether or not the results of the program in terms of recommendations agreed with what he would have recommended for that patient. We want the program to derive the right advice and whatever way we can come up with to do that is all right. So we are looking at something really very different from what Dr. Pauker and Dr. Schwartz have been doing in trying to understand the actual reasoning process that takes place. KULIKOWSKI: Our system is a vivid example of how, if you want to give advice in a given area, often imitating the doctor is not necessarily the way to go . What you want to have is a number of alternative models, with the simulation of a particular doctor being just one of them. It clearly depends on the scope of your problem and on the knowledge structure of a particular domain. SZOLOVITZ: All of us are trying to provide people with expert clinical advice and the methods for doing that can range from simulating the clinician's logic to using a mathematical model. Howie Silverman for example, started out with what looked like a very large AI project, namely to derive a method for prescribing digitalis therapy. It turns out that the major part of the program is a very nice algorithm that does quite well and it uses the AI technology when interacting with the real world. So if we could do that for all internal medicine perhaps that would be the ultimate solution. SRIDHARAN: I see a tremendous richness of concepts going into the building of these systems, especially those of the MIT group. And I wonder how you go about deciding whether or not you need to do all this processing? Howie Silverman's project is a clear case. If he had wanted to make it look like a flashy AI program he could have done it. But actually the idea would be not to do it. If you can reduce the processing structure and encode your information in a clean form that will do the job, that is the desirable way to go. SZOLOVITZ: An example of a very rich and complex theory is Andee Rubin's master's thesis which is available at the AI lab at MIT. [t deals with medical diagnosis. She observed one of the doctors in our team diagnosing Steve Pauker who pretended to be a patient. The exercise was to go through the resulting transcript and establish the kinds of processes and knowledge involved. And that protocol became the basis for the system. Now unless you have very good models for the underlying medicine it is very difficult to do much better in terms of dissolving the AI part and being left with the concrete model. AMAREL: It seems to me in most instances the doctor is the decision maker who draws from certain bodies of knowledge that are for the most part systematic and ever expanding. And I see two components in the projects we are discussing. The first is the richness of the hypothesis space which varies between systems and in the way each system keeps track of possible hypotheses, evaluates them, partitions them and uses them. MYCIN for example has practically no hypothesis formation process. On the other hand, DIALOG is very concerned with the taxonomy of a large number of diseases and syndromes and searching that space entails deliberate processing of hypotheses. And this is where I think AI comes in much more than in some of the other systems. So the size of hypothesis space and the kind of tools you bring into searching the hypothesis space are the determining factors. The other component is the extent to which a project is interested in simulating the doctor's decision making process in the clinical setting. Some systems are geared toward doing precisely that, while others draw from specific bodies of knowledge in a particular domain and a variety of strategies for using that knowledge. POPLE: Our system is an example of a simulation. We did it not because I had any specific interest in trying to simulate Jack Meyers but because I had no other way to get at the problem, and he seemed to be a good model for going about it. The heuristic I hit upon was the only one I could find that resembled the behavior I saw. So I think you are right in saying there are different motives represented in these systems and therefore differences in terms of the way one should look at results and evaluate them. FEIGENBAUM: The problems being discussed here in the context of AI in medicine are almost identical to those issues and problems that arise in other areas of complex interpretation. This is a group of people who share the same sets of concepts, who read each others papers as ARPANET messages the day after they've been generated and so naturally we all Share the same sets of concepts. I think everything that people have been talking about has had to do with expectation driven or model based Systems for analysis, that these are model based hypothesis formation systems specifically, that the models come in a variety of types, associational, causal and sometimes even statistical, that the knowledge is inconsistent, typically in great quantities, that the knowledge is represented in a rich repertoire of representations we all know and massage each day. We may not use them all the time, but they represent the common tools and techniques for dealing with this knowledge in a highly flexible way. So everyone has come to realize that inserting the knowledge, deleting it, modifying it are the critical problems and we've: all invented roughly similar ways of doing it. And coupling all this with these rich inferential processes, we essentially have a kit of techniques that we all can appreciate and explore. I admire Harry Pople's courage in writing an article comparing these systems. I would say that the easier article to write would be one comparing what we might have heard, say six years ago at a conference on medical diagnosis with what we are hearing today. There is an incredible difference. For instance, compare the current work with that of Ledley/Lusted of more than a decade ago, with Signs and Symptoms Matrix and application of Bayes theorem comprising the rich inferential rule of that system. Or compare the current work with the techniques on which millions of dollars have been spent in statistical pattern classification or clustering techniques for diagnosis. Or compare the current work with what was supposed to be the solution to all this, the so called logic tree which is very static. So the techniques that are being discussed at this conference are light years away from what was being discussed only a few years ago. There is an enormous gap between what we knew then and what we know now. SAFIR: I am concerned that computer scientists think they are modeling or simulating a process that they view as static. But it may very well be that the process of medical decision making is undergoing changes almost as rapidly as computer science so that what AI is using aS a model today could be the product of medical schools thirty years ago. SHORTLIFFE: In Dr. Engle's description of the past twenty years of medicine it struck me that a tremendous amount of work and man hours have been poured into the problem of medical decision making during this period, And now it can be automated and analyzed. And I wonder if Dr. Engle gives a talk ten years from now about AI in 1975 whether he will be able to say that AI had the key that had been overlooked for those past twenty years. And I think the challenge we should recognize in all this and take up at this point is to keep from becoming obsolete in the near future. LINDBERG: I haven't heard anyone attempt to measure the magnitude or quality of our accomplistments. Ted asked what will be said of the work in ten years but I think in much less time we will look back and realize that some of these diagnoses showed great achievement and the programs really did well, some were very simple and the whole thing was over-instrumented, and in some cases the decisions were wrong. And I think we have to make a serious effort to separate out which of our accomplishments are major and which are minor. They cannot all be of the same quality. SRINIVASAN: There has been a lot of discussion about the usefulness of various techniques for producing advice in medicine, but I wonder what is going to be next. Is it going to be more of the same, more specialized model building? I tend to think that direction is Static. Is this for the doctors the general paradigm or is there also some concern for planning functions? MEYERS: I would say good doctors in most circumstances must have a- definite therapeutic plan which may be modified with experience of course. We well recognize in DIALOG that treatment plans are extremely important in the overall scope but to deal with therapy is as big a problem if not bigger than the problem of diagnosis. This is taking.two worlds at once which is just too much. Fortunately, smaller programs like MYCIN or CASNET can deal with this but we had to put it in second place. And I believe Dr. Pauker is also in the same situation for the most part. PAUKER: We are to some extent but I think I have to disagree with your statement. I don't think that the world of diagnosis and the world of therapy are all that separate. I think it is a world of patients and therefore we never really know until the autopsy that we arrived at the right diagnosis. We are always undertaking a therapeutic plan trying to make the patient better, not being certain of what the diagnosis is. And clearly knowing what to model in terms of therapy initiation is very dependent upon and _= strongly influenced by what we mean by arriving at diagnosis. Often it is very difficult to know when we have reached that stopping point. What that arbitrary stopping point is depends on what we're going to do next, the seriousness of the Situation, the amount of time we have to provide treatment. So we cannot finesse one or the other. MEYERS: I agree with you. Perhaps I can make my point clearer. Once you get therapy into a system you then perturb the whole system and the data base becomes radically changed by the very presence of the treatment. And that is a very complex change which I am sure causes as big a problem as the original data base. PAUKER: I would have to agree with that. If you are studying a case in which you find treatment was already prescribed, it changes the whole issue of consistency. You needed a certain finding which is no longer there because a doctor took it away. But the problem of therapy does not go away just because we ignore it. In dealing with two diseases, one can mask the manifestations of the other, in which case we are back in the same ball park. As opposed to being cured we might say that. the therapeutic intervention of a physician at some level represents another disease. SZOLOVITZ: There is also the problem of history. When we study a case history what does it mean to say that a person has had a certain disease for three months? He did not all of a sudden have it. He had a lot of different symptoms which in retrospect amount to this particular disease. Now if he still has this disease in addition to some new disease, we are in exactly the same situation as we are when we initiate treatment. Because in order to understand the historical information we have to cope with this question of how diseases behave in time and with other diseases, what our expectations were as opposed to what actually happened, and how we form hypotheses to account for them. MCCORMICK: One of the great thrusts of decision making was cost effectiveness, developed by people in the Department of Defense which as far as I can see practically has strangled the community for the past ten years. The problems we are focusing on in the medical area are not that different from what is required for good decision making in other areas including planning in government or business. From among the various techniques we have developed to solve our problems could we find a more flexible mechanism to replace cost effectiveness as the standard criterion for judging the progress and development in a field? The closest any group has come to dealing with that in the context of management is Bill Martin at MIT who is building systems for management decision making. FEIGENBAUM: I would like to discuss another potential application in that area. When you try to do hypothesis formation you often reach critical points in the analysis where you need some sophisticated piece of data that is extremely costly to obtain and you must decide whether Or not it is worth the investment of time and effort to get that information. Right now we give over these decisions to human analysts. One of the things we know about these knowledge based systems is that they are extremely systematic in their application of a body of knowledge and often much better at it than the human experts who build up the rule base in the first place. Could we use these systems for making those decisions as opposed to trusting the opinion of the physician who may not be as systematic? There have been other types of model builders which have considered this problem but those discussed here are much richer in terms of knowledge employed and I wonder if it ought to be pursued. *kk*****END OF PANEL DISCUSSION****** #4 C. KNOWLEDGE ACQUISITION AND REPRESENTATION MODERATOR - BRUCE BUCHANAN BUCHANAN: This panel will discuss the acquisition and representation of knowledge in computer programs. The critical issue is how to transfer knowledge into the program. And as that depends partly on what representation one chooses, both issues are closely related. With DENDRAL we tried to custom craft the system. We worked with chemists many hours putting their knowledge into LISP code. In the long run it somehow begins to work, but the stability of the project is crucial in this method of collaboration because it is slow and tedious. Another approach is to move knowledge from the heads of experts into a program by an interactive dialogue system. We tried it with DENDRAL and we are pursuing it more with MYCIN. My own bias is that both methods are inefficient. We are therefore pushing the META-DENDRAL effort which tries to take the knowledge directly and infer the rules that are needed for the program, thereby removing the expert from the picture. DAVIS: With regard to acquisition, one thing we've found very useful in MYCIN is acquisition in context. That is, not only the knowledge but the reason for entering that knowledge is put into the program, for instance, entering a rule in response to a bug. With this approach you get a step up on the problem of assessing the impact of a particular rule on the knowledge base of the system. One of the constraints on the premise of a rule that has been given in the context of a bug is that it is going to have to evaluate to true in the context of the current consultation. Otherwise the rule simply is not going to fix the bug. This kind of knowledge is in the system I'm developing. It will accept any rule you give it but if it is in the process of trying to fix something and the rule will not be useful in that context, it will say so and request another. This is all clearly predicated on the assumption that working with an expert and putting his knowledge in the form you find suitable is the right way to go. In our system the Form happens to be a rule, and we draw the knowledge out of the expert in this way. This method presents the problem of how to deal with the ramifications of a new data structure on the system. Giving the system some understanding of its own representation seems to help. That is, give the system some capability of dealing with its own data types, and of being able to follow along some of these implications just by the structure of the types. We've done this and it helps. There are semantic implications that I don't know how to handle automatically. At the moment the user has to guide the system. SCHMIDT: The methodology in our BELIEVER system involves developing a model of the thought process that an expert or anyone for that matter uses, to solve ae_=problem. With that model we try to generate the response we think the expert will come up with in a given context. We then compare our model's response to the expert's actual response in that situation. We have found that unless we decompose or categorize the information in the same way the expert or subject has, it is difficult to extend the system further. SRIDHARAN: I would like to show that three of the issues being discussed have a common solution. The first is that of designing the knowledge based system and putting formal knowledge into a_ predesigned, simple and uniform knowledge structure. This could be greatly facilitated if a natural form of representation were used. There is no bug free system and there is no knowledge base that doesn't have problems. So it's not enough just to design a representation. It has to be designed with the idea in mind that you're going to be changing it constantly. It is not enough for the system to produce right answers. It has to be able to give reasons for those answers, in some sense explain its own processes. It has to be credible. Again, my contention is that a solution to these problems can be found in representing the knowledge in a natural form. Roger Schank's group is doing work in this area using the notion of computable semantics. Srinivasan's Meta-Description System which we are implementing partially in our BELIEVER system is also founded on this idea of a natural representation. The problem we are all experiencing in trying to explain our own systems and understanding others could be alleviated also if the knowledge were represented in a more readable, natural form. It would make it much easier to get down to the concrete stuff of the system and follow its reasoning processes. PAUKER: These issues of knowledge acquisition and representation depend heavily on how much knowledge you are talking about. The domain of Internal Medicine is representative of tthe real world in its complexity and number of facts one has to know and work with. The process of acquiring all those facts in a data base is one_ problem, maintaining consistency in that data base is horrendous and finding the errors in that data base is impossible. I don't know how you are going to go about doing it. Finding it by instance in any reasonable period of time is not possible. What we do with doctors is to produce what we think may be a reasonable approximation, send him out and when he kills a patient we do a CPC, an autopsy, find out what went wrong and correct it. Just acguiring let us say, all 210,000 facts contained in the text on Internal Medicine by Harrison, is not expertise. The medical student who memorizes it all is not a doctor yet. He has to be able to apply that knowledge in the right circumstances, to organize that knowledge at run time, not just at the time of system formation. We can each chop out our own neatly constrained problem where each of our own approaches works. But applying these in real world situations is another problem. DAVIS: There are two points here. One is, the text does not contain 210,000 unrelated facts. So we are dealing with an order of n not n-factorial when we talk about facts. I don't think we ought to be intimidated by raw numbers and facts. Clearly, there are levels of organization one can work with. PAUKER: Our experience in developing the Digitalis program has been that we cannot compute all possible implications of a fact we put into the system in a reasonable period of time because of the number of chains it produces. As your system grows, an added fact becomes harder to deal with. COMMENT: Perhaps Samuels' checkers playing program offers a useful approach for handling a large medical data base. He found that the best way to debug the checkers data base was to have it run through masters’ games and any time the program generated a move that wasn't the next move in the master's game, it adjusted its heuristics so it would generate that move. So a possible way to debug a medical data base might be to have the program run through CPC's and see if it is generating the same decision at each point and if not, adjust itself. PAUKER: AS an approximation it might be interesting to try that approach. But the problem with the CPC is the input and conclusions are in arbitrary order. Some of the conclusions are even wrong and_ there are no intermediate markers. So finding out where in the chain you went wrong is a problem. In addition the uncertainty remains that perhaps the CPC came up with the wrong diagnosis. The characteristic of _medicine is that the data input is incomplete and part of the game with the CPC is the doctor is led down the wrong path because all the data is not given to him. FEIGENBAUM: I'd like to throw out some numbers’ also. Simon estimated from some experiments in chess perception that a chess master holds between 50,000 to 100,000 facts about chess. The estimated number of words ina typical adult vocabulary is somewhere between 10,000 for the average person and 100,000 for the extremely intellectual person. Newell estimated that if he were to put together a model of the whole man, he would have about a million production rules. Now the question, is a million a hard number to manage? I think we would all agree that it is, given the kinds of mechanisms for representation we have been using. One thing to consider is, could we cause the necessary evocation to happen in one machine cycle by using active electronics in nets of demons instead of search electronics?. Each demon would be realized in an integrated circuit that would poke its head up when something came by. Now the cost of such a thing if you consider something like a ten property demon, might be about a dollar in the current state of electronics. So for a million dollars you have a million dollar memory which would evoke what is necessary to evoke in one machine cycle. And that's not absurd. SZOLOVITZ: But that doesn't solve the problem of what you are going to put into the representation and how you are going to debug not the methodology, but the actual content. For instance, it is nearly an impossible task for a panel of clinicians to revise Harrison's Text so there are no errors in it. How can we overcome this problem of working in a domain in which we cannot certify that some new fact we add to the system is in fact correct? DAVIS: I think we are in trouble if we reach that stage of simply putting things into the system without having any idea of whether or not it's correct. Steve said earlier that it is a near impossible task to follow down all the implications of a newly added fact. The alternative is not to follow it down, but to put it in anyway and wait for something to break. PAUKER: Let me say something about the nature of the medical data base. It is not factual, it has errors in it. It evolves, it is self-contradictory. When students enter medical school they are told that half of what they will learn is wrong. The problem is that no one knows which half. So given that real world constraint, the data base must be inconsistent. Unless we can deal with that we're in trouble. FEIGENBAUM: Who cares if there are inconsistencies? The processer can be set up to take care of it. Take Pople's scheme for example. It could be that some critical observation is an outlier and extremely important to the hypothesis. But because it is an oulier the inference scheme doesn't deal with it and that's a mistake. Fortunately, there is enough evidence redundantly available so that the inference of the correct hypothesis doesn't get demoted too much. So the inference scheme can be very tolerant of failures, of bugs, in spite of the fact that you don't check it all the way through in the data base. You sound like a bunch of mathematicians when you say if the system breaks that's it, you can't prove the theorem. DAVIS: But that has been our experience in programming. A_ subtle mistake in one place leads to very strange things further down the line. SRIDHARAN: I would like to suggest that the solution to handling multiple facts and finding contradictions in new occurrences lies in developing multiple representation. We should be able to put abstract concepts into the machine along with specific instances of those concepts so it can relate to them. For example, there is the kind of representation coming up in natural language work called scripts or scenarios which are really concrete instances of those schemes of inference which one immediately invokes in order to assimilate a new fact. These are all heuristic vehicles for handling this complex issue of representation. So the solution is not to design the best representation but to have at your disposal a variety of methods for looking at the various aspects of the same knowledge base. SAFRAN: We've talked a bit about representation, acquisition and numbers of facts but very little about the eventual use of these systems. How many representations do we need to effect any kind of use? BUCHANAN: There are many uses of Knowledge. Each task domain has its own specific uses and if the representation depends critically on the use and there are no general principles to work with, then we are going to remain in this custom crafting mode, building separate systems for each task. There is a MYCIN experiment in which we tried using its framework in other task domains diagnosing and recommending therapy for bugs in a Pontiac horn. People at SRI used the MYCIN structure to build a consultant for helping novice mechanics put together an air compressor and fix bugs in it. So we are finding the structure of the system useful in other domains. This was our first venture into a totally different domain and there is no claim that it was a grand success. We did it just to see what kind of things we had hidden away in the program that were purely medical that we wanted to clean out. POPLE: I'd like to point out that the process we are talking about is something that in most professional education is considered to be if not unteachable, then at least the most difficult thing to try to convey to the student. The process of course is using the knowledge of a given discipline to solve real world problems. I think we have given the various professions some good insight into this process that they may use effectively. There is now a transfer from the computer programs back into the classroom that can take place. And it is not at the level of facts but rather at this process level, something that has been very difficult to articulate to students in the past. *#exkKAEKEND OF PANEL DISCUSSION****#&#x D. METHODS OF INFERENCE - FORMAL AND CLINICAL PROBLEMS TED SHORTLIFFE - MODERATOR SHORTLIFFE: The topic for this panel is methods of inference. I have a list of issues we could address in this session that come under the general heading of hypothesis generation and testing. The first issue is how to quantify inferences. They may be causal Or associational but we've all found a need to put a number on them. This includes knowledge that has been given to the system rather than what is actually derived during the process of reaching inferences. A related problem is the accumulation of quantification numbers for the hypothesis. We've all had to handle the problem of relating positive and negative evidence as well as the clanker in diagnosis, that is, the one thing that seems to be against everything else. We have all had to design functions or algorithms for combining the numbers” that have been accumulating in order to reach decisions. Another issue relates to validating our models. If we start perturbing the numbers that we have from the outset does this really affect performance? If the numbers were available could we use statistical theory or are we dealing with issues that seem to go beyond Statistics? Can we define testing procedures that will convince ourselves and the observer that the kinds of techniques we are using for measuring inference are reasonable and justifiable at least in a_ theoretical sense? To what extent are we trying to avoid issues of independence of evidence in favor of the hypothesis? We try to keep our rules separate and individually executable to avoid having to relate them explicitly to- one another. And I think many of us have come up with schemes that allow us to skirt this issue mainly because we just don't know how to handle it. KULIKOWSKI: There is a certain amount of uniformity among the clinical projects in dealing with quantification. Obviously, we have relied heavily on the clincians' judgement in acquiring these weights. One important issue in work of this kind is to relate these weights of evidence to some of the more objective statistical measures that one could obtain say, from a data base. Part of the problem in all of our systems is that they are over-determined in some sense. We have a lot of redundancy in them quite deliberately because we attempt to explain the structure of hypotheses in alternative ways. As a result if you want to validate or test one of our systems or acquire new knowledge, in some sense what one has to do is to freeze the part of the system that is under examination. And that is a very difficult job because we have often skirted the issue of interdependency as Ted has suggested. In our project we are very interested in seeing how far we can get with the independence assumption and where it breaks down. We haven't yet done any formal study of this. On the other hand we are reaching a point where we often do need rather complex combinations of events to give us a useful clinical state to reason with. As we learn more about the necessary description of diseases and ways to reason about them, we will be able to extract those parts of the description that need strong interdependencies from those which do not. We've found in glaucoma as long as you stick to a relatively vague description you can maintain a very simple causal flow. The moment you want to characterize more precisely some of the interactions, you find that many things are not just a simple sequence of cause and effect but rather a set of interdeterminers in some form. As for accumulation of quantification, all of us fall back on the notion of independence. But I think there are significant differences between composing things along a causal chain and composing ona _ purely taxonomic basis. On the whole, I would say there is more arbitrariness in a taxonomy than in a causal scheme, although we must be certain that the causal scheme is really causal and not just something we imagine to be causal which is part of our problem in the medcial domain. BUCHANAN: The DENDRAL program does not present many of these problems of uncertainty. In the chemistry domain the inference mechanisms are largely stochastic processes. Essentially, we are able to get from the chemist predictive rules. These are probabilistic so there is some weight associated with them. The chemical structure is described and you expect to see evidence for certain actions. Asa result of the action, new situations are produced for which there is some evidence and data. Now all of that can be run in a straight forward predictive way and there is really no inference problem there. The problem comes when you try to read those rules backwards. That is, from the evidence derive the processes and the fundamental situations from which those processes arose. In the Meta-DENDRAL program we are working with the same packets of information but they are arranged differently. Given some collection of evidence and a global structural description namely the whole molecule, infer the rules that one needs to use or test the program in either a predictive or inferential way. We tried discontinuous scales for our inference rules and found that they didn't work. The problem was that in different contexts "strong" or "weak" weights meant different things. We found we could do better on a more or less continuous scale. MILLER: I would contrast DENDRAL with the medical systems because to replace the clinical experts with a meta rule forming system would be giving machines more responsibility than they can handle at least in the medical domain. I think that in this area the computer is not out on its own to derive clinical expertise because humans already have that. The problem is to apply that human expertise which is already in the system and I think most of the medical systems have done this. PARKINSON: We have had a problem which is common to belief systems and that is knowing what to do with negation and reciprocal belief. What we've done is to use both. For instance, PARRY has the following beliefs: The doctor desires to harm PARRY The doctor desires to help PARRY At first one might approach this by assuming that if he's not harming than he wants to help, and if he doesn't want to help than he wants to harm. But that is not the case. We have to add to each of these, the negation of it since the belief that the doctor doesn't want to harm PARRY still says nothing about his desire to help. Likewise, the belief that the doctor desires to harm PARRY still says nothing about his desire to help. Another problem is that on a scale of 0 to 10 we start out with a belief system that contains zero information except for the initial assumption that he's probably the doctor and that he probably does want to help. We have found in our model and we believe this happens with humans in real world situations that as one gathers evidence to affirm a certain belief, in this case it would be that the doctor does want to help, it tends to get believed strongly enough so that any further evidence that might challenge that belief gets explained away. So a belief can start from 0 and rise up to 10 and if counter evidence comes into play it may have little affect on it. We believe for humans if there isn't too much counter evidence to challenge a belief it probably does not change unless it is really important. Likewise in our system we look at the importance to the model of inferring that belief. For example, it is fairly important to find out if the doctor is trying to help us. It is very important to find out if he is trying to harm us. So if we decide that he really doesn't want to harm us and then some counter evidence appears and indeed the doctor starts attacking us’ then certainly this is important enough to alter the initial belief. There is also a problem when both the positive and negative evidence say, for the doctor's desire to help is of equal weight so that neither one is believed. I have one last comment about the strategy in the system itself. All these mechanisms are related to the original reason for proving or disproving the belief and that is self-interest. In order to make certain actions possible we have to find out if the environment allows it. And at that point we try to infer belief. It is not as if the program tries to prove everything it can. It wants to do something and it makes these inferences to find out if indeed it can. PAUKER: Something intersting happened with a program we developed. A wrong number was accidently inserted and about a month later I discovered it but the program had worked anyway. I changed the number put ina different one and it still worked! And that really raises the issue of whether a specific number really does make a difference. Perhaps there is a simpler mechanism. To some extent the method of inference is embodied more in the links than it is in the measures you put on the links. I think if you have the appropriate links, the apporpriate structure of the data, the exact quantification that goes on there probably is insensitive within a reasonable range. It is strange that we could put relatively arbitrary untested numbers in it and still have it work. As a physician my view is that the key to the program's performance is experts. It is not more facts or numbers it is the doctor using more interconnections and heuristic rules. And I don't think these kinds of heuristics can be built into numbers. So that right numbers or algorithms really don't make any difference. SAFRAN: I think this gets us into the issue of the credibility of any medical system. Given a data base and a set of numbers it is very important to be able to explain to a physician who is using them _ how these numbers were arrived at and their relative importance and how the system goes about reaching a decision or a hypothesis. The arbitrary assigning of numbers leads you away from credibility. SCHMIDT: I would like to reinforce that statement. There is very little you can say to the expert when he wants to know how you came up with your answer. And I think that is something worth considering if we hope to attract other experts to the system. They are the responsible persons in this case and they must have all the necessary information with which to evaluate the system. As a psychologist I use numbers all the time but I've avoided using them in common-sense reasoning because I find I need something more symbolic. Typically, I'm working in a world of partial matches, the _ evidence only partially matches the entire rule I'm looking for. To substitute these symbolic rules means I do have a residual after that partial match whereas with a number I just have a difference. There is no further computation I can do in my system with that difference between the number I would have liked, say probability 1, and the number I got of probability 8. So I think if you want to organize very complex evidence you probably will do well to stay away from numbers. SHORTLIFFE: Certainly your first comment is a potential problem in our system and probably one for all these systems. There are numbers that guide our rules and we've gone to great length to implement some capability to explain the reasoning. The expert may ask us how the program reached the conclusion and we can list for him the six rules. And each one of those rules may look just great to him but he simply cannot accept the conclusion. MILLER: In our system we have found the actual number in any particular instance plus or minus 1 doesn't make a lot of difference. But I think doing away with numbers or saying numbers aren't important is something we really can't do. I ran an experiment whereby I wiped out DIALOG's evoking strength and gave it equal weights in terms of confirming a diagnosis. I then used this altered version of DIALOG in a case it had solved previously. Its behavior was very different and it didn't perform the way you would expect a physician to perform in terms of coming to a diagnostic conclusion. So these experiments showed us that the numbers do matter quite a bit. SILVERMAN: As computer scientists we learn to deal with numbers quite a bit and I think there is an over- propensity toward looking at numbers for answers. Once we began to define our model and associate more limits between items that were coming in, the actual numbers that we were using became unimportant. What happened waS we got it to a tertiary system and that seems to work just fine because we have a thorough enough model. So instead of having a range of seven or eight possible values, we have three along with a great deal of information as to how to choose which is the appropriate one. SHORTLIFFE: You are saying that a discontinuous three valued scale seems to do very well. If proper associational links between evidence exists, do you think you can simplify the numbers more and more? PAUKER: Let me add one point to that. When we talk about a discontinuous three valued scale I think we mean using that to measure strength and belief. SHORTLIFFE: Yes. PAUKER: The three by three matrix that Howie talked about, that is, toxic, a little toxic, not at all toxic, is a statement matrix. It is not a level of belief matrix. SHORTLIFFE: It is to the extent that a set of observations about a patient has got to be mapped in one of those states. So there is some element of belief about which state the person is in which is reflected in the three values. POPLE: I had a strong aversion to the use of number at first but it became clear that in going through cases, Dr. Meyers did use terms that definitely suggested the strength of associations. So we found in the language of the clinician relationships which we eventually had to incorporate into the data structure of DIALOG. And I don't think it is all that difficult to take these numbers and translate them back into the kind of terms or ideas that they were intended to convey in the first place. KULIKOWSKI: Our approach was slightly different. I started out being quite a lover of numbers having worked in a number of pattern recognition applications. My motivation in moving away from them was because I found them unsatisfactory for explaining the structure of our reasoning to a clinician and more significantly because if a numerical method alone doesn't work you are not able to trace back symbolically that residue that Chuck had mentioned. In the early stages of our system we removed the numbers’ from the | causal links but kept the numbers between the evidence and each state. The system worked comparably well in doing that. So I would say if you go from a structure of a subgraph of the causal net to a higher level hypothesis the numbers can be important if you are dealing with a large number of hypotheses the way Harry is doing. When you are dealing with only a few hypotheses the mapping can be deterministic. We've defined the problem so well by the causal subgraph that it is one to one mapping. COMMENT: I don't see how the use or absence of numbers’ has anything to do with the difference between the CASNET model and the DIALOG model. I feel it is solely due to the degree to which you _ have been getting close to the metabolism. If the ophthalmologist has a very good understanding of the metabolism then you have relatively firm linkages that can be described in a binary way. Now we are very far away in the general case of internal medicine from having such detailed understanding so that we approximate a more complex situation Statistically by linkages to which we assign values. MEYERS: I don't think it is a difference in level of understanding but as Cas said, the complexity of the problem. I think we can take any subset of internal medicine and follow exactly the same rules that we are talking about. It is the complexity of the problem that requires numbers so that you can keep your facts straight. KULIKOWSKI: I would tend to agree with Dr. Meyers. To go between levels one needs numbers. But once you are at some level of understanding you can operate symbolically. COMMENT: I think everyone would agree that the purpose of AI is to produce machines that will do intelligent things at some level. And if they do things intelligently the way people do them they inherently run into the same kinds of errors that even experts can produce. So the point of view could be taken that by using a quantification scheme with a consistent numerical process, even though the machine has been up for 48 hours, it is more likely to give a consistent answer than _ the physician who has been up for the same amount of time and not at peak of performance level. So I think a good argument can be made for a quantification scheme that it does at least have the virtue of being consistent if nothing else. PAUKER: At the current level of technology, do machines stay awake that long? SAFIR: I think we are probably at a stage now of complicating the assumption and getting further from the forest for the trees. We go through stages like this where things begin to look more and Mmore complicated and after a while somebody backs up and _ looks at it critically and offers another simplified hypothesis. We are right now at a phase of complicating the science and waiting for that next step. Somebody once said that all diseases come down to the simple phenomenon of a tube getting plugged up somewhere and it is true. You get very involved in the clinical richness until someone comes along and finds out by electron microscopy that it is a tube getting blocked. Things work in amazingly simple ways but we organize them in our thinking in ways that are complicated and have nothing to do with what is actually happening. And the numbers don't really exist, they happen to be a good cerebral mechanism for dealing with ideas that we cannot handle otherwise. I think at this stage our representations are quite imperfect and it would be nice to have been back when we thought we knew what we were doing. That must have been a comfortable time before we had machines to test these hypotheses. RAEKEKEEND OF PANEL DISCUSSION******* E. PROBLEMS OF SYSTEMS DEVELOPMENT SAUL AMAREL - MODERATOR AMAREL: This panel will discuss the management of systems development. We will try to get a feeling for the more practical aspects of managing projects, and share problems, advice and experiences we have had in collaborating across disciplines. It is the counterpart but no less important part of the project oriented, scientific activities we have discussed so far. I want to start by asking Ed Feigenbaum who has had experience in large projects for almost ten years that involved the application of AI in scientific and medical programs, to tell us of his own experiences. FEIGENBAUM: Let me say something about experts because they represent the kernel of what it's all about in the knowledge based system design area. In discussing how one picks an applications area in AI heuristic programming in particular, aimed specifically at Medicine, I listed as one of the criteria under knowledge base: "Is there in your environment at least one highly knowledgeable, highly motivated, computer oriented and computationally sensitive expert who can serve aS an informant, through whom the knowledge base can be acquired?". One can partition the classes of experts into the computer oriented and computationally sensitive experts, and those who are not. The only place I've ever gotten into trouble in work in a knowledge based system is the one place I had an expert who didn't know the first thing about computers. The kind of mental model a person has about what a computer can and cannot do is extremely valuable. Without that it's hard to make any progress at all, as the person has no scale of measurement against which to suggest an idea. The computer oriented and computationally sensitive experts break down into three classifications. There are the area experts who are not computer science oriented, but who understand scientific research. There are the quasi-computer scientists who know a great deal about computer science and technology and could be computer scientists or not depending on in what university they sat. Examples of such people are Lederberg and Ray Carhart. Those are people about whom one often has guilt feelings. That is, they are so good at what they do, in let's say chemistry, one feels very guilty for having yanked them that far over into computer science. Then there is the very special brand of experts represented by Ted Shortliffe, who could call himself a professor of computer science or professor of medicine or a practicing doctor. The equally rare complement to that is the computer scientist who will make the trip more than half way into somebody else's discipline. It is becoming increasingly difficult to find applications oriented computer scientists who are willing to become minor experts in somebody else's domain in order to translate the conceptual terms. They just don't see the payoff. Let me talk about payoff. You have to arrange that the expert you find sees the payoff in what it is you want to do. You may not be able to demonstrate that on day one, but you have to have some way of getting to a point where that person gets to a terminal or at least a seminar in which the payoff is made clear to him. So there are two problems. First, get right into the heartland of that expert's domain. Then plan for incremental payoff so that you can get over the first threshold and sustain his interest. He has to see that those first five facts he put in made a difference, or he won't put in the next five after that. After you manage to bootleg some resources to get to stage one of having a credible running program that can serve as the platform for an NIH proposal, plan that the very first renewal application for that proposal involves a study section of at least half that expert's peers. You want a discipline oriented evaluation as well as a computer science evaluation at the very first stage which is, say three years down the pipe. Then if you plan to carry this on for more than the period of the .first renewal, plan for an almost totally disciplinary evaluation at the end of it for the second renewal. Nothing is guaranteed to make the expert more attentive to getting knowledge into the program than knowing he is going to display it to his own peers. Finally, the computer scientist and expert must be assured of a very good computing resource with adequate amounts of computing right at the beginning, and a level of sustained funding that is reasonable. And by that I mean nothing less than three years is reasonable for an effort in this area. If you can't get a three year go right away at it, don't bother. It's just not worth getting into if you have to struggle with these problems of resources. SAFIR: I'm sensitive to the peculiar role of the medical practitioner doctor who gets involved with computer science because it's happening to me and it's an involuntary act. The doctor who gets involved is likely to be a full time academic doctor who works in a medical college or teaching hospital. He is different from the practitioner who delivers health care in that he is more likely to be involved in trying to improve the appications process and not someone who practices application all the time. He is an investigator. And that's a different sort of person from the practitioner. There is a big spectrum of doctors, and you are being exposed to a very biased sample here. I would in no way hold myself up as a typical ophthalmologist. I think freak is probably the best word for people who are interested in something this far from the practice of medicine. The academic investigator doctor finds that if he's going to be a good investigator, he has to learn how the tools work in order to use them intelligently. He finds himself involved in computer science without being able to help it. And if he has the particular cast of mind, it becomes an exciting new discipline and he finds himself putting a foot into that camp. I think medical doctors who start out as practitioners as I did are going to be replaced by people who went about it in an orderly and disciplined way, like Ted Shortliffe who decided to learn both disciplines from the ground up and then put them together in one head. I think that the problems of collaboration between doctors and computer scientists are far more complex than they seem to be. It' s not just a question of getting a good doctor interested in computer science and learning the technology. I think you ought to try to capture medical graduate students right at the outset with displays, devices and services that they can understand, that are non-threatening. They may seem terribly mundane to you, but they are things that doctors want and understand. Once a doctor learns that he can get a useful service from the computer through having fun at a terminal, then you've got him. Then you can entice a larger and larger percentage of them into doing something more scientific. SMITH: There are several questions of collaboration across disciplines, in our case computer science and chemistry. I would use a broad definition of interdisciplinary collaboration and include the various subdivisions of chemistry. There is much collaboration that can go on, and SUMEX of course, provides one mechanism for doing this. But it hasn't removed all the difficulties. Some people for example, wouldn't mind so much being users of our programs, but the mention of collaboration conjures up perhaps some interference in their own particular research projects. I think the way to extend the kinds of things we're doing to an outside community is to demonstrate utility, and to provide them with information that is difficult to get in any other way. The traditional method of demonstrating utility has been to publish papers in the literature. But that method breaks down when you are talking about computer programs which are applied to chemistry. There is no way you can describe a computer program of any complexity to enable another chemist to replicate it. Again, we have the hope that SUMEX will provide a mechanism for removing some of these difficulties. We hope it will allow chemists to get their hands on a program, try it out, and see what it can do in a problem of their own definition. AMAREL: I would say that a primary challenge for the AIM community and especially the AIM workshop in terms of goals, is how to transmit current stages of development of a very complex program, not only to collaborators but to other interested people. YAMAMOTO: Perhaps we need some different technique for scientific communication than the traditional journal article in this area. These articles are based on the fact that the science performs essentially a demonstration on an existing natural object. Whereas, in Computer science in many cases the science is creating the object on which it is also demonstrating. The problem of publication therefore is simultaneously to report the demonstration and to make available to another interested scientist the opportunity to either verify, perform or alter. And this cannot be done by writing a journal article. The root problem that this community faces therefore, is the problem of creating a new science, where science is a social activity. And it is in this sense that AIM's sharing facility is a very important component of the future of those who want to continue working in applied areas of Al. RINDFLEISCH: Through the networking and the direct contact with these programs, people can start to share code, and take these concepts more directly than from published journal articles which give only conceptual descriptions of what is going on. AMAREL: Don Lindberg, as chairman of the AIM advisory committee, has raised some very fundamental issues about what this community should be doing and how it should be interacting. LINDBERG: I want to comment on the two matters that Ed Feigenbaum and Aran Safir raised, as I can agree with everything both of them said. Ed suggested that a permanent alliance must be made between the computer scientist and medical man. I would extend that by including an alliance between the medical man and bioengineering. And if you can't anticipate working together for five years, it's probably not going to be profitable. It doesn't mean that nothing will come out before that, but it takes at least that much time before an easy working relationship -matures,. With regard to what Aran said about ophthalmology, if you look at the historical sequencing of medical specialists becoming involved with computing, I think that right now ophthalmology is new to computing, and some of the pizzazz elements that you are after stem from that specialty having just now gotten ready to be interested. Probably the first group of clinical specialists to be users of the computer were the pathologists because they produce lots of numbers in the laboratories and it made sense. Next was radiology because they were quantitative people too. They had recording problems and image problems. These two groups are settled down and they are essentially, as specialties, committed to computing. They're locked in. Their views toward the kind of applications you give them are quite different now then they were fifteen years ago. Just a week or so ago, my colleagues were dealing with a problem of a wholly automated magnificent AVL blood gas machine which comes with a little micro processor on a card. The manual for a technician is such now that all you have to do is hold the tube in your hand and find the hole in the machine. It turns out that they had enough operational problems with it, even though it's a glorious methodology, that they're going back to an IL which is essentially a semi automatic machine and they are making the decision without a backward glance. They are fully able to evaluate the nice technology and trade it off against better performance. They are really launched. The ophthalmologist will be stuck with computers too and their desires will shift as the association matures. That leads me to the the point I wanted to make about collaboration. I would urge you all developing this collaboration, particularly the computer side, to be very very slow about promising working systems and urging your colleagues to use them for others until you are really ready. The difference between coming up with a program that can be demonstrated on SUMEX at a meeting like this is an order of Magnitude away from making that a working system. There are all sorts of people arrangements that are necessary to make it work. If your chemist collaborator is using the system and only he observes that it doesn't work, that's one thing. But if you are serving the ophthalmologist who is going out and making promises to his colleagues and his patients, or the pathologist who is running a service for the whole institution and those go sour, well, that's why people move from one school to another. The partnership is no good unless both parties feel they are winning. Assuming that can be accomplished and patients and colleagues don't get hurt and feelings are not bruised, then the obvious guestion is what is the computer scientist going to get out of it? Well, he wants to get some papers which are of importance to the field of computer science. And often the first impulse is to generalize everything, which is a good scientific approach I guess. But if it takes the form of looking at a simple problem and making it hopelessly complex merely so that it can be described in a jazzy new terminology, my advice is don't do it. When something can be done simply, do it simply and be proud of having made life simpler and not more complex. So, if you are going to commit yourself, make something that your collaborator is seriously concerned about and is going to use. Use the best methods and don't make it complicated if you don't have to. AMAREL: Bill Baker of NIH is involved in the management = and administration of all these enterprises. BAKER: I was a biomedical engineer before I ever went to NIH, and chairman of a biomedical engineering department that was multidisciplinary. So I'm not part of this community, I'm an outsider looking in. Our annual budget at NIH is around 12 million a year. It's very very small. We also have in other agencies within HEW, programs that have direct impact on your activities. These have a problem that I'd be concerned about if I were on your side of the fence in that some of them are sheer impulse functions. They come and they go. Unfortunately, a great. many people here are being supported today by these impulse type programs. Other programs at HEW have an instability in size of the activity. The technology supported by the National Center of Health Services Research seems to have a very indefinite size. It almost. changes week by week. They have a lot of money one week and none the next. These other programs differ from ours in that NIH has a very carefully prescribed area of responsibility in the health enterprises, and that is basic and clinical research or health knowledge. These others deal more with the patient care and health services system. And NIH is being pressured by congress right now to move over into what NIH calls disease control and demonstration. It's a very big issue and when we meet with Dr. Yamamoto, it is very frequently the most important thing that we discuss. It has to be worked out so that the nation's needs can be met without diluting or sacrificing any of the effort that is going on in the basic and clinical research activities. We are supporting research that is also supported by sister agencies and we will continue to do this. The rationalization that I see for doing this is that all these projects have the subset of AI dealing with organizational disease built into’ them. SUMEX/AIM is not the only nationally shared resource that we deal with. Our concerns in_- such activities lie in that we have developed a certain capacity of high technology, of complex methodology that can be shared across’ the country. But the mechanisms that are most appropriate to marry the collaborators to the system, the financial Support for these activities, we are very concerned about and are working on. In fact, I've come up with a new resource called an interface resource which Saul Amarel really represents. He started out with no equipment and interfaced his collaborators to the hardware that was not under his resource. He's quickly adding to the resource capabilities and will go into the regular category out of the interface resource. Hopefully, the decision makers of NIH and OMB and congress will think this is a good idea and additional funds will be put into our program to bring more attention to this method of work. FEIGENBAUM: I'd like to make a point that relates Don Lindberg's comment about not letting systems out too soon, to some of Bill Baker's comments. It has to do with the nature of the enterprise and our ability to sustain it over a period of time. I think we have failed to persuade society's decision makers that the pursuit of intelligence in machines is of great value to society and Ought to be pursued as a long range endeavor though it's difficult and expensive. There was a time when we had almost no one to make happy as a result of our research because there was plenty of money to go around for all at NIH and other funding agencies. Now the emerging science is being asked what it has done for society lately, and being given eighteen months in which to answer. Whereas, in the sixties we had ten years to answer. In most of what we do, we haven't yet reached the first order consumer and unless we conduct some kind of field testing for our programs and come up with numbers that justify our work and show that an intellectual task is being performed better or cheaper, with some greater social utility than it was before we did it, the interest of society in sustaining this research is not,going to last beyond the first SUMEX/AIM grant. Academic collaborators are accustomed to getting about 70% of the way in developing usable programs. We are not good at engineering products. We need to construct a mechanism whereby someone else who likes the task of going the last 30% of the way and is good at that task, takes it over for us. As it stands now, when it comes time to make a DENDRAL work for the consumer, we have to do it. We'd like to move on to other things, but to sustain ourselves we have to do it. So we need to invent this other kind of institution that will close that last gap, that will make our systems usable at least up to the stage of some kind of user evaluation. BAKER: Ed, you should read our guidelines in biomedical engineering resources because we have that built in. Now when DENDRAL is ready to go from computer science into this phase, you just switch modes with a whole new resource approach. The mechanism really sits in Our program to make this transition. FEIGENBAUM: But who is going to do it? BAKER: It's a different kind of mode. We have one biomedical engineering resource we support now under this set of guidelines in Micro~electronics at Case Western. Its advisory committee has membership from industry to give advice on this. Its mission is to carry our enough collaboration with the prototypes that it develops to take the risks that the interested stock holders of the company would normally take. NIH is willing to go that far. LINDBERG: I feel totally resonant with Ed's’ remarks. I don't think that preparing a guideline creates the people out there to do the job. University people are creative, that's their strong suit and nailing together a finished production system practically never happens. BAKER: It lets them know where they can get support to do it. YAMAMOTO: I'd like to make a comment that deals with marketing and the issue of creating a science. Science is sustained by a cloud of people in society who are sympathetic to it, many of whom practice a sub version of it or identify themselves with it. To build a science you must build a broad pyramid in the society that recognizes that what you are doing is akin to what they are using, and it is the small actions that you take that create a science. If you market an idea using an agency of the government and the good will of a portion of the academic community, you can then begin to create that cadre of sympathetic individuals in the society who use computers for intelligent tasks. There are now at least ten companies that say they are selling AI kinds of things, and there are many hospitals that are using intelligent activities performed by machines. And the next thing we succeed in selling in the medical care area will sell that much more easily. So you really should pay some attention to the marketing of ideas because it is a very significant portion of scientific endeavor. FEIGENBAUM: I think you have just highlighted again the need for an institution that likes to do that. YAMAMOTO: If you should find someone that wants to sell your ideas or promote them you should not regard them as either inferior or ina different area. You should always be willing to embrace that cloud of other activity around you or else you will not be embraced by society. I don't like the market. My background is just as esoteric in a different science as yours is. But I've come to realize recently the- importance of the market in my field. COMMENT: I wanted to point out that there is some pure basic scientific research that has to be done in AI, and we don't want to alienate those researchers from doing their work in the problem domain of medicine. Even though their product may not be sellable, their work is just as important to this community as producing DENDRALS that are going to be sold. SAFIR: Marketing implies that you first find out what is needed and that's much more complex than it seems. If there is a great need for a product in the community of users of medicine, it is not for a diagnostic aid in serious metabolic or internal medical diseases. It is for something so terribly mundane that the academicians are not interested in it. I think that the problem may be almost insolvable in that the community providing solutions is not interested in studying the problems people want solved. AMAREL: Before we conclude, I would like to have a more general discussion about the workshop itself, its content and suggestions for preparing next year's workshop. RINDFLEISCH: We will put a list of the participants and _ their addresses and telephone numbers in a file on SUMEX so people can get it right away. COMMENT: I suggest that for future workshops we have proceedings Or some mechanism of publication of people's ideas. It would be helpful to have something that would inform us of what will be discussed beforehand and then take with us to peruse after having seen the systems and having run them. AMAREL: We are going to make an effort to put together something which can approximate proceedings and will give a fairly concise account of what went on at this first workshop, and have it on file in SUMEX/AIM. This can be done and we will try to have this accessible to at least those people who can access SUMEX/AIM which is a fairly large population, As far as other publications are concerned, I am not certain. YAMAMOTO: Let me give the problem a challenge. You are AI people with an AI facility. Your AI report of the proceedings of a workshop like this ought to be some type of mnemonic recollection in your AI systems. It should record in your machine base systems, what as a consequence of this workshop has happened in your respective AI prototypes. It seems to me that if you have a true AI philosophy, your AI system ought to grow as a result of encountering. I think that would be the ultimate form of publication for this community rather than the bound proceedings, even though the distribution base in this would be very narrow, perhaps only to those people who have attachments to SUMEX. I don't know how to do it, I can only imagine what it might be like. But it ought to be something uniquely different, and that's a challenge. I think the workshop was too long. I heard on the first day some comments about who is allowed to stay and who is not and so on. TI think you ought to blur that edge next time so it isn't so sharp. AMAREL: I think the publication is a marvelous challenge, and a formidable one. Your challenge is in five years to get to that point. What do you think Bill, we can do it? BUCHANAN: With respect to that issue there are some things that are going on at a very low level in MYCIN. We try to keep track of the author of a rule, and that is some sort of an acknowledgement of how the System if growing. In SUMEX itself we are trying to work on a bulletin board facility for dissemination of informal ideas, where you can be notified when something of interest to you is posted by someone else. SRIDHARAN: Bruce mentioned that there is a long lead time between the actual birth of an idea and the time you put it dowm on paper. And one of the things I was looking forward to at the workshop was hearing nascent ideas that I have not already read. But I think it has worked out quite the contrary. Therefore, for the next workshop, I really hope that we get the fresher ideas from everyone and not a rehash of what we have already written up or thought about. SAFIR: Writing up your ideas in an organized form is a difficult task, and busy researchers are not likely to do it unless they get a significant reward which is generally a recognized publication. If the National Library of Medicine would recognize a file that could be accessed by anyone who had a printout device so that one could get this electronically stored manuscript that would have to be _ suitably refereed, I think you might start an electronic journal that would have value and would be rapidly responsive to people's thoughts. FEIGENBAUM: To follow up on what Sridharan had to say, one of the consequences of the workshop could have been an open forum in which people suggested to each other the next set of problems to work on and not just absorb what has been done already in various projects. There are many problems yet to be tackled in the computer science area and also problems from the medical domain. AMAREL: With these suggestions with us, let us bring this first AIM workshop to a close. In behalf of all my colleagues I wish to thank first and foremost, Cas Kulikowski who is the organizer of the workshop, N.S. Sridharan who worked with him throughout on many different problems and Saul Levy who organized the computing activities with respect to the workshop. I also want to thank Pat Moore and Ken Brown who were vital to its success, and our graduate students here at Rutgers for their valuable help. Lastly, many thanks to the AIM advisory committee, the AIM executive committee, and of course, SUMEX/AIM itself which provided a very useful way of working and planning for this workshop. Thank you very much. *kkKKKKKEND OF PANEL DISCUSSION******** 7a. 7b. 10. VI. REFERENCES Smith, D.H., Masinter, L.M., and Sridharan, N.S. (1974) “Heuristic DENDRAL: Analysis of Molecular Structure", In Computer Representation and Manipulation of Chemical Information (W.T. Wipke, Editor) John Wiley. Buchanan, B.G. (1974) "Scientific Theory Formation by Computer", in Computer-Oriented Learning Processes, (Proceedings of NATO ASI, Bonas, France). Wipke, W.T.. (1974) "Computer Assisted Three-Dimensional Synthetic Analysis" in Computer Representation and Manipulation of Chemical Information . (W.T. Wipke, Editor). Colby, K.M. (1973) "Simulations of Belief Systems" in Computer Models of Thought and Language. W.H. Freeman. Schmidt, C.F. (1975) "Understanding Human Action: Recognizing the Motives". Computers in Biomedicine, Technical Report 45, June 1975, Department of Computer Science, Rutgers University. Also to appear in Cognition and Social Behavior, J.S. Carroll and J. Payne (Eds.), New York: Lawrence EarIbaum Associates, in press. Sridharan, N.S. (1975) "The Architecture of the BELIEVER System, Part I: General Description and Illustration of the Inference Mechanism", Computers in Biomedicine, Technical Report RUCBM-TR-46, June 1975, Department of Computer Science, Rutgers University. Weiss, S.M. (1974) "A System for Model-Based Computer-Aided Diagnosis and Therapy", Computers in Biomedicine, RUCBM-TR-27-Thesis, Department of Computer Science, Rutgers University, June 1974. Kulikowski, C., Safir, A. and Weiss, S.M, (1973), "A Representation of Medical Knowledge for Problem Solving: Application to a Model of Glaucoma", Computers in Biomedicine, RUCBM-TR-21, Department of Computer Science, July 1973. Pople, H.E. (1975) "Artificial Intelligence Approaches to Computer-Based Medical Consultation", 1975 IEEE Intercon Conference Record, Session 31, April 1975. Shortliffe, E. (1974) "MYCIN: A Rule-Based Computer Program for Advising Physicians Regarding Antimicrobial Therapy Selection", Computer Science Department, Stanford University, STAN-CS-74-465, October 1974, Shortliffe, E.H., Davis, R., Axline, S., Buchanan, B., and Cohen, S. (1975) "Computer-Based Consultations in Clinical Therapeutics: Explanation and Rule Acquisition Capabilities of the MYCIN System", Computers and Biomedical Research, August 1975. ll. 12. Miller, P.B., (1975) "Strategy Selection in Medical Report MAC-TR-153 (M.S.Thesis from MIT), September 1975. Pauker, S., (1975) “Towards the Simulation of Clinical submitted to American Journal of Medicine, June 1975. Diagnosis", Cognition", VII. AIM Organization AIM EXECUTIVE COMMITTEE 1975: LEDERBERG, Dr. Joshua Principal Investigator of the SUMEX/AIM Project Department of Genetics, $331 Stanford University Medical Center Stanford, California 94305 AMAREL, Dr. Saul Principal Investigator of the Rutgers Research Resource Department of Computer Science Rutgers University New Brunswick, New Jersey 08903 BREWER, Dr. Carl R. Biotechnology Resources Branch National Institutes of Health Building 31, Room 5B25 9000 Rockville Pike Bethesda, Maryland 20014 LINDBERG, Dr. Donald 605 Lewis Hall University of Missouri Columbia, Missouri 65201 AIM ADVISORY GROUP 1975: LINDBERG, Dr. Donald [Chairman] 605 Lewis Hall University of Missouri Columbia, Missouri 65201 AMAREL, Dr. Saul Department of Computer Science Rutgers University New Brunswick, New Jersey 08903 BREWER; Dr. Carl R. [Executive Secretary] Biotechnology Resources Branch National Institutes of Health Building 31, Room 5B25 9000 Rockville Pike Bethesda, Maryland 20014 BOBROW, Dr. Daniel G. Xerox Palo Alto Research Center 3333 Coyote Hill Road Palo Alto, California 94304 FEIGENBAUM, Dr. Edward Serra House Department of Computer Science Stanford University Stanford, California 94305 FELDMAN, Dr. Jerome Department of Computer Science University of Rochester Rochester, New York LEDERBERG, Dr. Joshua [Ex-officio] Department of Genetics, S331 Stanford University Medical School Stanford, California 94305 MILLER, Dr. George The Rockefeller University 1230 York Avenue New York, New York 10021 REDDY, DR. D.R. Department of Computer Science Carnegie-Mellon University Pittsburgh, Pennsylvania SAFIR, Dr. Aran Department of Ophthalmology Mount Sinai School of Medicine City University of New York Fifth Avenue and 100th Street New York, New York 10029