Form Approved SECTION } 0.4.8. 68-RO249 OEPARTMENT OF LEAVE BLANK HEALTH, EDUCATION, AND WELFARE TYP P PUBLIC HEALTH SEAVICE E ROGRAM NUMBER REVIEW GROUP FORMERLY GRANT APPLICATION COUNCIL (Month, Year} DATE RECEIVED TO BE COMPLETED BY PRINCIPAL INVESTIGATOR (items 1 through 1. TITLE OF PROPOSAL (Oo not exceed 53 typewriter spaces) Biomedical Knowledge Engineering and Infecti 7 and 15A} ous Diseases 2. PRINCIPAL INVESTIGATOR 3.DATES OF ENTIRE PROPOSED PROJECT PERIOD (This application. 2A. NAME (Last, First, Initiall FROM THROUGH COHEN, Stanley N. 4/1978 3/1983 2B. TITLE OF POSITION 4, TOTAL DIRECT COSTS RE- 5. DIRECT COSTS REQUESTED Professor of Medicine and Head, Division of Clinical Pharmacology QUESTED FOR PERIOD IN FOR FIRST 12-MONTH PERIOC ITEM 3 $1426444 $170425 2G. MAILING ADDRESS (Sire? City, State, Zip Code] Clinical Pharmacaq]ogy Medical Center Stanford University . PERFORMANCE SITE(S) (See /nstructions) Clinical Pharmacology Department of Computer Science Stanford University Stanford, California 94305 20. DEGREE 2€. UARITY NO. Ph.D. 2F. TELE [Area Codd TELEPHONE NSION H DATA m6 | 497-5315 2G. DEPARTMENT, SERVICE, LABORATORY OR EQUIVALENT (See Instructions) Clinical Pharmacology 7H. MAJOR SUBDIVISION (See Instructions} Department of Humanities and Sciences T. Research Involving Human Subjects (See Instructions} A.CAINO B.C) YES Approved: Cc. (7) YES — Pending Review Date 8. Inventions (Renewal Applicants Only - See Instructions) AKI NO 8.(1) YES — Not previously reported C.CIYES — Previously reported TO BE COMPLETED BY RESPONSIBLE ADMINISTRATIVE AUTHORITY (/tems 8 through 13 and 158) 9. APPLICANT ORGANIZATION(S) (See Instructions) Stanford University 11, TYPE OF ORGANIZATION (Check appliceble item! COreoerat CJstate COvocAL (XIOTHER (Specify) Private, non-profit University Clinical Pharmacology Stanford, California 94305 10. NAME, TITLE, AND TELEPHONE NUMBER OF OFFICIAL(S) SIGNING FOR APPLICANT ORGANIZATION(S) NAME, TITLE, ADDRESS, AND TELEPHONE NUMBER OF OFFICIAL IN BUSINESS OFFICE WHO SHOULD ALSO BE NOTIFIED IF AN AWARD IS MADE K.D. Creighton Controller Stanford University Stanford, California 94305 Telephone Number AL NENT RECEIVE C l ASI FOR INSTITUTIONAL GRANT PURPOSES (See Instructions} 20 School of Humanities and Sciences c/o Sponsored Projects Office (415) 497-2883 Telephone Number (s} 14, ENTITY NUMBER (Formerly PHS Account Number) 1941156365A1 15, CERTIFICATION AND ACCEPTANCE. We, the undersigned, certify that the statements herein are true and complete to the best of our knowledge and accept, 9s to any grant awarded, the obligation to comply with Public Healt Service terms and conditions in effect at the time of the ewerd. SIGNATURES A SIGNATURE-OF-PERSON NAMED IN ITEM 2A DATE (Signatures required on Original copy only. DATE Use ink, °Per®’ signatures not acceptable) B. SIGNATURE(S) OF PERSON(S) NAMED IN ITEM 10 NIH 398 {FORMERLY PHS 398) Rev. 1/73 SECTION 1 DEPARTMENT OF HEALTH, EDUCATION, AND WELFARE LEAVE BLANK PUBLIC HEALTH SEAVICE PROJECT NUMBER RESEARCH OBJECTIVES NAME AND ADDRESS OF APPLICANT ORGANIZATION Stanford University Stanford, California 94305 NAME, SOCIAL SECURITY NUMBER, OFFICIAL TITLE, AND DEPARTMENT OF ALL PRG PROJECT, BEGINNING WITH PRINCIPAL INVESTIGATOR See attached list. TITLE OF PROJECT . . : Biomedical Knowledge Engineering and Infectious Diseases Se a rs ety FS SP i SS USE THIS SPACE TO ABSTRACT YOUR PROPOSED RESEARCH. OUTLINE OBJECTIVES AND METHODS. UNODERSCORE THE KEY WORDS (NOT TO EXCEED 10) IN YOUR ABSTRACT, Knowledge about medical disciplines, such as infectious diseases, changes rapidly. To assist researchers and practittoners codify, access, and reason with the knowledge of their domain, we propose developing knowledge-based computer programs as ''knowledge engineering'' aids. We base the proposed work on the MYCIN program, which we developed over the last three years. MYCIN stores facts and relations about infectious diseases in a set of inference rules, or production rules. It reasons about complex case histories using this knowledge, and [t can explain its reasoning. We have also developed a prototype knowledge acquisition system to aid the experts who modify and extend the knowledge base. LEAVE BLANK NIH 398 (FORMERLY PHS 398) PAGE 2 Rev. 1/73 anwrnmMnaan . - 8 © «@ Name Cohen Buchanan Davis Shortliffe . Axline Wraith Scott Soc. Sec. No. Title Professor Adjunct Professor Research Associate Professor Professor Research Associate Research Associate 2A Department Medicine Computer Science Computer Science Medicine Medicine Medicine Computer Science SECTION i! — PRIVILEGED COMMUNICATION FROM THROUGH DETAILED BUDGET FOR FIRST 12-MONTH PERIOD 4/1978 4/1979 DESCRIPTION (/temize) TIME OR AMOUNT REQUESTED (Omit cents} PERSONNEL EFFORT FRINGE . ALARY TOTA NAME TITLE OF POSITION MINAS. | SALA BENEFITS L PRINCIPAL INVESTIGATOR See attached list 125859. 24667. 150525. CONSULTANT COSTS Clinical Consultant 3000. EQUIPMENT 2 Datamedia Display terminals L800. SUPPLIES Office supplies 4500. DOMESTIC 2000. TRAVEL FOREIGN w PATIENT COSTS (See instructions) ALTERATIONS AND RENOVATIONS -- OTHER EXPENSES (/temize) Telephone 2000. Maintenance Contracts 600. Postage/publications/miscel laneous 3000. TOTAL DIRECT COST (Enter on Page 1, item 5)

Ore [IMs (Heemate es EQUCATION (Zein with boecolsurcats training and include postdccioral) - = YEAS SCIENTIFIC t N JCATI — NSTITUTION AND LOC ON DEGREE CONEERRED FIELD College of William and Mary B.S. 1973 Mathematics Stanford University M.S. 1974 Computer Science HONORS Phi Beta Kappa MAJOR RESEARCH INTEREST Artificial Intelligence ROLE 1N PROPOSED PROJECT Scientific Programmer RESEARCH SUPPORT (see instructions) RESCARCH AND/OH PROFESSIONAL E XPERIENCE (Starting with present positon, jist training and exp ariance raizvant to ar33 of project, List ol! - OF INOS? representative publicstionms, Do not excoed 3 pages for each individual.) 1974-present Scientific Programmer, MYCIN project, Division of Clinical Pharmacology. Department of Medicine, Stanford University Publication: Scott, AC, Clancey, W, Davis, R and Shortliffe, EH; Explanation capabilities of knowledge-based production systems. (submitted to Amer. J. Linguistics) NIH 395 (FORMERLY PHS 398) Ray. 1773 4N OU S. COVERAMEDT PHINTISG OFFICE +1374 seaersaeqe . 5.1.conen RESEARCH PLAN 1 Objectives The overall objective of the proposed research is the development and evaluation of a computer based system for codifying judgmental knowledge of experts in order to improve the effectiveness of medical research and clinical decision making. Our work to date has concentrated on codifying knowledge for the diagnosis and selection of therapy for infectious diseases, and has produced a system (called “MYCIN”) capable of offering consultative advice for certain classes of infections. The development of this system over the past few years has provided a “laboratory” for the elucidation of the informal judgmental criteria used by experts in the field. In codifying that knowledge and testing it on real cases, we have encouraged the formal specification of what was previously informal knowledge, and have provided an arena in which conflicting judgements from different experts can be tested, We are requesting support for continued research and development in order to demonstrate MYCIN’s effectiveness asa research tool for biomedical scientists working with infectious diseases, and eventually as a general methodology for “knowledge engineering” in related disciplines. Proposed steps toward that end include: (a) expand the clinical knowledge base of the system to increase the range of clinical cases for which MYCIN can aid physicians and researchers. (b) systematize and organize knowledge and decision processes on a rigorous basis using MYCIN techniques so that one researcher can build on another’s research results, (c) improve the system’s interaction with physicians and researchers. (d) transform the system to a dedicated mini-computer to improve response time and make it exportable. (e) evaluate the research and clinical utility of the system, in part by showing that the system offers an effective forum that encourages experts in the field to reach a medical or technical consensus in their view of the domain. PRIVILEGED COMMUNICATION Sec. 2 | 5." conen Gi 2 Background and Rationale 2.1 The Knowledge Engineering Problem Computer programs can provide assistance to working scientists in several different ways. For a number of years, computers were used almost exclusively as numeric problem solvers, They acted as mathematical assistants, performing calculations that were complex, tedious or repetitious. They have been used for manipulating symbolic expressions as well, For example, hospitals and businesses have stored massive amounts of symbolic information in computer files, and have developed intricate programs for retrieval and display of the stored information. It is also possible to extend the metaphor of the Problem solving assistant into the realm of symbolic information, as demonstrated by several artificial intelligence (AI) programs. For example, the DENDRAL programs(1) assist research chemists with both the combinatorial and inferential aspects of chemical reasoning, both of which can be demanding and tedious for human scientists. The MYCIN program is an outgrowth of nearly a decade of work on DENDRAL. We are building on, and improving, many of the ideas from DENDRAL about representing large amounts of domain-~ specific knowledge for computer aided problem solving. The representation, use and acquisition of knowledge for computer programs has been called “knowledge engineering” [D.Michie, On Machine Intelligence, New York: Wiley, 1974]. The MYCIN and DENDRAL programs are important examples of this branch of AI work. One of the central ideas in this work is the belief that high performance in solving problems arises from a large store of task~specifie knowledge -- that is, a “knowledge base’ containing information specific to the task at hand. We represent that body of knowledge as a collection of decision rules -- in the case at hand, rules about diagnosis and therapy selection in infectious diseases. These conditional sentences are called “production rules’. The production rule formalism provides an easily understood representation of facts and relations. However, our experience has shown that writing new rules and integrating them into an existing knowledge base is not as simple as we had hoped. Thus we must provide more tools for the experts who write rules so that they can see the relationships of new rules to old ones and easily determine the consequences of adding new rules to the program. (1) see, for example, E.Feigenbaum, B.Buchanan, & J.Lederberg, “On Generality and Problem Solving: A Case Study Using the DENDRAL Program’. Machine Intelligence 6 (eds Meltzer & Michie), 165-90. Edinburgh: Edinburgh University Press 6 Gms. COMMUNICATION Sec. 2.1 S.N. conn D> We have already developed primitive mechanisms for checking the syntax of new rules and some aspects of their semantics. For example, the rule models developed by Davis [14] give the system the ability to check the similaritiés of a new rule with other rules of the same type in order to comment on (and ask about) the differences noticed. We will need to build on these ideas, in effect, to make the system smarter about what it notices. It would be premature to suggest that a computer program could arbitrate the scientific disagreements among experts and reach a consensus smoothly. This is a super-human task. However, we believe that a program that is able to keep track of the different ways experts express their knowledge can be an important aid to those experts in coming to an agreement. For example, the program can select case histories that highlight the consequences of using different facts and relations. Our past work has emphasized the use of judgmental knowledge in a high performance program that provides inferential assistance to physicians. We now propose to build on that work to provide knowledge engineering assistance to research scientists, with two long-range goals in mind: (a) using infectious disease as a case study to develop a methodology of knowledge engineering that will be applicable to building high performance systems in a range of disciplines, (bd) develop techniques for using such systems to provide a forum for formal specification of previously informal knowledge, aS a means of encouraging consensus among experts in the field, 2.2 The Medical Problem A number of recent studies indicate a major need to improve the quality of antimicrobial therapy. Almost one-half of the total cost of drugs spent in treating hospitalized patients is spend on antibiotics [1,2], and if results of a number of recent studies are to be believed, a significant part of this therapy is associated with serious misuse (2,3,4,5], Some of the inappropriate therapy involves incorrect selection of a therapeutic regimen [4], while another serious problem is the incorrect decision to administer any antibiotic [2,4,5]. One recent study concluded that one out of every four people in the United States was given penicillin during a recent year, and nearly 90% of these prescriptions were unnecessary [6]. Other studies have shown that physicians will often reach therapeutic decisions that differ significantly from the decisions that would have been suggested by experts in infectious disease therapy practicing at the same institution. Nonexperts sometimes choose a drug regimen designed to PRIVILEGED COMMUNICATION See. 2.2 oH. Cone cover for all possibilities, prescribing either several drugs or one of the so-called “broad spectrum’ antibiotics, even though appropriate use of clinical data might have led to more rational and less toxic therapy. ~ Within a hospital environment in which professional resources ’ are often overburdened, and in environments where expert sources are not readily available, a computer-based consultant will be highly useful. Such a system will also have broad fringe benefits in its educational impact on staff physicians and in providing a framework for quality control and peer-review evaluations. Antimicrobial therapy appears to be an especially suitable area for the initial development of a computer-based system to assist physicians with decisions in clinical therapeutics. The components of the decision making process in antimicrobial therapy are more readily definable than in many other areas of medicine, and the consequences of the physician’s decision can usually be assessed in terms of direct therapeutic action. Nevertheless, the general approach used here is applicable to other areas of clinical decision making. The basis of rational antimicrobial therapy decisions is identification of the microorganisms causing the infectious disease. Accurate identification is important because of the specificity of antibiotic action: drugs that are highly effective against certain organisms are often useless against others, The patients clinical status and history (ineluding information such as prior infections and treatments) provide data that may be valuable to the physician in identifying the disease-causing organisms. However, bacteriological cultures that use specimens taken from the site of the patient ‘s infection usually provide the most definitive identifying information. Initial culture reports from a microbiological laboratory may become available within 12 hours from the time a clinical specimen is obtained from the patient. While the information in these early reports often serves to classify the organism in general terms, it does not often permit precise identification. It may be clinically unwise to postpone therapy until such identification can be made with certainty, a process that usually requires 24 to 48 hours, or longer. Thus it is commonly necessary for the physician to estimate the range of possible infecting organisms, and to start appropriate therapy even before the laboratory is able to identify the offending organism and its antibiotic sensitivities, In this setting MYCIN plays two roles: (a) providing consultative advice that will assist the physician in making the best therapeutic decision that can be made on the basis of available information, and, (b) by its questioning of the physician, pinpointing the items of clinical data that are necessary to increase the validity of the clinical decision. aa. quam COMMUNICATION Sec. 2.3 (S.N. Cohen EE # 2.3 Our Work to Date A comprehensive review of our work appears in Section 3 of this proposal. Briefly, we have developed a computer program capable of offering consultative advice on the diagnosis and therapy selection for bacteremia and meningitis, two areas central to the management of infectious disease, This work has been guided by three fundamental objectives. (1) A major objective of the MYCIN system has been to provide a computer-based therapeutic tool designed to be useful in both clinical and research environments. This requires development of a system that has a medically and scientifically sound knowledge base, and that displays a high level of competence in its field. The program must first convince clinicians of the quality of the information it is providing before they will be willing to use it. (2) We believe it is important for the computer system to have the ability to explain the reasoning behind its decisions. It should be able to do so in terms that suggest to the physician that the program approaches the problem in much the same way that he does. This permits the user to validate the programs reasoning, and modify (or reject) the advice if he believes that some step in the decision process is not justified. It also gives the program an inherent instructional capability that allows the physician to learn from each consultation session. (3) A third major objective is to provide the program with capabilities that enable augmentation or modification of the knowledge base by experts in infectious disease therapy, in order to codify knowledge in the domain, as well as to improve the validity of future consultations. The system therefore requires Some capability for acquiring knowledge by interacting with experts in the field, and for incorporating this knowledge into its knowledge base. Three separate parts of the MYCIN system accomplish these objectives, The consultation system uses the knowledge base, along with patient-related data entered by the physician to generate therapeutic advice. The explanation system has the ability to explain the reasoning used during the consultation, and to document the motivation for questions asked or the rationale for conclusions reached. Finally, the knowledge acquisition system enables experts in antimicrobial therapy to update MYCIN’s knowledge base, without requiring that they know how to program a computer. A principal feature of MYCIN central to these objectives is the format in which its knowledge is encoded. Knowledge used by MYCIN is contained in diagnostic and therapeutic decision rules formulated during extensive discussions of clinical case histories. The MYCIN knowledge base currently consists of approximately 400 such rules. Each rule — COMMUNICATION Sec. 2.3 S-'- Cohen Ga consists of a set of preconditions (called the “premise’) which, if true, justifies the conclusion made in the ‘action’ part of the rule (an example is shown below). If 1} the gram stain of the organism is gram negative, and 2) the morphology of the organism is rod, and 3) the aerobicity of the organism is anaerobic, then there is suggestive evidence (.6) that the identity of the organism is Bacteroides. Many of the system’s unique and important capabilities are made possible by encoding knowledge in rules like the one above. Such rules form modular ‘chunks’ of knowledge about the domain, represented in a form that is comprehensible to clinicians and researchers, The consultation system uses its collection of rules to make conclusions about the patient. If, for, instance, it is attempting to determine the identity of an organism responsible for a particular infection, it retrieves the entire list of rules which, like the one above, conclude about identity. It then attempts to ascertain whether the conclusion of the first rule is valid, by evaluating in turn each of the clauses of the premise. Thus, for the rule above, the first thing to find out its gram stain. If this information is already available in the data base, the program retrieves it. If not, determination of gram Stain becomes the objective of a new rule, and the program retrieves all rules which conclude about it, and tries to use each of them to obtain the value of gram stain. If, after trying all the relevant rules, the answer still has not been discovered, the program asks the user for the relevant clinical information which will permit it to establish the validity of the premise clause, Thus, the rules “unwind” to produce a succession of goals, and it is the attempt to achieve each goal that drives the consultation. The use of a rule-based representation of knowledge makes it possible for the system to explain the basis for its recommendations. For example, if asked “How did you determine the identity of the organism?’ the program answers by displaying the rules which were actually used, and explaining, if requested, how each of the premises of the rules was established. This is something which people readily understand, and it provides a far more comprehensible and acceptable explanation than would be possible if the program were to use a simple statistical approach to diagnosis. As work proceeds to expand the program’s knowledge base, new “chunks” are added in much the same way that a clinician in training learns new pieces of knowledge about his field. This rule-based representation of Knowledge means that the expert himself can offer new “chunks” of knowledge by expressing them in 10 Gams: COMMUNICATION Sec. 2.3 S.N. Cohen > the same rule-based format. He can thus help make the program more competent, without having to know anything about computer programming. In addition, since the rules are largely independent of one another, and are used by the program as necessary in order to deal with the particular consultation underway, the addition of a new rule or modification of an existing rule requires little alteration of other items in the knowledge base, unlike systems using the decision-tree methodology. Other benefits gained from this approach have been explained in more detail in the references. 2.4 Other Approaches There are three other approaches to the problem of encoding medical decision making knowledge that have received extensive attention in the literature: . (i) decision trees - as in [7], in which a sequence of decisions i: structured in the form of a tree, Each node represents a particular question, and the answer determines which branch of the tree to follow to get to the next question. Final results are obtained by descending all the way to a leaf of the tree. (ii) Bayesian techniques ~ as in {8], in which extensive frequency data make it possible to use Bayes” theorem as a basis for diagnosis. (iii) Decision analysis and utility theory - as in [9], in which there is associated with each piece of information a likely cost of obtaining it, and a measure of the benefit to be derived from having it. Information is requested until the projected cost of asking another question (perhaps requiring another lab test or operative procedure) outweighs the benefit (in terms of a more precise diagnosis) to be obtained. Each of these has a number of attractive aspects, but also encounters some limitations which provided the motivation for our investigation of a rule-based system. Decision trees, for example, offer simple, readily understandable procedures for diagnosing specific ailments. Problems occur, however, if they encounter unexpected data or if test results are unavailable. The representation of knowledge they offer can be somewhat inflexible, as well, since the attempt to make changes deep down in the tree often requires consideration of all previous decisions made further up the tree. The Bayesian technique offers an appealing generality and precision, since it it a domain independent technique based on 11 “es GED COMMUNICATION Sec. 2.4 ee] exact principles, Limitations here arise from the need for extensive amounts of frequency data concerning a priori and conditional probabilities. Where these data exist, the technique can be used quite effectively, but such figures may not often be available [10]. Techniques based on utility theory can present a well- motivated sequence of questions that appears to ‘zero in’ on the underlying ailment. Like the Bayesian approach, however, it requires on extensive data on conditional probabilities of symptoms and disease. Since none of these is intended to bea model of the reasoning process typically employed by clinicians, it can at times prove difficult for a clinician to discover the basis for the conclusions drawn by any of them. While they each present a compact encoding of knowledge that can provide an appealing efficiency to programs based on them, there is an unavoidable loss of comprehensibility to the physician using them. Reasoning which requires several distinct inferential steps by a clinician, for instance, might be expressed in a Single value of a conditional probability in the Bayesian method, One additional technique has received some attention lately, as other researchers (e.g., [11] and [12]) have developed sophisticated models of physiological processes. Where the system involved is sufficiently well-understood and isolatable (e.g., the glaucoma model in [12]}, this can be a powerful approach. But this is not often true, Infectious disease diagnosis and therapy selection (like many other areas) involves a broad range of processes, many of which are only very imperfectly understood. Finally, we place great emphasis on the flexibility of the knowledge base. A substantial amount of knowledge is required to support a high level of performance, and this means that modification and augmentation of the knowledge base will continue for an extended period. Each modification must therefore be a reasonable task, or the program will soon begin to stagnate. A flexible knowledge base also means that the system is inherently dynamic in character. It is easily modified to take into account regional variations in practice, new results which arise from progress in medical research, and changes in drug resistance patterns. Our experience to date suggests that our current approach of codifying individual decision rules offers a large number of advantages, including flexibility and ready comprehensibility. It can provide the basis for a formalism capable of functioning in domains where little statistical data is available, or where information is uncertain or incomplete, and can thus offer a useful extension to existing techniques. 12 6 PRIVILEGED COMMUNICATION Sec. 3 s.v.conen 3 Previous Work Done See Appendix A, for a review of work performed under BHSRE funding (grant no. HS-01544), 4 Specific Goals The primary goals of our proposed work over the next five years are (i) to increase MYCIN’s abilities to integrate large collections of facts and relations. The content of this knowledge base will be specific to infectious diseases. However, we view this as a case study of the larger problem of developing a methodology of knowledge engineering applicable to a range of disciplines; (ii) to develop techniques for using this methodology to provide a forum for formal specification of previously informal knowledge, as a means of encouraging consensus among experts in the field. In keeping with these goals, five main foci of attention for our work will be: (a) increase the system’s competence, i.e., both the breadth and depth of the knowledge base (b) provide knowledge engineering support tools to aid experts codify and test their inferential rules about the domain (c) provide a number of human engineering features to insure that the program is faster, easier, and, in general more attractive to users (d) transfer the system to a small, dedicated mini-computer to improve response time and enable exportability. (e) establish an on-going evaluation program to monitor the growth and convergence of the knowledge base. with the assistance of collaborating clinicians on the wards. 13 PRIVILEGED COMMUNICATION See. 4.1 s.x.cohen (a p . , Competence 4.1.1 Breadth The work to be done in the future development of MYCIN is ilustrative of the expected needs of knowledge engineering programs in general. Our previous work has resulted in a program that is currently capable of dealing with bacteremia and meningitis, but for several reasons this is too narrow a range if it is to be useful in a research or clinical setting. One problem, for example is that the physician must decide whether the patient is suffering from either of these before he can determine if MYCIN would be an appropriate source of advice. But a significant part of the diagnostic task is this determination of infection etiology. Requiring the physician to make this decision thus presents a significant barrier to use of the program. A second problem arises from the interactions of multiple infections. Cases are often complicated by the presence of more than one infection, and it is not in general possible to consider each infection independently, To select precise therapy, MYCIN must be able to sort out the various sources of infection, and determine their influence on one another. In complex situations such as these printed textbooks usually fail to cover all combinations of dependencies. Thus a program that reasons about these situations can provide intelligent assistance to the researcher or clinician who wants to have expert-level advice. Finally, our experience with new users of the system suggests that they can at times overlook explicit instructions concerning the programs capabilities, and present it with medical problems outside of its competence. It will prove very important for the eventual unsupervised use of the program, then, that MYCIN be able to recognize the limits of its capabilities, and respond appropriately. That is, like the human consultant, the program must be able to say “I don’t know’. In response to these problems, we intend to work 6n three specific issues. First, we will extend the system’s range of competence to cover both urinary tract and pulmonary infections. Based on analyses of infection frequencies seen at our medical center, the inclusion of urinary tract and pulmonary infections should permit MYCIN to handle 76% of hospital acquired bacterial infections and 64% of all bacterial infections. Strategies similar to those employed in our approach to bacteremia will be used to expand the system to include these important areas of infectious disease. In addition, it will be necessary to develop the ability to identify the underlying foci of infection, so that the program can bring to bear the appropriate subset of its knowledge of the field. Second, we will extend the system’s knowledge base to cover 14 Quam... COMMUNICATION Sec. 4.1 S.N. Cohen > the prophylactic use of antimicrobial agents. This will be an especially useful area, since prophylaxis (defined as the use of antimicrobial agents before disease due to an infectious agent is present or before infection or colonization with an organism has occurred) represents one of the largest categories of use and abuse of antimicrobials. There are circumstances, such as prevention of endocarditis in patients with underlying heart disease, in which such treatment can be justified, In most eases, however, financial costs and potential drug toxicity exceed the marginal benefits to be achieved, and prophylactic therapy is thus unwise. Kunin [13] reported that 58% of surgical patients in a major university medical center received prophylaxis, but such therapy was judged appropriate in only 38% of these cases. Thus, prophylactic use of antimicrobial agents represents a substantial fraction of antimicrobial misuse, and the inclusion in MYCIN of knowledge about this area would greatly enhance its clinical utility. The final issue is the further development of the systems ability to recognize and convey its limitations. The current system has something of this already, and can recognize (in cases of bacteremia and meningitis) those situations when there is too little clinical or lab data available to draw any substantive conclusions about therapy. This will have to be extended to enable the system to recognize the situation in which the problem is not insufficient data, but insufficient knowledge about a medical problem outside of its domain of competence. Such a capability will inerease physician confidence in the system, as well, since he knows that the system is capable of indicating its inability to advise. Once MYCIN has this broader range of medical knowledge, along with the ability to select the applicable part of its Knowledge base and the ability to recognize its own limitations, the system can be used with confidence. It can integrate a large amount of judgmental knowledge from experts and advise other researchers and clinicians about specific problems on the basis of that knowledge. This offers a much greater assurance that MYCIN will be playing an effective role in health care research, 4.1.2 Depth Experience with new users has also Suggested that some of the questions asked by the system during the course of the consultation require too much judgment and sophistication on the part of the user. One question, for instance, inquires whether the patient “is febrile due to the infection’. Since determining the source of a fever can be a difficult and subtle problem, this question presumes a great deal of the user. In addition, it was the shortage of exactly this sort of expertise among non-experts that motivated the choice of infectious disease as a domain and the design of MYCIN as a clinical consultant. If the program is 15 a" COMMUNICATION Sec. 4.1 S.N. Cohen Fe to be useful, it should focus on objective data, and be able to rely onits set of rules to supply the judgmental knowledge necessary to make the difficult, subjective judgments. In practice, this means that concepts like “febrile due to the infection” must be further decomposed to discover the grounds on which such decisions are made, and new rules written to embody those decisions. Each of those rules should be examined in turn, to insure that they do not require unreasonable levels of expertise from the user. In this fashion, the basis on which the program makes its conclusions (and hence the questions which it asks) will move away from ‘softer’, subjective information, and toward more easily quantified objective data. The point here is not to reduce the physician’s role to that of simply entering data, since some of these subjective judgements are best performed by the physician. We intend rather simply to increase the system’s judgmental capacity and level of sophistication, as our fundamental aim is to create an effective symbiosis between physician and computer, making the best use of the talents of both, (This movement toward objective bases for decisions would also provide an effective solution to the problem of variations between users of the program. Especially where questions of judgment are concerned, there can be some variation in the answers to MYCIN’s questions given by two clinicians running a consultation about the same patient. We expect that if the program were to request less subjective data, this variation would be much reduced.) 4.1.3 Disease Models One important capability of a human consultant is the ability to detect and take appropriate action in response to inconsistent information. This appears to be based on a knowledge of what constitutes a ‘normal’ constellation of symptoms for a particular pathology, i.e., a model of the disease. For example, consider the case of a 24 year old military recruit presenting with meningitis. History taken from the patient reveals that he has been recently exposed to other recruits with meningococcal disease while physical examination shows areas of purpura over his entire body. However, gram stain of the CSF is interpreted as showing gram negative rods. An infectious disease expert would have the gram stain of the CSF of this particular patient reexamined to ensure that there had been no misinterpretation. MYCIN currently has a very simple version of consistency checking, in that each individual answer given during a consultation is checked for validity. For instance, the system will challenge a response indicating an age of more than 100 years, or a white blood cell count of more than 30,000. But each 16 Gaal... COMMUNICATION Sec. 4.1 S.NeCohen a of these is an independent test based on the entire possible range of each piece of data. The program should have the same sort of disease models that human consultants seem to employ, to allow it to test the plausible validity of each piece of information in the context of the likely pathology. This would add an important capability to the program, if it were faced with a situation in which some particular piece of information seemed to be at variance with the current hypothesis about disease etiology. It could suggest to the clinician the possibility of a technical or clerical error in the lab report, and indicate that the test should be re-run. Where this was impractical due to considerations of time or expense, the inconsistent datum could justifiably be ignored in the remainder of the consultation. This ability to judge the likely validity of information within the context of the clinical situation is an important part of human performance on the task, and will make a significant contribution to MYCIN’s competence. H,1.4 sensitivity Analysis Extensive testing of the program on real eases has suggested two other types of reasoning ability that will markedly enhance the program’s performance as an intelligent assistant. We noted above the program’s ability to recognize the situation in which it has insufficient data to make a recommendation. Ina similar situation, a human consultant does not simply indicate the lack of data, but goes on to suggest additional tests to run, and indicates exactly which pieces of information will be required before a conclusion can be reached. This is the first of the additional forms of reasoning the program should have -- it should be able to indicate the source of its inability to reach a conclusion, and determine what information is necessary before it can proceed. There is a large body of work in the field of decision analysis (see, e.g., [15]) that will provide a useful foundation for this. Second, a human consultant may at times offer a recommendation with the warning that the evidence was contradictory, and even a slight change in the data might make a large difference in the final result. That is, he can indicate how sensitive his final answer is to small changes in the information on which it is based. The fundamental mechanism on which MYCIN is based is particularly well suited to implementing both of these abilities. Since the system performs a_ step-by-step analysis of the case, with each decision expressed by one of the rules in the knowledge base (rather than a one-step probabilistic computation, for instance), MYCIN is capable of reviewing its own reasoning process, re-examining it, and making further decisions about it. Thus, if unable to reach a conclusion, it might re-examine the reasoning used to see what missing information prevented it !7om 17 PRIVILEGED COMMUNICATION Sec. 4.4 S.¥.Cohen GD reaching an answer. Similarly, it might routinely re-examine its results at the conclusion of a consultation, to determine if any are sensitive to slight changes inthe information about the ease. If so, it might offer the physician a very specific warning, indicating exactly what changes should be made to its current. recommendation in response to specific changes in information about the patient. In the example given above, for instance, the system might indicate that the meningitis may, indeed, be of gram negative etiology, but that the validity of this diagnosis is based solely on the results of the gram stain of the CSF, The system would also note that an abundance of clinical data suggests the diagnosis may be meningococcal meningitis and that antibiotic coverage for neisseria- meningitidis should also be considered. 4.2 Knowledge Engineering Support Tools As is clear from the preceding discussion, much of the knowledge engineering work of increasing the system’s competence involves ongoing development of the knowledge base, and requires constant re-testing and evaluation on real cases. We intend to develop several types of support facilities designed to speed this task. 4.2.1 Patient Library An on-line patient library, for instance, will provide many useful features. It can offer a standard set of cases against which the knowledge base can be tested periodically, to insure that modifications and extensions to improve performance in one area do not inadvertently degrade performance in other areas. It can also offer a ready source of examples on which newly added rules can be tested. The first step will be to provide efficient cataloging and access facilities, so that library contents are easily surveyed and retrieved. More sophisticated features would include automatic case selection. Since most changes to the knowledge base will have no effect on the majority of cases in the library, appropriate selection of test cases gains importance as the library size increases, With an automatic selection ability, the program would choose a range of relevant cases on which to test the modification, basing its choice on the nature of the particular modification made, N22 Knowledge Acquisition A second important tool is the further development of the existing knowledge acquisition capability. The primary aim here is to provide a mechanism to allow the infectious disease expert to ‘educate’ the program directly, and to build a large 18 | PRIVILEGED COMMUNICATION Sec. 4.2 S.N.Cohen [a collection of rules without undue effort, Currently, most changes to the knowledge base are Suggested by our clinical experts and effected by the programming staff, There is thus often a delay of a few days between the discovery of a problem and its repair. By bridging the gap between the clinical expert (who communicates his ideas in English) and the system (which ‘understands’ only programming languages), it becomes possible for the expert to make changes in the knowledge base by himself. He can thus make and test his changes in a few minutes, and see immediately if they improve performance. A system designed to do this has been constructed, and has demonstrated the utility of acquiring new knowledge directly from the expert, in the context of an existing shortcoming in the knowledge base [14]. But further development of these features is necessary. For instance, we intend to improve the existing record keeping facilities, to include extensive background information about all rules, giving such things as the name of the expert who wrote the rule, the motivation for adding it to the system, references to published literature which corroborate the conclusions it draws, and a history of modifications made to it. For the expert extending the knowledge base, this provides a “seratch pad’ of sorts, making the ongoing task of knowledge base development a good deal easier. For the clinician using the system, it means increased confidence in the advice offered, since not only ean that advice be explained, but there will be literature references available for each step in that explanation, A second improvement would be a more powerful ‘rule editor’, that would make it easy for the clinical expert to make any of a number of common changes to rules in the knowledge base. He could then make small changes without going through the more extensive routines necessary to re-write the rule. 4.2.3 Testing the Effect of Adding @ New Rule As the knowledge base grows Significantly larger, we will encounter new problems of modifying and using it. Testing the effect of a new rule, for instance, is currently done with empirical techniques, as indicated above, by running the new system on a large number of cases. However, it may eventually become impractical to do this as the knowledge base gets very large, since too many cases may have to be tried, Hence the empirical techniques should be Supplemented with analytical techniques, in which the system examines its own knowledge base to determine what effect adding the new rule may have. This is, once again, made feasible by the particular rule-based representation of knowledge that we use. 42.4 strategies Problems inthe use of a very large knowledge base may arise because, currently, the system tests every rule for 19 PRIVILEGED COMMUNICATION Sec. 4.2 S.N-Cohen GS CHEB... --revance to the patient being discussed. This may ~~ eventually become impractical as the knowledge base gets quite large. It may then prove necessary to add to the system a number of strategies which allow it to apply its rules more selectively. We have developed a mechanism for the expression and use of these strategies, and plan to begin assembling and testing a number of them to improve MYCIN’s performance. 4.3 Human Engineering and Clinica] Capabilities The common reluctance of researchers and physicians to accept computers as intelligent assistants presents a challenging design problem. It means that a high level of performance alone is insufficient to assure that a program will have an impact on health care research and practice. We must present the physician with a program that is similar in some respects to the source of advice he is used to, the human consultant. It was this that motivated the explanation facilities in MYCIN, since we recognized early in the program’s development that physicians were unlikely to accept dogmatic advice from a program without further explanation of its basis. We will continue developments of this sort, to insure that the system is not only a competent consultant, but one that is *friendly’ and easy to use. 4.3.1 Dose Modification in Renal Failure As one example of a new development to increase the system’s utility, we will be developing new uses for the routines which modify drug dose in renal failure, They are currently invoked when the therapeutic regimen is printed, near the end of the consultation. But the problem of dose modification in renal failure is a common one, and the computation required is a complex operation. Thus a physician may be reluctant to undertake the necessary computation for a regimen he may have selected on his own. In response, we intend to make the dose modification routines available as a separate option in MYCIN. A physician would be able to request a “mini-consultation’ concerned solely with renal failure and dose modification. This is one example of a more general idea: providing a number of small, but highly useful auxillary routines that can assist the clinician with many of the necessary tasks he must perform in administering antimicrobial therapy. We believe that the physician’s bias against computers may not be so strong where straightforward mathematical computations are concerned. These simple-to-use utility routines can provide the initial inducement to the physician to use the computer, and may eventually encourage him to view our entire system asa _ useful tool in patient care and disease management. 20 que. COMMUNICATION Sec. 4,3 S.N-Conen 4.3.2 Improvements in Data Collection Our experience with new users of the system also indicates that physicians tend to become impatient with the system’s current approach to data collection. They are used to offering the consultant a brief summary of the case that compacts a great deal of important information into a few sentences, Since the problem of having a computer understand ordinary English is well known to be very difficult, we have instead settled for having the program request each piece of information individually. We will be making several changes in this process to speed it up, The Progress Report section above mentioned a revision to the organization of the consultation that offers several advantages, including faster, more uniform data entry. This process will be Simplified still further, by tabularizing it. That is, instead of answering each of a number of individual questions, the physician will be presented with a table he can complete, one that will have room for the necessary information. This should make the process even easier. The remainder of the consultation will then consist of a relatively few questions that are specific to the case under consideration. 4.3.3 Facilitating Communication These innovations will speed the process considerably, but the problem of typing ability remains a barrier to convenient use of the system. In response, we have begun to explore new forms of data entry. One possibility is the use of a customized keyboard that would make it possible to enter an answer like pseudomonas-aeruginosa with a single keystroke. Another is the use of a ‘response completion’ feature, Using this, the physician need only type enough of his response to make it unambiguous, and can then indicate that the system should finish it. With this feature, he may only have to type pseu, and can leave the remainder to the system. There is also the possibility of using amore sophisticated type of terminal, perhaps one equipped with a “light pen”, a pointer-like device that allows the user to point to items displayed on the terminal screen, Any or all of these facilities will insure that unfamiliarity with a computer terminal, or lack of typing ability, will not present a problem for persons who use the systen. Techniques like these help to facilitate the communication from the physician to the system. There is an analogous problem of ease and clarity of communication in the other direction, from the computer to the physician. We have found that some of the explanations the system offers to validate its conclusions extend to several lines of text. These are occasionally long and verbose enough that reading them can interrupt the flow of the consultation. We will explore the possibility of replacing these text-based responses with answers oriented around graphics capabilities. Given the natural interpretation of the use of 21 PRIVILEGED COMMUNICATION Sec. 4.3 s.v.conen GD rules during a consultation as the exploration of a reasoning tree, we believe this can provide an especially effective means of communication. An answer that requires several lines of text at present could easily be expressed with a simple diagram that made quite clear the system’s motivation for asking a particular question, or the foundation for any particular piece of advice it offered. This would be a significant improvement in the clarity of communication between the system and expert, In the past we have purposely avoided the use of any specially equipped computer terminals (e.g., those with light pens or graphics capabilities), in order to insure that the final version of MYCIN was easily exportable to a wide range of physician communities. With the progress in technology, however, it has become clear that many advanced features are becoming routinely available on inexpensive terminals that can be used over standard phone lines. We can take advantage of these new developments to make major improvements in the speed and ease of communication with the program, without requiring that each user make any large investment in specialized equipment. 4.4 Exportability of the System We see, within the five-year time scale of this proposal, a change in the character of our work, resulting from the growing importance of “hands-on” involvement by clinical experts. In the initial phases we have concentrated on building the basic methodology -- the production rule encoding of decision criteria, along with techniques involved in using them (the consultation, explanation, and knowledge acquisition systems). We are still involved inthis phase, and as noted above, will continue to develop these ideas and programs. There is of necessity, therefore, a very close connection between our work and that of the experts who are codifying their knowledge of the field. But our framework has begun to converge on a_ solid foundation, as changes to the basic methodology have become far fewer and further between. By year O4 of this proposal we foresee having a solid enough foundation that it can be adopted by outside experts as a basis for codifying their knowledge of the field, largely independent of our own. continued development work. This added dimension of the research -- clinicians and researchers working directly with our system to develop agreed- upon collections of judgmental decision ecriteria -- will put a Significant strain on the available resources. We are presently running on the computer at the SUMEX-AIM research facility supported by the Biotechnology Resource Program (under Grant RR-00785), and although the system is loaded close to capacity, we find it an effective facility for our program development research, The clinical experts, however, will need 22 Quam. COMMUNICATION Sec. 4.4 s.N.conen GD to use the results of our work (i.e., the programs we develop) as the medium for their own research. If their work is to be effective, and their continued involvement assured, they must be given high performance tools that provide a speedy response. We do not believe that SUMEX can now provide that additional level of research support, nor do we believe it is within the scope of the SUMEX-AIM charter to support widespread use of programs ina service mode. In addition, the long-term impact of our research is not likely to be very widespread if our system is available only on a very large and expensive computer. Given the potentially wide range of applicability of the proposed work, we believe it important in the long run to provide a relatively inexpensive, exportable system. Finally, we are currently relying on the SUMEX facility for both aspects of our work (methodology development by the computer scientists and knowledge base construction by the clinicians). Even now the latter computational load is large enough that moving it to a separate system would be an important contribution to reducing the burden on SUMEX. As a result of these problems, we recognize the need for Some additional means of exporting the system to the community for whom it is intended. There are two alternatives we are exploring in conjunction with the SUMEX facility staff: machine independent implementation of the programs and moving the programs to a satellite computer with many of the capabilities of a PDP-10. The MAINSAIL language is currently under development at the SUMEX facility asa machine independent programming language which will make possible wide dissemination of programs. Programs coded in this language will require little conversion effort to run on other computers. As a practical matter, however, this approach seems best suited to the design of new progran systems. It does not now appear to be a desirable solution for exporting programs the size and complexity of MYCIN, due to the magnitude of the reprogramming task, Another alternative, which is still consistent with MAINSAIL implementation, is the use of mini-computers that could be added to SUMEX as satellites . In this approach, one of the “large mini’s’ currently under development by DEC would be set up aS a peripheral to the main system, sharing the file system and other I/0 devices, but with its own memory and CPU to provide additional computing power. With the cost of such a system currently projected in the range of $250,000, it presents an adequate solution to the problem at a much smaller investment. If the satellite machine were capable of running INTERLISP (or a close dialect), we would realize several other advantages 23 PRIVILEGED COMMUNICATION Sec. 4,4 S.v-Conen aS well. With a direct hardware connection between machines, and minimal software conversion necessary, the two phases of our work (research on methodology and di velopment of knowledge bases) can proceed in parallel. This also provides an effective mechanism for feedback from the experts on their experience in building the knowledge base, and means we can more quickly incorporate their suggestions and ideas into our research work. There is of course an unavoidable degree of uncertainty in these plans. There is not now on the market an “off the shelf’ mini-system that meets our needs. However, several current and projected developments combine to make this a reasonable prospect. First, work is currently underway at Bolt, Beranek and Newman (Cambridge, MA) on a version of INTERLISP for the PDP-11. This software development is under-written by the Advanced Research Projects Agency (ARPA) of the Department of Defense, and will be available to the ARPA research community, which includes several of the projects at Stanford. In addition, the BBN work includes the development of an augmented PDP-11/45 as a hardware facility for running their System. We believe that an off the shelf system would be more desirable in the long run than the BBN hardware, which is in part “home grown’. But the existence of such a system, and the availability of the software to run it, is an important demonstration of the feasibility of our plans. In addition, the work underway at MIT on a “LISP machine” (a computer designed specifically to run LISP code, and intended to be price competitive with existing mini-machines) is another demonstration of the practicality of our plans, and another possible source for the facility. Finally, recent reports in the trade press (see Appendix B) indicate that commercial manufacturers will soon be offering a machine of the size and architecture needed to run our system, at a price in the range quoted above. Thus while we cannot now specify the precise piece of hardware which will provide the facility required, the developments noted indicate that it should be commercially available at about the same time as our projected need for it. We thus feel that, despite the uncertainty of projecting four years ahead, the necessary hardware and software will be available at an attractive price. ALS Performance Evaluation In order to demonstrate the effectiveness of the MYCIN framework for codifying knowledge for research scientists we will need to demonstrate that a disparate group of scientists can communicate with a growing knowledge base; find their points of disagreement, and reach consensus on a common expression of their knowledge. As a check on whether the experts converge ona correct set of rules, we will also need to demonstrate that the 24 tem and to uncover CIN’s Impact on Consensus Among PRIVILEGED COMMUNICATION Sec. 4.5 S.N.Cohen aa» resulting system comes to expert-quality decisions on difficult cases. Dr. Axline will be our initial outsidé collaborator. Because he understands the system and was instrumental in developing the bacteremia knowledge base, long distance collaboration will be less difficult than with any other infectious disease expert. He will be a major source of knowledge about prophylactic uses of antibiotics, for which the Stanford group will act as critics. The Stanford group, on the other hand, will be the primary source of rules about urinary tract and pulmonary infections, which Dr. Axline will then criticize. The knowledge engineering tools that we now have for examining the knowledge base constitute the minimal capabilities we need for long distance interaction. As soon as the University of Arizona and Stanford groups converge on these three initial rule sets, we will extend the collaborative community to experts at other institutions. Our evaluation activities will concern two areas, parallelling our two central goals, 4.5.1 Evaluation of MYCIN’s Performance in Infectious Disease During the next three years we will implement a formal program of performance evaluation, to insure the maintenance of a high level of performance in areas currently within MYCIN’s expertise, and to aid in extending that performance. Maintenance of existing performance levels will depend primarily on the patient library mechanisms described earlier. Using some of the advanced facilities we will be developing, it will be possible to have the program running unattended in the evenings, testing modifications to the knowledge base by selecting cases from the library, and comparing the new answers with the expected results. The system will be able to run and test a number of cases overnight, and make available in the morning a detailed report of the results. Extensions to the system will be made with the help of infectious disease fellows. By making the program available to them, we hope to profit by their extensive use and testing of it, isti weak CXESEANE wea points. 4,5 i ve Evaluation Experts in the Field suggest new additions to the sys of MY