RESEARCH PROGRAM: \ A\ /

/ '

BIOMEDICAL KNOWLEDGE REPRESENTATION U
. ——
NL.
Submitted to the
National Library of Medicine
February 1979
Department Computer Science |

— —

. ~ 7 — «
ersity “~\ Stanford Univ
TABLE OF CONTENTS

PART A -—— ADMINISTRATIVE PAGES

Section
IT. RESEAPCE OBJECTIVES - 28 6
II. BUDGETS > 8 © © «© © «

II.A. Administrative and Core Research . .

II.B. Project 1 7 «© © «6
TT.c. Project 2 . 6 2
IT.D. Project 2. * 4 6 6
Til. BUDGET NOTES 8 ee
IV. CURRICULA VITAE 8 8

PART B -- RESEARCH PLAN

Section

I. THE PROGRAM PROPOSED o 4

L.A. Rationale for the Program

I.B. Resources that exist to aid

I.C. Significance - 8 6

It. PROJECT 1 ==CCODIFICATION AND USE OF MEDICAL

KNOWLEDGE IN ONCOLOGY
II.A. Introduction * 8
II.B. Specific Aims . 8

II.C. Methods ee

4

Page
. 2
- 3A
> 3B
+ 3D
- 3F
3E

- 3
. 4
Page
1

. 3
- 20
« 21
- 24
» 24
- 30
33
III. PROJECT 2 -- A WORKBENCH FOR KNOWLEDGE
REPRESENTATION 7 8 © © 2

ITI.A. Objectives of the Research and their
Significance . « +»© «© «© «© « «

III.B. Background and rationale . . . . .

ITI.C. Methods of procedure . «© « 2 « «

IIl. PROJECT 3 -- CODIFICATION AND USE OF MEDICAL
KNOWLEDGE IN CLINICAL LABORATORIES

III.A. Objectives ° ° . . . . . .

III-B. Background erd rationale + + «+ >

IItr.c. Methods . ° . ° . . . .

Iv. CCRE RESEARCH o 8 © 6 6 ew
IV.A. Objectives of Research > © «© « «
IV.B. Eackground and Rationale. . . . .

IV.C. Methods of Procedure . . .« «© « «

IV.D. Significance a
Vv. FACILITIES AVAILABLE a
VA. Hardware er a

V.R. Software and Personnel re

VI. COLLABORATIVE ARRANGEMENTS o 8 © © «© «6

VII. PRINCIPAL INVESTIGATOR ASSURANCE - 8 «

VIII. APPENDIX A - 8 © © © © «4 6
Ik. APPENDIX 8B o 8 «© «© © 6 6

x APPENDIX C a
REFERENCES i

ii

37

37

3e

41

69

7¢

73

106

107

107

107

110

lll
Overv lew Sec,
I, THE PROGRAM PROPOSED

We begin this proposal with a description of the broad
program contemplated, with rationale and justification of need,
and a description of resources and facilities already available
for the purpose,

Herein we oropose a five-year program of research on
knowledge representation, and the various problems associated
with it in the design of knowledge-based computer programs. The
Stanford University group will work collaboratively with a group
from the University of Missouri’s Health Care Technology Center,
under the direction of Dr. Donald Lindberg. The program will be
uncer the general direction of Professor Eéward Feigenbaum of
Stanford, who presently serves also es the Principal Investigator
of SUMEX-AIM, the NIH-sponsored National Computer Resource for
research on the application of Artificial Intelligence (AI)
techniques to medicine and biology. This Resource will serve th
computer needs of the proposed program.

The provosed program consists of four activities: three
projects and a core research activity.

Projects One and Three address the problems of knowledge
representation, acquisition, and utilization in specific
medical/hospital settings.

In Project One, the clinical setting is the Oncology Dav
Care Clinic. The task that provides specificity and direction to
the research is the construction of a consultation system
regarding experimental protocols and selection of therapy for
clinic outpatients. This project is led by Professor &.H,
Shortliffe of the Stanford Medical School, the Original developer
of the MYCIN program for consultations regarding infectious
disease diagnosis and therapy.

In Project Two, the transfer of such expertise to other
places and to other medical applications can be viewed as th
primary goal. One powerful way of cumulating the concepts and
methods of an emerging branch of Computer Science is to cumulate
them in working software packages that widely applicable and
widely shared. This project zims at developing @ number of such
backages or "tools", constituting a computer-program "workbench"
for further research on and application of knowledge-based
systems. The packages emerge as generalizations of work done in
the task-specific projects;constitute avery tangible tyce of
result ‘therefrom; and serve to amplify and a
Sec. I. Overview

efforts. This project is under the direction of Professors Bruce
Buchanan and Douglas Lenat of Stanford.

In Project Three, the setting is the Clinical Laboratory
and the task is one of acquiring and representing the medical
expertise that allows the laboratory expert (e.g. the Laboratory
Director) to interpret test results and discuss these with the
patient’s clinical physician. This is the inter-university
collaboration headed by Dr. Lindberg. An important subgoal of
this project is the transfer of the Stanford expertise in
knowledge based systems research to the Missouri Center.

The Core Research Activity will' investigate a variety of
fundamental research questions whose answers will shape present
and future developments in knowledge representation research,
Such questions involve formalisms and data structures for
representing various types of knowledge; various methods—some
automatic, some interactive—for acquiring new knowledge in
systems; new inferential methods for putting this knowledge to
work; strategy~knowledge representations for reasoning about the
Gomain specific knowledge; and so on. The Core Research Activity
is under the direction of Professor Feigenbaum. Douglas Lenat of
Stanford.

Lastly, it is an objective of the overall program to
disseminate the findings of the research, and to provide training
Opportunities to others. This objective will be accomplished
through publications, presentations of research results at
scientific meetings, by making room in the operational sites and
the core activity for visiting scientists and trainees, and by -
participation in a special annual meeting. The meeting to
discuss our research and similar projects in this field will
either be 2 vart of or be coordinated with the annual artificial
intelligence in medicine meetings at Rutgers University. That
is, in years when the Rutgers meeting agenda and housing
facilities can accommodate this group and its audience, we will
join with Rutgers. In years when this is not possible, we will
sponsor a separate meeting addressed to the four principal
objectives of this program.

The administrative arrangements for the Program will be
these:

The Principal Investigators of the various program
activities will collectively constitute an Executive Committee
for the Program, under the chairmanship of the Program Director.
The Executive Committee will meet routinely by telephone-—
conference and occeésionally face-to-face.
Overview Sec. TI.

An Advisory Group will be formed, consisting of colleagues
at other institutions who share our motivations and scientific
interests. This group will advise the Executive Committee on
major decisions end will offer peer review as necessary. The
kernel of the Advisory Committee will be drawn from the
membership of the SUMEX~-AIM Advisory Committee (for which Dr.
Lindberg is currently chairman) .

I.A, Rationale for the Program

 

I.A.1, What do we mean by knowledge?

 

Computer scientists have long recognized that a computer is
a general symbol-manipulating device. Arithmetic constitutes a
special case of this capability—the manipulation of those
symbols that are numbers. In this proposal we will be discussing
non-numeric symbol manipulation by computers. In thinking about
non-numeric computation, it is useful to think about:

a. inference methods (as opposed to calculation and
algorithms)
dD. qualitative “lines of reasoning" (as opposed to

quantitative formulations)

c. symbolic facts (mot merely numeric parameters and
formulas)
d. decision rules of expertise and judgment (as opposed to

Mathematical decision rules)

The use of the term “knowledge” in this proposal is
intended to cover both (c) and (d) above. In common uséae, the
term "knowledge" does not usually include (d), because such
judgmental and experiential knowledge is largely tacit xnowledae
and therefore not recognized (i.e. the knowledge is "private" and
the expert is not aware of what he/she knows and is using in
oroblem-solving) . The knowledge is private not because the expert
is umwilling to share it, but because he/she is unable to
discover and verbalize it.
Sec. I.A. Overview

It is central to our view that such knowledge—the
knowledge of “expertise"— is critical for competent practice in
medicine and science, in fact constituting the bulk of the
knowledge employed in such practice. We view as a matter of great
importance that such knowledge be codified and given a concrete
(and at least semi-formal) representation, so that it can be
used, stored, transmitted to others, analyzed, discussed, and
taught.

Every activity of this proposed program is aimed at
developing the scientific concepts and methods by which this can
be most expeditiously, carefully, and usefully done.

Symbolic computation, though general and powerful, has
hardly begun to be exploited in real applications. The speciality
within Computer Science that has studied complex methods of
symbolic computation is “Artificial Intelligence Research."

T.A.2, Some Relevant Global and Local History

Early work in artificial intelligence aimed toward the
creation of generalized problem solvers. Work on programs like
GPS {by Newell and Simon] and theorem proving , for instance, was
inspired by the apparent generality of human intelligence and
motivated by the belief that it might prove possible to develop a
single program applicable to all (or most) problems. While this
early work demonstrated that there was a large body of useful
general purpose techniques (such as problem decomposition into
subgoals, and heuristic search in its many forms), these
techniques did not by themselves offer sufficient power for
expert levels of performance. Recent work has instead focused on
the incorporation of large amounts of task specific knowledge is
what have been called "knowledge-based" systems. Rather than
non-specific problem solving power, knowledge based systems have
emphasized high performance based on the accumulation of large
amounts of knowledge about a single domain. A second successful
focus in work on intelligent systems has been the emphasis on the
utility of solving "real world" problems, rather than artificial
problems fabricated in simplified domains. This is motivated by
the belief that artificial problems may prove in the long run to
se more a diversion than a foundation for further work, and by
the belief that the field has developed sufficiently to provide
techniques that can aid working scientists. While artificial
problems may serve to isolate and illustrate selected aspects of
a task, solutions developed for those selected aspects often do
not generalize well to the complete problem.
Overv lew Sec I.A,

There are numerous current examples of successful systems
embodying both of these trends, systems which apply task-specific
knowledge to real world problems.

The following are synopses of a variety of knowledge-based
systems developed by the Stanford participants in this program
over the past thirteen years:

DENDRAL: An intelligent assistant to an analytic and
structural chemist. It infers the structures of complex organic
molecules from structural constraints. These constraints are
either supplied interactively by the user from his "private"
knowledge and intuition, or are inferred automatically from
instrument data, such as mass spectral data, nuclear magnetic
resonance data, etc, For those families of molecules for which
the knowledge base has been carefully elaborated, the CENDRAL
program performs at levels equalling or exceeding the best numan
experts. The DENDRAL program now has a significant user
community in university laboratories and in ‘industry, and is
being used to solve difficult real problems.

Meta-DENDRAL: This program is focused on the problem of
elaborating DENDRAL’s knowledge base for specific families of
compounds. It infers an empirical theory {a body of
fragmentation rules) of the mass spectrometry of specific
families from recorded mass spectral data. It has not only
"rediscovered" rules previously acquired from chemists, but has
discovered novel rules for certain families—rules that heve
recently warranted publication in the chemical literature.

MYCIN: This program is an intelligent assistant to 4
physician diagnosing infectious diseases. In conjunction with
its diagnoses, it recommends therapeutic action. It is capable
of explaining its line-of-reasoning in any (and varying) level of
detail to the user in English. It can accept new decision rules
from the user in English. It keeps an updated model of its own
knowledge base, which it uses to critique the introduction of new
rules into the system. It is capable of acquiring and using
Measures of the uncertainty of the knowledge, and produces a
“believability" index with each inference, i.e., it is capable of
approximate implication. A version called  EMYCIN, sens
infecticus disease knowledge, has been developed to extend the
use of the system to cther domains.

HASP: Project scientists working in a classified
environment led the development of 2 signal-understanding program
for continucus surveillance of certain objects of military
interest. The program ran successfully in a number of highly

Gh
Sec. I.A, Overview

varied test situations, and is being further developed ina
currently~funded ARPA program. The program used a design for
incremental hypothesis formation that was a modification of the
HEARSAY design for the CMU speech-understanding system. Symbolic
knowledge from a number of sources was used to aid the
interpretations of the primary signal data. Time-dependent
analysis was novel in this system and played an important role.

AM: This remarkable program conjectures “interesting”
methematical concepts. Its knowledge base encompasses the
(usually private) knowledge of a mathematician as to what
constitutes an "interesting" construct in mathematics. Starting
with the simplest set-theory concepts, and hundreds of rules
defining "interestingness" of mathematical concepts, it has
conjectured such concepts as addition, multiplication,
factorization, primes, unique factorization into primes (the
fundamental theorem of arithmetic), and an almost unstudied
concept in number theory called "maximally divisible numbers."

MOLGEN: (under development) This program is being designed
to be an intelligent assistant to an experimental molecular
geneticist in formulating plans for laboratory experiments
involving the manipulation of short DNA strands with restriction
enzymes. The program is concerned with representing knowledge
about planning and with the automatic formulation of plans to the
level of detail demanded by the user. The program's knowledge
must be represented at various levels—biological, genetic,
topological, and chemical—and these levels must be incorporated
into the reasoning.

CRYSALIS: Crystallographic Image Interpretation: (under
development) This program is being designed to interpret
ambiguous, incomplete three-dimensional image data obtained in x~
ray crystallography of protein structures. The image input data
is the so-called electron density map and the answer desired is
an approximately correct protein molecule (or portion thereof).
As with HASP, many sources of symbolic data support the
interpretation of the primary signal data. The HASP program
organization has been imported as a test of its generality. The
interpretation problem is difficult because the best wavelength
available (x-rays) is too long to resolve atoms and interatomic
separations; hence the need for additional sources of symbolic
knowledge, ¢.g., the amino acid sequence of the protein.

PUFF: This program interprets data from the pulmonary
function testing laboratory and provides for the Lab Director an
interpretive summary of findings regarding airways obstruction,
lung restriction, and the degree of severity; subtype, such as
bronchitis; the corroborating evidence and its weight; treatment

oF
Cverview Sec T.A,

recommendations;etc. This knowledge-based system was built in
collaboration with a pulmonary physiologist at Pacific Medical
Center, and is in routine daily use.

VM: A program that offers the attending physician or nurse
interpretations of streams of data monitored from a patient in
Intensive Care; signals alarm conditions due to unexpected
patient condition or possible instrument malfunction; and offers
advice regarding the management of the patient ’s ventilator
machine assistance. This is another collaboration with Pacific
Medical Center.

SACON: A MYCIN-like consultation system that advises a
structural engineer on the analysis plan necessary to compute the
multitude of structural engineering design parameters needed for
building a complex structure (such as an airplene wing or an off-
shore oil drilling platform or a building). Interactively, in
consultation, the user supplies the design specifications. The
system was built in collaboration with structural engineers at
the MARC Analysis Corporation. It was built rapidly using the
EMYCIN package discussed later.

In short, as the capsule sketches above indicate, the main
themes of our work involve: the acquisition and maintenance of
knowledge bases; the utilization of this knowledge in a variety
of ways for data interpretation, problem solving, and planning;
and the representation of this knowledge for computer inference,

Knowledge Representation Issues and Desians--the

 

I.A.3
MYCIN Exper

lence

In lieu of further general discussion of Knowledge
representation, we have chosen to explicate in some depth our
viewpoint and methodology by drawing upon the experience in
design and development of just one of our programs, the well-
known consultation system MYCIN. For us, this work has been
seminal; hence the discussion of it that follows generalizes to
most of the other Stanford-based efforts mentioned above.

I1.A.3.4a, Background

Several comeuter crograms have been written that attemot to
model a physician’s decision making processes. Some of these
Nave stressed the diagnostic process itself [27],[17]; others

~l
Sec. I.A. Overview

have been designed principally for use as educational tools
[31] ,(36] , [56]; while still others have emphasized the program’s
role in providing medical consultations [4] ,[29],(51],(57].
Actually, these applications are inherently interrelated since
any program that is aimed at diagnosing disease has potential use
for educating and counselling those who lack the expertise or
statistical data that have been incorporated into the program.
Consultation programs often include diagnosis as a major
component, although their principal focus involves interactive
use by the physician and/or the determination of appropriate
advice regarding therapy selection,

In general, the educational programs designed for
instruction of medical students and other professionals have met
with more long-term success [68] than has been the case for the
diagnostic and consultation programs. The relative success in
implementing instructional programs may result because they deal
only with hypothetical patients as part of an effort to teach
diagnostic and therapeutic concepts, whereas the consultation
programs attempt to assist the physician in the management of
real patients in the clinical setting. A program making
decisions that can directly affect patient well-being must
fulfill certain responsibilities to the physician if he is to
accept the computer and make use of its knowledge.

Physicians will, in general, reject a computer program
designed for their use in decision making unless it is
accessible, easy to use, forgiving of noncrucial errors from
nonexpert typists, reliable, and fast enough to facilitate the
physician’s task without significantly prolonging the time
required to accomplish it. They also require that the program
function as a tool to the physician, not as an all-knowing
machine that analyzes data and then states its inferences as
dogma without justifying them.

Those who design computer programs to give advice to
physicians must devise solutions to these requirements in an
effort to combat the current lack of acceptance of computer-aided
diagnosis by the medical profession [14],{24]. The physician is
Most apt to need advice from such a program when an unusual
diagnostic or therapeutic problem has arisen. However, he may be
unwilling to experiment with a program that does not meet the
general requirements outlined above.

Considerations such as those mentioned here have in large
part motivated the research of our group over the last half-
decade. We felt it was important to devise a consultation
program that was (1) useful, (2) educational when appropriate,
(3) able to explain its advice, (4) able to understand and
Overview Sec I.A.

respond to simple questions stated in natural language, (5) able
to acquire new knowledge interactively, and (6) able to be
modified easily. Although we recognized that this list of design
considerations was somewhat idealistic in light of the state of
the art in computer science, we did feel that it provided a
useful set of long-range goals. The program we developed, known
as MYCIN, has had considerable success in achieving many of the
goals stated... The current research proposes to build on the
MYCIN experience, both by expanding the basic computer science
methodology to deal with recognized problems as yet unsolved, and
by implementing a consultation system in a clinical setting where
its usefulness and acceptability to physicisns can be assessed.

I.A.3.6, The MYCIN Program

AS medical knowledge has expanded in recent decades, it has
become evident that the individual practitioner can no longer
hope to acquire enough expertise to manage adequately the full
range of clinical problems that will be encountered in his
practice. Thus when a patient’s problem clearly falls outside
the area of the attending physician’s expertise, consultations
from experts in other subspecialties have become a well accepted
part of medical practice. Such consultations are acceptable to
doctors in part because they maintain the primary physician’s
role as ultimate decision maker. The consultation generally
involves a dialog between the two physicians, with the expert
explaining the basis for his advice amd the nonexpert seeking
justification of points he finds puzzling or questionable. A
consultant who offered dogmatic advice he was umwilling to
discuss or defend would find his opinions were seldom sought.

Fig, 1 shows a schematic view of the consultation process.
Appendix A shows a detailed typescript of a sample consultation.
The physician nonexpert gives information about his patient to
the expert in response to questions and, in return, receives
advice and explanations. Thus there are actually three kinds of
information flow between the physician and his consultant. The
MYCIN program models the consultative process by attending to all
three kinds of information. It is our conviction that programs
wnich ignore the explanation pathway will fail to be accepted by
physicians because they will see in such systems too severe a
departure from the human consultation process (in which the
primary physician is provided with sufficient information to
allow him to decide whether to follow the offered advice).
Sec. I.A, Overview

 

 
 

PHYSICIAN
EXPERT

    
   
   
 

aoe ee — - F— - —  fe ee ee

 

DATA
ABOUT ADVICE
PATIENT

 

 

 

 

 

 

 

te

a meme my

\
i
EXPLA- !
I
!
t

  

PHYSICIAN
NONEXPERT

 
 

Figure 1 - Information Flow Between Physician And Consultant

 

MYCIN is a LISP program designed to serve as a clinical
consultant on the subject of therapy selection for patients with
serious infections, The program may be envisioned as interposed
between the expert and nonexpert in much the way that the large
box is positioned in Fig. 1. The difference is that the human
expert can offer only general knowledge to the program, not
patient-specific decisions. The program thus becomes the
decision maker, using general medical knowledge from experts to
assess a specific patient and to give advice plus explanations
for its judgments.

Pig. 2 details the organization of MYCIN relative to the
human consultation crocess depicted in Fig. 1. As before, the
nonexpert offers data about his patient and in return receives
both advice and, when desired, information via one of two
internal explanation mechanisms (the general question~-answerer or
the reasoning-status checker). The basis for all decisions is
Gomain-specific knowledge acquired from experts (static
knowledge}. A group of computer programs (the rule interpreter)

 

 

19
Overview Sec I.A,

uses this knowledge, and data about the specific patient, to
generate conclusions and, in turn, therapeutic advice. It
simultaneously keeps a record of what has happened, and this
record is available to the explanation routines if the physician
asks for justification or clarification of some conclusion that
the program has reached. Although Fig. 2 is somewhat
complicated, the following discussion should clarify the
interrelationships among the various system components depicted
in the diagram. Furthermore, Appendix A gives detailed examples
of all the features described below.

Knowledge Representation

Static Knowledae

Static knewledge refers to all data that are constant in
the program and unchanging from one consultation to the next.

Facts About The Domain. Much of the knowledge MYCIN
requires 1S simple statements of fact about the domain. These

CAN raneralty ha ranraeantrA sac abe dbA_AKS RAE pent een Leta.

Overview Sec I.A,

uses this knowledge, and data about the specific patient, to
generate conclusions and, in turn, therapeutic advice. It
simultaneously keeps a record of what has happened, and this
record is available to the explanation routines if the physician
asks for justification or clarification of some conclusion that
the program has reached. Although Fig. 2 is somewhat
complicated, the following discussion should clarify the
interrelationships among the various system components depicted
in the diagram. Furthermore, Appendix A gives detailed examples
of all the features described below.

Knowledge Representation

Static Knowledae

Static knewledge refers to all data that are constant in
the program and unchanging from one consultation to the next.
Sec. I.A,

Overview

 

  

PHYSICIAN
EXPERT

    

 

 

 

 

 

I FACTS PRODUCTION
{ ABOUT RULES
THE FOR MAKING
DOMAIN INFERENCES

 

STATIC
KNOWLEDGE/ }

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

   
  

 

t
i
I
I
I
I
I
1
1
i
GENERAL REASONING
1 QUESTION STATUS
ANSWERER CHECKER
| A
! Lt
I | ° !

DATA CONCLUSIONS c I
1 | aBout ABOUT CONSULTATION =
i PATIENT PATIENT

1 ;

1 \ DYNAMIC
I { KNOWLEDGE !
{ ee ee — | 2 om ems eee eee oe ee oe oe oe ow -”
I ADVICE &
1 — —>,- —
I EXPLANATIONS AT EXPLANATIONS DURING 4
I ANY TIME (NATURAL CONSULTATION o>
LANGUAGE) “WHY”, "HOW", "EXPLAIN 7
a ee

  

PHYSICIAN
USER

i i-—---

 
  
 

—e a oe =

 
  
 

KNOWLEDGE-BASED
PRODUCTION SYSTEM

Figure 2 - Schematic Description Of MYCIN Related To Fig. i

 

Production Rules. (Appendix A - Section I) In addition to
simple facts, MYCIN requires judgmental knowledge acquired from
experts and available for use in analyzing a new patient.
Judgmental knowledge in MYCIN is expressed as production rules
[16] which define certain preconditions (the PREMISE) that allow
a conclusion to be reached (the ACTION) with a specified degree

12

fe ee se ee me eee ere eee eee een oe
Overview Sec I.A.

of confidence (the "certainty factor" [49]). Although such rules
are stored as LISP list structures, a series of routines is
available for translating them into English. For example:

PREMISE: If the stain of the organism is gramneg, and
the morphology of the organism is rod, and
the organism is anaerobic,

ACTION: Then there is suggestive evidence (.7) that
the identity of the organism is bacteroides.

Note that the purpose of this rule is the determination of
organism identity. Rules are classified and accessed in
accordance with their purpose as described below.

Dynamic Knowledge

Dynamic knowledge refers to all data that are variable and
change from one run of the program to the next.

Data About The Patient ~ Acquired From The User. MYCIN
asks questions of the user, driven by a reasoning algorithm
described below. These questions generally ask the user to fill
in the “value" in an attribute-object-value triple (eg., "What is
the patient's name?"), or to give the truth value of a oredicate
(eg., "Is the vatient a compromised host?"). Thus these data may
be represented, once acquired, in precisely the way that facts
about the domain are represented in the static knowledge base
(see above).

 

Data About The Patient - Generated By The Program. When
the preconditions in the PREMISE of a rule are found to hold,
MYCIN executes the ACTION portion of the rule and generates a new
"fact" which can, once again, be represented as an attribute-
object-value triple. As mentioned above, conclusions may also
have a confidence value associated with them, thereby requiring
that the triple be expanded to a quadruple:

 

 

the identity of ORGANISM—1 is
bacteroides, with -
certainty factor of 2.7
(IDENTITY CRGANISM-1 BACTEROIDES .7)
Sec. I.A, Overview

Predicates may be similarly expanded. Furthermore, by
generalizing this scheme to include representation of data
acquired from the user, the physician may be asked to express his
confidence in the answer he gives when MYCIN asks a question.

Maintenance Of A Record Of The Consultation. A history of
the consultation is the third variety of dynamic knowledge. The
details of representation need not be described here, but these
data include records of which rules succeeded, which rules were

ried but failed, how specific decisions were made, how
information was used, and why questions were asked.

 

The Production System

 

The Rule Interpreter

This series of routines analyzes rules in the static
knowledge base, determines whether they apply to the patient
under consideration, and if so draws the conclusions delineated
in the ACTION portions of the rules. This process would quickly
become unmanageable as system knowledge grew if there were not a
mechanism for selecting only the most relevant rules for a given
patient. This is accomplished by a goal-oriented approach that
we have described in detail (50],[51]. Briefly, as the rule
interpreter examines the PREMISE of a rule, it notes whether the
relevant data needed to determine the truth of each precondition
are already known. If not, it digresses to examine those rules
which make conclusions about the data that are needed by the
first rule. The PREMISE conditions of those rules may, in turn,
invoke additional rules, and in this way @ reasoning network
relevant to the first rule is formed. Since rules are classified
according to their purpose, as previously described, it is easy
to identify all rules that may aid in determining the truth of a
specific precondition. The entire process is initiated by
invoking a specific "Goal Rule" which defines MYCIN’s task and is
the only rule necessarily invoked for every consultation. When
MYCIN can find no rules for determining the truth of a
precondition, it asks the user for the relevant data. If the
physician does not know the information either, the invoking rule
is simply ignored.

Maintenance Of Initiative In The Hands Of The Physician

As was discussed above, a physician is not likely to accept
a system such as MYCIN if the program simply asks a series of

14
Overview Sec I.A.

questions and then presents a piece of dogmatic advice as it
terminates execution. The production system has therefore been
provided with a series of "interrupts" that allow the physician
to digress with questions of his own or to demand justification
for the line of questioning on which MYCIN has embarked during
the consultation. Whenever the program asks a question, the user
can temporarily refuse to answer and instead call on the
explanation capabilities described in the next section.

Explanations

The Reasoning-Status Checker (RSC) (Appendix A ~ Section
TV)

This component of the explanation system deals with most
guestions that arise during the consultation session itself.
Because the context of current reasoning about the patient is
well-defined, the physician can be given a great deal of
information on the basis of a few simple commands that do not
require natural language processing. These commands are briefly
described below: the details of their implementation have also
been documented [48]. As shown in Fig. 2, the reasoning status
checker (RSC) uses only the knowledge base of rules and the
current record of the consultation; the general question~-answerer
(GQA) described below, on the other hand, has access to all
static and dynamic knowledge.

The WHY Command. whenever MYCIN asks a question, the
physician may prefer not to answer initially and instead to
inquire about the reasoning underlying the questioning. Thus he
may simply respond with the command WHY (i.e., "Why do you think
that the information you are requesting may be useful?"). Since
all questions MYCIN asks are generated by rules, and since the
rules are selected according to their puroose as previously
mentioned, an English language translation of the rule under
consideration generally serves as an adequate response to the WHY
query. The RSC therefore responds by displaying the current
rule. In addition, it places an identifying number before each
of the preconditions in the PREMISE and indicates whether the
condition is (a) already known to be true, or (b) still under
investigation (note that one of the latter group of preconditions
will have generated MYCIN’s current question to the user). The
physician can in turn inquire why the displayed rule was selected
by asking WHY 4 second time, and the RSC will accordingly display
the next rule in the reasoning network.

The HOW Commend. As mentioned above, when MYCIN displays a

IC
nh
Sec. I.A, Overview

rule in response to the WHY command, it labels each precondition
in the PREMISE with a wumique number. The physician may then
respond to the displayed explanation by entering HOW followed by
one of the identifying labels. If the reference condition is one
that MYCIN has already concluded to be true, the RSC assumes that
the physician is asking "HOW did you decide that the specified
precondition is true?" and answers by citing the relevant rules
used to make the decision. If, on the other hand, the cited
condition has not yet been fully investigated, MYCIN assumes the
physician is asking “HOW will you decide if the specified
precondition is true?" and responds by citing the rules it
intends to try, only some of which may actually succeed,

The General Question-Answerer (GOA) (Appendix A - Section
V)

The general question-answerer (GQA) is a more comprehensive
explanation system which, at any time during or after the
consultation session, has full access to all static and dynamic
knowledge in MYCIN (Fig. 2). Since it cannot make simple
assumptions based on context, as the RSC can do, the GOA must
accept and answer questions expressed in natural language.
MYCIN’s rule-based knowledge representation scheme, and some
techniques borrowed from early work in computational linguistics
(13],(38],[47], permit a straightforward but powerful approach to
interpreting simple English questions without contending with
several of the complex problems oof natural language
understanding. The details of this approach have been documented
[76].

Questions About Static Knowledge. The ability to retrieve
information from the static knowledge base gives the GA a
tutorial capability. Since the static knowledge is acquired from
experts, the GQA can essentially act as an intermediary between
an expert and a physician seeking general information about the
infectious disease field. The user might ask simple questions of
fact (eg., "Which culture sites are normally sterile?") or
questions regarding judgments stored in rules. Questions of the
second variety are termed "rule-retrieval" questions because they
may be answered simply by identifying and displaying English
versions of relevant rules from the knowledge base. Retrieval
may be keyed to the rule PREMISE (eg., "How do you use the gram
stain of an organism?"), the ACTICN (eg., "When do you decide an
organism might be a streptococcus?"), or to both the PREMISE and
ACTION (eg., "Do you ever use the morphology of an organism to
determine its identity?"). Furthermore, a question may deal with
a specific rule (eq., "What is ruleg37?"). Note that none of
these rules refers to a specific patient or consultation and thus
requires no access to the dynamic knowledge base (Fig. 2).

16
Overview Sec I.A,

Questions About Dynamic Knowledge. Although the RSC
permits inquiries regarding the dynamic knowledge base, its scope
is limited by the context of the current question being asked by
MYCIN. If the physician wishes to ask more general questions
regarding the status of MYCIN’s reasoning, or if he wishes to
review the program’s decisions after the consultation is complete
and MYCIN is no longer questioning him, the GQA gives him free
access to all information about the specific consultation. Once
again, the user might ask simple questions of fact (eg., "From
what site was culture-2 obtained?") or questions regarding the
basis for MYCIN’s judgments. The second variety is again a rule-
retrieval question, but is keyed to the consultation record in
dynamic data rather than to the knowledge base of rules in static
data (see Fig, 2). Thus questions may again reference the
PREMISE (eg. “How did you use the gram stain of organism—-1?") ,
the ACTION (eg., "What makes you think that Oorganism—-2 might be a
streptococcus?"), or both (eg., "Did you use the morphology of
organism-1 to determine its identity?"). Note that these
guestions parallel the examples given in the previous section but
that they are consultation-specific and thus request the
retrieval not of all relevant rules, but only those that were
actually used successfully in the specified context. Finally,
one may again wish to ask about a specific rule {eg., "Did you
use ruleg37 when considering organism-1?") .

Knowledge Acquisition

 

The only component of Fig. 2 not yet discussed is the
crucial step of acquiring domain-specific knowledge from exverts
and coding it for storage in the static knowledge vase. When
MYCIN was first being developed, such knowledge was acquired by
extensive meetings during which infectious disease experts and
computer scientists discussed specific patients and attempted to
analyze and extract the individual facts and rules that they were
using. Recently extensive work has been devoted to the problem
of automating the knowledge acquisition process in sessions
involving clinical experts interacting with MYCIN directly
(Appendix A - Section IX). This problem has been the subject of
a doctoral dissertation by one member of our group {15].

Certainty Factors

efforts to develop techniques for modeling clinical
Gecision making have had a dual motivation, Their potential
clinical significance thas of course been apparent. The design of
such programs also has required an analytical approach to medical
reasoning that has in turn led to a distillation of decision
criteria that in some cases had never been explicitly steted
before. It is a fascinating and educational erocess for experts

i7
Sec. I.A. Overview

to reflect on the reasoning steps that they have always used when
providing clinical consultations.

Several programs have successfully modeled the diagnostic
process [27],{28],[55]. Each of these examples has relied upon
statistical decision theory as reflected in the use of Bayes”
Theorem for manipulation of conditional probabilities. Use of
the theorem, however, requires either large amounts of valid
background data or numerous approximations and assumptions. The
successful performance of Gorry and Barnett’s early program [27],
for example, and a similar study by Warner using the same data
[55], depended to a large extent upon the availability of good
data regarding several individuals with congenital heart disease.
Gorry [28] has had similar access to data relating the symptoms
and signs of acute renal failure to the various potential
etiologies.

Although conditional probability provides useful results in
areas of medical decision making such as those mentioned, vast
portions of medical experience suffer from so little data and so
much imperfect knowledge that a rigorous probabilistic analysis,
the ideal standard by which to judge the rationality of a
physician’s decisions, is not possible. It is nevertheless
instructive to examine models for the less formal aspects of
decision making. Physicians seem to use an ill-defined mechanism
for reaching decisions despite a lack of formal knowledge
regarding the interrelationships of all the variables that they
are considering. This mechanism is often adequate, in well-
trained or experienced individuals, to lead to sound conclusions
on the basis of a limited set of observations.

We have examined the nature of such nonprobabilistic and
unformalized reasoning processes, have considered its
relationship to formal probability theory, and have proposed a
model whereby the incomplete "artistic" side of the practice of
medicine might be quantified. We have had to develop this model
of inexact reasoning in response to MYCIN’s needs; i.e., the goal
has been to permit the opinion of experts to become more
generally available to nonexperts. The model is, in effect, an
approximation to conditional probability. Although conceived with
MYCIN’s problem area in mind, it is potentially applicable to any
domain in which real world knowledge must be combined with
expertise before an informed opinion can be generated. The model
has been described in detail [75] and is based upon a_ scheme of
weighted numbers we call "certainty factors". Although the model
has been implemented in the MYCIN system, and in EMYCIN (see
below), and although it has allowed the program to demonstrate
impressive decision making performance, we still recognize many
problems with the formalism. The model has generated
considerable attention in the literature [1] and many important
suggestions for further research have been forthcoming.

18
Overview Sec I.A,.
Evaluations Of MYCIN’s Performance

Work on MYCIN to date has concentrated on the infectious
disease subfields of bacteremia and meningitis. Formal
evaluations have been undertaken which show that MYCIN compares
favorably with infectious disease experts in selecting therapy
for patients with bacteremia [62] or meningitis [63]. However,
we have not undertaken a clinical implementation of MYCIN yet,
and do not intend to do so in the near future. The reasons for
this decision are important in that they explain part of the
reason that we have turned from infectious diseases to oncology
at this time.

First, we have felt it is crucial that MYCIN not be placed
on the wards for clinical use if it does not already compare
favorably with other forms of consultative advice available to
orimary care physicians. We have learned that this requires that
MYCIN know about essentially all major infectious disease
subfields since the various disease syndromes interrelate
clinically in such important ways. In our evaluations of the
program, it has tended to be in those cases in which a
concomitant infection existed at some other site that MYCIN has
failed to perform adequately. Yet the time required for us to
develop the required knowledge bases for genitourinary
infections, endocerditis, pneumonia, and pelvic infections would
necessarily be at least as long as the period it has required to
acquire and test the system’s knowledge of bacteremia and
meningitis. We therefore anticipate a considerable period of
time before the program will be able to provide consistently
reliable infectious disease consultations and hence be ready for
ward implementation.

There are other problems as well that have been brought out
by the complex decisions involved in infectious disease therapy
selection. First, the truth model we have devised (see
discussion of certainty factors above) has several recognized
inadequacies that will require further research and testing.
Secondly, no computer~based decision making program with which we
are familiar has adequately managed time relationships amongst
variables, and MYCIN is no exception. We see the need for
continued research into the ways in which the production rule
formalism can be suitably adapted to accommodate the need to
represent time depencencies in clinical reasoning and to use such
dependencies to make appropriate decisions. For example, trends
in a fever or white count over time may be much more important in
assessing an infected patients illness than the actual values of
these parameters at the precise time when the consultation is
being requested,

19
Sec. I.Aa, Overview

Finally, in order to expand MYCIN’s infectious disease
knowledge into new problem areas, improved capabilities for
knowledge acquisition would be extremely useful. Although we
have made important initial steps in the development of this kind
of complex capability [15], there is clearly much more to be done
before an infectious disease expert who is a computer novice will
be able to comfortably interact at a computer terminal in order
to "teach" MYCIN the infectious disease judgmental knowledge that
it needs to know.

I.8. Resources that exist to aid this project

 

The research work proposed herein will not stand alone or
apart from other research already under way in the two sites. The
personnel and facilities in place at the University of Missouri’s
Health Care Technology Center are described later in the
appropriate Project section. At Stanford there is an interlocking
set of existing grants and contracts supporting the work of a
large group of scientists and students, the Heuristic Programming
Project of the Stanford Computer Science Department. This group
has, over the years, produced the various systems summarized
earlier,

Historically the most significant sources of funding have
been:

1. contracts from the Defense Advanced Research Projects
Agency, the leading government agency for funding artificial
intelligence research.

2. grants from the Biotechnology Resources Program of NIH
for the SUMEX-AIM computer facility, without which it would have
been very difficult to accomplish what was accomplished.

The other grants have had a short-term character. Some have
been renewed, others not.

The proposed NLM grant is important to this complex of
funding not only because it represents a significant amount of
funding but most importantly because it represents stable funding
over a five year period. It, therefore, like the ARPA funding,
will constitute the stable base of support that will allow the
work to advance steadily without personnel and funding
fluctuations. The NLM-sponsored work will, in turn, benefit from

29
Overview Sec I.B.

the other supported work in the usual coordinated and synergistic
way that significantly amplifies the effect of the NIM support,

The grant for the SUMEX-AIM computer resource ends in mid-
1981. There is no reason now to believe that at renewal time the
grant will face trouble. However such large facilities grants are
always subject to a great deal of pressure, not always from peer-
review. The need to service the research activities of an
ongoing five year NIM research project will definitely add
Strength to the renewal application.

Finally, a resource of the greatest significance for the
success of this work are the collaborative links that we have
built over the years with medical scientists and clinicians at
the Stanford Medical Center, the Pacific Medical Center, and the
University of Missouri. It takes years to make such links work
smoothly, but the resource is indispensable to a oroject on
biomedical knowledge representation.

I.C, Significance

Collectively, we stand on the threshold of a new era in our
understanding of the nature of medical and scientific knowledge,
its distribution, and its effective use, Superficially, the
cause of this has been the emergence of electronic symbol-
processing and digital communication. More substantially, the
reason for optimism is the emergence of knowledge-based computer
systems research and application as a viable scientific and
technical discipline.

We are now beginning to understand in a scientific and
technical way what practitioners have always understood about
their fields of learning and practice: that the bulk of the
knowledge they employ is not the Knowledge of textbooks and
journals, but the informal and judgmental knowledge gained from
long experience and practice. This knowledge is almost never
codified, but is passed from mentor to apprentice by long periods
of training and interaction, such as the internship, residency,
and the Ph.D. graduate program

In the last decade there have been significant
demonstrations that such heuristic knowledge can be explicated,
represented, ard put to use. Needed is an interdisciplinary team
consisting of computer scientists, domain Specialists, ans
various computer programs and computer-oriented methodology.
Sec. 1.C. Overview

Once explicated, this knowledge can participate in the ordinary
processes of cumulation of understanding in a field. For example,
it can be subject to further analysis and be the basis for
empirical studies and experimental investigation. It can be
criticized by peer review. And it can be taught, or disseminated
by library methods (electronic or otherwise).

In addition, the formal knowledge of a field can be coupled
to the informal knowledge to produce computer programs that act
as “intelligent agents" to assist practitioners in solving large
numbers of routine problems, and even some of the more difficult
problems, with which they are faced. Some methods of computer-
based inference are available today to do this, and more are
coming as research in this area matures. The concept is one of
"active knowledge" available to work for users, in contrast to
the passive knowledge of texts and articles (knowledge which is
useless until "discovered" by the practitioner through Library
search and reading),

Such a prospect is not visionary. It demands our immediate
attention. We have known for many decades that computers are
general symbol processing devices, not merely calculators. We
have known for two decades how to program them to infer lines-of~
reasoning through complex problems of a symbolic nature. In the
last decade we have learned how to make such reasoning powerful
and useful—by supplying such programs with considerable bodies
of knowledge about the problem domains. And we have had to learn
how to represent the knowledge. Now microelectronics has brought
the time of low-cost computing upon us. The electronic processing
necessary to make the power of symbolic computing available toa
wide community will be available. We should not allow ourselves
to drop behind in the development of the concepts and methods
necessary for the emergence of the applications.

There are also roles for knowledge-based symbolic computing
that are visionary, but must be explored. The kind of "active"
knowledge we have been discussing can be used to assist in the
discovery of new knowledge. The very human process of discovery
of new knowledge is a slow and halting process at best, done by
very few and marked by very rare bursts of creative insight. It
now seems possible (even plausible) that models of certain kinds
of discovery can be formulated that will systematize for computer
apolication the intertwined activities of inferential search and
literature (i.e. knowledge) search. The Meta-DENDRAL program
(that has formulated new rules of fragmentation in mass
spectrometry) and the AM program (that conjectured some not-so-~-
new objects and theorems in number theory) are demonstrable
precursors of this type of knowledge-acquiring program.

22
Overview Sec I.C,

We envision a National Library of Medicine that will be a
living library of the knowledge of medicine and biology, not
merely the repository of texts, journals, and articles and not
merely the immense file of their electronic images available at
terminals.

23
Sec. II. Project 1

II. CODIFICATION AND USE OF MEDICAL KNOWLEDGE IN
ONCOLOGY

II.A. Introduction

II.A.1. Objectives

The long term objective of our research effort is the
Gevelopment of tools for the representation and use of medical
knowledge in computer-based clinical consultation systems. Such
systems will provide useful assistance to primary care physicians
while incorporating features that heighten the acceptability of
the systems to their intended users. We also wish to increase
our understanding of the logic of medical diagnosis and therapy
planning through this work. To that end we propose a five year
research effort with the following goals:

(1) to demonstrate that a rule-based consultation system
with explanation capabilities can be usefully applied and gain
acceptance in a busy clinical environment;

(2) to improve the tools currently available, and to
develop new tools, for building knowledge-based expert systems
for medical consultation;

(3) to establish both an effective relationship with a
specific group of physicians, and a scientific foundation, that
will together facilitate future research and implementation of
computer-based tools for clinical decision making.

The basic research will build on our group’s prior
experience with a computer-based consultant, termed MYCIN, that
uses production rule symbolic reasoning techniques to assist in
therapy selection for patients with serious infections. The
domain we have selected for the first clinical implementation of
these techniques is the management of research therapy protocols
for cancer outpatients at Stanford Medical Center’s new oncology
day-care center.

24
Project 1 Sec II.A.
II.A.2. Backaround

This research builds on a long history of work on the MYCIN
and EMYCIN projects directed principally by Shortliffe and
Buchanan. Many of the persons developing those systems will be
involved with the research proposed here. These two projects are
described elsewhere and thus need not be described here as well,

IT.A.2.a. Stanford Division Of Oncology

 

In the past decade chemotherapy has assumed a more
important role in the treatment of patients with cancer. Some
2,008 patients are under the direct care of the five faculty
physicians of Stanford’s Division of Cncology in the Department
of Medicine. Most patients are receiving care on an outpatient
basis, either at the Debbie Probst Oncology Day Care Center in
Stanford Hospital or at the Division’s twice-weekly clinic at the
Palo Alto Veterans Administration Hospital. Altogether, about
9,808 outpatient visits are made to the Division physicians each
year.

Effective management of cancer often involves more than one
therapeutic technique. Increasingly, the initial course of
treatment utilizes a combined modality approach. Surgery and/or
radiation may be followed by chemotherapy to control any
remaining cancer. However, chemotherapy alone may be curative in
some cases,

Refined programs (protocols) have been developed for the
administration of radiation and chemotherapy for many forms of
cancer. The Division has had particular success with those used
against Hodgkin’s disease (the sixth most common cancer) and
other lymphomas. In designing and carrying out individual
programs of treatment, the physicians of the Division of Oncology
work closely with Stanford specialists in other areas,
particularly radiotherapists, surgeons, pathologists, diagnostic
radiologists, pharmacologists, and immunologists. Stanford’s
expertise in these many disciplines contributes to the high level
of care received by patients in the Division of Oncology.

The Division is of course also involved in educating and
training physicians on all levels, from medical students to
practicing physicians. among the trainees are nine clinical
fFeliows in oncology who participate actively in both clinical
research and patient care. Five physician specialists and
orivate physicians are involved directly with patient care in the

25
Sec. II.A. Project 1

Debbie Probst Day Care Center. Numerous others participate in
the protocol studies.

The Division of Oncology also firmly believes that
excellence in patient care and in teaching programs is best
achieved where there is a continuing pursuit of new knowledge.
Each of the six full-time faculty members. in the Division is
actively engaged in cancer research. The clinical research
efforts are concerned with the refinement and development of more
effective methods of treatment. New chemotherapy is being sought
and tested. Better combinations of chemotherapy, and of
chemotherapy with other methods (surgery, radiation,
immunotherapy), are also being developed.

Debbie Probst Oncology Day Care Center

The Division’s new, modern, outpatient clinic was designed
in response to the physical and emotional needs of cancer
patients undergoing chemotherapy. Located on the lower level of
the Stanford Hospital, it is designed as a self-contained unit,
convenient and comfortable for both patients and attending
medical personnel.

Three kinds of treatment rooms are provided, including some
for observation or for lengthy (six t eight hours) infusions
that formerly had required hospitalization. Efficient service to
patients is facilitated by a television monitoring system (see
discussion of Motorola system below), a computer-based medical
record system (see discussion of TOD below), and facilities for
preparing chemotherapy, analyzing blood , and viewing x-rays.

Information Display System

When the Oncology Day Care Center was designed, plans were
made for an automated scheduling and information display systen.
This system was developed in conjunction with the Motorola
company and is now in operation in the clinic. The
microprocessor—based system signals alphanumeric video
information to remote locations via video cables. Scheduling
secretaries keep appointment records on an associated floppy
disc, and on any given day four video display monitors in the
oncology conference room are used to display the day’s schedule,
relevant lab test results for the outpatients being seen that
day, room assignments, and the name of the oncologist who will be
attending each patient. At present all data are entered by
secretarial personnel and there is no hands-on interaction
between the physicians and the small computer.

26
Project 1 Sec II.A.

Time-Oriented Databank (TOD)

 

For the last several years the Division of Oncology has
also been using the time-oriented record keeping system (TOD)
Originally designed by Dr. J. Fries for use in the Stanford
Immunology Clinic [25],[58]. The data and all Top programs are
stored in the Stanford campus computer facility, an IBM 379/168,
The emphasis in the design of the TOD system has been the
analysis of large amounts of data on a body of similar patients,
not on interactive record keeping in the clinical setting itself.
Thus there are large amounts of data on Stanford oncology
patients, stored by dates of clinic visits, kept on this distant
computer for retrospective analysis. ‘TOD provides several
programs for statistical analysis of correlations, assessing
prognosis by attribute matching, and assisting with other tasks
that have traditionally required arduous chart review. Since the
data are not currently being used for the care of individual
patients, there may be a time lag of weeks before
transcriptionists extract the relevant data from paper-based
oncology outpatient charts and enter them into the ToD databank.

Oncology Treatment Protocols

As mentioned above, the Division of Oncolegy is active in
clinical research and has many patients being treated under
research protocols. There are currently about 30 operational
protocols, about half of which are active in the sense that
several patients are enrolled in the treatment plan at any given
time. Many of the protocols are designed and overseen by
Stanford oncologists, but there are also cooperative studies
involving Stanford and several other institutions. In many
cases, the cooperative studies are overseen by the Northern
California Oncology Group (NCOG) which has its headquarters very
near the Stanford campus. Each protocol is described dy a
lengthy article, often 45-6¢ pages, that explains the
justification for the therapeutic approach, outlines criteria for
patient selection for the study, describes therapeutic options,
and details the specific chemotherapy doses, dose modification,
and laboratory and clinical data that must be obtained on each
visit. It is quite impossible for any Single individual to know
the details of all 348 protocols. This is a particularly great
problem because the chysicians seeing oncology outpatients
include fellows, residents, and medical students; these
individuals have limited oncology experience and, in the case of
house staff and students, generally rotate through oncology for
only 4-8 weeks at a time, (See [41] for discussion of one
approacn which emphasizes use by primary care physicians, but has
not emphasized 4 well-designed human interface.)
Sec. II.A. Project 1
TI.A.3. Rationale

The rationale for the proposed research has largely been
described in previous sections. In short, there has been limited
success of statistical, data retrieval, and decision analysis
programs in dealing with the judgmental knowledge of expert
physicians and the uncertainties of medical data. We have made
encouraging strides in the development of symbolic reasoning
techniques for application to clinical decision making and
believe that the time is now appropriate for the clinical
implementation of such a system. Only then will it be possible
to assess the power of capabilities which have been designed to
make consultation systems acceptable to physicians. Although we
recognize that the short term impact of such systems is limited
by the current state of the art in computer science, the impetus
for appropriate basic research and development of new interactive
techniques will come largely through the lessons learned in
undertaking clinical implementations. Since techniques already
exist that have potential for considerable short-term clinica
impact, we believe it is now appropriate to spend part of our
time on a project for clinical use.

Although our interest is in the development of systems for
offering any kind of subspecialty expertise to primary care
practitioners, the initial application selected has been the
management of complex therapy protocol information in an
outpatient oncology clinic. This domain was selected for a
number of reasons:

(1) There are large amounts of information in the protocols
but relatively little inferential complexity; those problems that
have prevented us from attempting clincal implementation of the
MYCIN System for infectious diseases can therefore largely be
avoided.

(2) There is a small core of faculty members and oncology
fellows who are largely responsible for patients in the day care
center. Hence a relatively small number of individuals will need
to be introduced to the consultation system, and their continuing
roles in the clinic will heighten their chances of becoming
comfortable with computer-based techniques.

(3) There is already an awareness of, and involvement with,
computers in the Oncology Day Care Center (in the form of the
information display system previously described and associated
video display monitors). Thus, although there is not yet hands-
on computer use by oncologists in the clinic, computer-related
hardware is evident and accepted by the clinicians at the outset

28
Project 1 Sec II.A.

of the proposed research. Many fellows and faculty also use the
TOD system for clinical research and thus have limited, but very
positive, experience with computer use,

(4) Although the application of symbolic reasoning
techniques to the protocol management problem will not tax many
of the capabilities we have developed in the MYCIN context, it is
precisely this simplicity which makes the problem appealing as a
first clinical venture. If the information handling task can be
implemented relatively easily within the EMYCIN formalism, as we
expect it can, then we will be able to concentrate initially on
issues of making the system’s reasoning and knowledge base
understandable as well as making the system’s interaction
acceptable to physicians,

(5) The initial investment in establishing a role for
interactive computing in the oncology outpatient setting at
Stanford will have considerable potential for facilitating
interactions between our protocol management system and the
Division of Oncology’s current computer-related efforts (the
information display system, and the time-oriented databank). We
envision some challenging extensions to the consultation program
whereby physicians interacting with the protocol management
system may simultaneously benefit from direct connections between
Our computer and the other oncology systems.
Sec. II.B. Project 1

TI.B. Specific Aims

We propose core research as well as new demonstrations of
the clinical usefulness of present capabilities developed under
MYCIN research.

As has been discussed, we have identified an important
clinical problem in the outpatient oncology clinic at Stanford,
and have begun a collaboration with members of the oncology
division to develop and implement a Protocol Management System
(PMS) for use in the oncology clinic. Our proposal is to
demonstrate that computer-based reasoning and interactive
techniques developed during MYCIN research can be effectively
aoplied to an important clinical problem, namely the management
of oncology protocol data.

The infectious disease domain with which we have been
involved to date involves complex reasoning and computing
problems that we feel prevent the short term development of a
clinically useful infectious disease consultation system. The
oncology problem, on the other hand, involves large amounts of
knowledge but rather simple reasoning that current techniques
should be able to manage effectively. The complexities of
infectious diseases, however, have provided a particularly
@ppropriate domain for devising new computing approaches while
analyzing clinical reasoning. These difficult problems remain
major research interests of our group. We propose spending
approximately half our time continuing to work on basic tools for
expert medical consultation systems, using the current content of
the infectious disease knowledge base without any efforts to
extend its scope in the short term.

Specifically, our aims during the five years of proposed
research are:

Artificial Intelligence Objectives

(1) To implement and evaluate recently developed techniques
designed to make computer technology more natural and acceptable
to physicians;

(2) To extend the methods of rule-based consultation
systems to interact with a large database of clinical
information;

30
Project ] Sec II.B.

(3) To continue basic research into the following problem
areas: mechanisms for handling time relationships, techniques for
quantifying uncertainty and interfacing such measures with a
production rule methodology, approaches to acquiring knowledge
interactively from clinical experts. These are some of the
problems we have identified that have prevented the MYCIN
infectious disease application from being clinically implemented
as yet.

Oncology Clinic Objectives

We plan to develop and implement a Protocol Management
System (PMS), for use in the oncology day care center, with the
following capabilities:

(1) To assist with identification of current protocols that
may apply to a given patient;

(2) To assist with determining a patient’s eligibility for
a given protocol;

(3) To provide detailed information on protocols in
response to questions from clinic personnel;

(4) To assist with chemotherapy dose selection and
attenuation for a given patient;

(5) To provide reminders, at appropriate intervals, of
follow-up tests and films required by the protocol in which a
given patient is enrolled;

(6) To reason about managing current patients in light of
stored data from previous visits of (a) the individual patients
(b) the aggregate of all "similar" patients.

Advantages over present paper-based protocol files:
(1) Can be kept readily accessible and up-to-date;

(2) Can provide customized patient-specific calculations
and 2cvice not possible with a manual system;

31
Sec. II.B. Project 1

(3) May be augmented to provide important additional
capabilities once interfaced with a patient data base (e.g., the
time-oriented data bank [TOD] already used for retrospective data
analysis by the oncology division) ;

(4) Can provide customized explanations of protocol
information and the specific recommendations made by the
management system;

(5) Can improve the quality of clinical research by
encouraging enrollment of all patients in an appropriate
protocol, and assuring that necessary data are obtained to assure
uniformity of information on patients in the individual study
groups;

(6) Can improve the quality of patient care by:

(6a) Saving time by making protocol information easily
available, thus decreasing the waiting time patients
must now occasionally sustain while physicians track
down necessary protocol information;

(6b) Making certain that important tests are done to
screen for potentially serious toxicity of the powerful
agents used in cancer chemotherapy.

32
Project 1 Sec TI.C.

TI.C, Methods

TI.C.1. Overview

Our general approach to the research will be to emulate the
organizational and technical framework used during development of
several interdisciplinary computing efforts involving Stanford’s
Heuristic Programming Project (HPP), of which Prof. Buchanan is
co-director. The cohesiveness of project workers has always been
facilitated by a weekly group meeting in addition to smaller
working sessions at other times. At group meetings both computer
science and clinical personnel have opportunities to present
their work and give and receive suggestions regarding further
efforts. We believe it is important that the physicians and
computer scientists get to know each other end their motivaticns
for involvement in the project very well. For example, the
computer scientists working on MYCIN have all learned a great
deal about infectious diseases, and some have even taken formal
courses in microbiology at the medical school. Similarly, the
clinicians have been encouraged to understand the program in
depth and even to try some programming. We would expect similar
relationships to develop among the computer scientists and
oncologists working on the proposed research. Only in this wav
can both computer science and clinical concerns be taken
adequately into account during system design and implementation.

In addition to the development of the PMS for the oncology
clinic, we anticipate continued research into the basic science
issues discussed previously. As has been noted, we have already
identified several problems that must be solved before complex
reasoning orograms such as MYCIN can be made available for
clinical use. We also anticipate that work in the oncology
domain will uncover new problems, not previously encountered,
that may require significant modification or redesign of the
EMYCIN formalism. Thus we envision two parallel but highly
interrelated efforts:

(1) development of the PMS for the oncology clinic, using
EMYCIN and writing new production rules to embody the protocol
knowledge that will be needed for consultation sessions;

(2) continued mapping of basic science research, from the
core research section of this program, into the oncology problem
Gomain in order to facilitate complex decision making and
acceptable consultations in the clinical setting.
Sec. II.c. Project 1

II.C.2, Oncology Protocol Management System

 

The first year of research on the PMS will be spent
developing the program before it is made available in the clinic.
Years 2-3 will be devoted to revisions and extensions of the
protocol management system in light of initial experience with a
knowledge base about oncology. Years 4-5 will be devoted to
revisions and extensions of the basic methodology, as well as of
the working system, to facilitate use of a clinical data base for
patient management in oncology and related disciplines. We
expect that the five years will be spent as follows:

(1) We will begin by selecting the 2 or 3 most frequently
used oncology protocols (e.g., oat cell carcinoma of the lung,
Hodgkin’s Disease, non-Hodgkin’s lymphoma) . The extensive
knowledge in these documents will be extracted by the oncologists
working closely with those who know the EMYCIN formalism well.
Although much of the knowledge can be represented in typical
EMYCIN production rules, we anticipate that some of “the
information may be best contained in alternate representation
schemes. We therefore expect that new techniques for interfacing
EMYCIN production rules with tabular data or algorithmic
structures may be necessary. Most problems that will arise along
these lines should develop during codification of the first few
protocols; since the protocols all follow a similar structured
format, it is unlikely that new problems will arise when the 29th
or 38th protocol is being considered.

(2) EMYCIN’s knowledge acquisition capabilities remain
somewhat rudimentary (see next section), so we expect that most
new rules will be explicitly written by members of the research
group.

(3) Specific attention will be given to extracting
knowledge regarding patient eligibility for a protocol, tests and
films needed at various stages of treatment, therapeutic
alternatives available, and patient-specific indications for
modifying or withholding therapy. We recognize that these are
the protocol details that are often most difficult for the
oncologists to remember or to extract easily from a lengthy
written protocol (an up-to-date copy of which may not even be
readily available in the clinic).

(4) Once the knowledge has been codified, we will begin
internal testing oy interfacing the new production rules and
cnowledge structures with the EMYCIN orogram. Of particular
interest will be the adequacy of EMYCIN’s explanation
capabilities when interfaced with this new knowledge base.

34
Project 1 Sec II.c.

(5) Modifications will be made to the EMYCIN system in
response to suggestions made by the clinicians working on the
project as they gain experience with its capabilities. Of
primary concern will be an assurance that the human interface is
sufficiently comfortable that the other Division oncologists will
be willing to experiment with the system once it is introduced in
the clinic.

(6) After these first few protocols are operationally
managed by the PMS as described, the system will be introduced in
the Oncology Day Care Center. Orientation sessions will be given
to the clinic oncologists, and suggestions for further
refinements solicited.

(7) The next 3-5 therapy protocols will then be added to
the system, with appropriate notification to clinic physicians
when a new protocol is available for PMS access.

(8) Based on the experience gathered in codifying the first
several protocols, a protocol-entry system with editor will be
developed. This should greatly facilitate the entry of the
remaining protocols, which we anticipate should be fully codified
by the end of year 2.

(9) Anticipating an interface with the TOD system described
earlier, plus progress in the basic research that we will be
undertaking simultaneously, we will next begin to store patient-
related data in TOD format within the PMS. Much of the
information in the TOD Databank is also required by the PMS, so
there would be minimal if any additional effort required of the
PMS user,

(16) Assuming a breakthrough in the representation and
management of time-dependent variables, we would anticipate that
the PMS capabilities would be greatly augmented by access to
patient data stored in TOD format. During Years 4-5 we would
attempt to begin the implementation of this kind of interface
between TOD and the PMS.

All research described above would occur on a research
computer that could not guarantee reliable service to the
oncology clinic. We therefore recognize that we cannot initially
undertake any tasks crucial to clinic or Division operation. The
clinic must be able to continue to function even when our tool is
unavailable for scheduling or hardware reasons.
Sec. II.C, Project 1

Therefore, when the PMS is ready to progress into amore
integral role in clinic operation, we would anticipate, ina
separate proposal, the need for a dedicated machine to permit
reliable clinic service. We recognize that many of the most
interesting and challenging decision making tasks, including
those related to the use of symbolic reasoning techniques in
conjunction with large databases, can not be made available to
clinicians without a dedicated computer, but that this is beyond
the scope of the present proposal.
Project 2

Ill. A WORKBENCH FOR KNOWLEDGE REPRESENTATION

III.A, Objectives of the Research and their Significance

Our primary strategy for conducting our investigations has
been to allow the problem to condition the choice of scientific
paths to be explored. Projects One and Three, dealing with
problems in oncology outpatient consultations end with the
clinical laboratory, are the newest examples,

We are also motivated, however, by the importance (to us
and others) of generalizing our techniques and systematizind our
methodology. This is a normal part of the activity of cumulating
the results in our science, in which the experiments we choose to
generalize upon are the experimental systems we construct for
different domains of knowledge. In Computer Science, one
effective method of cumulating our growing understanding is
construct software packages that are the working manifestation of
what we believe we have come to understand. These packages allow
us to transfer yesterday’s “experimental technique" into
tomorrow’s "tool" for accelerating the research.

These packages also allow investigators in other
institutions to build rather directly upon the results of our
work, thereby amplifying the science as a whole. It is
particularly appropriate to cumulate our knowledge as software
packages in the SUMEX-AIM community in which the users share the
Same computer and system,

We have sought to extract from our various projects the
uniformities that have general applicability; to eliminate the
ad-hoc features that accrue in any large-scale programming
effort; and to build helpful "front-end" interfaces that will
allow others to couple smoothly to our work.

A number of such packages are beginning to emerge, We
propose to continue their development and test; and to merge them
appropriately into a larger software system that (for lack of 2
better term) we refer to as the "knowledce representation
workbench",

The Stanford group is fortunate to have the collaboration
of the Missouri group to act as a test-sand-evaluation site for
this workbench concept. It is expected that much of the research
of Project Three will be done using the emerging "workbench",

37
Sec. III.A. Project 2
We propose the following major objectives:

1. To develop AI technology as software packages that solve
general classes of problems.

2. To actively disseminate the technology by publication and
by encouraging pilot projects using the technology.

3. To apply these packages to medical applications forming
collaborations over time as opportunities arise.

III.8. Background and rationale

Artificial intelligence research at the Heuristic
Programming Project has concentrated on programs having real-
world applications. Each program has been a case study for
representing and manipulating the task-specific knowledge for an
application, Feigenbaum [22] has described this case study
approach as essential in building a science for "knowledge
engineering".

Because the cases have been carefully chosen, the
experience from this approach has accumulated. For example, the
GAl program [53] was developed recently for inferring DNA
Structures from enzyme digest data. This program used the
Generate—and-Test paradigm — in which the combinatoric output of
a complete and canonical generator of possible structures is
limited by pruning rules which use the digest data. That basic
approach was pioneered by the DENDRAL [11] program ten years ago.
With DENDRAL as an example, the development of this analogous
program was completed in only two months.

This example shows how the accumulation of theory speeds
the development of new AI programs. Significantly, the Heuristic
Programing Project has also accumulated methods — in the form of
software packages which can perform specific symbolic
computations. These packages are the state-of-the-art tools for
applied artificial intelligence. A trained "knowledge engineer"
can combine these packages to create computer Drograms for new
applications -— without having to re-program the solution of
standardized subproblems which have been solved before.

38
Project 2 Sec III.B.

EMYCIN: is an example of such a package. It is the domain
independent core of the MYCIN [51] program for the diagnosis of
infectious diseases. EMYCIN provides a framework for building
consultation programs in various domains. It uses a production
rule mechanisn and backward-chaining control structure during the
solution phase and has dialogue facilities for acquiring a
production rule knowledge base. An example of an application of
EMYCIN is the PUFF system for diagnosing pulmonary function
disorder. PUFF was the product of a collaboration with the
Pacific Medical Center in San Francisco. the first version of
PUFF was built in the following way. One hundred cases,
carefully chosen to span the variety of disease states were used
to extract 55 rules. The knowledge base was created with EMYCIN
and then tested with 150 additional cases. Agreement between
PUFF and the human expert was excellent anda later version of
PUFF is now in routine use at PMC. The first version of PUFF was
created in less than 50 hours of interaction with experts at PMC
and with less than 18 man-weeks of effort by the knowledge
engineers. Other applications of EMYCIN will be discussed in the
Section III.c..

The example shows that methods, in the form of usable
computer packages, have. now been developed. These packages
reflect the commonalities we now perceive among separate
applications. They are the recently available tools of apolied
artificial intelligence — programs providing practical symbolic
methods for common problems.

Our current repertoire of "methods" packages also include
the Unit Package, and AGE-l. The EMYCIN program, as discussed
above, is based on production rule technology and has been
successfully applied to diagnosing pulmonary function disorders
and consulting on structural analysis in an engineer ing
application. The Unit Package [52] is based on the so-called
"frames" approach and is being applied to experiment olanning in
molecular genetics. The AGE-l program is based on the HEARSAY
[28] “cooperating knowledge sources" model and is the product of
experience with the SU/X and SU/P [43] programs.

New applications are currently being developed for each of
these packages. Heiser and Brooks at the University of
California at Irvine are using EMYCIN to develop af
psychopharmacoleay consultant, ‘termed HEADMED [34]. Blum [5]
has proposed using the Unit Package in a system which will
combine statistical methods and artificial intelligence
techniques to perform studies on acclinical database. Several
other applications have been oprovosed and are under
consideration.

 

Inhe name “EMYCIN" comes from "essential MYCIN", the MYCIN
reasoning framework without any domain-srecific knowledae,

39
Sec. III.B. Project 2

We propose to continue the development and application of
these packages and to develop new ones as results become
available from core research.

TII.B.1. Relating the Workbench to Core Research

Over the five year course of this research, there will be a
movement of topics from core research into developed packages for
the workbench. Our overall strategy has two main thrusts:

1. To expand the problem solving capabilities of the
workbench by developing more sophisticated methods of
symbolic reasoning.

2. To expand the capabilities of existing packages following
core research in other topics -~- knowledge acquisition,
knowledge integration, tutoring, and explanation.

This mode of research reflects a bias towards the creation
of systems to perform specific tasks. First an approach to
problem solving is developed and tested in a task domain. ‘Then
research in other topics follows. Three methods of problem
solving are discussed in this proposal and elaborated in the
following. The simplest of these is a backwards chaining
approach — exemplified in EMYCIN ~— which links together the
premises and conclusions of rules to construct a direct line of
reasoning. The next level of sophistication in these packages is
represented in the AGE-l1 which is based on the HEARSAYII [2]
architecture. AGE~1l allows (1) both data-driven and goal~—driven
reasoning and (2) reasoning at different levels of abstraction.
This architecture has been used effectively by Stanford
researchers in a signal~processing application [43]. Providing
other AI capabilities — such as explanation or knowledge
acquisition — is more difficult in AGE-l than in EMYCIN. The
next level of sophistication appears in a proposed "planning
package" which is expected to grow out of on-going research in
the MOLGEN project. This approach to planning formalizes the
selection of what to do next asa choice in any of several
problem-solving "spaces". The viability of the latter problem-
solving method is still being tested and essentially none of the
other system capabilities have been developed.

The following is a list of several AI issues discussed in
this proposal. These will be explored within some formalisms

4g
Project 2 Sec III.B.

already developed by us, EMYCIN, AGE-l, and the Unit Package 2
as well as new formalisms,e.g., the Planning Package as the need
arises. The planning package is expected to materialize at the
end of some core research which is currently in progress.

Problem Solving
Knowledge Acquisition
Explanation

Tutor ing

Knowledge Compiling
Time—Dependence
Meta-Knowledge

TII.c. Methods of procedure

This section describes our plan for creating an integrated
collection of well-designed software packages, which can be
combined by a knowledge engineer to meet the needs of a specific
application. In this section we will show examples of each of
the packages and discuss the nature of their applications. We
will also discuss the work proposed for further developing the

packages,

There is a great deal of overlap in the proposed work among
the packages. While the packages reflect different approaches to
problem solving and differ in their state of development,
analogous lines of research are proposed in each, The EMYCIN
package, which is the most developed, uses the the simplest
approach to problem solving and has the broadest range of
proposed work following several lines of core research. As
discussed already in Section III.B.1., similar lines of
development are planned later in the grant period for the other
packages.

III.c.1. EMYCIN

The EMYCIN ("Essential MYCIN") project is an attempt to
provide a framework for building consultation programs in various
domains. It uses the domain-independent components of the MYCIN

 

“the Unit Package is a passive representation package and
dees not provide any software for problem-solving. It is being
used, nowever, aS a representation medium for the Planning
Package and can also be used in conjunction with AGE-l.

4]
Sec. III.C, Project 2

system, notably the production rule mechanism and backward-
chaining control structure, Then for each particular
consultation domain, the system builder supplies the rules and
parameters of that domain to produce a functioning program. Work
on the EMYCIN project is devoted to providing a useful
environment for the new system builder, with emphasis on speeding
the acquisition and debugging of the knowledge of the new domain.

III.C.1.a. An Example of EMYCIN — The PUFF Application

 

The PUFF system for the interpretation of laboratory
measurements from the pulmonary function laboratory. The EMYCIN
syStem was used as base upon which 6@ production rules concerning
the presence of pulmonary disease were created. The data from
over 180 cases were used to create the rules by the pulmonary
physiologist in cooperation with the biomedical engineers who
instrumented the laboratory and Stanford computer scientists who
had previous experience with the MYCIN program.

Figure 1 shows several rules created during the development
of the system. These rules are used to create a complete report
including the input measurements, historical information, and the
measurement interpretation. Figure 2 shows a copy of this report.

IF @ < DLCO < 8¢@ (DLCO is the measurement of diffusion
capacity for Carbon Monoxide)
THEN "Low diffusing capacity indicates loss of alveolar
capillary surface which is "
IF 78 <= DLCO < 88 THEN "mild"
IF 68 <= DLCO < 79 THEN "moderate"
IF 6 <= DLCO < 68 THEN "severe"

IF The severity of obstructive airways disease of the patient is
greater than or equal to mild, and
The degree of diffusion defect of the patient is greater than
or equal to mild, and
The total lung capacity measured by the body box (TLCB) is
greater than 118 percent of predicted,
THEN "The low diffusing capacity, in combination with
obstruction and a high Total Lung Capacity, would be consistent
with a diagnosis of emphysema."
The subtype of obstructive airways disease is emphysema.

Figure 1. Typical PUFF interpretation rules.
Conclusions are made for internal system use and for
inclusion in the summary.
Project 2 Sec III.c.,

PRESBYTERIAN HOSPITAL OF PMC COE JANE 582
CLAY AND BUCHANAN, BOX 7999 P336666.
SAN FRANCISCO, CA. 94120 DR. SMITH, JOHN

PULMONARY FUNCTION LAB
WI 56.7 KG, HT 166 CM, AGE 58 SEX F
SMOKING 49 PK YRS,CIG 1.6 PK QUIT 9,PIPE 9 QUIT 4G,
CIGAR @ QUIT 4
DYSPNEA-W/MILD-MOD. EXER, COUGH=NO , SPUTUM-LT 1 TBS, MEDS-YES
REFERRAL DX-CORONARY ARTERY DISEASE , PRE OP

BERRI HIRT ARERR RAI HIR ERE RARER KATA RRR BRRERERTEST DATE 19-26-78

PREDICTED POST DILATION
(+/-SD) OBSER(%PRED) OBSER(%PRED)
INSPIR VITAL CAP (IVC) L 3.1(8.4) 3.9 { 98)
RESIDUAL VOL (RV) L 2.1(8.3) 3.9 (149) 3.5
(166)
TOTAL LUNG CAP (TLC) L 5:2(9.7) 6.9 (116) 6.5
(125)
RV/TLC % 49. 49, 53.
FORCED EXPIR VOL(FEV1) L 2.6(@.3) 2.1 { 81) 2.1
' ( 34)
FORCED VITAL CAP (FVC) L 3.1(8.4) 2.9 { 96) 3.9
( 98)
FEV1/FVC % 83. 78. 71,
FORCED EXP FLOW 208-1200L/S 4.2(8.8) 4.5 4.4
FORCED EXP FLOW 25-75% L/S 2.9(@.7) 1.5 1.5
FORCED INS FLOW 280-120@L/S 2.9(8.6) 2.9 2.9
AIRWAY RESIST(RAW) (TLC= 6.0) 1.1(2.5) 1.6 (SIGH) 1.4
DF CAP-HGB=14,4 (TLC= 5.3) 25. 17.2 { 68)

( 69%IF TLC= 5,2)

RIK KK HK KAR HK KR RE KRIKIEKRRRKAK KKRIKRR EERE RRA RK RR KR RE
INTERPRETATION: Elevated lung volumes indicate
overinflation. In addition, the RV/TLC ratio is increased,
suggesting a mild degree of air trapping. Forced vital capacity
is normal but the FEVI/FVC ratio is reduced, suggesting airway
obstruction of a mild degree. Reduced mid—expiratory flow
indicates mild airway obstruction. Obstruction is indicated
Sy curvature in the flow-volume loop of a small degree.
Following bronchodilation, the expired flow shows slight
improvement. This is confirmed by the lack of change in
airway resistance. The low diffusing capacity indicates a
loss of alveolar capillary surface, which is moderate.

CONCLUSIONS: The low diffusing capacity, in
combination with obstruction and a high total lung capacity
would be consistent with a diagnosis of emphysema. The
patient’s airway obstruction may be caused by smoking.
Discontinuation of smoking should help relieve the symptoms.

PULMCNARY FUNCTION DIAGNOSIS:
1. Mild Cbstructive Airways Disease.
Emphysematous type.
Robert FPallat, M.D.
Figure 2. Sample PUFF Report

43
Sec. III.C., Project 2

TII.C.1.b. Applications of EMYCIN

To date, EMYCIN has been successfully applied at Stanford
to the domains of pulmgnary function (PUFF) [37] and structural
analysis (SACON) (3]. EMYCIN is also being applied to clinical
psychopharmacology [34] at the University of California at
Irvine.

ITII.C.l.c. Proposed Work for EMYCIN

 

SYSTEM-BUILDING TOOLS

 

1) Acquisition of Knowledge — Acquire the

 

3S5acon (Structural Analysis Consultation): The purpose of
the consultation is to provide advice to a structural engineer
regarding the use of a structural analysis program called MARC.
The MARC program uses finite-element analysis techniques to
simulate the mechanical behavior of objects. The engineer
typically knows what he wants the MARC program to do, e.g.
examine the behavior of a specific structure under expected
loading conditions, but does not know how the simulation program
should be set up to do it. The MARC program offers a large (and,
to the novice, bewildering) choice of analysis methods, material
properties, and geometries that may be used to model the
structure of interest. The user must learn to select from these
options an appropriate subset that will simulate the correct
Physical behavior, preserve the desired accuracy, and minimize
the (typically large) computational cost. The goal of the SACON
program is to bridge this gap, by recommending an analysis
strategy. This advice can then be used to direct the MARC user
in the choice of specific input data, e.g. numerical methods and
material properties.

The performance of the SACON program matches that of a human
consultant for the limited domain of structural analysis problems
that was initially selected. To bring the SACCN program to its
present level of performance, about two man-months of the
expert ’s time were required to explicate his task as a consultant
and formulate the knowledge base, and about the same amount of
time implementing and testing the rules (this estimate does not
include the necessary time devoted to meetings, problen
formulation, demonstrations and report writing).

44
Project 2 Sec III.¢c,

framework, vocabulary, and decision rules of the
domain from the expert.

2) Rule Checking -— Check syntax and semantics of
new rules and check for possible conflict with
existing rules,

3) Alternative Models for Reasoning under
Uncertainty -- Provide the system builder with a
fixed set of alternative methods for propagating
degrees of certainty in the reasoning chains.

4) Time-Dependent Features -—— Enable the system to
make use of parameters whose values change with time.

5) Meta Knowledge — Add capabilities for using
meta-rules and other meta-level knowledge.

In addition, we propose extending the power and flexibility
of the present system in the following ways:

DOMAIN-INDEPENDENT CONSULTATION SYSTEM

1) Answering Questions — Incorporate question-
answering capabilities into the system.

2) Tutoring — Couple the system to a tutor ing
Program to teach the contents of the knowledge base.

Many of these items involve substantial research before we
understand the best way to add them to the program or even what,
precisely, needs to be added. We present below our best ideas on
the approach we will take, but wish to emphasize that the nature
of the solution may change as our research progresses,

The products of the research will be presented in
scientific papers and in an integrated computer program that can
be used by scientists to encode their ow Knowledge of their
domains for reasoning about difficult problems.

45
Sec. ITII.C. Project 2

TII.C.1.d. Acquisition of Knowledge

 

The preliminary facilities for acquiring knowledge (called
TEIRESIAS [Davis76]) developed in the context of the MYCIN
application will be incorporated into EMYCIN for use by experts
when building any consultation system. This facility will allow
an expert to specify the major parameters of a consultation.
Then, following a consultation, the system will show the expert
the values of these parameters, and ask for verification that
they are correct. If the values are not correct, the system will
explain to the expert the line of reasoning that led to the
incorrect values. This allows the expert to pinpoint an error in
the system’s rule set, which the expert can then repair by
adding, deleting, or modifying rules.

In addition to incorporating the existing rule-acquisition
facility, we plan to automate the aquisition of a large portion
of the initial knowledge that is required in building a
consultation system. The system will prompt an expert through an
intermediary for the conceptual framework, vocabulary, and major
lines of reasoning of the domain before any rules are entered.
The conceptual framework includes the definition and hierarchy of
objects or states that will be used to structure the reasoning
process (called the "context tree") as well as the attributes and
values of these objects that will be used for writing rules.
Numerous internal pointers needed for correct associations among
concepts will be set up automatically at this time.

Improvements to Teiresias

 

The TEIRESIAS facility, for interactively debugging the
rule base, is most useful when the knowledge base is reasonably
well developed and the necessary changes to the rule and
parameter base are small. This facility is currently being
improved primarily by using the existing question-answering
system to explain the system’s lines of reasoning [48], and by
using a new English parser based on a semantic grammar to
understand any rule additions or changes from the expert [8].

An EMYCIN sketchpad As a result of our recent éxperience
eliciting a rule base for structural mechanics [3], we have
found it useful to characterize the knowledge acquisition process
as occurring in a number of distinct phases.

The first phase corresponds to making initial decisions about the
typical advice the consultant will give and the major
reasoning steps the consultant will use.

46
Project 2 Sec III.c.

This is followed by an extended period of defining parameters and
objects and then, using this initial domain vocabulary,
developing a substantial portion of the rule base, This
process, lasting approximately 2 months in the structural
analysis case, captures enough domain expertise to allow the
consultation system to give advice on the large number of
common cases.

In the final phase, further interactions with the expert tend to
refine and adjust the established rule base, primarily to
handle more obscure or complicated cases,

Future research on knowledge acquisition will explore the
design and implementation of interactive facilities to be used
during the early phases of the Knowledge base design. In
darticular, methods will be developed for rapidly acquiring and
manipulating definitions of the context tree of objects, their
major parameters, as well as the major problem solving strategies
to be used by the consultant.

During the initial passes at defining objects, the system
would begin to acquire some detail about the actual methods {the
rule sets) that will be used to reason about the Major parameters
of the consultation. For each of these Parameters the expert
typically knows what major factors and subgoals will be relevant
to concluding the parameter. These factors can be specified by
the expert, but need not be acquired in detail until the system
actually must begin gathering the rules for determining these
important parameters. In this manner, the expert can be free to
concentrate on the more general aspects of the problem solving
process without having to be bothered with the specification of
detail,

Using the EMYCIN sketchpad, the expert and intermediary
would develop and acquire substantial portions of the knowledge
base and an explicit representation of the overall reasoning
strategy that the program will use to advise about the user’s
problem. This framework and knowledge of overall strategy can be
used later to motivate explanations of the system’s lines of
reasoning produced by the question-answering system. We intend
to investigate ways that this knowledge about the major
Parameters could be used by TEIRESIAS (during the later phases of
the knowledge acquisition process) to explain how and whv a
particular, incorrect conclusion was made.

Rule Checkina

Fo
“I
Sec. ITI.C. Project 2

While the production rule format permits any executable
LISP expression as the premise or action of a rule, not all LISP
forms make reasonable rules. Common syntactic errors include
misspellings, misplaced arguments, parenthesis errors and
incorrect classification of the rule; such errors generally
result from inaccurately inputting the rule, and if left
undetected, may cause the rule to fail, or even cause runtime
errors. Semantic errors can result if a new rule is inconsistent
with existing rules, or is incomplete, failing to take into
account all the factors necessary for the conclusion.

We plan to do extensive checking of each new rule entered
into the system. We hope thereby to catch most errors at rule
entry time, rather than finding them during later consultation
runs when it is harder to (a) isolate the effects of a faulty
rule and (b) correct any problems which result.

Syntactic checking is fairly straightforward. The rule
checker needs to know about the syntax of each argument to the
predicates which make up arule. This knowledge exists in the
form of predicate templates, which have long been used by other
parts of the system to "read" rules. The rule checker’s use for
them is, in effect, to make sure the rules are "readable", For
example, the template for the predicate SAME is

(SAME CNTXT PARM VALUE) ,
for which a typical instance from the infectious disease domain
might be

(SAME CNIXT IDENT £.COLI).

The rule checker knows from this that a call to SAME should
have three arguments: the first must be a legal "context atom",
i.e., a variable used to select a binding in the context tree,
the second must be a parameter, and the third must be a legal
value for that parameter. If any of these is incorrect, the
error is easily detectable, and in many cases correctable.
Simple spelling errors may be corrected by invoking INTERLISP‘s
spelling corrector, using an appropriate spelling list; e.g., for
the PARM slot use the list of all parameters, for the VALUE slot
use the list of values legal for the parameter appearing in the
PARM slot. Transposed arguments and spurious extra arguments
(typically a result of parenthesis errors) are also easily
detected by checking against the template.

Another common syntactic error is incorrect classification
of a rule, i.e., specification of what type of context it may
apply to. In many cases it is possible for a rule checker to
completely determine the correctly classification, simply by
observing which parameters appear in the rule and comparing with
the known structure of the context tree. At worst, the checker

48
Project 2 Sec III.c.

could narrow down the possibilities to a small set of nodes of
parallel structure.

More subtle errors arise from fundamental "semantic" errors
in a new rule, and the processing required to detect such errors
is correspondingly more complex. One major type of semantic
error is inconsistency of anew rule with existing rules. One
rule might subsume another, i.e., one premise is implied by
another, For example, with the two rules

A —> X

A&B-> X,
the first subsumes the second. The error here is that if the
second rule succeeds, the first will also, and the information A
is contributing twice to the conclusion X. Our certainty factor
model is predicated on rule premises being independent;
subsumption is a blatant violation of that assumption.

Another possibility is that one rule might contradict
another rule or rules. This is trickier. Certainly the two
rules

A-> X

A -> “X
contradict each other. But such obvious contradictions are
fairly unlikely; more subtle interactions can occur. For
example, given a set of rules

A-> B, B= C

A->D, D-> ~C
it is difficult to determine whether there is a contradiction
except in the special case that all the rules have definite
conclusions (CF=1.8). But if the confidence attached fo those
conclusions is less than definite, there may be no direct
contradiction at all, merely conflicting tendencies, perfectly
admissible under our certainty factor model. we plan to
investigate means of analyzing rules to uncover possible
contradictions, measure how great a conflict may exist, and ways
to determine if the conflict is a real problem.

Another type of semantic error may occur if a rule fails to
take into account all the information relevant to a conclusion.
The system can sometimes detect this by means of rule models,
which currently consist of statistical observations of the
correlation of parameter occurrences in existing rules [15].
These rule models are constructed automatically by reading the
rules. As a tyoical use, if rules mentioning parameter x usually
also mention parameter y, then the system might request
confirmation of a new rule which considers only x. we plan to
increase the richness of the rule model language, to enable
better semantic checking of the users rules, especially during
early acquisition phases, when there do not exist sufficient
rules to form useful rule models on purely statistical grounds.

49
Sec. ITI.C. Project 2

For example, the user might wish to describe in some brief
fashion the sort of rules he is about to enter, and the system
could then make sure the rules are actually consistent with the
user “s model,

TII.C.l.e, Alternative Models for Reasoning under
Uncertainty

The method developed for ranking MYCIN’s hypotheses based
on measures of certainty is an approximate method. It developed
from a pragmatic need for measuring the degree of confirmation of
a hypothesis based on several] non-independent (partially
overlapping) pieces of evidence. The certainty factor (CF) model
discussed above is a means of combining single "certaint:
factors" associated with each inference to arrive at a reasonable
measure of how strongly the evidence Supports each hypothesis.

It is reasonably simple to understand. However, its main
drawback lies in the difficulty of associating a CF with a single
rule. Because the rules are not independent, the CFs are also
not independent. This means that adding a new rule involves
looking at similar rules in order to decide how high the CF ought
to be set,

For some experts (or problem areas), CFs seem to be more
difficult to use than for others. Thus we propose to offer the
system builder a choice of evidence accumulation methods. One of
them will be the CF scheme already in use. A second will be the
likelihood ratio scheme used in the PROSPECTOR system [18],
although that requires storing two measures with every inference:
P{H/E] and P{H/~E].

A third method will be a very simple additive measure with
thresholding, as proposed by one of the physicians working with
MYCIN. In this model, measures of positive and negative evidence
are added and subtracted into a total for each hypothesis, with
action taken on the hypotheses in the end that lie above the
threshold,

Under other funding we are exploring other relationships
between evidence and hypotheses, As measures are found that can
be £it to new problem-areas we will find ways of adding them to
the set of available confirmation methods. The important point
nere is to give the system builder a choice of evidence
accumulation schemes, any of which can be used in EMYCIN.

56
Project 2 Sec ITII.c,

Time-Dependent Features

A consultation system built under the current design of
EMYCIN takes a snapshot of the available information about a case
and makes a one-time evaluation of the situation. In cases where
the nature of the diagnosis or repair is strongly dependent on an
understanding of the process of failure over time, this static
approach to the problem is inadequate. No provision is made in
the present system for considering the same case several days
later when more information is available or when the values of
some parameters have changed.

The system also lacks a mechanism for dealing with
parameters whose values vary with time. In many domains, time
considerations may be crucial to the solution of even the
simplest problem. For example, it might be critical to track the
values of various parameters over a vceriod of time, or to check
what value existed at a particular time in the past.

In order to increase the number of domains in which EMYCIN
systems will be useful, we plan to add two new features. ‘The
First is a "restart" mechanism that will allow a user to run a
follow-up consultation on a stored case, adding information that
has become available since the- original consultation, and
correcting old answers that are no longer accurate. The second
is to expand the syntax and semantics of rules to deal with
values of parameters changing over time.

Follow-up Consultations

 

The builder of an EMYCIN system should be able to specify
which carameters are likely to change for a given case from one
consultation to the next. In a follow-up consultation, the
system should summarize its knowledge of the case and do the
following three things:

1) ask whether new information is available
for any of the parameters which are subject to
change, and prompt for the new answers;

2) ask whether values are known for any of
the parameters whose values were UNKNOWN at the
time of the previous consultation, and prompt for
the new answers;

51
Sec. III.C. Project 2

3) allow the user to specify changes which
may have occurred in the values of any other
parameters (viz., those which do not usually
change) .

Extending the Rule Syntax and Semantics to Deal with Time
Relations

 

The builder of an EMYCIN system should be asked to classify
parameters according to their stability over time. A possible
classification scheme is shown below.

1) Constant - value is always the same (e.g., Name and
Sex of medical patients)

2) Regularly changing - new value is available at
regular intervals; there will be several values stored
for the parameter, each with atime (e.g., barometric
pressure at a certain city)

3) Gradually refined - value is likely to change over
time, from unknown to uncertain to definite (2.9.,
Identity of an organism growing on a culture plate)

Parameters of the first type are the typical case that
EMYCIN now handles. For the second type, a time must be kept
with each value-CF pair. The third type of parameter will
typically change from one consultation to the next, and previous
values will be discarded as new information becomes available.

New PREMISE and ACTION functions must be defined so that
EMYCIN rules can handle time-varying parameters. Functions will
be needed to test and conclude (a) the value of a parameter ata
given time, (b) the duration of a particular condition (e.g., it
has been raining for three hours), and (c) trends in the values
of numeric parameters (e.g., the volume of water in the tank has
increased within the last hour). As we test EMYCIN in different
domains, we may discover other types of tests and conclusions
that must be made on time-dependent parameters.

Add Capabilities for Using Meta-Rules and other Meta-Level
Knowledge

 

Cur oreliminary research with meta-level knowledge [15] as

52
Project 2 Sec III.c.

well as our preliminary experience with the GUIDON tutorial
program has shown the importance of acquiring, using and teaching
Structural and strategic meta-knowledge, as well as the domain
rules. Structural meta-knowledge provides a framework that sets
the context for domain rules, and in tutoring helps make the
rules memorable to a stuwient. It might include patterns and
principles that are made specific by groups of rules. Strategic
meta-knowledge constitutes planning knowledge for using the rules
to solve different problems [19]. This meta-knowledge is written
aS meta-rules and takes the form of diagnostic reasoning
strategies and domain-dependent approaches for efficient
consideration of a case,

In our work with EMYCIN, we will explore various kinds of
Structural and strategic meta-knowledge that is appropriate to
the production rule representation and useful for explaining
decisions made by the program (to a consultation user or a
student). We will start by implementing in  EMYCIN the
capabilities for using the meta-level knowledge described by
Davis: meta-rules to be used for pruning and reordering the
object-level rules, and meta-level models of rule sets that aid
in debugging (and tutoring) the domain knowledge.

Experience with EMYCIN programs like HEADMED and PUFF will
provide us with particularly useful case studies of possible
forms of meta-knowledge.

Incorporating Question~Answering Facilities into the System

In order to make the questions-answering facility available
to an EMYCIN consultation system, the system must be provided
with a dictionary of synonyms and a list of definitions of the
important concepts in the its domain of expertise. The
dictionary will contain common synonyms in the domain, pointers
between English words and parameters, and common Phrases in the
domain that can be given a single specified meaning.

We will provide a facility for automatically constructing a
dictionary from the parameters in the knowledge base. The system
Dullder will also be able to add synonyms and fill in parts of
the dictionary that cannot be created automatically. This should
provide all the information necessary for answering standard
questions about the consultation system. The kinds of questions
that the system will be able to answer are:

1) the vaiue of a parameter

st
()
Sec, III.C. Project 2

2) how a parameter was used oor concluded in the
consultation

3) how a parameter is used or concluded in general
4) how a rule was used in the consultation

5) why a question was asked during the consultation
6) the translation (into English) of a rule

7) the definition of a concept

These question types will be recognized ina variety of
forms. For example, all of the following will be taken to be
equivalent ways of asking for the value of a parameter

1) What is the value of x?
2) Is Y the value of x?

3) What is x?

4) Do you know what X is?

The major benefits of providing these capabilities are that
the user of a consultation system can understand the reasoning
and the designer of the system can find the sources of reasoning
errors.

Coupling a Tutorial System to EMYCIN

Work on the idea of automatic "Transfer of Expertise" from

a human expert to a program [22], [15] has led to important
advances in the representation of knowledge within the program.
These advances have allowed the systems to explain their
reasoning process to users, thus providing the basis for a
tutorial program. We have been building an intelligent computer
aided instruction (ICAI) program [12] that guides a subject
rough problems in a complex domain with the goal of
transferring the system’s knowledge of the domain to the student.

54
Project 2 Sec III.C.

Current ICAI techniques like planning the discourse,
modelling the student, and teaching problem solving strategies
all take a natural form in our system. In turn, the system
serves aS an excellent environment for experimenting with
unsolved problems in the design of computer-based tutoring.

We have demonstrated the feasibility of using the MYCIN
knowledge base for teaching as well as for consultation, and this
aspect of our research will be continuing during the grant period
under separate fund ing? :

We have not yet demonstrated the generality of the tutorial
program, GUIDON, in other domains; but we have meticulously
avoided introducing any domain-specific knowledge into GUIDON’s
control structure and teaching strategies. We believe that its
design is as general as MYCIN’s. Thus, all that is needed for
tutoring in another domain will be (a) domain rules for EMYCIN to
use on cases which GUIDON can discuss and (b}) domain specific
meta-level knowledge that would be useful for teaching these
rules. Moreover, we must keep the tutoring Strategies of GUIDON
coupled to the representation of EMYCIN systems that we wish to
tutor.

III.C.2. AGE~-1

 

The basic idea behind AGE-i is to generalize the ideas
found in specific problem-solving systems and make them available
in a package — hence the name AGE, for "Attempt to GEneralize".
AGE-1 takes an active role in assisting a knowledge engineer in
constructing a performance system. The specific model that is
incorporated in AGE-l1 — the "cooperating knowledge sources
model" — was pioneered in the HEARSAYII system ([28], [33]) for
speech understanding. It was further developed by Stanford
researchers in two data interpretation problems — SU/X and SU/P
(otherwise known as HASP and CRYSALIS) [43].

TII.C.2.a. Examples from AGE~1

 

 

The CRYSALIS program {19] is a knowledge-based program
being developed in collaboration with the University of
California at San Diego. Its task is to infer protein structure
from X-ray crystallography data. This program was developed in

 

A. . : 3
“Joint provosal to Office of Naval Research, Personnel and
Training Division and Advanced Research Projects Agency.

5

in
Sec. III.C. Project 2

close collaboration with the AGE group at Stanford and has been
using a very similar problem-solving model. Currently the top-
level of CRYSALIS is being rewritten using the AGE-1 package.
Examples from the CRYSALIS program are used below to illustrate
the problem-solving model in AGE-1.

The Problem-Solving Model

 

AGE-1 uses a uniform multi-level data structure, termed the
"blackboard", to hold the status of the system. In CRYSALIS, the
blackboard is used to hold various crystallographic data and
structural hypotheses. Separate hierarchically organized panels
of the blackboard correspond to "electron-density" space and
“protein-model" space. These correspond roughly to data space
and hypothesis space except that the electron density space has
two levels of hyootheses above the electron density data. The
protein-model space describes the three-dimensional structure of
the protein at different levels of abstraction from the atomic
level to the large-scale structural features like "beta~sheets",

 

Skeletal Level
(backbone — graph
o& density nodes)

Stereotypic Level
(helices, beta-sheets)

Nodal Level
(high intensity points)

Superatomic Level
(Side chains, proline)

Atomic Level
(C,N,Fe etc.)

Parametric Level
(electron density data)

 

Electron Density Space Protein Model Space

A set of procedures termed knowledge sources (KSs) are used
to form and link the hypotheses on these panels. In the CRYSALIS
application, these knowledge sources include such domain specific
operations as skeletonization, helix identification, sidechain
identification, bond rotation, sequence identification, cofactor
identification, and heavy atom identification. The knowledge
sources are expressed as production rules. AGE-1 provides a
framework for coordinating the activity of the KSs mixing goal-
driven and data-driven reasoning as it searches for solutions.

If the KSs had been perfect, the coordination could have be

56
Project 2 Sec III.C,

directed ina goal-driven manner analogous to the production
rules in EMYCIN. However, because of gaps in the theory and
implementation of the individual KSs and noise in the data, they
are individually incomplete and errorful. Like the HEARSAYIT
system, AGE-l uses an algorithm — a version of the hypothesize
and test paradigm — which emphasizes cooperation (to help with
incompleteness) and cross-checking (to help with errorfulness) .
During the hypothesize part of the cycle, a KS can add a
hypothesis to the blackboard; during the test part of the cycle,
a KS can change the rating of a hypothesis in the blackboard.
This process terminates when a consistent hypothesis is generated
satisfying the requirements of the overall solution or when
knowledge is exhausted,

In AGE~1, the hypothesize-and-test paradigm is formalized
as a control structure with three levels. The first level is the
hypothesis-formation level. KSs on this level make changes to
the blackboard panels. In the hypothesize and test paradigm,
they put hypotheses on the blackboard and test the hypotheses of
other KSs. A rating is associated with each hypothesis to store
the overall judgment. Immediately above the hypothesis-formation
level is the KS-activation level which contains two KSs. The KSs
are called the “event-driver" and the “expectation-driver" and
correspond to data-driven and goal-driven policies for activating
KSs on the first level. The highest level of KSs is called the
Strategy level. This level must decide (1) how close the system
is to a solution, (2) how well the KSs on the second level are
performing and (3) when and where to redirect the focus-of-
attention in the data space. KSs on this level can invoke KSs on
the second level.

This problem-solving method is more complex and more
general than the backward-chaining approach used in EMYCIN. It
is designed to tolerate errorfulness in the data and in the KSs
and allows the inferences to be run opportunistically in either
direction. It also allows the inferences to be run at several
levels of abstraction,

Using AGE-1 to Build a Knowledge-based System

 

 

The purpose of the AGE-1 system is to assist a computer
Scientist at building a problem-solving system. AGE-1 is
intended to speed up process task when the task domain can be
cast in the model of cooperating knowledge sources. To this end,
AGE-1 has several software subsystems — a "TUTOR" subsystem and
several knowledge acquisition subsystems.

The TUTOR is a module for the unfamiliar user which helos

in
~I
Sec. II.C. Project 2

him create an application program. It guides the user through a
top-down design of his system by presenting him with a list of
topics and subtopics at each level. Canned text is available for
explaining the choices at each level. A "browse" option is
available for random perusal of the topics and subtopics.

Knowledge about the parameters of the application program
is acquired by the DESIGN subsystem. The DESIGN subsystem
provides the user with choices at each phase of the construction
of the application program. This construction involves choices
for hypothesis structure, rule acquisition, goals, and
expectations. Thus, the domain dependent particulars for each of
the components of the application program are asked about in
turn. For example, the following items must be acquired for each
KS

1. preconditions

2. inference levels

3. links

4, hit strategy

5. local variable bindings

The acquisition of each of these items is further broken
into the most primitive elements. The DESIGN module has a
"guided" approach for the novice and an "unguided" approach in
which an expert calls for the knowledge acquisition functions
quickly and directly.

III.C.2.b. Applications of AGE~1l

 

The CRYSALIS example illustrates the most comprehensive
application of AGE-1. AGE-l has also been used on an experimental
basis to create a version of PUFF Section III.C.1.b. and on some
cryptography problems (simple code-breaking). These applications
have been used for testing the tutorial and knowledge acquisition
components of AGE-l.

58
Project 2 Sec ITII.c.

ITI.C.2.c, Provosed Work for AGE~-1

 

In the current version of AGE-l, the DESIGN module provides
choices and explains them with canned text. AGE-1 does not build
up its own knowledge of the user’s application — only a
knowledge of the design choices that the user makes. It does not
make inferences about the relationships between design choices —
so that it does not infer choices for the user even when one set

of choices implies another set,

We plan to move toward a system where AGE-1 will ask the
user about the domain and play a more active role in making the
Gesign decisions. This means that AGE~1 Must have a model of
"how to build a system" and that we must encapsulate the reasons
behind the design choices. Our plan is to begin to capture this
information in the form of production rules which relate the form
of the domain knowledge to the design choices of AGE-1 to a
prediction of the performance consequences in the application
program being built.

Accompanying this effort we would like to begin
construction of two explanation subsystems — one for explaining
the activity in the Gesign phase and one for explaining
performance of the application system. We expect to build on the
explanation work in the EMYCIN system for this,

In the long term, we also plan some work on knowledge
compiling. Our plans for this in the EMYCIN system have already
been discussed. There is some experience in compiling the
knowledge of a cooperating knowledge source system — notably the
HARPY [39] system which can be seen as a "compiled" approach to
the task performed by HEARSAYII. Much more work is needed before
this could be done automatically.

III.C.3,. The Unit Package

The Unit Package is a frame-structured representation
system developed as a tool for building knowledge bases in the
MCLGEN project. Unlike EMYCIN ane AGe-1, the Unit Package
provides no problem-solving framework, However, the Unit Package
can be used as a passive representational medium in conjunction
with specific problem-solving approaches. Two approaches to
experiment planning are being developed in this way as part of
research in the MCLGEN oroject. The tnit Package is also
accessible from within the AGE-1 package, The Unit Package
Duilés on a substantial amount of work (both here and elsewhere)

ui
©
Sec. III.C. Project 2

on frame-structured languages. A comprehensive description of
this work is available as a technical report [52] which is
included with this proposal.

Knowledge in the Unit Package is organized in a semantic
network of nodes and links. Following other work on frames [42],
the nodes are called "units" [6] and the links are called slots,
The major software components of the Unit Package are (1) an
interactive editor for adding new information or modifying
existing information, (2) a set of routines for matching and
manipulating descriptions, and (3) a set of access functions
which maintain network relations (such as inheritance of
properties) and provide an extended address space to hold the
semantic network.

TII.C.3.a. Examples from the Units Package

 

The Unit Package is a fairly extensive set of software for
defining the symbolic entities of a domain. It provides a number
of conventions and methods for defining standard kinds of
relationships between the symbols.

There are three main steps building a knowledge base for a
domain with the Unit Package, The typical user of the Unit
Package is a computer scientist, although four geneticists on the
MOLGEN project routinely use the Unit Package. The main steps
are using the interactive editor are as follows.

(1) Define the symbols of the domain. These
symbols take the form of units as
illustrated below.

(2) Define the operations which manipulate
these symbols. Operations are procedural
knowledge in the form of production rules
or LISP functions,

(3) Define an aporoach for problem solving,
The steps are not necessarily performed in this order or by one
person. In an evolving knowledge base, the user uses the editor
both to create new symbols and to modify old ones as his
understanding improves. The expertise to define all of these
things may be spread over several people working on a common
knowledge base.

60
Project 2 Sec III.C.,

"Specialization" is a relation which is indicated by a user
when he defines a symbol. It is used to indicate subclasses
among concepts — e.g., the wit for the restriction enzyme Eco
RL is a specialization of the unit for general restriction
enzymes which is a specialization of the unit for endonuclease
whieh 1s a specialization for the mit for nuclease and so on.
General properties of a class are ~ inherited by its
specializations. This is formalized in part by having
descriptions in slots of those units that correspond to classes.
These descriptions delineate legal values for the correspond ing
slots in specializations of the class. Descriptions can be
progressively tightened as one proceeds down a specialization
hierarchy. This feature makes the process of specialization
correspond to the addition of non-contradictory new knowledge to
units. A specialization (or generalization) hierarchy of
concepts from a molecular genetics knowledge base is illustrated
below,

LAB-OBJECT
ANTIBIOTIC
AMTNOGLYCOSIDE
KANAMYCIN
NEOMYCIN
BETA-LACTAM
AMPICILLIN
GENE
APR
CMR
ENZYME
LIGASE
NUCLEASE
ENDONUCLEASE
RESTRICTION-ENZYME
ALU]
Asul

eae

Symbols in the Unit Package are
Organized in a generalization hierarchy.
This hierarchy indicates "inheritance paths”
by which symbols acquire the attributes of
their generalizations,

Each of the symbols in a knowledge base is defined in terms
of "slots". A unit corresponds approximately to a property list

61
Sec. III.C. Project 2

except that (1) the structure of a slot has several explicit
fields for information about such things as modes of inberitance
and datatype and whether the value is stored or computed~ and (2)
the value of a slot can be a description of a value. The
following figure illustrates two units of different complexity.

 

NAME: Endonuclease
DOCUMENTATION: A nuclease that cuts internally in a
DNA structure. ,

 

SITE-TYPE: One of (MONO, STICKY-HEXA, FLUSH-HEXA,
PENTA, STICKY-TETRA, FLUSH-TETRA)

3 °-END: One of (P, OH)

5 °=END: One of (P, OH)

MODE: One of (Precessive, Non-precessive)

OPTIMAL—PH: RANGE (@ 14)

NAME: Rat~-Insul in—Problem

DOCUMENTATION: This unit gives the parameters of an experiment
for cloning the gene for rat-insulin.

GENE: RAT-INSULIN

GENE-PRECURSOR: RAT-INSULIN-RNA

ORGANISM : A Bacterium
Default: E.COLI

VECTOR: A Vector

GOAL: A Lab-goal with

STATE = A Culture with
ORGANISMS = A Bacterium with
EXOSOMES = A Vector with
HAS-GENES = RAT~INSULIN
CONDS = (PURE? ORGANISMS)

 

Two units from a MOLGEN knowledge base.
Each unit is organized as alist of slots.
The slots are filled with values or
descriptions of values. These units are
examples of "symbols" from the molecular
genetics domain.

While the Unit Package is not a problem-solving program, it
does provide a large number of routines for creating, modifying,
and matching wnits in a knowledge base. These routines are
called by problem-solving programs in the MOLGEN project which
are currently being tested. Some of the built-in features —
such as the generalization hierarchy and symbolic descriptions —
seem to be especially useful for problem-solvers that work with

 

°See the technical report for details.

62
Project 2 Sec ITI.c.

abstractions. For a discussion of other features of the Unit
Package — such as the various modes of inheritance, set
notation, or the attachment of procedural knowledge — the reader
is referred to the enclosed technical report.

ITI.C.3.b. Applications of the Units Package

 

MOLGEN — Planning Experiments in Molecular Genetics

Molecular genetics is a rich and rapidly growing science.
Several aspects of molecular genetics make it attractive as a
task domain for artificial intelligence. It is a young science
and new techniques and ideas are developed regularly. This makes
it attractive for studying the process of discovery ([38], [23]).
It is a laboratory science and experiments are clearly defined in
terms of laboratory steps and results. This makes it attractive
for studying the processes of planning and plan debugging.
Finally, many kinds of knowledge are used in molecular genetics,
This motivates work on representation in the Unit Package.

Planning research in MOLGEN has focused on two broad
classes of experiments —- structural synthesis and structural
analysis. The synthesis experiments use various laboratory
techniques to build DNA structures. Analysis experiments use
various laboratory techniques to identify an unknown structure.
An analyst seeks to discriminate between competing hypotheses for
the structure of a samole.

Other Applications

In the past few months, several other projects have begun
to use the Unit Package as a representational medium. Dr. Blum
[5] is using it in an application which will combine statistical
methods and AI methods for performing studies on a clinical data
Sank at Stanford. The Unit Package is being used to represent a
set of medical models to permit a more sophisticated
interpretation of patient record data in the data base than is
possible using statistical methods alone.

The Unit Package is also being used in a mathematical
application at Stanford and is being tested for a planning
application at the RAND corporation. Other apolications are
expected over the course of this grant period.

63
Sec. TII.C, Project 2

TII.C.3.c. Proposed Work in the Units Package

 

The proposed work on the Unit Package may be divided into
two main categories — representational work and research-related
work. Barring surprises from the emerging applications of the
Unit Package, most of the work on representational machinery is
finished. There are a few outstanding tasks such as (1)
generalizing the concept hierarchy to be a concept graph so that
units can have more than one generalization and (2) providing
some more flexible forms of inheritance. Since the Unit Package
became operational in June 1977, the rate of change to the system
itself has slowed dramatically. This reflects the need for a
stable system for development of applications and the fact that
the Unit Package has found an important niche for the
applications in the Heuristic Programming Project.

This standstill in develooment also reflects the current
interests of the research group —- which is to work on the
problem-solving applications of the Unit Package. A great deal
more development will become important as this work is completed.
For example, the Unit Package provides a substantially richer
descriptive language for concepts than is available in MYCIN or
EMYCIN. It lacks, however, substantial facilities for knowledge
acquisition — beyond a simple interactive editor. As
applications of the Unit Package develop, an increased need for a
stronger user interface is expected — incorporating such things
as the natural language interface (BAOBAB [8]).

Another line of development is the development of standard
relationships which appear in many domains. The Unit Package
currently provides only a very small set of built-in
relationships -— such as generalization and specialization —
which are utilized by the semantic network processing functions.

reating additional relationships is part of the knowledge~
engineering task of applying the Unit Package to a task domain.
Some of these relationships — such as "part-of" or “abstraction-
of" — seem to appear in many domains. To the extent that these
relationships have general utility and can be standardized, they
will be made part of the initial knowledge base for new
applications — thus expanding the apparent power of the Unit
Package and reducing the effort of starting new applications,

IITI.C.4. Long Term Work and New Packages

 

The development of packages over the next five years will
be opportunistic — relying on the most usable results from core
research in artificial intelligence. Thus, while the following

64
Project 2 Sec IITI.C,

ideas indicate only our best current ideas for continued
development.

TII.C,4,a., Planning Package

One of the areas in which we see future work is in the
general area of planning. The artificial intelligence research
on this problem is currently being performed in the domain of
experiment planning in molecular genetics. Some interesting
ideas are just beginning to emerge from this work which, if
successful, could become the basis of a,"planning vackage",

This research is investigating the viability of a new
approach to planning called "orthcgonal planning", The thrust of
this approach is to take the elements of a planning out of a
"planning algorithm" and put them into explicit “planning
Spaces". Explicit planning operations such as refinement
(mapping from abstract to specific) and evaluation and subgoal
proposing are expressed as operators in a planning space.
Different combinations of these operators can be arranged to
create top-down (goal-driven) planning, bottom-up (opportunistic)
planning, and various hybrid methods. The Planning research
seeks to find general methods for deciding when to apply these
different planning operators in order to plan flexibly and
effectively. Currently ten planning operations have been
formalized in the planning space and four strategic operations
have been formalized in a overseeing "strategy space". This
approach is being tested in the domain of experiment planning in
molecular genetics and uses the Unit Package for representing the
symbols and operations in all of the spaces.

TII.C.4.b, Time-—Or iented Knowledge Representation
Package

One important topic in computer-based diagnosis and therapy
programs is the representation of knowledge about situations that
are changing over time. Most current programs have concentrated
on the interpretation of a single instance in the course of the
patients disease process. As the patient status changes over
time, a program must be able to modify its representation to
conform to the new situation. The ability to represent trends in
the health of the oatient is an important part of the disgnostic
orocess.

Creation of a package that supports the representation of

ov
ui
Sec. III.C. Project 2

changes over time will be important for applications based on
clinical data bases. These data bases typically contain the
results of a variety of tests which were administered at each
patient visit to the clinic. The problem of interpretation of
updated test results has also come up in each of our current
applications, for example, initially negative culture results
that grow out a particular pathogen after several days in our
infectious disease program or the comparison of new pulmonary
test results with the previous findings. No general purpose
approach has been incorporated into these programs.

A program for a particular dynamic clinical setting -~
interpreting measurements from the intensive care unit has been
developed at the Heuristic Programming Project. That program,
named the Ventilator Manager (VM) [21], is able to evaluate a
stream of thirty measurements provided on a 2-19 minute basis
by a computer-based physiological monitoring system. The system:
(1) provides a summary of the patient physiological status
appropriate for the clinician; (2) recognizes untoward events in
the patient/machine system and provides suggestions for
corrective action; (3) suggests adjustments to ventilatory
therapy based on long-term assessment of the patient status and
therapeutic goals; (4) detects possible measurement errors; and,
(5) maintains a set of patient specific expectations and goals
for future evaluation.

Removing the the basic assumption about the regularity of
the changes in the ICU setting is the major area of research in
the development of this package. A typical problem is the
interpretations of a series of test values that are higher than normal
over several testing instances. Specialized knowledge about the
typical rate of change of the underlying disease process is
necessary to determine whether these values represent a trend.

The representation of dynamic settings also requires a
model of the stages of the disease and treatment process that
best characterize the clinical status of the patient. Often a
particular value of a measurement takes on entirely different
interpretations based on the current context. For example, the
meaning of critical measurements one hour after surgery compared
to the same measurement after three days of recovery. A
rudimentary model of this type based on various therapeutic
regimens is built into the ICU measurement interpretation system.
Additional work in required in the generalization of this type of
modeling process.

66
Sec. 111

Project 3

Codification and Use of Medical Knowledge
from Clincial Laboratories

ADMINISTRATIVE INFORMATION ONLY

1, TITLE OF PROPOSAL (Do not exceed 53 typewriter spaces}

laboratory Expert Project
2. PRINCIPAL INVESTIGATOR

Clinical

3.OATES OF ENTIRE PROPOSED PROJECT PERIOO (This application.

 

2A. NAME (Last, First, Initial}
Lindberg, Donald A. B.

 

28. TITLE OF POSITION

Director, Information Science Group
Director, Health Care Technology Center

 

FROM THROUGH
perma —_—_ | ay 31, 1994

4, AL DIRECT TS RE. 5. ONRECT COSTS REQUESTED
Qld BRAGS O iN FOR FIRST 12-MONTH FERIOC

  

 

 

2C, MAILING AODRESS (Stree City, State, Zip Coces

“University of Missouri
605 Lewis Hall

 

 

 

Columbia, Mo. 65211
20. DEGREE 2
M.D.
2F. TELE. Ares Codd TELEPHONE NUMBER AND EXTENSION
Data 1314 | 882-6966

 

 

2G. DEPARTMENT, SEAVICE, LABORATORY OR EQUIVALENT
(See instructions)

Health Care Technology Center

 

2H. MAJOR SUBDIVISION (See instructions)
Graduate Schoo!

instructions)

   

Stanford University
Stanford, California

 

4, Mesearch involving Human Subjects (Ses Instructions)

AC3Nno 38.(C] YES Approved:

C.{ YES — Pending Review Date

8 Inventions (Renewal Applicants Only - See Instrucuens}

A.A] NO 8.7 YES — Not previously reported
C.D YES — Previously renortea

 

TO BE CONPLETEO BY RESPONSIBLE AOMINISTRATIVE AUTHORITY fltems 8 througa 13 and 158)

 

9. APPLICANT ORGANIZATION(S) (See fastructions)

The Curators of the University of Missour
215 University Hall
Columbia, Mo. 65211

11. TYPE OF ORGANIZATION (Check applicable trem)
COFeperRaL Castate CULOcAL [J OTHER (Spscity)

i . .
Universiry

 

12, NAME, TITLE, ADORESS, ANO TELEPHONE NUMBER OF
OFFICIAL IN GUSINESS OFFICE WHO SHOULD ALSO 8£&
NOTIFIEO IF AN AWARD 15 MADE

H. Kent Shelton

Asst. Vice President Financial Services
215 University Hal]

Columbia, Mo. 65211

 

10. NAME, TITLE, ANO TELEPHONE NUMBER OF OF FICIALIS)
SIGNING FOR APPLICANT ORGANIZATION(S)

H. Kent Shelton
Asst. Vice President
Financial Services

Teiephone Number (s)

 

Telephone Number 314-88 223512 3512
1S.1GEN NTIGHAL COMPONENT TO RECEIVE CREDIT
FOR INSTITUTIONAL GRANT PURPOSES (See fastructions}

Graduate School

 

14. ENTITY NUMGER (Formerly PHS Account Humber)

 

43-6003859

67
Sec. iI
PROJECT 3. Codification and Use of Medical Knowledge from Clinical Laboratories

ADMINISTRATIVE INFORMATION ONLY

RESEARCH OBJECTIVES
NAME AND ADORESS OF APPLICANT ORGANIZATION

 

University of Missouri-Columbia

 

NAME, SOCIAL SECURITY NUMBER, OFFICIAL TITLE, ANO DEPARTMENT OF ALL PROFESSIONAL PERSONNEL ENGAGED ON

PROJECT, BEGINNING WITH PRINCIPAL |
Donald A. B. Lindberg, ‘ii, Director, Health Care Technology Center and

Information Science Group; of Pathology
Robert Abercrombie, Ph.D. Post Doctoral Fellow, Information Science Group

Paul Blackwell, Ph.O. Professor of Computer Science

Lamont Gaston, M.D., Professor of Pathology

Lawrence Kingsland, Senior Electronics Technician, Information Science Group

W. B. Stewart, M.D. , Professor of Pathology, Director of Laboratories

Henry Taylor, M.0. rofessor of Pathology

John Townsend, M.D.; Professor and Chairman, Department of Patholoqy

FITLE OF PROJECT «John Yio Ph.D., 227 68 0029, Post Doctoral Fellow, {information Science Gro.

Clinical Laboratory Expert Project
USE THIS SPACE TO ABSTRACT YOUR PROPOSED RESEARCH. OUTLINE OB.
{NOT TO EXCEED 10) IN YOUR ABSTRACT,

A. Objectives
t. To represent within a soapurer based information system the knowledge and
procedures of the clinical _ laboratory expert.
2. To determine how to implement this information system such that benefits
result to the clinical laboratory service which are measurable in terms of:
a. Increased quality of laboratory determinations
b. Reduced costs to the laboratory and/or the institution
c. Increased access to pertinent information by laboratory data providers
and users. .
3. To determine how to interface this information system with the hospital
and clinic services such that benefits result in actual patient care. We
propose to seek "'process'' measures rather than ''outcome!' measures,
4. Using this operational testbed to shed light upon certain important
questions basic to artificial intelligence in medicine research.

  
   
  
   
 
 
  
 
 
    

METHODS. UNOERSCORE THE KEY WORDS

These objectives will be pursued by construction of a knowledge representation
system for the domain of the clinical laboratory expert. Subject matter expertise
will be provided by directors of the clinical laboratories of the University of
Missouri Medical Center. Fundamental artificial intelligence methodology and special-
ized computational facilities will be provided by the SUMEX Laboratory and the
Department of Computer Science at Stanford University. Management and interfacing
of the project and site-testing will be provided by the Health Care Technology Center

at the University of Missouri-Columbia.

68
Project 3 Sec. JII.A.

PROJECT 3: The Clinical Laboratory Expert Project
lil. As Objectives
1]. To represent within a computer-based information
system the knowledge and procedures of the clinical
laboratory expert,
2. To determine how to implement this information system
such that benefits result to the clinical laboratory
service which are measurable in terms of:
(a) Increased quality of laboratory determinations
(b) Reduced costs to the laboratory and/or the institution
(c) Increased access to pertinent information by laborator~
data providers and users.
3. To determine how to interface this information system
with the hospital and clinic services such that benefits
résult in actual patient care. We propose to seek ''process''
measures rather than ‘'outcome'’ measures.
4. To seek through this operational testbed to shed light
“upon certain important questions basic to artificial intelli-
gence in medicine research. These include the following:
(a) How best to retain the power of symbolic representa-
tlons traditional to Al techniques while at the same time
obtaining the benefits of the numerical methods which are
traditional to fieids such as laboratory management?
{b) How best to set up an information system so as to
accommodate to the endless stream of changes which occur
In the operating environment of a system such as the

clinical laboratory?

(c) How to improve, and hopefully optimize, the Interface

§9
Sec.

1tt.B.

-B.

Project 2

~

of the knowledge engineer and the subject matter expert, in

this case the clinical laboratory expert?

Background and Rational

Use of artificial intelligence techniques, especially the recent
focus on formal representation of the knowledge of experts, is the latest
and most promising of applications of the computer to medicine. It is
already clear that the techniques are powerful and that the proof-of-
concept and feasibility phases of medical applications have been success-
fully passed. This technique has been shown feasible in the areas of
infectious disease (Shortliffe et al., 1973), glaucoma management
(Weiss, Kulikowski, Safir, 1978), patient present illness (Pauker,
Gorry, Kassirer, Schwartz, 1976), and in the general differential
diagnosis in interna] medicine (Lawrence, 1978). [In many ways the Al
techniques are still in development, but the real question remains:
in what areas of medicine are they most usefully going to be employed?
Some raise the question, in which areas would such techniques even
be accepted?

The clinical laboratories offer the very best application sites
for exploring Al techniques as a basis for biomedical information
systems. The following observations support this contention:

1. The clinical laboratories were the first sites for

successful implementation of computer-based information

systems of any kind (Hicks, 1969; Lindberg, 1965, O'Kane,

Haluska, 1977).

2. There are a host of current computer systems already

disseminated in this field which form a basis for advanced

technological developments,

79
Project

3

Sec. / 11.8.

3. Clinical laboratory services constitute a major part
of hospital expenses (estimates vary from 25-40%).
4. Clinical laboratories, for the most part, are
administered by professional medical personnel who have

training in technological matters, including hardware and

‘information systems, and who therefore are likely to be

receptive to advances in this kind of methodology.

5. There is an expertise in clinical laboratory operation
and interpretation which is recognized by medical specialty
training.

6. Knowledge in this field is plentiful; and expertise

takes the form of a multitude of-tiny empirical pieces of
information, which await unification into an overal!
information framework. This situation is compatible with the
way in which formal knowledge systems have been built for
other Al applications.

7. On the other hand, the field does offer an advantage

in another (almost counter) sense: namely, that there are
true and realistic models of the basic data generating
sources. For example, one knows quite surely that impedance
transients in a Coulter Counter are caused by particles,

and that these particles are (for the most part) erythrocytes.
Likewise, the concept of ''serum electrolytes'' is known

to have a solid basis: namely, that there are actual,
Immutable ions of sodium, potassium, chloride, and bicarbonate
(and CO.) within the serum. Furthermore, chemical laws
describe the relationship between many blood constituents.

Curfously, the chemical laws are not used ordinarily as the

7]
Sec.

Project 3

basis of laboratory management, and only partially as a
basis for test interpretation and subsequent patient
management. The chemical laws and the physical models
are, however, a potential advantage in building advanced
information systems.

8. The clinical laboratory offers a setting which is
receptive to and safe for development of new information
systems, yet which also offers a home base for extension
out toward the more purely clinical setting. The meeting
ground of the two is clear: it is the interpretation of

the results of laboratory measurements.

For these reasons, we feel that clinical laboratories are in
general a potentially fruitful place for Al in medicine applications.

There are reasons which make us think that the particular
laboratories and group at the University of Missouri are a good
choice among those institutions with excellent clinical laboratory
programs.

I. The school has a long history fn lab system developments.

The first automated lab system in the country was built here

In 1962 and has operated continuously since then.

2. The system incorporates all clinical laboratories and all

test results.

3. These results are in computer processible form, indeed

are reported through the computer systems. Consequently test

data Is accessible.

4, Experts in clinical laboratory medicine are members of the

team who propose to build the Clinical Laboratory Expert system.

5. The project is sponsored by the Health Care Technology

72
Project 3

Sec. 111.C.

Center, which has ample experience and capability in the
management and conduct of multi-disciplinary technical
projects. The Center management review of all projects
includes participation of an evaluation team with members from
operations research, medica! sociology, economics, health
services management, and medicine.

6. Most important of all, we have a plan to accomplish che
system building, and we have predecessor systems to build

on and to compare with.

itt.c, Methods of Procedure

 

We propose to grow the information system beginning with a
nidus or model system and to expand the scope of the system by
adding to it information and values from, additional areas. That
is, our strategy will be to begin with what is clearly feasible,
to build our collaborative patterns about an early success, and
then to expand in a systematic fashion to more ambitious goals.

We feel this is mot only a good general management strategy but
the best way to build programming systems too.

Fvantually. for instance it would be desirable for the system tn
be able to learn from the data. First, however, the system must be
given the logic by which laboratory data are evaluated and understood.
We plan for development of the system in four phases.

Phase One: incorporate the medical logic which takes into
account the information which is available within the laboratory
Itself: e.g. test results, quality control results, methodological
Information.

Phase Two: Incorporate the additional medical iogic which takes
Into account [Information about the patlent: first simple aspects such

as gender, age, race; then more complex concepts such as drug therapy,

73
Sec. 1 1l.

C.

Project 3
Operative status, clinical service assignment and provisional
diagnosis.

Phase Three: incorporate medical logic which includes concerns

for hospital function.

Phase Four: incorporate medical logic which attempts to link

to considerations which are outside the hospital setting.

Following is a more detailed description of the phased development.

Phase |. The aspect of the lab results which is of primary concern
within the laboratory hinges upon quatity control considerations.
These are the first logical aspects which must be represented.

We are referring initially to thinking which currently goes

on strictly in the laboratory, previous to release of a test
result. Subsequently, there may or may not be significant
discussion between the laboratory director and the clinician
concerning further lab work and/or clinical concerns. Previous

to this stage, however, there is a great deal of evaluation done
now within the lab and based on laboratory on only partially
clinical grounds. Not enough evaluation of this sort is possible
with today's high volume instruments. This function can be greatly
enhanced by advanced computational techniques.

We would plan to introduce knowledge into the system

along the following lines:

1. Knowledge of the labs selected (likely we would start
with hematology and clinical chemistry)

2. Knowledge of what tests are done, what methods are used,
what parameters are estimated, what units are used. It
should be noted that there are.often multiple extant methods
for a single determination, as wel] as multiple laboratary

locations throughout the institution at which it might be

74
Project 3

Sec. 17I.%.

done. Methodology and unitage change continually. Since
a referral-type laboratory may do 3,000-5,000 different
determinations, it is a serious problem to choose a
representation which will be amenable to the endless updating
Knowledge of the kinds of patients and hospital locations.
Logic permitting an initial evaluation of the test result.
for credibility. This naturally includes arithmetic
ranges, formats, etc.
Logic permitting evaluation taking into account other
results from examinations performed as a battery.
An example is the well known relationship between hemo-
globin and hematacrit.
Logic permitting evaluation of test result taking into
account laboratory quality control procedures and records.
We have recently completed an evaluation of the proposed
Buil statistic for control based on a weighted-moving-
average of mean corpuscular hemoglobin concentration,
which is a slight but stil] insufficient improvement on
the traditional method.
This is an example of the need to bring numerical methods
Into alignment with the symbolic logic. In essence, this asks
the general question, is it likely the result is valid con-
sidering the quality of the particular "run'! or batch
which produced the result?
The outcome of all the laboratory logic is the resolution
of the following questions:
a, Should the test be repeated using the same blood sample?
b, Is the issue important enough (or specimen identification

sufficiently questionable) that a new specimen must be obtaine-

75
Sec. III.C.

Phase Il.

Project 3

from the patient?

¢. Should the result be reported to the clinician and
to the chart with some kind of qualification attached?
d. %s there a quality control problem in the laboratory
which requires immediate action?

e. ts there a breakdown in the clinical procedure
(ordering, specimen collection, etc.) which requires
immediate action?

There are a number of clinical but relatively elementary

considerations which may be taken into account within the laboratory --

and which certainly should be taken into account by the knowledge-based

system we propose. Examples are:

1.

Logic permitting evaluation of test results taking into
account basic information about the patient, i.e., age,
race, sex, and ward location.

Logic permitting evaluation of test results taking into
account previous test results in the same patient.

These pieces of information are often of critical
importance in evaluating the credibility or significance
of laboratory reports. Normal ranges, for example, vary
for some tests with age, race, and sex, Previous results
on a patient, to take another example, may be the first
clue to a mismarked specimen: the blood-from-the-wrong-
patient blunder which is so fundamental a problem for al]
laboratories.

Logic permitting evaluation of test results taking into
account the general nature of the putative diagnosis (e.g.,

admitting diagnosis or treatment regimen).

76
Project 3

Sec. 111.C.

It should be noted here that we are not proposing that
the system permit or encourage that clinical knowledge of the
patient influence the test result, but only the interpretation
of the result and the handling of the specimen. A general
diagnosis or even a treatment regimen can greatly influence
these matters. Plasma specimens from patients on oral anti-
coagulants, for example, usually should not yield normal
prothrombin times; indeed for these patients, “normal'' is
abnormal and dangerous. The implication here is for interpre-
tation of the result, and when to report an abnormality!’ throug:
the stat or emergency systems. Similarly, patients with
leukemias, especially under chemotherapy, often have remarkedly
elevated uric acids which have nothing to do with the usual

reasons for hyperuricacidemia.

The issues which are relevant at the patient or the clinician's
level hinge upon matters of test interpretation, the possibility
of needing to order further tests, the possibility of new diagnoses.
There is obviously an immense amount of logic which concerns
laboratory test interpretation in the context ofall] of the possible
clinical diagnoses and management problems. We are not proposing
to Include this mountain of knowledge, which really pertains more
reasonably to programs such as Myer's INTERNIST System.

We propose to stop with knowledge which might reasonably be
construed to represent the conversation of the laboratory director
with the patient's clinical physician. It is difficult to specify
precisely this cut-off at the stage when we are only proposing
the system. The best indication of our intent might be provided
by an example,

77
Sec. T11.C.

Project 3

it frequently happens that the lab director and a clinical
hematologist will discuss a set of lab findings for a patient
(with or without the question of errors in the findings) up to
the point at which it is clear that the findings support the
interpretation "iron deficiency anemia'’. This stage of reasoning
represents a kind of intermediate between findings and diagnosis
which Al systems sometimes call a concept. The semantic network
system of Kulikowski, Amarel and Weiss, for instance, has such
"concepts'' within its logic. From the point of view of the logic
we propose to write, this interpretation would be a proper
termination, whol ly supported by lab findings but requiring more
clinical information about the patient than is obtainable from
such paper systems as lab requisitions. The cause of the iron
deficiency anemia would remain for another system to take up.

There are a host of such intermediate pathophysiological
concepts which constitute a kind of proper frontier between clinical
lab reasoning and more purely clinical reasoning. In practical
terms, the resolution frequently is reached either by a telephone
conversation between the lab director and the clinical physician,
or by personal contact on such an occasion as rounds. We are not
eager to automate the personal contact, although time does not
permit enough of these discussions to occur; we would like to
automate at least the decision to make the telephone cal! or
appointment.

Most test results, even batteries of results do not
permit an interpretation at the laboratory level, In some cases,
we feel the logic could take us further, The most extreme case and
the most complete logic we feel would énd with a tentative patho-
phystologic concept (such as anemia) and in selected important
cases a decision on the part of the computer system to recommend the

lab director call the clinician, Because of the limitations of

78
Project 3 Sec. F11.C.

time, this is not a minor decision, Only the most important cases
should be selected for such conferences, whether telephone or in
person. A system with full and explicit logic should form a good

basis for such a decision. Furthermore, previous experience has

shown us that even our non-Al current tab monitoring systems

must bring together all pertinent (available) information about

a patient before bringing the abnormal report to the attention

of the user. This simple assembling of data aids current decision
making; we anticipate that assembly based on a more extensive logic

will prime a clinically useful discussion.

Phase IT!. Logic relevant to hospital function primarily concerns institutiona!

patterns. This includes changes in laboratory patterns, timeliness of
reporting, distribution of costs among services and patients, and
examination of interactions between procedures. For example, do

screening batteries including such tests as LDH's result in an inappropriate
number of repeat kinetic enzyme studies? These matters are derivative
measures of institutional function which are the natural by-products of
semantic understanding of the laboratory transactions. They would not be
examined until after the more fundamental logic in Steps ! and !1 had

been dealt with.

Phase {V, Logic which links to considerations outside the hospital
environment.
It ts difficult to detail these linkages abd initio, They are made
up potentially of at least two separate concerns: derivation of facts
of general scientific interest; and the provision of linkages to educational

functions.

{t must be emphasized that firm promises for such accomplishments
Sec. I II.C. Project 3
cannot be made. Still, one should point out some potentially important
implications outside the immediate hospital realm, and should attempt to
make the connections. A more or less modest scientific fact which could

with luck result from the studies is the long awaited multivariate normal
for application to multi-channel screening (Lezotte, 1977; Grams, 1977).

Building of instructional systems is beyond the scope of the
present proposal, but provision of the connections is an inherent part
of our plan. Good Al systems are (partly) characterized by their ability
to defend their decisions. That is, a classification or advice provided
from such an automated system can be challenged, and it can be expected
the the system can recapitulate the rules or criteria which produced its
conclusion. It is precisely this ability which should allow potential
users outside the laboratory to benefit directly from the existence of
such a knowledge-based system. We would hope to allow for this educational

by-product usage by providing suitable means to challenge and converse

with the system.

80
Project 3

Sec. 111.°.

System building

We have given thought to the architecture of the proposed
system. It should be emphasized that this project is a long
term development in an area of fundamental importance to medicine:
namely, the knowledge which surrounds clinical laboratory testing.
We feel that there exists an adequate base of expertise in this
field at the University of Missouri, acknowledging of course that
we would utilize the full resources of the published literature and
that the knowledge and logic of the system would be subjected
to outside review by consultants as each major step was taken.
We do not, however, have an adequate experience in work in
artificial intelligence techniques per se to undertake the project
alone. It is clear that this competence exists in the group at
Stanford. We feel we have a sufficiently good working relationship
with Professor Feigenbaum and his colleagues that a joint develop-
ment will be successfully concluded.

The form of the actual computer representation has not been
selected. Our lab systems have used table driven assembly code for
years. The HCTC is collaborating with clinicians at UMC and computer
scientists at Rutgers to create a rule-based rheumatology consultant.
We wish to explore with Or. Feigenbaum the possible appropriateness

of the imputational "blackboard" of the Hearsay system.

The knowledge-based system to incorporate clinical laboratory
expertise will be built on the SUMEX machine via the existing
time-sharing network. We have used terminal connections to SUMEX
for five years in connection with operation of the AIM network,
the SUMEX Executive Committee, and smaller experimental projects,

The communications are sufficient to support development of such

ay
Sec. 111.C.

Project 3

a system, At the same time, we recognize that it is inappropriate
(and probably impossible) for the SUMEX computer complex in
California to support a real-time service activity in Missouri.
Fortunately this is not necessary. Testing of the model in its
Sequential versions against actual lab data in batches or bench-
mark sets can easily be done on a periodic basis. This will not
be a problem. Even the status of the quality control results can
be accessed and included in the model's operation in this fashion.
Since all. transactions are recorded, one can accurately recreate
"real time!’ for any moment.

The issue of implementation of the full model in a real
laboratory setting is a separate problem. The system has not yet
been built, so we can't say what kind of computer would be needed
to run it. If we are correct in assuming, like other systems,
that a part of a PDP-10 is capable of running the model, then it
is not unreasonable to expect our laboratories to acquire this
level of computer support. The current lab systems are using a
combination of two PDP-12's, an IBM System 7, substantial
services of an 18M 370/158 (which is being replaced by an Amdahl]
machine), and several microprocessors, including M6800's and LSI-I1's,
All this does not add up to an Al machine, but we don't want it
to yet. There is a commitment to having computing gear at UMC,
and in most large clinical laboratories, At the same time, one
must acknowledge that the five year duration of the project
will doubtless see a continued reduction in the cost of computing
gear, as well as a continuation of the advances in hardware which
will have made Al techniques more realistic in the past. Machines
equivalent to DEC PDP-10's may well come to be offered for smal] amounts

of money in microforms. This kind of breakthrough is, not necessary in order

f& r-us ta moye uveFNtg an Alcbased system. What is necessary Is that

82
the system work well and be able to keep up with the changes in
laboratory procedures which have plagued and almost destroyed
previous systems. Our institution is currently supporting six

full time programmers in a vain attempt to keep rigid old programming

systems current with methodological and administrative changes.

if the Al techniques succeed in producing a competent flexible
software system, we feel that ongoing personnel savings will
offset even large one-time hardware costs.

While the major model system is being built, we will naturally
implement as improvements whatever parts of the logic are
reasonable and feasible on the existing hardware. This is not
difficult to imagine, because the current system is somewhat
distributed already. It is through this means that we would
expect to identify and hopefully to achieve cost savings and
quality improvements. We assume that the major advances would
come through implementation of the full new system. These should
be calculated ahead of time. If the savings and improvements are
"there'', the project will have been successful and the system will

be implemented as a whole at UMC and elsewhere.

3
Sec. 111.C. Project 3

Concepts to be included

There are certain general concepts which are suffused throughout
all elements of laboratory practice. These will necessarily be

incorporated in all phases of the proposed development.

These concepts include the following:

1. Statistical significance of testing, including

sensitivity - specificity of tests. This orientation

is inherent in lab work. Recent reports (Casscells, Schoenberger
Graboys, 1978; Ransohoff and Feinstein, 1978) indicate that it

is not well understood by the clinical users of laboratory services.
2. Related to this idea is the concept of normal, which

is very much dependent upon each particular laboratory,

and even upon specific methodologies. The knowledge of normal
ranges regarding the methodology and regarding age, sex, race,
and special circumstance$ of the test population must be

firmly associated in the system with each test specification.

The system must be able to defend its interpretations, and

hence to inform the user of the laboratory's assumptions

and adjustments to methodology.

3. The concept that automatic error detection is the

essential first step before interpretation of results is
attempted, and that the attempt at error detection must be
vigorous, With the present systems we are able by careful
after-the-fact daily checking to recagnize and correct errors in
data which have passed through the computer checks and have actually
been reported to the patient's chart, Two and one half percent of

results are in error, Of these 0,5% (In retrospect) actually represen:

8h
Project 3

Sec. 111...

true technician or technologist methodological errors.
The remainder are a very mixed bag of clerical and
administrative errors. Our performance (which is probably
good compared with many wholly manual or semi-automated labs)
Is the result of incorporating extensive computer editing
of the data. We long ago, for example, incorporated self-
check digit identification for patient and specimen numbers,
Since we had shown that this category alone accounted for
half the errors detected by an earlier system (Lindberg,
Schroeder, Rowland, Saathoff, 1969),

Additional empirical methods of pattern recognition
have been developed for error deletion, and will be
incorporated in the proposed system. These include analysis
of electrolyte patterns, creatinine and others (Lindberg,
1968) .

The current daily Abnormal Value Rounds in the laboratories
will provide an ideal work setting for the model development
and testing. Presently lab reports are transmitted by and
reviewed by the several computer systems. Special cases,
according to adaptive algorithms, are selected by the systems
for review daily by the chairman of the Department of Pathology,
Dr. Townsend, and his residents and staff. They currently
accept or reject the computer judgments based on their own
internalized judgments and upon additional data about the
patients which is obtained by going ta see the patient and/or
the chart. It is this logic which should be represented in

the new programs.

85
Sec. T11.C.

Project 3

4, Multi-step testing is a practice which has been common

to labs for decades. The logic is not always made explicit

to the user, and we feel there is an advantage in doing so.
The classic example is the serological test for syphilis.
Formerly, laboratories did a VORL (for sensitivity), followed
in the positive cases by a Mazzini (for specificity). Currently
these have been replaced by the rapid plasma reagin test and
the fluorescent treponema antigen test. The same practice

is followed (appropriately) with many clinical enzyme tests
such as CPK and LDH, their kinetic counterparts and their
iso-enzyme extensions. Even more dramatic is the multi-step
or branching tree. logic which is used by coagulation

laboratories and thespecial immunology laboratories. The

questions to be addressed by the system include: what test
should be done first? What is available locally? What
subsequent test.to do, dependent upon what initial results?
What statistical significance do the results have? What
further testing could be done? If this involves a remote
referral lab, how is the service obtained?

Essentially, this logic is quite subject matter dependent.
It is specific to the limited domains, but because of this,

also quite synonymous with expert behavior.

86
Project 3 Sec. /} 11.0.

ftr.d. Significance
The significance of a successful outcome would be:
1. Advances in basic knowledge representation techniques
2. Formal and public representation of a major field of
medical expertise which will be of interest to al] fields

of medicine, health care, and information science.

3. Advances in techniques for remote collaboration on
information system development. That is, we would be much
further along on knowing how to share rare computational
facilities and unique computer science competence with a
broader, perhaps even national, medical community.

4. Improved understanding of evaluation of advanced health

care technology.

The significance of a less than complete success would be
lessened. Undoubtedly some of the representation and testing would
be accomplished, since we will commence with the easiest part.

If one's success were limited to this, the results would be of

real importance but of interest primarily to laboratorians and
computer scientists. These are an important part of the audience,
but not the only ones we see for the complete system. The ''downside

risk'', in other words, is minimal.

87
Sec. ITI.E.

.E.

Project 3

Facilities available

The Health Care Technology Center can house the computer
component of the project at the University of Missouri-Columbia.
Space is available in a modern office building. The Center
provides library facilities, computer laboratory facilities,
telecommunication, etc. The Department of Pathology will be
providing access to the working laboratories as required. These
include Hematology, Chemistry, Microbiology, Clinical Microscopy,
Coagulation, Immunology and Anatomical Pathology services for
the University Hospital (440 beds), a similar arrangement for the
adjacent Harry S Truman Memorial Veterans Medical Center (480 beds),
the Mid-Missouri Mental Health Center (175 beds), and Rusk
Rehabilitation Center ( 100 beds). The combined laboratories
process 2,100,053 procedures a year.

Computer hardware per se includes 6 DEC LSI-I1's; 3
M6800 systems; 2 DEC PDP-12's (tapes, disks, terminals) ;DEC PDP11/34;
1BM System 7; and multiple direct connections to the University
Network 1BM 370/158 and 370/168 (both to be replaced by Amdahl gear).

The members of the Health Care Technology Center include
45 faculty from 14 University departments in 6 schools of the
Columbia campus.

The professional staff of the Department of Pathology includes
29 faculty and 20 residents and fellows. Only a subset
of the faculty are planned as active members of this project team,
but all are interested in the success of the venture and all are

available as needed for help on specific knowledge areas within their

own subspecialties,

88
Project 3 Sec. I 1i,F.

htnF, Collaborative arrangements

The system would be developed jointly with members of
Computer Science at Stanford and the Health Care Technology Center
at the University of Missouri-Columbia. Computer support for the
model system would be provided by the SUMEX computer facility. This
is an NIH supported national resource. Use of local computers at
UMC for data gathering, analysis, test implementation would be provided
free of charge. An exception is minor maintenance charges for
HCTC equipment. Telecommunications for approved projects are provided
by the SUMEX contract with TYMNET and ARPANET. Access to Net nodes is
provided by UMC WATS lines. In addition, the project would budget
funds to provide for frequent travel between the two schools.

Results of the project are to be published.

Stanford University is viewed as the primary submitter of the
proposed program project, with the University of Missouri -Columbia
supporting the application and taking responsibility for the Laboratory
Expert Project. Doctor Feigenbaum is the Principal Investigator for
the program project. Doctor Lindberg is viewed as Director of the

Laboratory Project.

89
VW.

12.

PROJECT 3: REFERENCES

Shortliffe, E.H., Axtine, S. G., Buchanan, 8.G., Merigan, T.C.
and Cohen, S$. N., "An Artificial tntelligence Program to
Advise Physicians regarding Antimicrobiol Therapy''.
Computers and Biomedical Research, 6 (1973):1-17.

Weiss, S., Kulikowski, C. A.. Safir, A. Glaucoma Consultation by
Computer''. Computers in Biology and Medicine,8 (1978): 25-40.

Pauker, S. G., Gorry, G. A., Kassirer, J. P., Schwartz, W. 8B.
"Towards the Simulation of Clinical Cognition: Taking a
Present Illness by Computer''. American Journal of Medicine,60,
(June, 1976): 981-996.

Lawrence, S. V. "Internist: Computer Program Expressing Clinical
Experience and Judgment of a Master Internist Constitutes a Unique
Resource''. Forum on Medicine (April 1978): 44-47.

Hicks, G.P.Evenson, M.A., Gieschen, M. M., Larson, F.C. "On Line
Data Acquisition in the Clinical Laboratory!'. Computers in

Biomedical Research Vol. II! (Stacey and Waxman) New York:
Academic Press, 1969, pp. 15-53.

Lindberg, D. A. B.: "Collection, Evaluation and Transmission of
Hospital Laboratory Data''. Proceedings 7th I8M Medical
Symposium (1965): White Plains, New York, IBM, 1965.

O'Kane, K. C., Haluska, E. A. ''Perspectives in Clinical Computing".
In Advances _in Computers, 16 (1977): Academic Press, 161,

Lezotte, D. C. "A Multivariate Laboratory Data Analysis System:
Introduction''. Journal of Medical Systems, 1, No. 3 (1977): 293-98.

Grams, R. R. "Progress Toward a Second Generation Laboratory Information
System (LIS)'. Journal of Medical Systems,(]) No, 3, (1977):263-74,

Casscells, W., Schoenberger, A., Graboys, T., "Interpretation by
Physicians of Clinical Laboratory Results''. New England

Journal of Medicine 299, No.18 (November 1978): 999-1001.

Ransohoff, DO. F., Feinstein A. R., "Problems of Spectrum and Bias in
Evaluating the Efficacy of Diagnostic Tests'', New England

Journal of Medicine 299, No. 17 (October 26, 1978): 926-30.

Lindberg, 0.A.8., Schroeder, J.J., Jr., Rowland, L.R.,. Saathoff, J.,
"Experience with a Computer Laboratory Data System''. In
Strandjord, J. (ed), Multiple Laboratory Screening. Academic
Press, New York, 1969, 245-55,

an
Project 3

Project 3

The undersigned agrees to accept responsibility for the
scientific and technical conduct of the research project
and for provision of required progress reports if a
grant is awarded as the result of this application.

ALES LE md AD. om
Date Principal Investigator

 

ay”

91
Sec. Core Research

IV. CORE RESEARCH

IV.A. Cojectives of Research

The long term goal of artificial intelligence research at
the Heuristic Programming Project (HPP) is to understand and
build knowledge-based "intelligent agent" programs. Over the past
decade we have studied such systems in the context of scientific
and medical applications where human expertise for solving the
problems was evident and where the difficulty of the problem
Seemed to lie just outside the boundaries of current AL methods.
Because of the complexity of the applications, a significant part
of the effort has been to make the expert knowledge of the
problem explicit and to represent it appropriately in a knowledge
base. This perspective has focussed attention on four areas for
research:

(1) Representation -—- designing the symbolic structures for
modeling the knowledge about a problem. Presently this phase
is carried out by the system builders; we intend to codify
the knowledge used to make such decisions, both as an aid to
the system builders and ultimately to enable the progrems
themselves to choose appropriate representations,

(2) Reasoning — modeling the appropriate inference mechanisms
for a problem and building systems that incorporate those
models.

(3) Knowledge acquisition — designing systems that acquire
knowledge by communication with human experts.

(4) Multiple uses of knowledge — designing systems that use the
symbolic representation of the domain knowledge for
additional purposes such as consensus building (accommodating
conflicting advice from experts whose competence may be equal
but whose "styles" vary), tutoring of human students by
employing the knowledge base (both the information it
contains and the way it is organized), and explanation
(constructing a chain of rules’ which satisfactorily
rationalize the system’s behavior to an observer.

92
Core Research Sec IV.B.
IV.B. Background and Rationale

Artificial intelligence research at the Heuristic
Programming Project has utilized medical and scientific problems
to focus the research effort. For many different applications
over the last decade this has led toa cycle of research as
follows:

1. Form a collaboration with a scientist to work on a specific
problem in a challenging and interesting area.

2. Propose a method for representing and manipulating the domain
knowledge. This involves acquiring both formal and informal
knowledge and developing a knowledge-based system that
reasons with that knowledge.

3. Test the system. In this phase the method.is pushed to its
limits. The relationship between the design and the
performance of the system is used as the basis for future
development.

Both success and failure of a system can lead to further
research steps. When a system fails to solve a problem, the
seeds for further research can sometimes be found in the reasons
for failure. On the other hand, when a knowledge-based system is
successful, the desire to use it effectively uncovers a number of
additional needs. Thus, many of the topics of artificial
intelligence -— such as the ability of a program to acquire
knowledge, or to explain its reasoning, or to manage updates in a
knowledge base — have grown out of programs that were at first
successful only at problem solving. From this experience has
come not only a set of approaches to building intelligent
systems, but also a broader understanding of what intelligent
systems should be like.

The following sections discuss the background information
about each of our major research areas. We will outline the
progress that has been made on this topic and identify the major
technological tools. Then in Section IV.C. we will discuss
our perception of the outstanding research issues and how we plan
to approach them.

IV.B.1. Representation

93
Sec. IV.B. Core Research

One of the trends in our work has been to develop general
purpose approaches for representing a broad range of knowledge in
a knowledge base. This is illustrated by the Unit Package that
has been developed for the MOLGEN project(({40],{53]) for
experiment planning in molecular genetics. In the figure below
are two units from a MOLGEN knowledge base. The first unit
represents the restriction-enzyme EcoRl:; the second wunit
represents a problem-solving goal for an experiment.

 

NAME: ECORL
SITE-TYPE: STICKY-HEXA
3°=END: OH

5 °-END: P

MODE: NON-PRECESSIVE
MOLWT: 28500
SUBSTRATE: DNA
RECOGNITION-SITE:

123 45 67 8

G AAT T C

 

 

C T TAA G
16 15 14 13 12 11 19 9

 

NAME: LAB~GOAL~1
STATE: A CULTURE with
ORGANISMS = A BACTERIUM with
EXOSOMES = A VECTOR with
GENES = RAT-INSULIN
CONDS : (PURE? ORGANISMS CULTURE)

 

The usual way of using the Unit Package is to define
general knowledge before specific knowledge. For example,
general knowledge about enzyme, nuclease, and restriction enzymes
would be entered before the specific knowledge about a particular
restriction enzyme like EcoRl. The Unit Package is designed to
encourage the use of description, such as the description of a
culture in the second unit above. These descriptions are used
for checking new information as it is entered and for pattern-
matching operations that are part of a reasoning step. Reference
[52] describes the Unit Package and compares it to other work on
representation.

The examples above have illustrated the representation of
“object-centered" or "“noun~like" knowledge. Every reasoning
program also contains a representation of the inferential

94
Core Research Sec IV.B.

knowledge. In the first version of the DENDRAL program, this
kind of knowledge was represented as a program. This choice of
representation had the consequence that a chemist could not enter
new knowledge into the program (because he could not be presumed
to be an expert programmer). Also, since the program structures
were not understandable by the program itself, facilities for
explanation of DENDRAL’s reasoning had to be built into each part
of the program. In the MYCIN program [51], developed more
recently, the inferential knowledge was moved out of the program
and into a knowledge base represented as production rules. This
representation, because it was closer to the experts”
representation than DENDRAL code was, allowed us to develop
programs that could acquire rules from physicians. It also
allowed the system to generate its own explanations by examining
the rules it had used. Production rules illustrate many of the
themes which run through our work on representation.

(1) Explicitness — Knowledge is encoded in a knowledge base and
not just in programs. (For example, production rules are
used to make inferential knowledge explicit.) The distinction
between knowledge being in a Program or in a knowledge base
is a crucial one, for our purposes. Information encoded as a
program can be run, and initially coded, more easily and
quickly. However, as the program grows, it becomes more and
more difficult to add new knowledge : its relationships to
all the other knowledge must be considered and programmed
explicitly. The latter method, storing knowledge in a
separate data structure, a "knowledge base", enables the
pieces of knowledge to be accessed and manipulated just like
data. While their use, their running, may be somewhat
slower, the system builder can now enter data in modular
fashion, without much concern for the rest of the items in
the knowledge base. He can give the system the knowledge it
needs to reason about its own knowledge base.

(2) Modularity — Knowledge is encoded in independent "chunks" as
far as possible. (Production rules can be added or deleted
from a knowledge base to change its problem-solving
behavior.) The concepts chosen to represent the chunks of
knowledge are those which are natural and useful to a domain
expert. This is useful both if the expert is to input rules
directly, and if he is to be convinced by the system's
explanation of its behavior.

(3) Uniformity ~ Knowledge is represented so that it can be
manipulated by general purpose programs. (Production rules
and frames are two of the uniform methods for which we have
general purpose processing routines.)
Sec. IV.B. Core Research

Our perception of the outstanding research issues in
representation is discussed in Section IV.C.l.. As can be
seen from the examples above, how knowledge is to be used is
important in determining how it should be represented. With more
uses for knowledge — explanation, tutoring, problem-solving —
come more constraints on its representation.

IV.B.2. Reasoning

The first step in creating a problem-solving system is to
develop and test a method for reasoning. In the DENDRAL
program([11] for inferring chemical structures from mass
Spectrometry data, the reasoning framework that we tested was
called the Generate-and-test paradigm. This consisted of (1) an
exhaustive generator of all possible solutions (chemical
structures) and (2) a set of pruning rules which used the mass
Spectrometry data to eliminate inconsistent answers. One of the
issues that became relevant in studying this reasoning framework
is the combination of possibly contradictory evidence. Data in
many problems is incomplete and errorful; there is seldom a
perfect match between an internal model and empirical data. Even
if DENDRAL had a perfect model of how mass spectrometry data
corresponds to chemical structures, the data from any particular
run of amass spectrometer are erroneous with respect to both
extraneous and missing data. In DENDRAL, an overall domain-
specific matching function was used which reflected a priori
probabilities of errors in the data. Recently we have reexamined
this problem in the context of the GAl program[53}] which solves
an analogous problem from molecular genetics.

For the MYCIN program we used backwards-chaining as a
reasoning framework. This method develops a line of reasoning by
chaining together MYCIN’s inference rules (production rules)
backwards from the goal of making the diagnosis towards the
available evidence. This particular reasoning framework has
proved especially convenient for developing computer explanations
of the program’s reasoning. To deal with imperfect evidence and
inexact rules of inference, a mathematical model of certainty
based on numeric "certainty factors" was developed. This
constitutes a model of "plausible reasoning". In order to test
the MYCIN approach in other domains, a domain independent
package, EMYCIN (for “Essential MYCIN") has been created and is
being utilized in other applications discussed elsewhere in this
proposal.

When MYCIN is chaining back through its inference rules and
discovers a need for information that cannot be inferred, it
stops and asks for it. This approach is appropriate only when

96
Core Research Sec IV.B.

there is a way of supplying data as needed by the reasoning
program. For some applications, such as signal interpretation,
it is better for the program to make use of whatever it knows ,
because there is little chance that specific items of information
can be supplied on demand. further limitations of a simple
backwards-chaining model are (1) it is unidirectional, hence
cannot mix top-down and bottom-up processing and (2) it is
exhaustive, hence less efficient than approaches that reason
hierarchically by working with abstractions.

An alternative reasoning model which does not have these
limitations is the "cooperating knowledge sources" model
developed for the HEARSAYII [28] system and incorporated in our
AGE-I program. This model consists of (1) the "blackboard", a
global data structure which holds the system’s hypotheses, and
(2) a set of "knowledge sources" (KSs) which contain the
inference rules for the system. Because of gaps in the theory
and implementation of the individual KSs and noise in the data,
the KSs are individually incomplete and errorfitl. A version of
the “hypothesize and test" paradigm is used which emphasizes
cooperation (to help overcome incompleteness in both knowledge
and data) and cross-checking (to help correct errors). During
the hypothesize part of the cycle, a KS can add a hypothesis to
the blackboard; during the test part of the cycle, a KS can
change the rating of a hypothesis in the blackboard. This
process terminates when a consistent hypothesis is generated
satisfying the requirements of the overall solution or when
knowledge is exhausted. The power of the blackboard — over,
say, a uniform QA4 assertional net — is its structure: it is n-
dimensional, where the dimensions have some meaning (time, level
of abstractness, geographic location, etc.). Hence each rule can
know what part(s) of the blackboard to monitor, and each
hypothesis is carefully placed at a meaningful spot on the
blackboard. This is a simple but powerful type of analogic
modelling of the domain.

Two research programs based on this paradigm have been
developed by our group [43]. One is the CRYSALIS program for
interpreting x-ray crystallography data and the other is a
military signal interpretation program. In these programs the
HEARSAY model was extended by (1) extending the blackboard to
allow for several independent hierarchical relationships among
data and hypotheses and (2) extending the control structure.

In each of the examples above, our study of reasoning
methods always starts in the context of a problem in a scientific
or medical domain. We then generalize the method and package it
for further testing in other domains. When a framework for
reasoning works well enough, research on other artificial
intelligence topics, such as explanation or knowledge
Sec. IV.B. Core Research

acquisition, often follows. Our perception of open research
issues in reasoning methods is discussed in Section IV.C.2..

IV.B.3. Knowledge Acquisition and Management

 

One characteristic of the domain problems we have studied
is their requirement for a substantial amount of domain
expertise. Goldstein addressed this point in [26]:

Today there has been a shift in
paradigm. The fundamental problem of
understanding intelligence is not the
identification of a few powerful techniques,
but rather the question of how to represent
large amounts of knowledge in a fashion that
permits their effective use and Interaction.
This shift is based on a decade of experience
with programs that relied on wumiform search
or logistic techniques that proved to be
hopelessly inefficient when faced with
complex problems in large knowledge spaces,

 

The relevant problem solving knowledge includes much formal
and informal expertise of the domain expert; it also includes
many mundane facts and figures that make up the elementary
knowledge of the domain. Before a computer system can solve
problems in the domain, this information must be transferred from
the expert to the computer.

Qver the last decade, there has been some encouraging
Progress along this dimension. In DENDRAL, the rules of
inference about mass Spectrometry had to be put in machine form,
but knowledge acquisition by the Program from the chemist was
beyond our technology. Knowledge was added by a painstaking
Process in which a computer scientist together with a chemist
learned each other’s terminology and then wrote down the chemical
rules for the simplest kinds of chemical compounds. Then the
computer scientist entered the rules into the computer and tested
them and reported the results back to the chemist. The reward
for this effort over several years waS a program with expert-
level performance.

It is interesting to compare the knowledge acquisition
effort of the DENDRAL program with that of a more recent progrem

98
Core Research sec IV.B.

— PUFF, the system for diagnosing pulmonary function disorder.
In contrast with DENDRAL, PUFF was created in less than 5@ hours
of interaction with experts at PMC and with less than 19 man-
weeks of effort by the knowledge engineers. Part of this
tremendous difference in development time is due to the fact that
the domain of pulmonary function is much simpler than mass
Spectrometry. However, the main reason that the development was
so rapid is that PUFF was built with the aid of an interactive
knowledge engineering tool, EMYCIN. when knowledge engineers at
the Heuristic Programming Project started the PUFF project, they
already had a reasoning framework in which to fit the problem and
an "English-like" language for expressing the diagnostic rules.
The facilities that make EMYCIN such a powerful tool are the
direct result of the core research over the last five vears on
the MYCIN program.

Another dimension of progress closely related to knowledge
acquisition is knowledge Management, that is, management of the
global structure of a knowledge base. A knowledge base is more
than a set of isolated facts: its elements are related to one
another. In the DENDRAL program, all of the knowledge was
represented as programs and LISP data structures. If changing one
part of the program meant that another part had to be changed as
well, the programmer had to know that. As programs or knowledge
bases get large, this kind of effort becomes substantial. A
system becomes too large to maintain when no one can remember all
of the interactions and every change introduces bugs.
TEIRESIAS([15] extends the idea (developed initially in automatic
programming research) that a system can aid substantially in
identifying sources of errors and can take on some of the
responsibility for making changes.

Research issues in knowledge acquisition and management are
discussed in Section IV.C.3..

IV.C. Methods of Procedure

We are interested in exploring the effects of new ideas
about knowledge based programming on a variety of systems to
effectively test the generality of these ideas. Each of the
topics in the core research area will be developed in the context
of more than one example program (see discussions of Projects l-
3).

The expert systems developed at the Heuristic Programming
Project over the last decade can be used as tools for the

99
Sec. IV.C, Core Research

development of the core research topics. Each of the biomedical
domains has particular aspects that can be utilized in this work:
the MOLGEN program for molecular genetics research has methods
for representing experiment planning, the MYCIN program for
infection disease diagnosis and therapy has a well developed rule
set, the PUFF program for pulmonary function test interpretation
has a small rule set, and the VM program for interpreting
physiological measurements from the Intensive Care Unit has a
knowledge base that emphasizes knowledge that changes over time.

Iv.C.1. Representation

In Section IV.B.1. we traced our work from specialized
representations as in the DENDRAL program to representations of
more general applicability — such as our production rule and
frame methodology. Today's representation systems, even the
“general" ones, do not solve all of the ovroblems that we are
encountering in our research. In most science, methods which are
general are also weak. There seems always to bé a need to tailor
aspects of a representation to particular problems. The
following representation issues stand out in our work:

Time-based knowledge

Several problems which we are working on involve situations
that evolve over time. In the Ventilator Management (VM) program
{21], time enters as instrument data that varies over time. The
program must correctly track the stages of treatment on the
treatment machines. In the RX program [5] for reasoning from
time-based clinical data bases, statements about disease and
treatment of patients need to be adequately quantified over time.
In the MYCIN [51] work, we want the system to be able to resume a
consultation session about a patient and appropriately update new
knowledge about the patient as treatment progresses, In the
MOLGEN project [48], the experiment planning program must plan a
sequence of steps. It must predict how the laboratory objects
will be changed over time as the manipulations proceed. The
basic issues common to these projects are (1) time-specified
reference to objects and (2) tracking causal changes on objects
over time. While these problems do not seem conceptually
difficult, they do require extensions to the representational
tools which we have available.

Grain Size in Complex Systems

 

190
Core Research Sec IV.C.

among the virtues of production rules 6 are (1) their
modularity allows easy addition and modification of inferential
knowledge and (2) they can be written in such a way that their
grain size seems appropriate for explanation systems. As we move
toward hierarchical reasoning methods the grain size of
individual production rules seems too small for coherent
explanations. Just as the reasoning methods work with
abstractions to reduce the combinatorics, explanations of this
should also be abstract.

At present, the problem of factoring knowledge is an opaque
art. When a frame~structured representation is used, a knowledge
engineer makes decisions about what facts to group together.
This decision takes into account indexing during problem solving
and the interactions among items in the knowledge base. In
hierarchical reasoning methods knowledge is viewed with a varying
grain size; it starts with an abstract conceptualization at the
beginning of problem solving and moves toward finer detail as the
solution proceeds. Although we have some understanding of how to
organize a body of knowledge hierarchically, much work remains to
be done to make the best use of that organization dur ing
knowledge acquisition and problem solving.

Matching representation methods to problems

In our current systems, a knowledge engineer must learn the
particulars about a problem and then pick or develop an
appropriate representation. We would like to extend current AI
ideas in the design of a system which takes more responsibility
for choice of representation. Such a system will select or
modify its representations combining the knowledge of the limits
and advantages of representations with the knowledge of its own
needs,

IV.C.2. Reasoning

In Section IV.B.2.} we traced our research on methods of
reasoning from the Generate-and-Test paradigm (DENDRAL, GAl), to
backwards chaining (MYCIN, EMYCIN, PUFF), to the cooperative
knowledge sources model (CRYSALIS, HASP, AGE-1). In this section
we discuss core issues related to these reasoning models as well
as some ideas for new models.

Incomplete Reasoning

 

Ssee {16] for a discussion of different ways of using this
formalism.

181
Sec, IV.C. Core Research

One of the themes in all of our methods of reasoning is the
treatment of inexact and incomplete knowledge. One of the
difficulties which we have perceived in MYCIN’s simple CF model
is that the representation is inadequate for discriminating
between (1) absence of evidence and (2) evidence of absence.
This example illustrates how the needs of the reasoning program
have to influence the fundamental representations used in the

system,

Reasoning with Abstractions

 

The availability of the Unit Package [52] has broadened our
capabilities for representing abstractions. For example, an
organism can be variously described as "a bacterium", "E.coli K-
12", “a bacterium that is grampositive", or even "a bacterium
with a vector which has the rat-insulin gene". A reasoning
program can use the descriptions available in the Unit Package as
abstractions in its reasoning process. We are currently using
this idea in the MOLGEN project for reasoning: about experiment
planning.

Orthogonal Planning

One of the themes in our representation work is to make
knowledge explicit for general processing. We have carried this
theme into an experimental framework for reasoning being
developed currently in the MOLGEN project. The idea is to make
the reasoning operations, which are carried out by a planner,
explicit in the knowledge base. These operators then implicitly
define an abstract “planning space". Our hope is that this will
provide a computer with a planning method more powerful and
flexible than previous hierarchical planning methods. The
feasibility of this approach is currently being tested.

Matching Reasoning Methods to Problems

 

One of our long term goals in developing and understanding
reasoning methods is to develop a theory for matching reasoning
methods to problems. Such a program would combine knowledge of
the limitations of available reasoning frameworks with the needs
of an application to aid in the design of a knowledge based
system. We have started on this problem with the research of the
AGE project within the HPP.

102
Core Research Sec IV.C,
IV.C.3, Knowledge Acquisition and Management

In Section IV.B.3., we traced our work on knowledge
acquisition from the DENDRAL program, where knowledge was
acquired by a knowledge engineer and then programmed into the
system, to the PUFF example where the EMYCIN package greatly
accelerated the creation of a consultation system for pulmonary
function diagnosis.

Three Phases of Knowledge Acquisition

As a result of our recent experiences with the SACON
program [3], we have found it useful to characterize the
knowledge acquisition process as occurring in three distinct
phases. We have done the most research on the third phase and
plan to work our way towards the first phase.

(1) Framework Identification. The first phase corresponds to
Making initial decisions about the typical advice the
consultant will give and the major reasoning steps the
consultant will use.

(2) Acquisition of Fundamental Concepts. This is followed by
an extended period of defining parameters and objects. These
objects form the fundamental vocabulary of the domain. Using
this initial domain vocabulary, a substantial portion of the
rule base is developed. This process, captures enough domain
expertise to allow the consultation system to give advice on
the large number of common cases.

(3) Acquisition in a Well-Developed Knowledge Base. In the
final phase, further interactions with the expert tend to
refine and adjust the established rule base, primarily to
handle more obscure or complicated cases. In this phase, the
system can draw on examples from the knowledge base to guide
the acquisition process.

 

Previous work on the TEIRESIAS program [15], which explored
one possible method for handling the "final phase", will provide
the basis for our research in knowledge acquisition. This phase
of the acquisition task utilizes the large body of knowledge to
Set the appropriate context for understanding new facts.

Consistency

183
Sec. IV.C. Core Research

Developing an understanding of the automatic management of
knowledge during and after its acquisition is an important aspect
of our research aims. The knowledge base consists of the
totality of concepts and relations between concepts that have
been presented to the program. We will investigate methods for
determining the consistency of the aggregate knowledge base.

The quality of the knowledge base is improved through
experimentation. Cases are run (for medical domains) by
selecting a diverse set of patients and comparing the results to
the conclusions of our expert. When the results don’t match, the
knowledge base must be updated to account for those changes. Two
Operations are important for this process: (1) the ability to
determine the piece or pieces of knowledge that must be changed
and (2) determining that changing the knowledge to correct the
results on one patient will not produce incorrect results when
applied to another patient.

Another possibility is to identify and; in effect, live
with inconsistency, just as people apparently do. Predominantly
rational behavior may be evinced by a system which does not
satisfy consistency requirements. The key test is whether the
elimination of any "inconsistent" rule makes the system behave
better or worse in the long run. This is closely tied to
consensus-formation, as discussed in the next section.

IV.C.4. Multiple Uses of a Knowledge Base

 

We are exploring many additional uses of the knowledge base
beyond the performance aspects for which we acquired the
knowledge. Three areas are of interest: using the Knowledge for
explanation of the reasoning steps of the program, using the
knowledge for intelligent teaching about the domain, and using
the knowledge base as a vehicle for building consensus among
experts,

Explanation

The use of explicit inference rules in a knowledge base has
made it possible to generate an explanation of the programs”
reasoning steps. While this has been achieved in the “backwards
chaining" reasoning model, it is more difficult in the reasoning
methods which reason hierarchically. We will examine methods for
modifying the level of explanation based on the abstractions used
by the program and a model of the user.
Core Research Sec IV.C.

Tutor ing

The act of explaining the knowledge has led to the problem
of using the knowledge base for tutoring purposes. Our initial
experiment with this in the MYCIN framework [12] demonstrates the
potential educational value of this use of the knowledge base.
Under another proposal (pending to ONR & ARPA) we will be
exploring strategies for presenting the contents of a knowledge
base represented as a set of rules. Here we propose to extend
those methods for relating to the user the contents of knowledge
bases stored in other representations.

Consensus Building

We propose to investigate approaches for building consensus
among experts. Because the strength of consultation programs
will in large part lie with their ability to pool knowledge from
several sources, it is important to recognize apparent
differences of opinion among experts and to assist, when
possible, with arriving at a consensus. This represents another
version of the consistency checking problem: comparing the
ramifications of multiple versions of knowledge and providing the
capability to guide an interaction in which such differences are
"ironed out". Of course there may be times when both versions of
the knowledge may need to be stored and appropriately flagged so
that users can select which experts” opinion they will follow
during a consultation. The experts may wish to select a style of
reasoning (e.g., empirical vs theoretical), rather “than a
particular individual’s set of rules. Ultimately, the system
itself may be able to choose from differing advice in its
knowledge base.

All of these areas require some augmentation to the
knowledge base to provide the causal reasoning steps upon which
the knowledge is tied. This allows a program to explain why a
particular rule was written in addition to telling how the rule
was used to make a particular conclusion. Similar needs have
been shown in the use of a rule base for tutoring and for
determining consensus among experts [37]. Often, a rule will be
put into the system cast in a much more specific form than that
to which the knowledge truly applies. One task to investigate is
how to generalize to just the proper level. More complex still
are the subtle changes that accompany a rule as it is generalized
(¢.g., changing certainty factors).

105
Sec. IV.D. Core Research

IV.D. Signif icance

The significance of this work is twofold:

1. Understanding how to represent inexact and incomplete
knowledge symbolically so that a system can perform complex
intelligent processes -- like diagnosis and explanation.
This work expands the boundaries of what we understand how to
do with computers.

2. Investigating the fundamental questions that underlay the
development of domain-independent tools of AI discussed
elsewhere in this proposal.

One of our ultimate goals is to understand the techniques
employed in building such programs. It has always been difficult
to determine if a particular problem-solving method used ina
particular knowledge-based program is domain-specific or whether
it can generalize easily to other domains. [In current knowledge-
based programs, the domain knowledge and the manipulation of it
using AI techniques are often so intertwined that it is difficult
to uncouple them, to make a program useful for another domain.
This long range goal, then, is to isolate AI techniques that are
general, to determine the conditions for their use; to build up a
knowledge base about AI techniques themselves. We will carry out
our research with this question in mind: what are the criteria
determining whether a particular problem-solving framework and
representation system is suitable for a particular application?

186
Facilities Sec V

Vv. FACILITIES AVAILABLE

V.A. Hardware

All computing work will be carried out initially on the
SUMEX facility, a dual processor DEC KI-1@ system running TENEX.
The system is located at Stanford, but is supported by NIH under
grant RR-9785 as a national resource for the study of
applications of artificial intelligence to problems in biology
and medicine. It has available a wide variety of advanced
programming languages (¢.g., INTERLISP, SAIL), and support
programs (e.g., text editors), as well as powerful file handling
and storage management capabilities. Resources available at no
cost to this program include CPU usage and disk storage, while
access is via local dial-up lines and three networks (TYMNET,
TELENET, and ARPANET) .

Within the next 18 months the SUMEX installation is also
scheduled to receive a PDP-20/208 system that will be interfaced
with the currently existing PDP-1%. The new machine is intended
for service-related applications of artificial intelligence to
medicine, and some of our programs, once operational, would most
appropriately be run on this machine. The machine will be used
by other projects, however, and may occasionally be scheduled for
sole use by one of these. Thus SUMEX can make no commitment to
provide scheduled service to medical personnel wishing to use the
programs routinely. The PDP-20/208 hence will function as a
prototype for the kind of dedicated small machine that may
eventually operate in the clinic.

V.B. Software and Personnel

Our proposal is to build on the knowledge representation
and control techniques developed during work on the MYCIN,
Molgen, PUFF, and AGE systems in the Heuristic Programming
Project. New programs and data structures will, of course, be
required. Starting with existing software packages, however, is
a considerable advantage over developing the software - and
design experience ~- de novo. The base language will continue to
be INTERLISP.

In addition to the computing power and the large collection
of existing software, access to the SUMEX system also offers the

197
Sec. V.B. Facilities

benefit of being a part of the SUMEX-AIM community. The SUMEX
user community includes a wide range of researchers in artificial
intelligence united by a number of common interests. We have
found our interchanges with them in the past to be very useful,
and expect this to continue.

198
Collab. Arr. Sec, VI.

Vi. COLLABORATIVE ARRANGEMENTS

Formal collaboration with Dr. Lindberg’s group at the
University of Missouri is the natural result of many years of
informal exchange. The formal arrangement between the two
institutions is that Dr. Lindberg’s project will be funded as a
subcontract from Stanford, with budget as indicated in the budget
section,

There is a long history of successful collaboration between
the Stanford Medical School and the Computer Science Department.
The SUMEX Computer Facility is a physical demonstration of this
collaboration, while the large number of interdisciplinary
research publications is more evidence. In part, this is due to
the physical proximity of the two groups; but more importantly,
it is due to common interests and common goals. The SUMEX
facility itself has removed many of the communication barriers
which often halt interdisciplinary research.

109
Sec. VII. P.I. Assurance
VII. PRINCIPAL INVESTIGATOR ASSURANCE

The undersigned agrees to accept responsibility for the
scientific and technical conduct of the research project and for
provision of required progress reports if a grant is awarded as
the result of this application.

“lait. 30,1979 Sly A Fez stbali-—
fd

Date Principal Investigator

 

 

119
Appendix 4.

VIII. APPENDICES

VIIL.A. APPENDIX A =~

Annotated MYCIN Typescript

In the following pages we have included many detailed examples of the MYCIN
program in operation. These exemplify both the accomplishments and the limitations
of the work we have done so far. Although we are not proposing expansion of the
program’s infectious disease knowledge at this time, these examples should help
illustrate the kinds of capabilities that we intend to develop in 2 system for
oncology protocol management.

The examples in this appendix include the following:

Section I - A sample production rule, translated into English.

Section II - Instructions printed for new users if they recuest assistance when
trying MYCIN for the first time.

Section III - Free-text case summary that may be entered by a physician for
purposes of case identification in the future.

Section IV - Detailed example of a consultation session for a patient with
meningitis; the WHY and HOW commands of the reasoning-status checker (PSC) are also
demonstrated.

Section V -~- Interactive session with the general question answerer (COA)
regarding the consultation session in Section IV.

Section VI - Example of MYCIN’s ability to assist with antibiotic dosage
modification in renal failure patients; note that the program can also explain its
decisions at this specialized task.

Section VII - Example of a graphical option we have developed which permits
interested physicians to display a chart estimating the steady state blood levels of
an antibiotic at a variety of regimens for modified dose or dosing interval.

Section VIII - Example of a subsystem of MYCIN in which the user can circumvent
much of the extensive consultation session demonstrated in Section IV. Tl a
physician is relatively certain of the infection and organisms to be treated, he may
specify these as shown and MYCIN will simply assist with therapy selection.

lll
Sec. VIII.A. Appendix A.

Section IX - Example of MYCIN’s ability to rerun previously stored patients and
to interact with an expert when a problem in performance is identified. Note that
MYCIN and the expert have a "discussion" in which a missing rule is identified. The
physician tells MYCIN the missing rule (in English) and the program translates it
into its internal LISP representation. The case is then run again to see if the
performance improves with the new rule in place.
Appendix A.

Nn
oO
a
<j
re
4
tH
.
he
.

i. Sample Rule with Addditional Stored Information

RULE 200

[This rule applies to all cultures and suspected infections, and is tried in
order to find out about the organisms (other than those seen on cultures
or smears) which might be causing the infection]

If: 1) The infection which requires therapy is meningitis, and
2) The patient does have evidence of serious skin or soft tissue
infection, and
3) Organisms were not seen on the stain of the culture, and
4) The type of the infection is bacterial
Then: There is evidence that the organisms (other than those seen on
cultures or smears) which might be causing the infection is
stapbylococcus=coag-pos (.75) streptococcus-group-a (.5)

Author: YU

Literature: G. Karalazin "Sickle-cell anemia - CLINICAL MANIFESTATIONS
IN 100 PATIENTS" AmJMedSci 269:51 1975.
E. Barret-Connor "Acute pulmpnary disease and sickle-cell anemia"
ARRD 104:159 Aug 1971.

M. Robinson "Pneumococcal meningitis in sickle-cell anemia"NEM
274: 1006 1966.

113
Sec. VIII-A. Appendix A.

IT. 1 T
It Instr Stions fey Users at heaprant ?éptional)
MYCIN 3=Jun-78& ...

Special options (type ? for help):
RK

Instructions? (¥ or N)

ek YES

This is a computer program named MYCIN that has been designed to
advise you regarding an appropriate choice of infectious disease
therapy. It is understood that you have a patient who may have an
infection. Please answer the following questions, terminating each
response with RETURN. To correct typing errors, use the DELETE key to
delete single characters, <ctrl>W to delete a word, and <ctrl>C to
delete the whole line.

If you are not certain of your answer, you may modify the response
by inserting a certainty factor (a number from 1 to 10) in
parentheses after your response. Absolute certainty (10) is assumed
for every unmodified answer. It is likely that some of the following
questions can not be answered with certainty.

You may change an answer to a previous question in two ways.
If the program is waiting for a response from you (that is, has typed
"ek"), enter CHANGE followed by the number(s) of the question(s)
whose answers will be altered. You may also change a previous
answer at any time (even when the program is not waiting for a
response from you) by typing <ctrl>F (Fix), which will cause the
program to interrupt its computation and ask what you want to change.
(If the response to <ctrl>F is not immediate, try typing the RETURN
key in addition.) Try to avoid going back because the process
requires reconsidering the patient from the beginning and therefore
may be slow.

Note that you may also enter UNK (for UNKown) if you do not know
the answer to a question, ? if you wish to see a more precise
definition of the question or some examples of recognized responses,
22? if you want to see all recognized responses, the word RULE if
you would like to see the decision rule which has generated the
question being asked, the word WHY if you would like to see a more
detailed explanation of the question, or the letters QA if you would
like to interrupt the consultation in order to ask questions
regarding the decisions made so far in the consultation. If you are
ever puzzled about what options are available to you during a
consultation, enter the word HELP and a list of options will be
listed for you.

Sample Response [user input follows the "**"]
Does the patient have a risk factor for tuberculosis?
ae
One or more of the following are considered risk factors for tb:
a) positive PPD (5IU), b) history of close contact with a

114
Appendix A. Sec VITI.A.

person having active tb, c) household member with a past
history of active tb, d) chest X-ray showing apical scarring,
e) granulomas seen on biopsy of any organ tissue.

expected responses are: YES NO
Enter HELP for user options.
** YES
SUMMARY:
{type ctrl-0 to abort printout)
UNK = answer not known
? ~- Repbrases the question and gives examples of recognized
responses
2? - prints a list of all recognized responses
RULE - prints the current decision rule
QA - program enters question-answering mode
CHANGE - go back and re-request answer to question number
COMMENT - enter comments about any aspect of the system’s
performance. Your comments will be forwarded to those in
charge of the MYCIN ‘program.
WHY - gives high-level explanation of the current reasoning
chain that provoked this question.
HOW ~ explains HOW the system will achieve a goal referred to
by number in a previous explanation.
EXPLAIN - provides a more detailed explanation of a previous
answer given by a WHY command.
FORGET - resets the explanation of the reasoning chain back to the
lowest level, as if you never used the WHY/EXPLAIN commands.
STOP ~ Halts the program without completing the consultation.
SAVE - Halts the program, saving the current patient on a disk
file, retrievable at a later date.
HELP - Prints this list
Sec. VIII.A. Appendix A.

IIT. User’s Description of Patient

This will be patient 538

If you would care to enter a brief summary of PATIENT-538’s case, to be used
for easy future reference to this file, please do so now (type as many lines
as required, terminate with a control-Z):

** 34 yr old white male with severe burn. After 15 days

** of hospitalization pt became disoriented and febrile. On 16th

** hospital day LP performed. PMN 100, PMNZ99, Pro 280, Glu 5 (periph
** blood glu unk). WBC 1900 with 88% PMN and 5% bands. Pt had drug-
** induced aplastic anemia. Current med at time of LP: Cephalothin.

** Final org: Pseudomcas-aeruginosa. Correct Rx: Ideal would have been
** Carb + Tobra (or Genta). Same as pt 504 except gram stain of CSF

** omitted.

Z
ak

116
che

Appendix A. Sec VIII.a

IV. Infectious Disease Consultation Session
[User’s responses follow double asterisks]

1) Patient’s name:
eK PT5228

2) Age:

** 34 YEARS

3) Sex:

*k* MAEL

4) Race:
ek WHY

{i-e. WHY is it important to determine the race of PATIENT~-5387]

{1.0] The race of a patient is one of the 5 standard parameters that identify
a patient.
This may be useful later in the consultation. For example, it is
relevant in determining the organisms (other than those seen on cultures
or smears) which might be causing the infection.

{back to question 4...]
** CAUCASTAN

Please supply one answer under each heading.- If your response is not
understood, you will be prompted by the standard questioning method.
(enter ? for normal prompting, * for questions that are not applicable).

Please enter the results of positive cultures using the following table.
site culture date date | method |
number collected positive of collection
{urine,sputum only]

53) **DONE

Please enter the results of pending cultures in the following table.

site culture date stain| method of |
number collected examined collection
{urine,sputum only]
6)** CSF 1234 30-Jan-77 YES *

stain of morph!

smear
{type DONE if no organism was seen on the smear]
7) **DONE
site cul date coll. smear examined?| coll. method!
8)**DONE

Please enter negative cultures in the following table.

117
Sec. VIII.A. Appendix A.

site culture date! method of |
number collected collection

{urine,sputum only]
9) **DONE

10) Do you suspect Pt53@ may have an infection at a site from which you
have not obtained culture specimens?
** NO

Please enter information on current antimicrobials.

drug route of date|

name administration started
11)** CEPHALOTHIN IV 23-Jan-77
12)**DONE

Please enter information on prior antimicrobials.

drug route of date date|
name administration started stopped
13) **DONE

{Considering organisms (other than those isolated from cultures) which might
be present...]

14) Has Pt538 recently had symptoms of persistent headache or other
abnormal neurologic symptoms (dizziness, lethargy, etc.)?

** YES

15) Has Pt53@ recently had objective evidence of abnormal neurologic signs
(nuchal ridigity, coma, seizures, etc.) documented by physician
observation or examination?

** YES

The CSF cultures will be considered to be associated with MENINGITIS.

16) Please give the date on which clinical evidence (symptoms, signs, or
laboratory tests) of the meningitis first appeared.

ee 29-Jan-77 22:15

17) Does Pt538& have an abnormal chest X-ray?

kk NO

18) Does Pt538 have one or more of the following risk factors for tb: a)
positive intermediate strength PPD, 6b) history of close contact with a
person with active tb, c) a household member with a past history of
active tb, d) apical scarring on chest xray, e) history of granulomas
on biopsy of liver, lymph node, or other organs.?

** NO

19) Does Pt538 have any cutaneous lesions or rash on physical examination?

** NO

20) Does Pt538 have evidence of ocular nerve dysfunction?

xk NO

21) Is Pt53& a compromised host (e.g. alcoholic, sickle-cell-disease,
immunosuppressed)?

**k YES

22) Is Pt53@ immunosuppressed (e.g. corticosteroid therapy, cytotoxic drug

118
Appencix 4. Sex

therapy, radiation therapy, leukemia, lymphoma)?

kk NO

23) For how many days has Pt538 had abnormal neurologic signs?

** 7,2 HOURS

24) For how many days has Pt538 had neurologic symptoms?

xx 1 DAYS

25) Has Pt538 had recent evidence of tender or enlarged salivary glands?

ee NO

26) Has Pt538 been exposed to any contagious disease recently (e.g.
meningococcal disease, mumps) ?

** NO

Please enter CSF findings in the following table

CSF Percent Protein CSF Peripheral |
WBC PMN Glucose Glucose
27)** 100 99 280 5 UNKNOWN
28) Do you know the results of a recent CBC of Pt538?
kk YES

Please enter CBC results in the following table

WBC Percent Percent |
in 1000’s PMN BANDS
29)*%_ 1.9 UNKNOWN UNKNOWN

30) Has Pt538 had any injury or insult to, or defect in the central-
nervous-system (excluding neurosurgery)?

kk NO

31) Has Pt538 ever undergone any type of surgery?
ak YES

32) Does Pt538 have a history of NEUROSURGERY?

kk NO

33) Does Pt538 live in a crowded environment?

€* NC

34) Do you suspect recent alcoholic history in Ft5328?
xk NO

35) Is meningitis a hospital-acquired infection?
kk WHY

{iee. WHY is it important to determine whether the infection was acquired
while the patient was hospitalized?]

(2.C] This will aid in determining the organisms (other than those seen on
cultures or smears) which might be causing the infection.
It has already been established that
[2.1] the infection which requires therapy is meningitis, and
[2.2] organisms were not seen on the stain of this pending csf culture,
and
[2-3] the type of the infection is bacterial, and
[2.4] neurosurgery is not one of the relevant items from the history
of pt538, and
[2.5] the age of pt53@ is greater than 10 days
Therefore, if
[2.6] the infection was acauired while the patient was hospitalized

119
Sec. VIII.A. Appendix A.

then
there is evidence that the organisms (other than those seen on
cultures or smears) which might be causing the infection is e.coli
(.75) staphylococcus-coag-pos (.3) pseudomonas-aeruginosa (.3)
klebsiella-pneumoniae (.5)
Author: Yu
{RULE545]

{back to question 35...]
**k HOW 2.3

{i-e. HOW was it established that the type of the infection is bacterial?]

The following rules concluded about:
the likelihood that the type of the infection is bacterial

Rule Yes No
(2.7] RULE1L48 (19)
(2.8] RULE5CO (.51)
{2.9} RULESOL (.95)
{3.0] RULE502 (.97)
(3.1] RULE5 26 (98)
{2.2] RULE504 (97)
{3.3] RULE524 (.98)

Thus, is has been established that the type of the infection its bacterial.

{back to question 35...]
xe HOW 2.5

{i-e. HOW was it established that the age of pt538 is greater than 10 days?]
In answer to question 2 you said that the age of pt538 is 34 years
Thus, it has been established that the age of pt538 is greater than 10 days.

{back to question 35...]

ak Y

36) Is Pt538’s clinical history consistent with EPIGLOTTITIS?
*kk NO

37) Is Pt538’s clinical history consistent with OTITIS-MEDIA?
kk NO

38) Has Pt53& ever undergone splenectomy?

xe NO

39) Is Pt538 a burn patient?

*x* WHY

{i-e. WHY is it important to determine whether PATIENT-538 has been seriouslv
burned?]}

{4.0] This will aid in determining the organisms (other than those seen on
cultures or smears) which might be causing the infection.

120
Appendix A. Sec VIII.A.

It has already been established that
{4.1] the infection which requires therapy is meningitis, and
{4.2] organisms were not seen on the stain of this pending csf culture,
and
{4.3] the type of the infection is bacterial
Therefore, if
{[4.4] pt53€ has been seriously burned
then
there is suggestive evidence (.5) that pseudomonas-aeruginosa is one
of the organisms (other than those seen on cultures or smears)
which might be causing the infection
Author: Yu
{[RULE57 8]
[back to question 39...]
kk WHY

{i-e. WHY is it important to determine the organisms (other than those seen
on cultures or smears) which might be causing the infection?]

[5.0] ... in order to determine the therapeutic regimen of PATIENT~538.

Tf
[5.1] information has been gathered about organisms isolated from the
patient, organisms noted on smears of the patient, negative
cultures of the patient, suspected infections without
microbiological evidence, current drugs of the patient and
prior drugs of the patient, and
{5.2] an attempt has been made to deduce the organisms which recuire
therapy, and
{[53.3] You have given consideration to organisms (other than those
seen in cultures or smears) which might be present, and found
at least one organism which requires therapy.
then
determine the best therapy recommendation from among*the drugs likely
to be effective against the significant organisms
(RULE092]

{back to question 39...]
kk OY

T have found evidence (RULE545 RULF557) that treatment should cover for E.coli
causing meningitis.
I have found evidence (RULE545 RULE557 RULES78) that treatment should cover
for Pseudomonas-aeruginosa causing meningitis.
I have found evidence (RULE545 RULES57) that treatment should cover for
Klebsiella~pneumoniae causing meningitis.
I have found evidence (RULE507) that treatment should cover for Diplococcus-
pneumoniae causing meningitis.
40) Does Pt538& have focal neurological signs?
xe UNKNOWN
[Determining which drugs are desirable for use against the E.coli...]
[Determining which drugs are desirable for use against the Pseudomonas-
aeruginosa...]
{Determining which drugs are desirable for use against the Klebsiella-
pneumoniae...]

bo
bo
re
Sec. VIII.A. Appendix A.

(Determining which drugs are desirable for use against the Diplococcus-
pneumoniae...]

My therapy recommendations will be designed to treat for organisms
that either are very likely or, although less likely, would have a
significant effect on therapy selection if they were present. It is
important to cover for the following probable infection(s) and
associated organism(s):

INFECTION-1 is MENINGITIS

+ <ITEM-1> E.COLI [from clinical evidence only]
+ <ITEM-2> PSEUDOMONAS<AERUGINOSA [from clinical evidence only]
+ <ITEM=3> KLEBSIELLA-PNEUMONIAE [from clinical evidence only]
+ <ITEM-4> DIPLOCOCCUS-PNEUMONIAE [from clinical evidence only]
41) Does Pt538 have a clinically significant allergic reaction to any
antimicrobial agent?
**k NO
(Considering AMPICILLIN-AND-GENTAMICIN for use against INFECTION-1L...]
42) Do you have reason to suspect that Pt538 may have impaired renal
function?
** NO

43) Pt538’s weight in kilograms (or <number> PCUNDS):
** 70

{(REC-1] My preferred therapy recommendation is as follows:
In order to cover for Items <l 2 3 4>:
Give the following in combination:
1) AMPICILLIN
Dose: 3.5g (28.0 ml) q4h IV [calculated on basis of 50 mg/kg]

2) GENTAMICIN

Dose: 119 mg (3.0 ml, 8Cmg/2ml ampule) q&h IV [calculated on basis of
1.7 mg/kg] plus consider giving 5 mg q24h Intrathecal

Comments: Monitor serum concentrations

Since high concentrations of penicillins can inactivate
aminoglycosides, do not mix these two antibiotics in the same IV
bottle.

Do you wish to see the next choice therapy?
** YES

Which, if any, of the above drugs do you wish eliminated from consideration?
ee AMPTCILLIN

{Considering GENTAMICIN for use against INFECTION-1...]
{Considering PENICILLIN for use against INFFCTION-1l...]

([REC=2] Next best therapy recommendation:
Appendix A. Sec VIII.aA-

In order to cover for Items <1 2 3>:
Give: GENTAMICIN
Dose: 119 mg (3.0 ml, 80mg/2ml ampule) Sh IV [calculated on basis of
1.7 mg/kg] plus consider giving 5 mg q24h Intrathecal
Comments: Monitor serum concentrations

In order to cover for Item <4>:
Give: PENICILLIN
Dose: 3.5 million units (7-0 ml) a4h IV [calculated on basis of S500C0
units/kg]

RECOMMENDATION=2 is equivalent to RECOMMENDATION-1.

Would you like to see details of this comparison?
kk YES
Looking at the coverage of the most likely items, we see that:

RECOMMENDATION-1 consists of 1 first choice drug (AMPICILLIN for ITEM-1
and ITEM-4) and 1 third choice drug (GENTAMICIN for ITEM-1, ITEM~2 and
ITEM-3)

RECOMMENDATION-2 consists of 1 first choice drug (PENICILLIN for ITEM-4)
and 1 third choice drug (GENTAMICIN for ITEM=2, ITEM=1 and ITEM=3)

ORGANISMS RECOMMENDATION-1 RECOMMENDATION~2
most likely’ Drug -- Choice Ranking Drug -- Choice Ranking
ITEM-1 AMPICILLIN-AND-GENTAMICIN -- Ist
GENTAMICIN -—= 3rd

ITEM=2 GENTAMICIN -=- 2nd GENTAMICIN =-- 2nd
ITEM-3 GENTAMICIN — 3rd GENTAMICIN —~ 3rd
TTEM-4 AMPICILLIN -~ lst PENICILLIN -- Ist

(The desirability of a drug is defined to be its lowest ranking for the items
it covers.)

Do you wish to see the next choice therapy?
** NC

123
Sec. VIII.A. Appendix A.

Vv. Answers to User’s Questions about this Consultation and
About the Knowledge Base

Please ask any simple~-English questions regarding the above
consultation session or the overall knowledge of the consultation
system. When discussing the consultation above, please refer to
organisms, cultures, etc. by their numeric names (eg. ORGANISM-1).
Questions which were asked during the consultation may be specified
by the numbers that preceded them above, (e.g., “WHY DID YOU ASK
QUESTION 17°). Control-0 will abort type-out, or “cancel” the current
question. Type HELP for a list of the options available. Enter the
word STOP when you have no further questions.

Type HELP for the list of question options.
xk HELP

QA-OPTIONS:
(Type ctr1l-0 to abort printout)
ADVICE - prints suggestions for using the CA system effectively
OT - prints examples of question types recognized by the system
EQ <numbers> - prints an explanation of the specified
question(s) from the consultation, e-g., EQ 8 13
LP <object type> - prints list of attributes associated
with object type, e.g., LP CULTURE. (Object types are
patient, infection, culture, organism, and drug.)
PR <numbers> - prints specified rule(s), e.g-, PR 373 49 87 2
EXPLTHER ~ prints a short description of the therapy program
REC - prompts user for alternative therapy RECommendation and
compares it to mycin’s recommendation
RX <infection> <organism identities> - prints treatment lists for
specified organisms causing the indicated infection, e.g.,
RX MENINGITIS ENTEROCOCCUS
JUSTIFICATION - when rules are printed out, include their
justifications
LITERATURE - when rules are printed out, include their literature
references
NOJUST ~- stop giving justification and/or literature with rules
TERSE - print rules in terse format
VERBOSE = turns off terse printing of rules
STOP - escape from QA-module
HELP - prints this list

kk JUSTIFICATION
** LITERATURE
*k WHY DID YOU TREAT E.COLI?

ITEM-1 is E.coli associated with meningitis. I decided to treat ITEM-1 on
the basis of clinical evidence only.

The following rules concluded about:
the likelihood that e.coli is one of the organisms (other than those

124
Appendix A.

2
@
QO

seen on cultures or smears) which might be causing the infection

RULE545 was used to conclude that e-coli is one of the organisms (other than
those seen on cultures or smears) which might be causing the infection
(.74). The last question asked before the conclusion was made was 35.

RULE557 was used to conclude that e-.coli is one of the organisms (other than
those seen on cultures or smears) which might be causing the infection
(.93). The last question asked before the conclusion was made was 27.

The parameters most frequently used in these rules were:
the infection which requires therapy
whether a smear of the culture was examined
whether organisms were seen on the stain of the culture
the type of the infection

Would you like a summary of ALL the parameters used by these rules? ** YES
Parameter (rules with this parameter)

the white count from the patient’s peripheral CRC (in thousands) (557)
whether the results of a recent CBC of the patient are available (557)
whether the infection was acquired while the patient was hospitalized (545)
the age of the patient (545)

the relevant items from the history of the patient (545)

the type of the infection (545, 557)

whether organisms were seen on the stain of the culture (545, 557)

whether a smear of the culture was examined (545, 557)

the infection which requires therapy (545, 557)

** WHY DIDN°T YOU GIVE TETRACYCLINE FOR E.COLI IN REC-1?
Shall I describe the therapy algorithm for you?** YES
DESCRIPTION OF THERAPY SELECTION
{Type entr1-0 to abort printout]

MYCIN is designed to provide optimal therapy recommendations. The
program uses the following three steps:

The first step ranks the drugs for each organism that recuires
therapy. Rules for doing this take into account invitro (lab) sensitivities
and the patient’s response to prior and current therapies. In the absence of
other information, default lab sensitivities are used. Preference is given
to current therapies if certain criteria are satisfied.

Next, 1 or 2 drugs are selected to cover for the organisms which are
most likely to be causing the infection. As many of the "less likely"
organisms as possible are treated, but they are disregard altogether if
3 or more drugs are necessary. Throughout, preference is given to
drugs that appear higher in the treatment lists. (These lists can be
examined via the RX option.)

Finally, the set of drugs is tested for patient factors that preclude
their administration, such as allergies, age, and renal function.
TETRACYCLINE was not prescribed for ITEM-l
TETRACYCLINE was not prescribed for ITEM~-l in RECOMMENDATION~1 because:

125

Ze

at

iH
4
.
Sec. VIII.A. Appendix A.

TETRACYCLINE was discounted for ITEM-l1 because there its evidence that
this e.coli is not sensitive to it.

Would you like to see some details?** NO

** HOW DO YOU USE THE TYPE OF THE INFECTION?
' The following rules use:
the type of the infection

154, 578, 574, 572, 570, 565, 564, 559, 557, 553, 551, 549, 547, 545, 543, 541,
§39, 537, 535, 533, 531, 529, 523, 520, 519, 517, 515, 513, 511, 507, 300,
289, 288, 287, 280, 174, 167

The parameter most frequently concluded by these rules is:
the organisms (other than those seen on cultures or smears) which might
be causing the infection

Would you like a summary of ALL the parameters concluded by these rules?
*x* NO

Which of the rules do you wish to see?
*k 220

RULE 280

{This rule applies to all cultures, and is tried in order to find out about
the organisms (other than those seen on cultures or smears) which might
be causing the infection]

If: 1) The infection which requires therapy is meningitis, and
2) The type of the infection is fungal, and
3) Organisms were not seen on the stain of the culture, and
4) The patient is not a compromised host, and
5) The patient has been to an area that is endemic for coccidiomycoses,
and
6) The race of the patient is one of: black asian indian, and
7) The cryptococcal antigen in the csf was not positive
Then: There is suggestive evidence (.5) that cryptococcus is not one of the
organisms (other than those seen on cultures or smears) which
might be causing the infection
Author: YU
Justification: Dark-skinned races, especially Filipino, Asian, and Black (in
that order) have an increased susceptibility to coccidiomycoses
meningitis.
Literature: Stevens, D.A. et al. Miconazole in Coccidiomycosis. Am J Med,
60:191+202, Feb 1976.

** HOW DO YOU TREAT PNEUMCCCCCAL PNEUMONIA

The MYCIN system currently can treat the following infections: BACTEREMIA

MENINGITIS CYSTITIS. Because the knowledge base is incomplete, all other
infections will be treated like bacteremia.

For treatment of DIPLOCOCCUS-PNEUMONIAE in BACTEREMIA

126
‘Appendix A.

area:

** WRAT DOSE OF ERYTHROMYCIN DO YOU RECOMMEND?
When
if age is > 12 yrs then 12 mg/kg q6h IV

else if age is > .08 yrs then 5 mg/kg q6h IV
else if age is > .02 yrs then 5 mg/kg ql2h IV

3rd

drugs of choice (ranked by relative efficacy,

choice:
PENICILLIN
AMPICILLIN
choice:
NAFCILLIN
CEPHALOTHIN
ERYTHROMYCIN
CLINDAMYCIN
choice:
TETRACYCLINE
CHLORAMPHENICOL
VANCOMYCIN

Duration of therapy: 28 days

Sec VIII.A.

assuming sensitivity)

ERYTHROMYCIN is given for BACTEREMIA, the recommended dosage is:

(Different dosages are recommended for different infections. ]

** HOW DO YOU TREAT CYSTITIS

In general, treatment for an infection depends on the causative organism.
Please include in your question the organism of interest.

** STOP

bo
~~
Sec.

VIIT.A. Appendix a.

VI. Dose Calculation Consultation Session

MYCIN

3-Jun-78 ...

Special options (type ? for help):
kk DOSE

Instructions? (Y or N)

&k YES

This program makes dosage recommendations for a patient with an
infectious disease. You will be asked a few facts about the patient

which

will be used to determine the appropriate dosages. Since the

recommended dose of an antibiotic may differ with infection, you will

first

given.

what

be asked to enter the infection for which the drugs are to be
Then you will be prompted for an antibiotic and will be told
dosage to give. You may then request dosing information for

another antibiotic (type RETURN if you are not interested in other
drugs). You may also request dosing information for antibiotics to

treat

another infection (type RETURN when you are done).

 

---+-- PATIENT-541-~------

1) Patient’s name:

** OSCAR LOVEJOY

2) Age:

ak 34

3) Sex:

ak OM

4) Do you have reason to suspect that Oscar Lovejoy may have impaired
renal function?

ae Y

5) What is the most recent creatinine clearance in ml/min which you feel
is a true representation of Oscar Lovejoy’s renal function?

ee OY

6) What is the most recent serum creatinine of Oscar Lovejoy (in
mg/10Cm1)?

xe 1.9

7) What is the previous serum creatinine of Oscar Lovejoy (in mg/100m1)?

xe 128

8) Number of days between Oscar Lovejoy’s two most recent serum
creatinines:

ae 2

9) Oscar Lovejoy’s weight in kilograms (or <number> POUNDS):

ek 70

10) Cscar Lovejoy’s height in centimenters (or <number> INCHES):

Re 175

Estimated body surface area is 1.9 sq. meters.
Dosage adjustments will be based on the calculated creatinine clearance of

42.7 ml/min/1.73 sq. meters (adjusted to average body surface
area.)

Infection: MENINGITIS

Drug:

GENTAMICIN

128
Appendix A. Sec VIII.aA.

After a loading dose of:

112 mg (2.8 ml, 20mg/2ml ampule) IV [calculated on basis
of 1.6 mg/kg},

give:

70 mg (1.8 ml, 8Cmg/2ml ampule) q@h IV [calculated on
basis of 1.0 mg/kg] plus consider giving 5 mg q24h
Intrathecal

Or, after a loading dose of:

140 mg (3.6 ml, @Omg/2ml ampule) IV [calculated on basis

of 2.0 mg/kg],

119 mg (3.0 ml, 80mg/2ml ampule) ql4h IV [calculated on
basis of 1.7 mg/kg] plus consider giving 5 mg q24h
Intrathecal
{normal dose is (1.7 mg/kg q8h IV plus consider giving 5 ng
q24h Intrathecal)]
New interval (or CR)
we 12
After a loading dose of:
133 mg (3.4 ml, 0mg/2ml ampule) IV [calculated on basis
of 1.9 mg/kg],
give:
105 mg (2.6 ml, @O0mg/2ml ampule) qi2h IV [calculated on
basis of 1.5 mg/kg] plus consider giving 5 mg q24h
Intrathecal
Drug: CHLORAMPHENICOL
Give: 1.75g (17.6 ml) qéh IV [calculated on basis of 25 re/kg]
Drug:
Infection:

Please ask any simple-English questions regarding the above
consultation session or the overall knowledge of the consultation
system. When discussing the consultation above, please refer to
organisms, cultures, etc. by their numeric names feg. ORGANISM-1).
Questions which were asked during the consultation may be specified
by the numbers that preceded them above, (e.g., ‘WHY DID YOU ASE
QUESTION 17°). Control-O will abort type-out, or ‘cancel’ the current
question. Type HELP for a list of the options available. Enter the
word STCP when you have no further questions.

Type HELP for the list of question options.

ee JUSTIFICATION

kx LITERATURE

** HOW DID YOU CALCULATE THE DOSE FOR GENTAMICIN?

When GENTAMICIN is given for MENINGITIS, the recommended dosage is:

if age is > 2 yrs then 1.7 mg/kg q8h IV plus consider giving 5 mg a24h
Intrathecal

else 2.3 mg/kg a&h IV plus consider giving 2.5-4 mg/day intrathecal

129
Sec. VIII.A. Appendix A.

The normal dose for Oscar Lovejoy is: 119 mg (3-0 ml, &0mg/2m1 ampule) aé&h IV
[calculated on basis of 1.7 mg/kg] plus consider giving 5 mg q24h
Intrathecal.

GENTAMICIN is excreted by the kidneys, so its dosage must be modified in
renal failure.

The following table shows how the patient’s renal function was determined:

Identifier Value Definition

SCR1 1.9 the most recent serum creatinine of oscar lovejoy
(in mg/100m1)

SCR2 1.8 the previous serum creatinine of oscar lovejoy (in
ng/100m1)

CCR 46.9 oscar lovejoy’s most recent creatinine clearance (in
ml/min)

cCr(£) 42.7 oscar lovejoy’s creatinine clearance, adjusted for
normal body surface area (ml/min/1.73 sq.
meters)

cCr(n) 100 the average normal creatinine clearance for oscar
lovejoy’s age group

CCr (min) 2&0 the minimum normal creatinine clearance for oscar

lovejoy’s age group
Since CCr(f) < CCr(min), PATIENT-541 has renal impairment.

To modify the dosage of GENTAMICIN, the following pharmacokinetic parameters
were used:

Identifier Value Definition

a 7-0 the percentage hourly loss of a drug due to nonrenal
processes

b 222 the slope of the plot of the elimination rate

constant of a drug versus the patient’s
creatinine clearance

Q -57 the drug’s elimination rate fraction: elimination
rate of the drug as a fraction of the drug’s
“normal” elimination rate.

Q was calculated using the formula:
Q= fa +b * CCr(£)] / (a +b *® CCr(n)]

My first dosage recommendation decreased the amount of GENTAMICIN and left
the dosing interval unchanged. The new amount was calculated by
multiplying the old amount by 0. :

The second recommendation increased the dosing interval and left the amount
of GENTAMICIN unchanged. The new interval was calculated by dividing
the old interval by 0Q.

** BOW DID YOU DETERMINE THRE PATIENT’S BODY SURFACE AREA?

RULE104 was used to conclude that oscar lovejoy’s estimated body surface area

130
is 1.9. The last question asked before the conclusion was made was 10.
ek PR 104

RULE 104
[This rule applies to any patient, and is tried in order to Find out about the
patient’s estimated body surface area]

If: 1) The weight of the patient is known, and
2) The height of the patient is known
Then: Use Boyd’s algorithm to calculate body surface area from weight and
height [Boyd, E. The Growth of the Surface Area of the Human
Body, 1935 pp 1 - 133.]
Author: Wraith
Justification: A comparative evaluation with statistical analysis of the
Boyd method with the DuBois formula demonstrated that the Boyd formula is
more accurate for abnormal body types. [Sendray J et.al., Determination
of human body surface area from height and weight, Applied Physiology,
7(1):1-12, July 1954.]
Literature: Boyd, E, The growth of the surface area of the human body, c.
1935, pp 1-133.
Sec. VIII.A. Appendix A.

VII. Graph of Blood Levels of Anitbiotics

The graph below provides an estimate of the steady state blocd levels
of gentamicin over time for the three regimens suggested by Mycin.
A graph of this form may aid the physician in the selection of the most
appropriate regimen.

Blood level of GENTAMICIN [mcg/mi]

 

1. 2.7 3.7 4.6 5.6 6.5
| | DB I
| | D B I
| | D B I
| | D B I
2.50+ | D B I
| | D B I
| | Dd B I
| | D B I
| | D B I
5.00+ | D B I
| x &B I
| D|B I
| D * I
| D BITI
7650+ D B *
| D B I |
| B rT | D
| B T | D
| B I | D
10.00+ B I | D
| BOL | D
T | B oT | D
i {B I | D
m IB IT | D
e 12.50+ I | D B
| I i D B
i I x B
n |Z D | B
| D | B I
F 15.00+ D | R I
° [ D | B I
u | D | B I
r | | B D I
s | | B D I
17.50+ | B DI
| 1 B - DI
| * DI
| B | DI
| Bs 7
20.00+ B | DI
MIC
KEY
D - drug level for modified dose [ 70.0 mgm every &.0 hours]

132
Appendix A.

- drug level for modified interval [ 119.0 mem every

I
B - drug level for modified interval and dose [
MIC = minimum inhibitory concentration [mceg/ml]
TL <= toxic level [meg/ml1]

* - indicates overlapping curves

133

98.0 mgm every

14.0 hours]

12.0 hours]
Sec. VIII.A. Appendix A.

VII. Therapy Recommendation Consultation Session
MYCIN 23-Jun-78 ...

Special options (type ? for help):
**® REC

Instructions? (Y or N)
ak YES

This program recommends therapy for a patient with infectious
diseases. You will be asked a few facts about the patient which will
be used to determine the dosages and appropriateness of giving
certain antibiotics. Then you will be prompted for the infection(s)
that you want treated and the organism(s) which vou feel may be
responsible for each infecton.

MYCIN cannot make a recommendation without tnowing the
infection(s) and organism(s) to be treated. Other information is
requested for the purpose of refining therapy selection, but is not
essential; you may answer UNKNOWN to any of these questions.

 

Case 2, AA, 47-24-31, VMC, A 3 year old female with one day history
of headache, lethargy, fever to 104. L.P.: Protein 25, glucose 95 110 ceils
with 96% polys. Pe: Meningismus, petechiae on arms and shoulders.

28-NOV-77 11:05:51
{consultation of 23-JUN-76 12:00]

1) Patient’s name: (first-last)
ek PT683

2) Age:

xe 205

we

3) Sex:
kk FEMALE
4) Is Pt683 currently receiving therapy with any antimicrobial agent?
** NO
5) During the present illness, has Pt683 been treated with an
antimicrobial agent which is no longer being administered?
** NO
Infection: MENINGITIS
Organism: VIRUS
(Determining which drugs are desirable for use against the Virus...]
Are there any other organisms which might be causing this meningitis?
*k* YES
Organism: HEMOPHILUS-INFLUENZAE
6) Has the lab reported the in-vitro antibiotic sensitivities of the
hemophilus-influenzae causing the meningitis (ITEM-2)?
ke ON
[Determining which drugs are desirable for use against the Hemophilus-—
influenzae...]
7) Does Pt€é83 have a clinically significant allergic reaction to anv
antimicrobial agent?

134
Appencix A. fec VITI.A.

ek NO
Are there any other organisms which might be causing this meningitis?
**k YES
Crganism: NEISSERIA-MENINGITIDIS
&) Has the lab reported the in-vitro antibiotic sensitivities of the
neisseria-meningitidis causing the meningitis (ITEM-3)?
#*

(Determining which drugs are desirable for use against the Neisseria-
meningitidis...] .
Are there any other organisms which might be causing this meningitis?
** NO

Are there additional infections for which you want therapy recommendations?
*k NO

INFECTION-1 is MENINGITIS

+ eee VIRUS Supportive therapy.
+ <ITEM-2> HEMOPHILUS-INFLUENZAE
+ <ITEM-3> NEISSERIA-MENINGITIDIS

[Considering AMPICILLIN-AND-CHLORAMPHENICOL for use against INFECTION-1...]
9) What is the most recent creatinine clearance in ml/min which you feel
is a true representation of Pt683’s renal function?
** UNKNOWN

10) What is the most recent serum creatinine of Pt683 (in mg/1C0Oml)?
xk UNKNOWN

11) Pt683’s weight in kilograms (or <number> POUNDS):
ke 17,7
[REC-1] My preferred therapy recommendation is as follows:
In order to cover for Items <2 3>:
Give the following in combination:
1) AMPICILLIN
Dose: @86 mg (7.0 ml) q4h IV [calculated on basis of 50 ng/kg]
[The dose should be modified in SEVERE renal failure. I can adjust
the dose only if a creatinine clearance or a serum creatinine
is obtained, and the patient’s weight is known.!

2) CHLORAMPHENICOL
Dose: 443 mg (4.4 ml) q6h IV [calculated on basis of 25 ng/kg]

Please ask any simple-English questions regarding the above
consultaticn session or the overall knowledge of the consultation
system. When discussing the consultation above, please refer to
organisms, cultures, etc. by their numeric names (eg. ORGANISM~1).
Questions which were asked during the consultation may be specified
by the numbers that preceded them above, (e.g-, “WHY DID YOU ASK
QUESTION 17°). Control-O will abort type-out, or ‘cancel’ the current
question. Type HELP for a list of the options available. Enter the
word STOP when you have no further questions.

Type HELP for the list of question options.

135
Sec.

VIII.A. Appendix A.

*k JUSTIFICATION

*# LITERATURE

** WHY DID YOU GIVE AMPICILLIN AND CHLORAMPHENICOL IN REC-1?

Shall I describe the therapy algorithm for you?** NO

AMPT
For

For
AMPT

CILLIN-AND-CELORAMPHENICOL was not prescribed for ITEM-1 or ITEM-3
ITEM-1 no drug was prescribed.

ITEM-3 AMPICILLIN was prescribed.

CILLIN-AND~CELORAMPHENICOL was prescribed for ITEM-2 in RECOMMENDATION-1

because:

~~ AMPICILLIN-AND-CHLORAMPHENICOL is used for treating hemophilus-
influenzae in meningitis

-- AMPICILLIN-AND-CHLORAMPHENICOL was considered for treating ITEM-2
because it is more appropriate than AMPICILLIN for this patient. [RULE066]
~~ ITEY-2 is sensitive to AMPICTLLIN-AND~CHLORAMPHENICOL

~- there were no contraindications for it

-~ it was part of the recommendation which covers for all of the items,
using the fewest number of drugs.

Would you like to see some details?** YES

The

following decision(s) are relevant:

The drugs considered for the hemophilus-influenzae ranked according to
desirability are ampicillin-and-chloramphenicol (.95) chloramphenicol
(89) carbenicillin (.79) tetracycline (.76) erythromycin (.75)
cephalothin (.7) [RULE101 RULEO66]

The drugs to which the hemophilus-influenzae is sensitive are ampicillin-~-
and-chloramphenicol (1.0) chloramphenicol (1.0) carbenicillin (1.0)

tetracycline (1.0) erythromycin (1.0) cephalothin (1.0) [RULEO98 RULE445]
For the conclusion concerning ampicillin-and-chloramphenicol see RULE445.

The meningitis is one of the infections for which ampicillin-and-
chloramphenicol is appropriate [RULE441]

kk PR 66

RULE

If:

C66

1) Ampicillin is one of the drugs considered for the item ranked
according to desirability, and

2) The diagnosis of the infection is meningitis, and

3) The age of the patient is greater than or equal to 1 week, and

4) The identity corresponding to the item is hemophilus-influenzae, and

5) The item is one of the most likely (top cluster) identities causing
an infection in the patient, and

&) It is not known whether ampicillin is one of the drugs to which the
item is sensitive in vitro, and

7) A: The patient is not allergic to one or more antibiotics, or
B: Chloramphenicol is not one of the drugs to which the patient is

allergic

136
Appendix A. Sec VIII.

h

Then: It is definite (1.0) that the combination of ampicillin with
chloramphenicol is a more appropriate therapy for use against the
item

Author: Wraith

Justification: Due to the increasing number of Ampicillin-resistant

Hemophilus influenzae isolated in the last few years a combination of
ampicillin and chloramphenicol is recommended until sensitivities of the
organism are known. At that time, chloramphenicol should be discontinued

unless the organism is ampicillin-resistant, in which case, ampicillin is
discontinued.

Literature: 1. Katz SL: Ampicillin-resistant Eemophilus influenzae type P:
A status report. Pediat. 55:66, 1975.

2. Ratz SL et-al.: Ampicillin-resistant strains of Hemcphilus
influenzae type b, Pediat. 55:145, 1975.

3. Nelson JD: Should ampicillin be abandoned for treatment of
Hemophilus influenzae disease? JAMA 229:322, 1974.

*k REC

For each item in turn, enter the drug you would have prescribed in
RECOMMENDATION-1.
In order to minimize the number of drugs in your recommendation, vou may not

want to prescribe therapy for every item. Items which represent the most
likely organisms are indicated with a plus sign (+).

+ ITEM-1 -- the virus **
+ ITEM=2 -- the hemophilus-influenzae ** CARBENICILLIN
+ ITEM-3 -- the neisseria-meningitidis ** CARBENICILLIN

(Checking for contraindications...]
{Considering CARBENICILLIN for use against INFECTION-1...]
[No contraindications found...]

{Now comparing your prescription to MYCIN’s...]

ORGANI SME MYCIN’s regimen Your regimen
“most likely" Drug -= Choice Ranking Drug -- Choice Ranking
ITEM-2 AMPICTLLIN-AND-CHLORAMPHENICOL =~ Ist

CARBENICILLIN -- 3rd
ITEM=3 AMPICILLIN -= Ist CARBENICILLIN == 2nd

137
Sec VIIT.A. Appendix A.

(The desirability of a drug is defined to be its lowest ranking for the items
it covers.)

Your recommendation for the most likely organisms(s) is not optimal, since
your prescription of 1 third choice drug (CARBENICILLIN for ITEM-3 and ITEM-2)
should be avoided.

You may refer to your regimen as RECOMMENDATION=-2 in later questions. ]

**k STOP
IX. Running a Consultation of a Stored Patient in Summary Form
with Rule Acauisition to Correct Diagnosis

Case 10, TS, WMC, 9-49-13-77, A 73 year old female with rheumatoid
arthritis, congestive heart failure, and gastritis on chronic prednisone 15
mg/day and coumadin. Admitted for a GI bleed. Progressive obtundation began
while in hospital. LP revealed: protein 158, glucose 20, 28 cells with 24 %
polys, 66% lymphs. Gram stain and India Ink prep revealed budding yeast-like
cells. Treatment: Begun on Amphotericin B IV and IT as well as 5-fc. Final
dx: Cryptococcal meningitis.

 

29~NOV-77 01:45:12
{consultation of 9-OCT-76 12:00]

Pt709 is a 73 year old female, caucasian.
Patient-709 is not an alcoholic.

Patient-709 is a compromised host.

Patient-709 is immunosuppressed.

Patient-709 does not live in a crowded environment.

Past Medical History:

Patient~709 is not allergic to one or more antibiotics.
Patient-709 has not undergone surgery.

Patient-709 does not have a tb risk factor.

Patient-709 has not recently been exposed to a contagious disease.

Recent Medical History:

The csf has not been tested for cryptococcus antigen.

Patient~-709 has not shown symptoms of mumps.

Otitis-media is not one of the diagnoses which are consistent with the
patient’s clinical history.

Epiglottitis is not one of the diagnoses which are consistent with the
patient’s clinical history.

Patient-709 has not had an injury or insult to, or defect in the CNS.
Patient-709 has had recent neurologic signs.

The duration of the neurological signs is 4 days.

Patient-709 has had recent neurologic symptoms.

The duration of the neurological symptoms is 2 days.

Physical:

The weight of PATIENT~709 is 68.1 kgms.

The height of PATIENT-709 is 165.1 cms.

Patient+/09 is febrile.

Patient-/709 has not been seriously burned.

Patient-/709 does not have a rash or cutaneous lesions.

Patient-709 has not focal neurological signs.

Patient-709 does not have evidence of ocular nerve dysfunction.
Patient-709’s X-ray is abnormal.

Active-tb is not one of the diseases which the patient’s chest X-ray
suggests.

Lobar-pneumonia is not one of the diseases which the patient’s chest X-ray
suggests.
Sec. VITI.A. Appendix A.

CULTURE-1: This pending csf culture was drawn on 9-OCT-76 12:00.
A smear of this pending csf culture was examined.

a= ~ORGANISM=-1 ~----~--
The stain of ORGANISM-1 is indiaink.
The CSF cultures will be considered to be associated with MENINGITIS.
The number of days since clinical evidence of the infection was first
noted is 4 days.
(Considering organisms (other than those isolated from cultures) which might
be present...]
Lymphoma is not one of the diagnoses of the patient.
Leukemia is not one of the diagnoses of the patient.
The CSF protein is 158.
The WBC from the CSF is 38.
The percent of pmns in the csf is 34.
The csf glucose value is 20.
The simultaneous blood glucose value is 95.
The results of a recent CBC of pt709 are available.
The white count from pt709’s peripheral CBC (in thousands) is 9.4.
Pt709 has received corticosteroids just prior to or during this infection.
The infection was acquired while the patient was hospitalized.

I have found evidence (RULE564) that treatment should also cover for VIRUS
causing MENINGITIS.
[Determining which drugs are desirable for use agdinst the virus...]

My therapy recommendations will be designed to treat for organisms
that are either very likely or, although less likely, would have a
significant effect on therapy selection if they were present. It is
important to cover for the following probable infection(s) and
associated organism(s):

INFECTION-1 is MENINGITIS
+ eee VIRUS [from clinical evidence only] Supportive therapy.

Please repeat the LP in 8 - 24 hours for cultures (bacterial, fungal,
Tb, and viral), cells, and chemistries.

Blood cultures should be drawn.

Is this correct?
** NO

I was unable to make any conclusion about the identity of ORGANTISM-1

Is that correct?

x* NO

Should it have been possible to conclude a value?
k* YES

Which one(s)?

kk CRYPTOCOCCUS

Let’s look at the CRYPTOCOCCUS ...

140
Appendix A. Sec VITI.aA.

Do you want to
1] track down the problem yourself, or
2] allow me to guide the debugging, or
3] ignore it for now and fix it another time?
Type 1, 2 or 3
wk 2
The following applicable rule(s) were unable to conclude about whether the
identity of ORGANISM-1 is cryptococcus

RULE 378
Tf: 1) The infection which requires therapy is meningitis, and
2) The stain of the organism is acidfast

Then: 1) There is suggestive evidence (.7) that the identity of the

organism is mycobactertium-tb, and
2) There is evidence that the organisms (other than those seen on

cultures or smears) which might be causing the infection is
coccidioides (-.8) cryptococcus (~.8)

Clause 2 of RULE378 [the stain of the organism is acidfast] was already known
to be false for ORGANISM-1, so the rule was never tried.

Is this rule correct?

&* YES

Should its premise have succeeded for this case?
** NO

Is there a rule missing?

kk YES

Ok, then I need a rule that would allow me to deduce that the identity of
ORGANISM=-1 is cryptococcus

Would you like to give me such a rule now?

** YES

Good...

The new rule will be called RULE260

If: 1 - THE INFECTION WHICH REQUIRES THERAPY IS MENINGITIS
and 2 - THE STAIN IS INDIAINK
and 3 -

Then: 1 = THE IDENTITY IS CRYPTOCOCCUS (.7)
and 2 -

This may take a few moments.

This is my understanding of your rule:
RULE 260

Tf: 1) The infection which requires therapy is meningitis, and
2) The stain of the organism is indiaink
Then: There is suggestive evidence (.7) that the identity of the organism
is cryptococcus

Okay? (Yes cr No)
** YES

141
Sec. VIIT.A. Appendix A.

This rule may be best described as a rule which:
1 - applies to all organisms
2 - applies to organisms from positive cultures

3 ~ applies to organisms seen on smears of specimens for pending cultures
ke

Please type your last name- BENNETT

Please describe briefly why you added (or changed) this rule.

Type control-Z when done.

xk THE SYSTEM FAILED TO CONCLUDE CRYPTOCCCCUS FOR AN INDIAINK STAIN AND
**k MENINGITIS AS THE INFECTION

aK

RULE260 has now been added to the knowledge base.

I will now re-run the consultation to check the effectiveness of your new
rule. Get comfortable, this may take a few moments.

 

29=NOV=77 01:53:52
{consultation of 9-OCT-~76 12:00]

-------- ORGANISM-1-----—-~

The CSF cultures will be considered to be associated with MENINCITIS.
[Considering organisms (other than those isolated from cultures) which might
be present...]

I have found evidence (RULE564) that treatment should also cover for VIRUS
causing MENINGITIS.
[Determining which drugs are desirable for use against the cryptococcus...]
[Determining which drugs are desirable for use against the virus...]

My therapy recommendations will be designed to treat for organisms
that are either very likely or, although less likely, would have a
significant effect on therapy selection if they were present. It is
important to cover for the following probable infection(s) and
associated organism(s):

INFECTION-1 is MENINGITIS

+ <ITEM-1l> CRYPTOCOCCUS [ORGANISM-1]
eee VIRUS [from clinical evidence only] Supportive therapy.

My recommendations will cover for all of the most likely items (indicated
Vv

f
by a "+" above). Less likely items will be covered unless this would

142
Appendix A. Sec VIII.A.
require an excessive number of drugs.]

Well, it appears that things are straightened out.

Is this correct now?

kk YES

Good, then the problem’s been fixed.
-»econtinuing with therapy selection now.

{Considering 5~FLUOROCYTOSINE-AND-AMPHOTERICIN=B for use against INFECTION~1...]
Estimated body surface area is 1.8 sq. meters.
Dosage adjustments will be based on the calculated creatinine clearance of
85-4 ml/min/1.73 sq. meters (adjusted to average body surface
area.)
Normal doses of all drugs will be given

[REC-1] My preferred therapy recommendation is as follows:
Give the following in combination:
1) 5-FLUOROCYTOSINE
Dose: 2.56g (10 250mg-capsuies) q6h PO [calculated on basis of 37.5
ng/kg]
Comments: Gradually increase dose from 12 mg/kg q6h to the above dose
as tolerated.

2) AMPHOTERICIN~B

Dose: 20 mg (200.0 ml) q24h IV [calculated on basis of .3 ng/kg]

Comments: Begin with 1 mg IV on day 1, 5 mg on day 2, 10 mg on day 2,
and then gradually increase dose by 5-10 mg/day until the above
dose is administered. Infuse slowly over 3-6 hrs. Plus,
consider giving C.5mg intrathecally every other day. Greater

than 15 mg intrathecally is rarely indicated.

Please obtain samples for fungal, TB, and viral cultures, cytology, VDRL
(blood and CSF), coccidiodes complement-fixation (blood and CSF),
cryptococcal antigen (blood and CSF), viral titers (blood and CSF). An
infectious disease consult may be advisable.

Blood cultures should be drawn.

143
Appendix B

APPENDIX B:

XNCWLEDGE ENGINEERING FOR MFDICAL DECISION MAKING:
A Peview of Computer-Based Clinical Decision Aids

1 INTRODUCTION

As early as the 1950s it was recognized that computers could conceivably
assist with clinical decision making [57], and both physicians and computer
scientists began to analyze medical diagnosis with a view to the potential role
of automated decision aids in that domain [55]. A variety of techniques have
been applied to computer-aided clinical decision making, accounting for at least
§CO references in the clinical and computing literature [104]. In this article
we review several methodologies and attempt to identify the important issues
that account for both the multiplicity of approaches to the problem and the
limited clinical success of most of the systems developed to date. Although
there have been previous reviews of computer-aided diagnosis [42], [86], [106],
our emphasis here will be somewhat different. . We will focus on the
representation and utilization of knowledge, termed "knowledge engineering," and
the inadequacies of data-intensive techniques which have led to the exploration

of nevel symbolic reasoning approaches during the last decade.

1.1 Reasons For Attempting Comouter-Aided Medical Decision Making

It is generally recognized that accelerated growth in medical knowledge has
necessitated greater sub-specialization among physicians and more dependence
upon assistance from other experts when 2 patient presents with a complex
problem outside one’s own area of expertise. The primary care physician who
sees the patient initially has thousands of tests available with a wide range of
costs (both fiscal and physical) and potential benefits (i.e., arrival ata
correct diagnosis or optimal therapeutic management). Fven the experts ina
field may reach very different decisions regarding the mMenagement of a specific
case [122]. Diagnoses that are made, and upon which therapeutic decisions are
based, have been show to vary widely in their accuracy [22},[77], [82].
Furthermore, medical decision making has traditionally been learned by medical
stucents in an unstructured way, largely through observing and emulating the
thought processes they perceive to te used by their clinical mentors [48].

Thus the motivations for attempts to understand and automate the process of

144
Sec. 1 INTRODUCTION

clinical decision making have been numerous [106]. They are directed both at
diagnostic models and at assisting with patient Management decisiens. Among the
reasons for attempting such work are the following:

(1) To improve the accuracy of clinical diagnesis through
approaches that are systematic, complete, and able to utilize data
from diverse sources;

(2) To improve the reliability of clinical decisions by avoiding
unwarranted influences of similar but not identical cases (a common
source of bias among physicians), and by making the criteria for
decisions explicit, and hence reproducible;

(3) To make the selection of tests and therapies efficient in
that optimal decisions are reached while the expenge of time or funds
is minimized before definitive ection is taken;

(4) To improve our understanding of clinical decision making,
beth so that future physicians can have better teaching in this area,
and so that the computer programs we develop will be more effective

and easier to understand by the physicians for whom they are designed.

1.2 The Distinction Between Data And Knowledge

The models on which computer systems base their clinical advice range from
data-intensive ta knowledge-intensive approaches. If there is a chronology to
the field over the last 20 years, it is that there has been progressively less

dependence on "pure,"

observational data and more emphasis on higher-level
symbolic knowledge inferred from primary data. We include with domain knowledge
a category of “judgmental knowledge" which reflects the experience and opinions
of an expert regarding an issue about which the formal data may be fragmentary
or nonexistent. Since many decisions made in clinical medicine depend upon this
kind of judgmental expertise, it is not surprising that investigators should
begin to look fer ways toe capture and utilize the knowledge of experts in
decision making programs. Another reason to move away from purely data-
intensive programs is that in medicine the primary data available to decision
makers are far from objective [14]. They include subjective reports from
patients, and error-prone observations [23]. Also, the terminology used in the
Teports is mot standardized [7] and the classifications often overlap. Thus
decision making aids must be knowledgeable about the unreliability of the data

as well as the uncertainty of the inference.

145
Sec. 1 INTRCDUCTION Appendix B

For example, data-intensive programs include medical record systens which

accumulate large detabenks te assist with decision making. Trere is little
knowledge per se in the databank, but there are large amounts of data which can
help with decisions and be analyzed to provide new knowledge. 4 program that
retrieves a patient’s record for review, or even one that retrieves the records
of several patients matching some set of descriptors, is performing a data
Management task with minimal "knowledge engineering" involved [32], [80]. On
the other hand, there is knowledge contained in the conditional probabilities
generated frem such a databank and utilized for Bayesian analysis. At the other
extreme are systems that attempt to understand and utilize the kind of expert
knowledge which cannot be easily gleaned from databanks or literarure reviews
et

f62!, [95]. Systems that redel human reasoning or emphasize educ

ba

en of users
tend to fall towards this end of the data-knowledge continuuc.

Ye use the term "knowledge engineering", then, to refer to computer-based
symbolic reasoning issues such as knowledge representation, acquisition, and
explanation [15]. It is along these dimensions that the programs differ most
sharpiy from conventional calculations. For example, these pregrams can solve
problems by pursuing a line of reasoning; the individual inference steps and the
whole chain of reasoning may also form the basis for explanations of decisions.
A major concern in knowledge engineering is clear separation of the medical
knowledge in a program from the inference mechanism that applies that knowledge
to individual cases. One goal of this paper is to identify, in the strengths and
weaknesses of earlier work, these issues which have motivated several current
research groups to investigate the knowledge engineering approach te the

automation of clinical decision aids.

1.2 Parameters For Assessing Work In The Field

The barriers to successful implementation of computer-based diagnostic
Systems have been analyzed on several occasions {7],{19],[99] and these need
not be reviewed in detail here. However, in assessing Programs it is pertinent
to examine several parameters that affect the success and scope of a particular
system in light of its intended users and applicetion:

(1) Hew accurate is the program?!

 

lalchough tris is important it is not the only measure of clinical
effectiveness. For exemple, the effects on morbidity, mortality, and
length of hospital stay may also be impertant parameter. 4s we shall
snow, few svstems have reached a stage of implementeticn where these
parameters could be assessed.

146
Sec. | INTRCDUCTION

(2) What is the nature of the knowledge in the svstem and how is it

enerated or acauired?

~ 69

3) How is the clinical knewledge represented, ané tow does it

facilitate the performance goals of the system described?

(4) How are knowledge and clinical data utilized and how does this

impact on system performance?

(3) Is the system accepted by the users for whom it is intended? Is

the interface with the user adequate? Does the system function outside

of a research setting and is it suitable for dissemination?

(6) What is the size of the required computing resource?

(7) What are the Limitations of the approach?

Cne issue we have chosen not to address is the cost of a system. Not only
fs information on this question scanty for most of the programs, but expenses
generated in a research and development environment do not realistically reflect

the costs one would expect from a system once it is Operating for service use.

1.4 Overview Of This Paper

an exhaustive review of compucer-aided diagnosis will not be attempted in
light of the vastness of the field, and we have therefore chosen to review the
methodologies by discussing several representative examples of systems that have
been described. The seven principal examples we have selected are not
necessarily the best nor the most successful; however, they illustrate the
issues we wish to discuss and encompass most of the major methodologies that
have been applied to computer-based medical decision making. In several cases
we have referenced other closely related systems, and the bibliography should
therefore guide the reader who wishes to pursue a particular topic more
thoroughly. Any attempt to categorize programs in this way is inherently
fraught with problems in that several Systems appropriately lay claim to more
than one methodology. Thus we have occasionally felt obligated to simplify a
topic for clarity in light of the overall purposes of this review and the
limitations of the space available to us.

Finally, certain kinds of decision making tools have been intentionally
deleted from discussion here. These include medical systems that are designed
primarily for use by researchers [251], [45], [59], [84]; advanced automated

9

instrumentation techniques such as computerized tomograpry-; signal processing

 

2See Kak.’s article in this issue of the PROCFENINGS.

147
Sec. 1 INTRODUCTION Appendix B

technicues such es programs for EKG anelysis [73] or patient menitoring [Lee];
and programs designed largely for data storage and retrieval witk the actual
analvsis and decision making left largely to the clinician (3223,[52],{116]. We

fave also chosen to discuss workin ng computer programs rather tkan theories

suitable for automation or early reports of work in progress.

 

2 Clinical Algorithms and Automation
21 Overview

Clinical algorithms, or protocels, are structured decision making
flowcharts to which a diagnostician or therapist can refer when deciding how to
manage a patient with a specific clinical problem [90]. in general these
algorithms have been designed by expert physicians for use by physicians’
assistants or nurse practitioners who are substituting for physicians in the
performance of certain routine clinical-care tasks’?. The methodology has been
developed in part because of a desire to define basic medical logic concisely so
that detailed training in pathophysiology would not be necessary for ancillary

ractitioners. Experience has shown that intelligent high school graduates,
selected in large part because of poise and warmth of personality, can provide
excellent care guided by protecols after only 4-8 weeks of training. This care
has been shown to be equivalent te that given by physicians for the same limited
problems, and to be accepted by physicians and patients alike for such diverse
clinical situations as diabetes management [51], [6C], pharyngitis [24],
headache [33], and other disease categories [97], [103].

The role of the computer in such applications has been limited, however.
In fact, several groups initially experimented with computer representation of
the algorithms but have since abandoned the efforts and resorted to prepared
paper forms [51], [103]. In these cases the computer had ori inelly guided the
physician assistant’s collection of data and had specified precisely what
decisions should be made or actions taken, in accordance with the clinical
algorithm. However, since the algorithmic logic is generally simple, and can
often be represented on a single sheet of paper, the advantages of an automated

approach over a manual system have not been clearly demonstrated. In one study

 

3¢linical algorithms have also been prepared for use by physicians
themselves, but Grimm has found that thev are generally less well-accépted, by
coctors [34]. He showed, however, that physician performance could improve when
Srotoeets were used in certain settings.

148
Sec. Clinical Algorithms and Automation
Vickery showed that, although the computer System entirely eliminated errors in
data collection (since the program demanded all relevant data at the appropriate
time), supervising physicians could detect no significant difference between the
performance of physicians’ assistants using automated versus manual systems
{193]. Furthermore, the computer could not, of course, decide whether the actual
observations entered by the physicians’ assistant were correct; yet this kind of
inaccuracy was one of the most common reasons that supervisors occasionally
found an assistant’s performance unsatisfactory.

There are two other ways in which the computer has been utilized in the
setting of clinical algorithms. Cne has been in the use of mathematical
techniques to analyze signs and symptoms of diseases and thereby to identify

those that should most appropriately be referenced in 2 clinical algorithm that

i+

s being prepared for the management ef that disease [26], (5C1,[1051. The
process for distilling expert knowledge in the form of a clinical algorithm can
be an arduous and imperfect one [90]; formal techniques to assist with this task
may prove to be very valuable.

Finally, some researchers in this area continue tb use computers to assist
with aucit of performance by comparing actual actions taken by a physicians’
assistant with those recommended by the algorithm itself. Sox et al. [97] have
described a system in which the assistant’s checklist for a patient encounter
was sent to a central computer and analyzed for evidence of deviation from the
accepted protecol. Computer~gen@rated reports then served as feedback to the

physicians” assistant and to the supervising physician.

2.2 Example

We have selected for discussion a project that differs from those
previously cited in that (1) computer techniques are still being utilized, and
(2) the clinical algorithms are designed for use by primary care physicians
themselves. This is the cancer chemotherapy system developed in Alabama by
Mesel et el. [64]. The algorithms were developed in Tesponse to a desire to
allow private practitioners, at a distance from tke regional tertiary-care
center, to manage the complex chemotherapy for their cancer patients, without
routinely referring them to the central oncologists. Mesel et al. have
described a "consultant-extender system" that enables the primary physician to
treat patients with Hodgkin’s Disease under the supervision of a regional

specialist. Five oncologists developed a care protocol for the treatnent of
Sec. 2 Clinical Algorithms and Automation Appendix B

Hodgkin’s Disease, and this algorithm was placed on-line. Once patients had
been entered in the study, their private physictens would prepare encounter
forms at the time of each office visit. These forms would document pertinence
interval histery, physical findings, and lab data, as well as chemotherapy
aduinistered. The form would then be sent to the regional center where it was
analyzed by the computer and a customized clinical algorithm was produced to
assist the private physician with the management of that patient during the next
appointment. Thus the computer program would take into account the ways in
which the individual patient’s disease might progress or improve and would
prepare an appropriate clinical algorithn. This protocol was sent back to the
physician in time for it to be available at the next office visit. The private
£

practitioner was encouraged to call the regional specialist directly if the
protecol seemed in some way inadequate or additional questions arose. The
authors present data suggesting that their system was well-accepted by
physicians and patients, and that excellent care was delivered. This is an
interesting result in light of Grimm’s experience [24]. Perhaps physicians were
more accepting of the algorithmic approach in Mesel’¥ case because it allowed
rhem to perform tasks that they would previously not have been able to undertake
at ali. Petrospective review of cases that were treated at the referral center,
but without the use cf the protocols, showed a 16% rate of variance from the
management guidelines specified in the algorithms; there was no such variance
when the protocols were utilized directly. Thus algorithms may be effective
tools for the administration of complex specialized therapy in circumstances

such as those described.

2.3 Discussion of the Methodology

Although clinical algorithms are among the most widespread and accepted of
the decision aids described in this article, the simplicity of their logic rakes
it clear why the technique cannot be effectively epplied in most medical
domains. Decision points in the algorithms are generally binary (i.e., a given
sign or symptom is or is not present), and there tend to be Many circumstances
that can arise for which the user is advised to consult the supervising
physician (or specialist). Thus the complex decision tasks are left to experts,
and there is generally no formal algorithm for managing the case from that point
on. Te is precisely the simplicity of the algorithmic logic, and the

supervising exrert “escape valve", which has permitted many algorithms te be

156
Sec. 2 Clinical Algorithms and Automation
represented on one or two sheets of paper and Fas obviated the need for direct
computer use in most of the systems. The contributions of clinical algorithns
to the distribution and delivery of health Care, to the training of paramedics,
and to quality care audit, have been impressive and substantial. Fowever, the
methodology is not suitable for extension to the complex decision tasks to be

discussed in the following sections.

 

3 Databank Analysis for Prognosis and Therapy Selection
3.1 Cverview

Automation of medical record keeping and the development of computer-based
patient databanks have been major research concerns since the earliest davs of
medical computing. Most such systems bave attempted to avoid direct interaction
between the computer and the physician recording the data, with the systems of
Weed [115], [116] and Greenes [32] being notable exceptions. Although the
earliest systems were designed werely as record-keeping devices, there have been
several recent attempts to create programs that could also provide analyses of
the information stored in the computer databank. Some early systems {32}, [47]
had retrieval modules that identified all patient records matching a Poolean
combination of descriptors; however, further analyses of these records for
decision making purposes was left to the investigator. Weed has not stressed an
analytical component in his automated problem-oriented record {ll@}, but others
have developed decision aids which use medical record Systems fashioned after
his [96].

The systems for detabenk analysis all depend or the develcpment of a
complete and accurate medical record system. If such a system is developed, 2
Tumber of additional capabilities can be provided: (1) correlations among
variables can be calculated, (2) prognostic indicators can be Measured, and (3)
the response to various therapies can be compared. 4 physician faced with «
complex management decision can leok to such a System for assistance in
idcentifving patients in the past who had similar clinical problems and can then
see how those patients responded to various therapies. aA clinical investigator
keeping the records of his study patients on such a system can utilize the
progrem’s statistical capabilities for data analysis. Fence, although these
applications are inberently data-intensive, the kinds of "tnowledge" generated

bv specialized retrieval and statistical routines can provide valuable

151
Appendix 8

Sec. 3 Databank Analysis for Prognosis ard Therapy Selection

assistance for clinical decision makers. For exemple, they can help physicians
avoid the inherent biases thet result when the individve!} Practitioner bases his
decisions primarily on his own anecdotal experience with one or two patients
having a rare disease or complex of symptoms.

There are many excellent pregrams in this category, one of which is
discussed in some detail in the next section. Several others warrant mention,
however. The HELP System et the University of Utah [109], [111]!, [112] utilizes
large data file on patients in the Latter-Day Saints Hospital. Clinical experts
formulate specialized "HELP sectors" which are collections of logical rules that
define the criteria for a particular medical decision. These sectors are
developed by an interactive process whereby the expert proposes important
criteria for a given decision end is provided with actual data regarding that
criterion based on relevant patients and controls fron the computer databank.
The criteria in the sector are thus adjusted by the expert until adequate
discrimination is made to justify using the sector’s logic as a decision tool4.
The sectors are then utilized for a variety of tasks throughout the hospital.

Another system of interest is thet of Feinstein et 21. at Yale {l7]. They
hed specific patient management decisions in mind when they developed their
interactive system for estimating prognosis and guiding management in patients
with lung cancer. Similarly, Rosati et al. have developed & system at Duke
University which utilizes a large databank on patients who have undergone
coronary arteriography [82]. New patients can be matched against those in the
databank to help determine patient prognosis under a variety of management

alternatives.

3.2 Example

One of the most successful projects in this category is the ARAMIS system
of Fries [20]. The approach was designed originally for use in an outpatient
rheumatology clinic, but then broadened to a general clinical database system
(TOD) [118], [119] so that it became transferable to clinics in oncology,
metabolic disease, cardiology, endocrinology, and certain pediatric
subspecialties. All clinic records are kept in a flow-charting format in which
a column in a large table indicates 2 specific clinic visit and the rows

indicate the relevant clinical parameters that are being followed over time.

 

£, ‘ : . : :
“This process might be seen as a tool to assist with the formulation cf
cal algorithms as discussed in the previous section. another approach
using databank enalysis for algorithm development is described in 26}.

2 y Pp

152
Sec. 3 Databank Analysis for Prognosis and Therapy Selection

These charts are maintained by the physicians seeing the patient in clinic, and
the new colurn of data is later transferred to the computer databank by a
Transcriptionist; in this way time-criented data on all patients are kept
current. The defined database (clinical parameters to be followed) is
determined by clinical experts, and in the case of rheumatic diseeses has nov
been standardized on a national scale [26].

The information in the databank can be utilized to create a prese summary
of the patient’s current status, and there are graphical capabilities which can
plot specific parameters for a patient over time [118]. Rowever, it is in the
analysis of stored clinical experience that the system has its greatest
potential utility [21]. In addition to performing search and statistical
functions such as those developed in databank systems for clinical investigation
(45], [59], APAMIS offers a prognostic analysis for anew patient when a
Management decision is to be made. Using the consultative services of the
Stanford Immumology Division, an individual practitioner may select clinical
indices for his patient that he would like matched against other patients in the
databank. Based on 2 to 5 such descriptors, the computer locates relevant prior
patients and prepares a report outlining their prognosis with respect toa
variety of endpoints (e.g-., death, development of renal failure, arthritic
Status, pleurisy, etc.). Therapy recommendations are also generated on the
basis of.a response index that is calculated for the matched patients. A prose
case analysis for the physician’s patient can also be generated; this readable
decument summarizes the relevant data from the databank and explains the basis
for the therapeutic recommendation.

The rheumatologic databank generated under ARAMIS has now been expanded to
involve a national network of iomunologists who are accumulating time-oriented
data on their patients. This national project seeks in part to accumulate a
large enough databank so thet groups of retrieved patients will be sizable and
thus control fcr some observer variability and make the system’s recommendations

more statistically defensible.

3.3 iscussion of the Methodology

The databank analysis systems described have powerful capabilities to offer
to the individual clinical decision maker. Furthermore, medical computing
researchers recognize the potential value of large databanks in supporting many

of the other decision making approaches discussed in subsequent sections. There

153
Appendix B

Sec- 3 Databank 4nalysis for Prognosis and Therapy Selection

are important additional issues regarding databank systems, however, which are
ciscussed below.

(1) Data acquisition rémains a major problem. Many systems have avoided
direct physician-computer interaction but have then been faced with the expense
and errors of transcription. The developers of one well accepted record systen
still express their desire to implement a direct interface with the physician
for these reasons, although they recognize the difficulties encountered in
encouraging hands-on use of a computer system by doctors [100].

(2) Analysis of data in the system can be complicated by missing values
that frequently occur, outlying values, and poor reproducibility of data across
time and among physicians.

(3) The decision aids provided tend to emphasize patient management rather
than diagnosis. Feinstein’s system (17] is only useful for patients with lung
cancer, for example, and the ARAMIS (TCD) prognostic routines, which are
designed for patient management, assume that the patient’s rheumatologic
diagnosis is already known.

(4) There is no formal correlation between the way expert physicians
approach patient management decisions and the way the programs arrive at
recommendations. Feinstein and Koss felt that the acceptability of their system
would be limited by a purely statistical approach, and they therefore chose to
mimic human reasoning processes to a large extent [53], but their approag¢h
appears to be an exception.

(5) Data storage space requirements can be large since the decision aids of
course require a comprehensive medical record system as a basic component.

Slamecka has distinguished between structured and empirical approaches to
clinical consulting systems [96], pointing out that databanks provide a largely
empirical basis for advice whereas structured approaches rely on judgmental
knowledge elicited from the literature or the minds of experts. It is important
te note, however, that judgmental knowledge is itself based on empirical].
information. Even the expert "intuitions" that many researchers have tried to
capture are based on that expert practitioner’s own observations and "data
collection" over vesrs of experience. Thus one might argue that large,
complete, and flexible databanks could form the basis for large amounts of
judgmental knowledge that we now have to elicit from other sources. Some
researchers have indicated a desire to experiment with methods for the sutomatic

generation of medical decision rules from databanks, and one component of the

154
Sec. 2? Databank Analysis for Prognosis and Therapy Selection

"t

esearch on Slamecka’s MARIS system is apparently pointed in that direction
(96]- Indeed, some of the most exciting end practical uses of large databanks
may be found precisely at the interface with those knowledge engineering tasks

that have most confounded researchers in medical symbolic reasoning [5].

 

4 Mathematical Models of Physical Processes
4.1 Overview

Pathophysiologic processes can be well-described by mathematical formulae

re

mn oa limited number of clinical problem areas. Such domains have lent
themselves well to the development of computer-based decision aids since the
issues are generally well-defined. The actual techniques used by such prograns
tend to reflect the details of the individual applications, the most celebrated
of which have been in pharmacokinetics (specifically digitalis dosing), acid-
base/electrolyte disorders, and respiratory care [63].

Cne or two cooperating experts in the field generally assist with the
¢efinition of pertinent variables and the mathematical characterization of the
relationships among them. Often an interactive program is then developed which
requests the relevant data, makes the appropriate computations, and provides a
clinical analysis or recommendation for therapy based upon the computational
results. Some of the programs have also involved branched-chain logic to guide
decisions about what further data are needed for adequate analysis?.

Programs to assist with digitalis dosing have progressed to the inclusion
ef broader medical knowledge over the last ten years. The earliest work was
Jelliffe’s [43] and was based upon his considerable experience studying the
pharmacokinetics of the cardiac glycosides. Ris computer program used
mathematical formulations based on parameters such as therapeutic goals (e-g.,
desired predicted blood levels), body weight, renal function, and route of
administration. In one study he showed that computer recommendations reduced
the frequency of adverse digitalis reactions from 25% to 12% [44}. Later,
another group revised the Jelliffe model to permit a feedback loop in which the

digitalis blood levels obtained with initial doses of the drug were considered

 

s"Branched-chain" logic refers to mechanisms by which portions of a
decision network can be considered or ignored depending upon the data on a given
case. For example, in an acid-base program the anion gap might be calculated
and a branch-point could then determine whether the pathway for analyzing an
elevated anion gap would be required. If the gap were net elevated, that whole
portion of the logic network could be skipped.

155
Sec. 4 Mathematical Models of Physical Processes Appendix B

in subsequent therapy recommendations [72], [89]. More recently, a third group
in Boston, noting the insensitivity of the first two appreaches to the kinds of
Nennumeric observations that experts. tend to use in modifying digitalis therapy,
augmented the pharmacokinetic model with a patient~specific model of clinical
status [31]. Running their system in a monitoring node, in parallel with actual
clinical practice on a cardiology service, they found that each patient in the
trial in whom toxicity developed had received more digitalis than would have

been recommended by their program.

4.2 Example

Perhaps the best known program in this category is the interactive system
developed at Zoston’s Beth Israel Hospital by Bleich. Originally designed as a
program for assessment of acid-base disorders [2], it was later expanded to
consider electrolyte abnormalities as well [3], [4]. The knowledge in Bleich’s
program is a distillation of his own expertise vegarding acid-base and
electrolyte disorders. The system begins by collecting initial laboratory data
from the physician seeking advice on ‘a patient’s management. Branched-chain
logic is triggered by abnormalities in the initial data so that only the
pertinent sections of the extensive decision pathways created by Bleich are
explored. Essentially all questions asked by the program are numerical
laboratory values or "yes-no" questions (e.g-, "Does the patient have pitting
edema?"). Depending woon the complexity and severity of the case, the progran
eventually generates an evaluation note that may vary in length from a few lines
to several pages. Included are suggestions regarding possible causes of the
observed abnormalities and Suggestions for correcting them. Literature
references are also provided.

Although the program was made available at several East Coast institutions,
few physicians accepted it as an ongoing clinical tool. Bleich points out that
part of the reason for this was the system’s inherent educational impact;
physicians simply began to anticipate its analysis after they had used it a few
times [3]. More recently he has been experimenting with the program operating
as a monitoring system®, thereby avoiding direct interaction with the physician.

The system’s lack of sustained acceptance by physicians is probably due to
more than its educational impact, however. For example, there is no feedback in

the system; every patient is seen as a2 new case and the program has no concept

 

6Personal coumunication with Dr. Bleich, 1975.

156
Sec. 4 Mathematical Models of Physical Processes

of following a patient’s response to prior therapeutic measures. Furthermore,
the program generates differential diagnesis lists but does not pursue specific
etiologies; this can be particularly bothersome when there are multiple
coexistent disturbances in a patient and the program simply suggests parallel
lists of etiologies without noting or pursuing the possible interrelationships.
Finally, the system is highly individualized in that it contains
consideration of specific relationships only when Bleich specifically thought to
include them in the logic network. Cf course human consultants also give
personalized advice which may differ from that obtained from other experts.
Kowever, a group of researchers in Britain [79] who analyzed Bleich’s program
along with four other acid-base/electrolyte systems, found total agreement emong
the programs in only 20% of test cases when these systems were asked to define
the acid-base disturbance and the degree of compensation present. Their
analysis does not reveal which of the programs reached the correct decision,
however, and it may be that the results are more an indictment of the other four

programs than a valid criticism of the advice from Bleich’s acid-base component.

4.3 Discussion of the Methodologies

The programs mentioned in this section are very different in several
respects, and each tends to overlap with other methodologies we have discussed.
Bleich’s program, for example, is essentially a complicated clinical algorithm
interfaced with mathematical formulations of electrolyte and acid-base
pathophysiology. As such it suffers from the weaknesses of all algorithmic
approaches, most importantly its highly structured and inflexible logic which is
unable to contend with unforseen circumstances not specifically included in the
algorithm. The digitalis dosing programs all draw on mathematical techniques
from the field of biomedical modeling (not discussed here), but have recently
shown more reliance on methods from other areas as well. In particular these
have included symbolic reasoning methods that allow clinical expertise to be
captured and utilized in conjunction with mathematical techniques [21]. The
Boston group that developed this most recent digitalis program is interested in
Similarly developing an acid-base/electrolyte system so that judgmental
knewledge of experts can be interfaced with the mathematical models of

pathophysiology’.

 

7Personal communication, 1978, with Prof. Peter Szolovits.

157
 

Sec. 5 Statistical Pattern Matching Techniques Appendix B
5 Statistical Pattern Matching Techniaues
5.1 Overview

Pattern matching techniques define the mathematical relationship between
measurable features and classifications of objects [12], [46]. In medicine, the
presence or absence of each of several signs and symptoms in a patient may be
definitive for the classification of the patient as "abnormal" or into the
category of a specific disease. They are also used for prognosis [1], or
predicting disease duration, time course, and outcomes. These techniques have
been applied to a variety of medical domains, such as image processing and
signal analysis, in addition to computer-assisted diagnosis.

In order to find the diagnostic pattern, or discriminant function, the
method requires a training set of objects, for which the correct classification
is already known, as well as reliable values for their measured features. If
the form and parameters are not known for the statistical distributions
underlying the features, then they must be estimated. Parametric techniques
focus on learning the parameters of the probability density functions, while
non-parametric (or "“distribution-free") techniques make no assumptions about the
form of the distributions. After training, then, the pattern can be matched to
new, unclassified objects to aid in deciding the category to which the new
object belongs®.

There are numerous variations on this. general methodology, most notably in
the mathematical techniques used to extract characteristic measurements (the
features) and to find and refine the pattern classifier during training. For
example, linear regression analysis is a commonly used technique for finding the
coefficients of an equation that defines a recurring pattern or category of
diagnostic or prognostic interest. Recent work emphasizes structural
relationships among sets of features more than statistical ones.

Three of the best known training criteria for the discriminant function

are:

(a) Bayes’ criterion: choose, the function that has the minimum cost associated
with incorrect diagnoses’;

(>) clustering criterion: choose the function that produces the tightest
clusters;

(c) least-squared-error criterion: choose the funetion that minimizes the
squared differences between predicted and observed measurement values.

 

; 8tt is possible to detect patterns, even without a known classification for
objects in the training set, with so-called “unsupervised learning techniques.
Also, it is possible to work with both numerical and non-numerical measurements.

%See Section § for further discussion.

158
Sec. 5 Statistical Pattern Matching Techniques

Ten commonly used mathematical models based on these criteria have been

shown to produce remarkably similar diagnostic results for the same data [7}.

5.2 Example

There are numerous papers reporting on the use of pattern recognition methods in
medicine. Armitage [1] discusses three examples of prognostic studies, with an
emphasis on regression methods. Siegel et al. [27] discuss uses of cluster
analysis. One recent diagnostic application using Bayes’ criterion (67)
classifies patients having chest pains into three categories: Dy: acute
myocardial infarction (MI); D9: coronary insufficiency; and D3: non-cardiac
causes of chest pain. The need for early diagnosis of heart attacks without
laboratory tests is a prevalent problen, yet physicians are known to misclassify
about one third of the patients in categories Dy and Da and about &0% of these
in D3. In order to determine the correct classification, each patient in the
training set was classified after 3 days, based on laboratory data including
electrocardiogram (ECG) and blood data (cardiac enzymes). There remained some
uncertainty about several patients with "probable MI." Seventeen variables were
selected from many: 9 features with continuous values (including age, heart
rates, white blood count, and hemoglobin) and @& features with discrete values
(sex and 7 ECG features).

The training data were measurements on 247 patients. The decision rule was
chosen using Bayes” theorem to compute the posterior probabilities of each
Giagnostic class given the feature vector X. (X = [x Ly * Qs e88 y X 17}. be.
Then a decision rule was chosen to minimize the probability of error, that is,
to adjust the coefficients on the feature vector X +! such that for the correct

class Dy:
P(Dy | X)=MAX (P(D1 1X), P(D9|X), P (D3/X))

The class conditional probability density functions must be estimated initially,
and the performance of the decision rule depends on the accuracy of the assumed

model.

Using the same 247 patients for testing the approach, the trained

 

lCthe posterior probability of a diagnostic class, represented as P(Ds1X),
is the probability that a patient falls in diagnostic category Dy given that the
feature vector X has been observed.

Ilsee (56] for a study in which the coefficients are reported because of
their medical imporr.

159
Sec. 5 Statistical Pattern Matching Techniques Appendix B

classifier averaged @0% correct diagnoses over the three classes, using only
data available at the time of admission. Physicians, using more data than the
computer, averaged only 50.5% correct over these three categories for the same
patients. Training the classifier with a subset of the patients, and using the

remainder for testing, produced nearly as good results.

5.3 Discussion of the Methodology

The number of reported medical applications of pattern recognition
techniques is large, but there are also numerous problems associated with the
methodology. The most obvious difficulties are choosing the set of features in
the first place, collecting reliable measurements on a large sample, and
verifying the initial classifications among the training data. Current
techniques are inadequate for problems in which trends or movement of features
are important characteristics of the categories. Also the problems for which
existing techniques are accurate are those that are well characterized by a
small number of features ("dimensions of the space").

As with all techniques based on statistics, the size of the sample used to
define the categories is an impertant consideration. As the*number of important
features and the number of relevant categories increase, the required size of
the training set also increases. In one test [7], pattern classifiers trained
to discriminate among 20 disease categories from 50 symptoms were correct 5]% ~
647 of the time. The same methods were used to train classifiers to
discriminate between 2 of the diseases, from the same 50 symptoms, and produced
correct diagnoses 92% - 98% of the time.

The context in which a local pattern is identified raises problems related
to the issue of utilizing medical knowledge. [It is difficult to Find and use
classifiers that are best for a small decision, such as whether an area of an X-
tay is inside or outside the heart, and integrate those into a global
classifier, such as one for abnormal heart volume.

Accurate application of a classifier in a hospital setting also requires
that the measurements in that clinical environment are consistent with the
measurements used to train the classifier initially. For example, if diseases
and symptoms are defined differently in the new setting, or if lab test values
are reported in different ranges -- or different lab tests used -- then
decisions based on the classification are not reliable.

Pattern recognition techniques are often misapplied in medical domains in

160
Sec. 5 Statistical Pattern Matching Techniques

which the assumptions are violated. Some of the difficulties noted above are
avoided in systems thet integrate structural knowledge tnto the numerical
methods and in systems that integrate humen and machine capabilities into
Single, interactive systems. These modifications will overcome one of the major
difficulties seen in completely automated systems, that of providing the system

with good "intuitions" based on an expert’s a@ priori knowledge and experience
{46}.

6 Bayesian Statistical Approaches

6.1 Overview

More work has been done on Bayesian approaches to computer-based medical
decision making than on any of the other methodologies we have discussed. The
appeal of Bayes’ Theorem !2 is clear: it potentially offers an exact method for
computing the probability of a disease based on observations and data regarding
the frequency with which these observations are known to occur for specified
diseases. In several domains the technique has been shown to be exceedingly
accurate, but there are also several limitations to the approach which ve
discuss below.

In its simplest formulation, Bayes’ Theorem can be seen as a mechanism to
calculate the probability of a disease, in light of specified evidence, from the
@ priori probability of the disease and the conditional probabilities relating
the observations to the diseases in which they May occur. For example, suppose
disease Dj is one of n mutually exclusive diagnoses under consideration and E is
the evidence or observations supporting that diagnosis. Then if P(D;) is the a

priori probability of the ith disease:

P(D_) P(EID.)
P(D,1E) = i :

 

A
P(D,) P(EID,)
22 FO, ‘
d
The theorem can also be represented or derived in a variety of other forms,
including an odds/likelihood ratio formulation. We cannot include such details
here, but any introductory statistics book or Lusted’s classic volume [58]

presents the subject in considerable detail.

 

* = - : .
l231s0 often referred to as Bayes” rule, discriminant, or criterion

161
Sec. 6 Bayestan Statistical Approaches Appendix B

Among the most commonly recognized problems with the utilization of a
Eayesian approach is the large amount of data required to determine all the
conditional probabilities needed in the rigorous application of the formula.
Chart review or computer-based analysis of large databanks occasionally allows
most of the necessary conditional prebabilities to be obtained. A variety of
additional assumptions must be made. For example: (1) the diseases under
consideration are assumed mutually exclusive and exhaustive (i.e., the patient
is assumed to have one of the n diseases, (2) the clinical observations are
assumed to be conditionally independent over a given diseasel3, and (3) the
incidence of the symptoms of a disease is assumed to be stationary (i.e., the
model generally does not allow for changes in disease patterns over time).

One of the earliest Bayesian programs was Warner’s system for the diagnosis
of congenital heart disease [107]. He compiled data on 23 patients and generated
a symptom-disease matrix consisting of 53 symptoms (attributes) and 725 disease
entities. The diagnostic performance of the computer, based on the presence or
absence of the 53 symptoms in a new patient, was then compared to that of two
experienced physicians. The program was shown to ‘reach diagnoses with an
accuracy equal to that of the experts. Furthermore, system performance was
shown to improve as the statistics in the symptor-diseese matrix stabilized with
the addition of increasing numbers of patients.

In 1968 Gorry and Barnett pointed out that Warner’s program had required
making all 53 observations for every patient to be diagnosed, a situation which
would not be realistic for many clinical applications. They therefore utilized
a modification of Bayes’ Theorem in which observations are considered
sequentially. Their computer program analyzed observations one at a time,
suggested which test would be most useful if performed next, and included
termination criteria so that a diagnosis could be reached, when appropriate,
without needing to make all the observations [28]. Decisions regarding tests
and termination were made on the basis of calculations of expected costs and
benefits at each step in the logical processl4, Using the sane symptom-disease

matrix developed by Warner, they were able to attain equivalent diagnostic

 

‘othe purest form of Bayes” Theorem allows conditional dependencies, and
the order in which evidence is obtained, to be explicitiy considered in the
analysis. However, the number of required conditional probabilities is so
unwieldy that conditional independence of observations, and non-dependence on
the order of observations, is generally assumed [101].

l4cae the decision theory discussion in Section 7.

162
Sec. 6 Bayesian Statistical Approaches

performance using only 6.9 tests on averagel5, They pointed out that, because
the costs of medical tests may be significant (in terms of patient discomfort,
time expended, and financial expense), the use of inefficient testing secuences
should be regarded as ineffective diagnosis. Warner has also more recently
included Gorry and Barnett’s sequential diagnosis approach in an application
regarding structured patient history-taking [110!.

The medical computing literature now includes Many examples of Bayesian
diagnosis programs, most of which have used the nonsequential approach, in
addition to the necessary assumptions of symptom independence and mutual
exclusivity of disease as discussed above. One particularly successful research

effort has been chosen for discussion.

6.2 Example

Since the late 1960’s deDombal and associates, at the University of Leeds
in England, have been studying the diagnostic process and developing computer-
based decision aids using Bayesian probability theory. Their area of
investigation has been gastrointestinal diseases, otiginally acute abdominal
pain [10] with more recent analyses of cyspepsia [39]
[125].

and gastric carecincma

Their program for assessment of acute abdominal pain was evaluated in the
emergency room of their affiliated hospital [10]. Emergency physicians filled
out data sheets summarizing clinical and laboratory findings on 304 patients
presenting with abdominal pain of acute onset. The data from these sheets
became the attributes that were subjected to Bayesian analysis; the required
conditicnal probabilities had been previously compiled from a large group of
patients with one of 7 possible diagnoses16, Thus the Bayesian formulation
assumed each patient had one of these diseases and would select the most likely
on the basis of recorded observations. Diagnostic suggestions were obtained in
batch mode and did not require direct interaction between physician and
computer; the program could generate results in from 30 seconds to 15 minutes
depending upon the level of system use at the time of analysis [38]. Thus the
computer output could have been made available to the emergency room physician,
on average, within 5 minutes after the data form was completed and handed to the

technician assisting with the study.

 

tovests for determining attributes were defined somewhat differently than
they had been by Warner. Thus the maximum number of tests was 21 rather than
the 53 observations used in the original study.

l6anpendicitis, diverticulitis, perforated ulcer, cholecystitis, small
bowel obstruction, pancreatitis, and non-specific abdominal pain.

ar

163
Sec. 6 Bayesian Statistical Approaches Appendix 8

During the study [10], however, these computer-generated diagnoses were
simply saved and later compared to (a) the diagnoses reached by the attending
clinicians, and (b) the ultimate diagnosis verified at surgery or through
appropriate tests. Although the clinicians reached the correct diagnosis in
only 65%-80% of the 304 cases (with accuracy depending upon the individual’s
training and experience), the program was correct in 91.8% of cases.
Furthermore, in 6 of the 7 disease categories the computer was proved more
likely than the senior clinician in charge of a case to assign the patient to
the correct disease category. Of particular interest was the program’s accuracy
regarding appendicitis - a diagnosis which is often made incorrectly. In no
cases of appendicitis did the computer fail to make the correct diagnosis, and
in only six cases were patients with non-specific abdominal pain incorrectly
classified as having appendicitis. Rased on the actual clinical decisions,
hewever, over 20 patients with non-specific abdoninal pain were unnecessarily
taken to surgery for appendicitis, and in six cases patients with appendicitis
were "watched" for over eight hours before they were finally taken to the
operating room.

These investigators also performed a fascinating experiment in which they
compared the program’s performance based on data derived from 600 real patients,
with the accuracy the system achieved using "estimates" of conditional
probabilities obtained from experts (54]!7. As discussed above, the program was
significantly more effective than the unaided clinician when real-life data were
utilized. However, it performed significantly jess well than clinicians when
expert estimates were used. The results supported what several other observers
have found, namely that physicians often have very little idea of the “true”
probabilities for symptom-disease relationships.

Another Leeds study of note was an enalysis of the effect of the system on
the performance of clinicians [ll}. The trial we have mentioned that involved
304 patients was eventually extended to 552 before termination. Although the
computer’s accurecy remained in the range of 91% throughout this period, the
performance of clinicians was noted to improve Markedly over time. Fewer
negative laparctomies were performed, for example, and the number of acuta
appendices that perforated (ruptured) also declined. However, these data

reverted to baseline after the study was terminated, suggesting that the

 

‘7such estimates are referred to as "subjective™ or "“personai"
probabilities, and some investigators have argued that they should be utilized
in Bayesian systems when formally derived conditional probabilities are noc
available [58].

164
Sec. 6 Bayesian Statistical Approaches

constant awareness of computer monitoring and feedback regarding system
performance had temporarily generated a heightened awareness of intellectual

processes among the hospital’s surgeons.

6.3 Discusston of the Methodology

The ideal matching of the problem of acute abdorinal pain and Bayesian
analysis must also be emphasized; the methodology cannot necessarily be as
effectively applied in other medical domains where the following limitations of
the Bayesian approach may have a greater impact.

(1) The assumption of conditional independence of symptoms usually does not
apply and can lead to substantial errors in certain settings [66]. This has led
some investigators to seek new numerical techniques that avoid the independence
assumption [8]. I£ a pure Bayesian formulation is utilized vwirhour making the
independence assumption, however, the number of required conditional
probabilities becomes prohibitive for complex real world problems [161].

(2) The assumption of mutual exclusivity and exhaustiveness of disease
categories is usually false. In actual practice concurrent and overlapping
disease categories are common. In deDombal’s system, for example, many of .the
abdominal pain diagnoses rissed were outside the seven "recognized"
possibilities; if a program starts with an assumption that it need only consider
a small number of defined likely diagnoses, it will inevitably miss the rare or
unexpected cases - precisely the ones with which the clinician is most apt to
need assistance.

(3) In many domains it may be inaccurate to assume that relevant
conditional probabilities are stable over time (e.g., the likelihood that a
particular bacterium will be sensitive to a specific antibiotic). Furthermore,
diagnostic categories and definitions are constantly changing, as are
physicians” observational techniques, thereby invalidating data previously
accumulated. A similar problem results from variations in a priori
probabilities depending upon the population from which a patient is drawn. Some
observers feel that these are major limitations to the use of Bayesian
techniques [13].

In general, then, a purely Bayesian epproach can so constrain problem
formulation as to male a particular application unrealistic and hence
unworkable. Furthermore, even when diagnostic performance is excellent such as
in deDombal’s approach to abdominal pain evaluation, clinical implementation and

System acceptance will generally be difficult.

165
Sec. 7 Decision Theoretical Approaches Appendix B

 

7 Decision Theoretical Apvroaches
7.1 Overview

Bayes” Theorem is only one of several techniques used in the larger field
of decision analysis, and there has recently been increasing interest in the
ways inwhich decision theory might be applied to medicine and adapted for
automation. Several excellent reviews of the field are available in basic
reviews [40], textbooks [78], and medically-oriented journal articles [61],
[87], [102]. In general terms, decision analysis can be seen as any attempt to
consider values associated with choices, as well as probabilities, in order to
analyze the processes by which decisions are made or should be made. Schwartz
identifies the calculation of “expected value" as central to formal decision
analysis [&7]. Ginsberg contrasts medical classification problems (@.g6,

zagnosis) with broader decision problems (e.g., "What should I do for this
patient?"), and asserts that most important medical decisions fall in the latter
category and are best approached through decision analysis [25]. The following
topics are among the central issues in the field.

(1) Decision Trees. The decision making process can be seen as a sequence
of steps in which the clinician selects a path through a network of plausible
events and actions. Nodes in this tree-shaped network are of two kinds:
decision nodes, where the clinician must choose froma set of actions, and
chance nodes, where the outcome is not directly controlled by the clinician but
is a probabilistic response of the patient to some action taken. For example, a
physician may choose to perform a certain test (decision node) but the
occurrence or nonoccurrence of complications may be largely a matter of
statistical likelihood (chance node). By analyzing a difficult decision process
before taking any actions, it may be possible to delineate in advance all
pertinent chance and decision nodes, all plausible outcomes, plus the paths by
which these outcomes might be reached. Furthermore, data may exist to allow
specific probabilities to be associated with each chance node in the tree.

(2) Expected Values. In actual practice physicians make sequential
decisions based on more than the probabilities associated with the chance node
that follows. For example, the best possible outcome is not necessarily sought
if the costs associated with that “path" far outweigh those along alternate
pathways (e-g., a definitive diagnosis may not be sought if the required testing

procedure is expensive or painful and patient management will be unaffected;

166
Sec. 7 Decision Theoretical Approaches

similarly, some patients prefer to “live with" an inguinal bernia rather than
undergo a surgical repair procedure). Thus anticipated "costs" (financial,
complications, discomfort, patient preference) can be associated with the
decision nodes. Using the probabilities at chance nodes, the costs at decision
mades, and the "value" of the various outcomes, an “expected value" for each
pathway through the tree (and in turn each node) can be calculated. The ideal
pathway, then, is the one which maximizes the expected value.

(3) Eliciting Values. Obtaining from physicians and patients the cost and
values they associate with various tests and outcomes can be a formidable
problem, particularly since formal analysis requires expressing the various
casts in standardized units. One approach has been simply to ask for value
ratings on a hypothetical scale, but it can be difficult to get the physician or
patient to keep the values18 separate from their knowledge of the probabilities
linked to the associated chance nodes. An alternate approach has been the
development of lottery games. Inferences regarding values can be made by
identifying the odds, in a hypothetical lottery, at which the physician or
patient is indifferent regarding taking a course of action with certain outcome
and betting on a course with preferable outcome but with a finite chance of
Significant negative costs if the "bet" is lost. In certain settings this
approach may: be accepted and provide important guidelines in decision naking
{[7l].

(4) Test Evaluation. Since the tests which lie at decision nodes are
central to clinical decision analysis, it is crucial to know the predictive
value of tests that are available. This leads to consideration of test
sensitivity, specificity, receiver operator characteristic curves, and
sensitivity analysis. Such issues are discussed by Komaroff et al. in this
issue of the PROCEEDINGS and have also been summarized elsewhere in the clinical

iterature [62].

Many of the major studies of clinical decision analysis have not
specifically involved computer implementations. Schwartz et al. examined the
workup of renal vascular hypertension, developing arguments to show that for
certain kinds of cases a purely qualitative theoretical approach was feasible
and useful [87]. However, they showed that for more complex clinically
challenging cases the decisions could not be adequately sorted out without the

introduction of numerical techniques. Since it was impractical to assume that

 

ets termed "utilities" in some references; hence the tern “utility
theory” [{ .

167
Sec. 7 Decision Theoretical Approaches Appendix B

clinicians would ever take the time to carry out a detailed quantitative
¢ecision analysis by hand, they pointed out the logical role for the computer in
assisting with sucr tasks and accordingly developed the system we discuss as an
example below [29].

Other colleagues of Schwartz at Tufts have been Similarly active in
applying decision theory to clinical problems. Pauker and Kassirer have
examined applications of formal cost-benefit analysis to therapy selection [68]
and Pauker has also looked at possible applications of the theory to the
Management of patients with coronary artery disease [70]. An entire issue of
the New England Journal of Medicine has also been devoted to papers on this

methodology [41].

7.2 Exanple

Computer implementations of clinical decision analysis have appeared with
increasing frequency since the mid-1960’s. Perhaps the earliest major work was
that of Giasberg at Rand Corporation [24], with more recent systems reported by
Pliskin and Beck [74] and Safran et al. (85].

We will briefly describe here the program of Gorry et al., developed for
the management of acute renal failure [29]. Drawing upon Gorry’s experience
with the sequential Bayesian approach previously mentioned {28], the
investigators recognized the need to incorporate some way of balancing the
dangers and discomforts of a procedure against the value of the information to
be gained. They divided their program into two parts: phase I considered only
tests with minimal risk (e.g-, history, examination, blood tests) and phase IJ
considered procedures involving more risk end inconvenience. The phase I
program considered 14 of the most common causes of renal failure and utilized a
sequential test selection process based on Bayes” Theorem and omitting more
advanced decision theoretical methodology [28]. The conditional probabilities
utilized were subjective estimates obtained from an expert nephrologist and were
therefore potentially as problematic as those discussed by Leaper et al. [54]
(see Section 6.2). The researchers found that they had no choice but to use
expert estimates, however, since detailed quantitative data were not available
either in databanks nor the Literature.

It is in the phase II program that the methods of decision theory were
employed because it was in this portion of the decision process that the risks

of procedures became important considerations. At each Step in the decision

168
Sec. 7 Decision Theoretical Approaches

process this program considers whether it is best to treat the patient
immediately or to first carry out an additional diagnostic test. To make this
decision the program identifies the treatment with the highest current expected
value (in the absence of further testing), and compares this with the expected
values of treatments that could be instituted if another diagnostic test were
performed. Comparison of the expected values are made in light of the risk of
the test in order to determine whether the overall expected value of the test is
greater than that of immediate treatment. The relevant values and probabilities
of outcomes of treatment were ocbtained as subjective estimates from
nephrologists in the same way that symptom—disease data had been obtained. Ali
estimates were gradually refined as they gained experience using the progran,
however.

The program was evaluated on 18 test cases in which the true ciagnosis was
uncertain but two expert nephrologists were willing to Make management
decisions. In 14 of the cases the program selected the same therapeutic plan or
diagnostic test as was chosen by the experts. For three of the four remaining
cases the progran’s decision was the physicians” second choice and was, they
selt, a reasonable alternative plan of action. In the last case the physicians

also accepted the program’s decigion as reasonable although it was not among

their first two choices.

7.3 Discussion of the Methodology

The excellent performance of Gorry’s progran, despite its reliance on
subjective estimates from experts, may serve to emphasize the importance of the
clinical analysis that underlies the decision theoretical approach. The
reasoning steps in managing clinical cases have been dissected in such detail
that small errors in the probability estimates are apparently much less
important than they were for deDombal’s purely Bayesian approach [54]. Gorry
suggests this may be simply because the decisions made by the program are based
on the combination of large aggregates of such numbers, but this argument should
apply equally for a Bayesian system. It seems tc us more likely that
distillation of the clinical domain in a formal decision tree gives the progran
so much more knowledge of the clinical problem that the quantitative details
become somewhat less critical to overall system operation. The explicit
decision network is a powerful knowledge structure; the "ynowledge" in

deDombel’s system lies in conditional probabilities alone and there is no larger

169
Sec. 7 Decision Theoretical Approaches Appendix B

sckere to override the propagation of error as these probabilities are
mathematically manipulated by the Rayesian routines.

The decision theory approach is not without problems, however. Perhaps the
most difficult problem is assigning numerical values (e.g., dcllars) to a human
life or a day of health, etc. Some critics feel this is a major limitation to
the methodology [112]. Overlapping or coincident diseases are also not well-
managed, unless specifically included in the analysis, and the Bayesian
foundation for many of the calculations still assumes mutually exclusive and
exhaustive disease categories. Problems of symptom conditional dependence still

remain, and there is no easy way to include knowledge regarding the time course

of

diseases. Gorry points out that his program was also incapable of
recognizing circumstances in which two or more actions should be carried out
concurrently. Furthermore decision theory per se does not provide the ‘tind of
focusing mechanisms that clinicians tend to use when they assume an initial
diagnostic hypothesis in dealing with a patient and discard it only if
subsequent data make that hypothesis no longer tenable. Other similar
strategies of clinical reasoning are becoming increasingly well-recognized [48]
and account in large part for the applications of symbolic reasoning techniques

to be discussed in the next section.

 

2 Svmbolic Reasoning Avproaches
@.1 Overview

In the early 1970°s researchers at several institutions simultaneously
began to investigate the potential applications to clinical decision making of
symbolic reasoning techniques drawn from the branch of computer science known as
artificial intelligence (AI). The field {s well-reviewed in a recent book by
Winston [120]. Although the term “ertificial intelligence" has never been
uniformly defined, it is generally accepted to include those computer
applications in which the tasks require largely symbolic inference rather than
numeric calculation. Examples include programs that reason about mineral
exploration, organic chemistry, or molecular biology; programs that converse in
English and understand spoken sentences; and programs that generate theories
from observations.

Such programs gain their power from qualitative, experimental judgments -

codified in so-called “rules-of-thumb" or "heuristics" - in contrast to

170
Sec. & Symbolic Reasoning Approaches

numerical calculation programs whose power derives from the analytical equations
used. The heuristics focus the attention of the reasoning program on parts of
the problem that seem most critical and parts of the knowledge base that seem
most relevant. They also guide the application of the domain knowledge to an
individual case by deleting items from consideration as well as focusing on
items. The result is that these programs pursue a line of reasoning as opposed
to following a sequence of steps in a calculation. Among the earliest symbolic
inference programs in medicine was the diagnostic interviewing system of
Kleinmumtz [49]. Other early work included Wortman’s information processing
system, the performance of which was largely motivated by a desire to understand
anc simulate the psychological processes of neurologists reaching diagnoses
{121}.

It was a landmark paper by Gorry in 1973, however, that first eritically
analyzed conventional approaches to computer-based clinical decision making and
outlined his motivation for turning to newer symbolic techniques [30). He used
the acute renal failure program discussed in Section 7.2 [29] as an axample of
the problems arising when decision analysis is used atone. In particular, he
analyzed some of the cases on which the renal failure program had failed but the
physicians censidering the cases had performed well. His conclusions from these
observations include the following four points.

(1) Clinical judgment is based less on detailed knowledge of
pathophysiology than it is on gross chunks of knowledge and a good deal of
detailed experience from which rules of thumb are derived.

(2) Clinicians know facts, of course, but their knowledge is also largely
judgmental. The rules they learn allow them to focus attention and generate
hypotheses quickly. Such heuristics permit them to avoid detailed search
through the entire problem space.

(3) Clinicians recognize levels of belief or certainty associated with many
of the rules they use, but they do not routinely quantitate or utilize these
certainty concepts in any formal statistical manner.

(4) It is easier for experts to state their rules in response to perceived
misconceptions in others than it is for them to generate such decision criteria
2 priori.

In the renal failure program medical knowledge had been embedded in the
Structure of the decision tree. This knowledge was never explicit, and
additions to the experts’ judgmental rules had generally required changes to the

tree itself.

171
Sec. & Symbolic Reasoning Approaches Appendix B

Based on observations such as those above, Corry identified at least three
important problems for investigation:

(1) Concept Formation. Clinical decision aide had traditionally
bad no true "understanding" cf medicine. Although explicit decision
trees had given the decision theory programs a greater sense of the
pertinent associations, medical knowledge and the heuristics for
problem solving in the field had never been explicitly represented nor
utilized. So-called "common sense" was often clearly lacking when the
programs failed, and this was often what most alienated potential
physician users.

(2) Language Development. Both for capturing knowledge fron
collaborating experts, and for compunicating with physician users,
Gorry argued that further research on the development of cemputer-
based linguistic capabilities was crucial.

(3) Explanation. Diagnostic programs had seldom emphasized an
ability to explain the basis for their decisions in terms
understandable to the physician. System acceptability was therefore
inevitably limited; the physician would often have no basis for
deciding whether to eccept the program’s advice, and might therefore
resent what could be perceived as an attempt to dictate the practice
of medicine.

Gorry’s group at MIT and Tufts developed new approaches to examining the
renal failure problem in light of these observations [69].

Due to the limitations of the older techniques, it was perhaps inevitable
that some medical researchers would turn to the AI field for new methodologies.
Major research areas in AI include knowledge representation, heuristic search,
natural language understanding and generation, and models of thought processes
—- all topics clearly pertinent to the problems we have been discussing.
Furthermore, AL researchers were beginning to look for applications to which
they could apply some of the techniques they had developed in theoretical
gomains. This community of researchers has grown in recent years, and a recent
issue of Artificial Intelligence was devoted entirely to epplications of AI to

biology, medicine and chemistry reals,

 

ivany of the systems described in this issue were developed on the SUMEX-
AIM computing resource, a- nationally shared system devoted entirely to
applicacions of AI to the biomedical sciences. The SUMEX-AIM computer is
physically located at Stanford University but is used by researchers nationwide
via connections to the TYMNET. The resource is funded by the Division of
Research Resources, Biotechnology Branch, National Institutes of Health.

172
sec. & Symbolic Peasoning Approaches

Among. the programs using symbolic reasoning techniques are several systems
that have been particularly novel and successful. Pople and Myers have
developed a system called INTERNIST that assists with test selection for the
diagnosis of all diseases in internal medicine [75]. This awesome task has been
remarkably successful to date, with the program correctly diagnosing a large
percentage of the complex cases selected from clinical pathologic conferences in
the major medical journals2C, The program utilizes a hierarchic disease
categorization, an ad hee scoring system for quantifying symptom—-disease
relationships, plus some clever heuristics for focusing attention,
discriminating between competing hypotheses, and diagnosing concurrent diseases
(761. The system currently has an inadequate human interface, however, and is
not yet implemented for clinical trials.

Ac Rutgers University Weiss’, Zulikowski, and Safir have developed a anodel
of ophthomologic reasoning regarding disease processes in the eye, specifically
glaucema [117]. In this specialized application area it has been possible to map
relationships between observations, pathophysiologic States, and disease
categories. The resulting causal associational network (termed CASNET) forns
the basis for a reasoning program thet gives advice regarding disease states in
glaucoma patients and generates Management recommendations.

For the AI researchers the question of how best to Manage uncertainty in
medical reasoning remains a central issue. All the programs mentioned have
developed ad hoc weighting programs and avoided formal statistical approaches.
Others have turned to the work of statisticians and philosophers of science who
have devised theories of approximate: or inexact reasoning. For example,
Wechsler [114] describes a program that is based upon Zadeh’s fuzzy set theory
{124]. Shortliffe and Buchanan [94] have turned to confirmation theory for their

model of inexact reasoning in medicine.

8.2 Exatple

The symbolic reasoning program selected for discussion is the MYCIN System
at Stanford University [95]. The researchers cited a variety of design
considerations which motivated the selection of AI methodologies for the
consultation system they were developing [92]. They primarily wanted it to be
useful to physicians and therefore emphasized the selection of a problem domain

in which physicians had been shown to err frequently, namely the selection of

 

2CData communicated by Drs. Pople and Myers at the Second annual 4.1.M.
Workshop, Rutgers University, June 1976.

173
Sec. & Symbolic Peasoring Approaches Appendix 8

antibiotics for patients with infections. ‘They also cited human issues that
they felt were crucial to make the system acceptable to physicians:

(1) it should be able to explain its decisions in terms a line of reasoning that
a physician can understand;

(2) it should be able to justify its performance by responding to questions
expressed in simple English;

(3) it should be able to "learn" new information rapidly by interacting directly
with experts;

(4) its knowledge should be easily modifiable so that perceived errors can be
corrected rapidly before they recur in another case; and

(5) the interaction should be engineered with the user in mind (in terms of
Prompts, answers, and informaticn volunteered by the system as well as by
the users).

All these design goals were based on the observation that previous computer
decision aids had generally been poorly accepted by physicians, even when they
were shown te perform well on the tasks for which they were designed. MYCIN’s
developers felt that barriers to acceptance were largely conceptual and could be
counteracted in large part if a system were perceived as a clinical tool rather
than a dogmatic replacement for the prinary physician’s own reasoning.

Knowledge of infectious diseases is represented-in MYCIN as production
rules, each containing 2 "packet™ of knowledge obtained from collaborating
experts [95]21. 4 production rule is simply a conditional statement which
relates observations to associated inferences that may be drawn. For example, a
MYCIN rule might state that "if a bacterium is a gram positive coccus growing in
chains, then it is apt to be a streptococcus." MYCIN’s power is derived from

such rules in a variety of ways:

(1) it is the program thet determines which rules to use and bow they should be
chained together to make decisions about a4 specific cases,

(2) the rules can be stored in a machine-readable format but translated into
English for display to physicians;

(3) by removing, altering, or adding rules, the system’s knowledge structuras
can be rapidly modified without explicitly restructuring the entire
knowledge base; and

(4) the rules themselves can often forma coherent explanation of system
reasoning if the relevant ones are translated into English and displayed in
response to a user’s question.

Associated with all rules and inferences are numerical weights reflecting
the degree of certainty associated with them. These Numbers, termed certainty

factors, form the basis for the system’s inexact reasoning in this complex task

 

2lproduction rules are a methodology frequently employed

; in AI research
(9] and effectively applied to other scientific problem domains [4].

9 os : : os
*2the control structure utilized is termed “goal-oriented” and is similar
to the consequent-thecrem methodology used in Hewitt’s PLANNER [37].

174
Sec. & Symbolic Reasoning Approaches

demain [94]. They allow the judgmental knowledge of experts to be captured in
rule form and then utilized in a consistent fashion.

The MYCIN System has been evaluated regarding its performance at therapy
selection for patients with either septicemia [123] or Meningitis [122]. The
program performs comparably with experts in these two task domains, but as yet
it has no rules regarding the other infectious disease problem areas. Further
knowledge base development will therefore be required before MYCIN is made
available for clinical use; hence questions regarding its acceptability to
physicians cannot yet be assessed. However, the required implementation stages
have been delineated [93], attention has been paid to all the design criteria

mentioned above, and the program does have a powerful explanation capability
feel.

&.2 Discussion of the Methodology

Symbolic reasoning techniques differ from the other methodologies mentioned
in this article in that the computer techniques themselves are as vet
experimental and rapidly changing. Whereas the computations involved in Bayes’
Theorem, for exemple, involve straightforward application ef computing
techniques already well-developed, basic researchers in computer science
continue to develop new methodologies for knowledge representation, language
understanding, heuristic search, and the other symbolic reasoning problems we
have mentioned. Thus the AI programs tend to be developed in highly
experimental environments where short term practical results are often unlikely
to be found. The programs typically reaufre large amounts of Space and tend to
be slow, particularly in time-sharing environments. 4s has been true for most
of the methodologies discussed, AI researchers have still not developed adequate
methods for handling concurrent diseases, assessing the time course of disease,
nor acquiring adequate structured knowledge from experts. Furthermore, inexact
reasoning techniques tend to be developed and justified largely on intuitive
grounds.

Despite these significant limitations, the techniques of artificial
intelligence do provide a way to respond to many of Gorry’s observations
regarding the inadequacies of prior methodologies as described above {30}.
There are now several programs responsive to his criticisms. Szolovits and
Pauker have recently reviewed some applications of AI to medicine and have

attempted to weigh the successes of this young field against the very real

175
Sec. 8 Symbolic Peasoning Approaches Appendix B

problems that lie ahead [101]. They identify several serious deficiencies of
current systems. For example, termination criteria are still poorly understood.
@lthough INTERNIST can diagnose simultaneous diseases, it alse pursues all
abnormal findings to completion, even though a clinician often ignores minor
unexplained abnormalities if the rest of a patient’s clinical status is vell
understood. In addition, although some of these programs now cleverly mimic
some of the reasoning styles observed in experts (14},[48], it is less clear how
to keep the systems from abandoning one hypothesis and turning to another one as
soon as new information suggests another possibility. Programs that operate
this way appear to digress from one topic to another -- a characteristic that

decidedly alienates a user regardless of the validity of the final diagnosis or

advice.
9 Conclusions

This review has shown that there are two recurring issues to confront in
considering the field of computerbased clinical decision making:

(1) Eow can we design systems that reach better, more reliable decisions in a
broad range of applications, and

(2) How can we more effectively encourage the use of suth systems by physicians
or other intended users?

We shall summarize by reviewing these points separately.

Performance Issues

Central to assuring a program’s adequate performance is a matching of the
most appropriate technique with the problem domain. Te have seen that the
structured logic of clinical algorithms can be effectively applied to triage
functions and other primary care problems, but they would be less naturally
watched with complex tasks such as the diagnosis and Management of acute renal
failure. Good statistical data may support an effective Bavesian program in
settings where diagnostic categories are small in number, non-overlapping, and
well-defined, but the lack of higher level domain knowledge Limits the
effectiveness of the Bayestan approach in more complex patient management or
diagnostic environments. A mathematical approach may support decision making in
certain well-described fields in which observations are typically quantified,
and related by functional expressions. These examples, and others, demonstrate
the the need for thoughtful consideration of the technicue most appropriate for

managing a clinical problem. In general the simplest effective methodology is

176
Sec. 9 Conclusions

to be preferred, but acceptability issues must also be considered as discussed
below.

Tt is also always appropriate to ask whether computer-based approaches are
needed at all for a given decision making task. The clinical algorithm
developers, for example, have almost uniformly discarded the machine, and
Schwartz et al. pointed out that a useful decision analysis can often be
accomplished in a qualitative manner using paper and pencil [87].

Finally, it is important to consider the extent to which a program’s
“understanding” of its task domain will heighten its performance, particularly
in settings where knowledge of the field tends to be highly judgmental and
poorly quantified. We use the term "understanding" here to refer to the degree
of judgmental or structural knowledge (as opposed to data) that is contained in
che program. Analyses of human clinical decision making [14], (48] suggest that
as decisions move from simple to complex, a physician’s reasoning style becomes
less algorithmic and more heuristic, with qualitative judgmencal knowledge and
the conditions for invoking it coming increasingly into play. It is likely that
medical computing researchers will similarly have to become "knowledge
engineers" in the sense that they will look for effective ways te match the
knowledge structures that they use to the complexity of the tasks they are

undertaking.

Acceptability Issues

A recurring observation as one reviews the literature of computer-based
medical decision making is that essentially none of the Systems has been
effectively utilized outside of a research environment, even when its
performance has been shown to be excellent! This suggests that it may be an
error to concentrate our research effort primarily on improving the decision
making performance of computers when there is evidently much more required
before these systems will have clinical impact. It is tempting to conclude that
the biases of medical personnel egainst computers are so strong that systems
will inevitably be rejected, regardless of performance, and in fact there are
some data to support this view [99]. However, we are beginning to see examples
ef applications in which initial resistance to automated techniques has
gradually been overcome through the incorporation of adequate system benefits
[113].

Perhaps one of the most revealing lessons on this subject is an observation

177
Sec. 9 Conclusions Appendix B

regarding the system of Mesel et al. that we described earlier [64]. Despite
documented physician resistance to clinical algorithms in otter settings [34],
the physicians in Mesel’s study eccepted the guidance of protocols for the
management of cherotherapy in their cancer patients. It is likely that the key
to acceptance in this instance is the fact that these physicians had previously
had no choice but to refer their patients with cancer to the tertiary care
center in Birmingham where all complex chemotherapy was administered. The
introduction of the protocols permitted these physicians to undertake tasks that
they had previously been unable to do, and it simultaneously allowed maintenance
of close dector- patient relationships and helped the patients avoid frequent
long trips to the center. The motivation for the physician to use the system is
clear in this case. It is reminiscent of Rosati’s assertion that physicians
will firse welcome cemputer decision aids when they become aware that colleagues
who are using the machine have a clear advantage in their practice {81}.

A heightened awareness of "human engineering"” issues among medical
computing researchers is also apt to help improve acceptance of computers by
physicians. Fox has recently reviewed this field in-.detail [18]. The issues
range from the mechanics of interaction at a computer terminal to progran
characteristics designed to make the system appear as a tool for the physician
rather than a dogmatic advice-giving machine.

Adequate attention must also be given to the severe time constraints
perceived by physicians. Ideally they would like programs to take no more time
than they currently spend when accomplishing the same task on their own. Time
and schedule pressures are similarly likely to explain the greater resistance to
automation among interns and residents than among medical students or practicing
physicians in Startsman’s study [99].

Finally it must be noted that acceptability issues should generally be
considered from the outset in a system's design because they may dictate the
choice of methodology as much as the task domain itself does. The role of
formal knowledge structures to facilitate explanation capabilities, for example,
may argue in favor of using symbolic reasoning techniaues even when a somewhat

less complex methodology might have been adequate for the decision task.

In summary, the trend towards increased use of knowledge engineering
techniques for clinical decision programs has been in response to desires for

both improved performance and improved acceptance of such systems. AS greater

178
Sec. 9 Corclusions

experience is gained with these techniques and they become better know
throughout the medical computing ccmmunity, ic is likely thet we will see
increasingly powerful unions between symbolic reasoning and the alternate
methodolegies we have discussed. One lesson to be drawn lies in the recognition
that there is basic computer science research to be done in medical computing,
and that the field is more than the application of established computing

techniques in medical domains.
Acknowledgments

We wish to thank R. Blum, L. Pagan, J. King, J. Kunz, E. Sox, and G. Wiederhold.
for their thoughtful edvice in reviewing earlier drafts of this paper.

179
Sec.

1d.

il.

14.

References Appendix B

References

Armitage, P. and Gehan, E.A. "Statistical methods for the identification
and use of prognostic factors." Int. J. Cancer, 13, pp- 16-36, (1974).
Bleich, H.L. “computer evaluation of acid-base disorders." Je
Clin. Invest. 48, pp. 1689-1696 (1969).

Bleich, H.-L. "The computer as a consultant." N. Fug. J. Med. 284 - l4le~
147 (1871). : At ABRs Ss Bede oes PP
Bleich, HL. “Computer~based consultation: electrolyte and acid-

base disorders." Amer. J. Med 53, pp. 285-291 (1972).

Blum, R-L. and Wiederhold, Gc." inferring knowledge from clinical data
banks: utilizing techniaues from artificia intelligence," Proc, 2nd 4nn.
Symp. on Comp. Applic. in Med. Care, IEEE, Washington D.C., November 1Lo7e;
ppe 305-207.

Buchanan, B.G. and Feigenbaum, E.A. "Dendral and Meta-Dendral:
185g) PPttcations dinension." Aértifictal Intelligence 11, pp- 5-24

Croft, D.J. "Is computerized diagnosis possible?™ Comp. Biomed. Res. 5,
pp- 251-367 (1972). —

Cumberpatch, J. and Heaps, H.S. "A disease—conscious method for sequential
diaghosis by use of disease probabilities without assumption of synptom
independence." Int. J. Biomed. Comput. 7, pp. 61-78 (1976).

Davis, R. and King, J. "An overview of production systems." In Machine
Representation of Knowledge (E.W. Fleock and D. Michie, eds.), New York:
Wiley, 19/0.

deDombal, F.T., Leaper, DeJ-, Staniland, J-R., et al. "Computer-
aided diagnosis of acute abdominal pain.” Brit. Med: Je 2, pp-9=-13 (1972).

deDombal, F.T., Leaper, D.J., Horrocks, J.C., et al. "Human and computer-
aided diagnosis o abdominal pain: further report with emphasis on
performance of clinicians." Brit. Med. Je 1, pp-376-280 (1974).

Duda, R.O. and Hart, P.E. Pattern Classification and Scene Analysis. New
York: Wiley, 1973.

Edwards, W. "Nal: diagnosis tn unique cases." In Computer Diagnosis And

Diagnostic Methods J.A. Jacquez, ed.), Springfield, Iil.7 narles C.
omas, IU7Z, pp. 139-151.

Elstein, A.S., Shulman, L.S., and Sorafka, S.A. Medical Problem Solving:

an dnalvsis of Clinical Reasoning. Cambridge, Mass.: Warvard Univ. Press,
TUS. —

 

Feigenbaum, E.A. "The Art of Artificial Intelligence: Themes and case
studies of knowledge engineering.” AFTPS Conference roc-, NCC 1978. Vol.
47. Montvale, N.J.: AFIPS Press, 1573) p.izy.

180
fec. References

16. Feinstein, A.R. gguat it of data in the medical record." Comput. Riomed.
o7

Res. 3, pp. 426~42

17. Feinstein, A.?., Rubinstein, J.F., and Ramshaw, W.A. “Estimating rognosis

with the aid of a conversational mode computer program." Anns. mt. Med.
76, pp» 911+921 (1972). Te

18. Fox, J. "Medical computing and the user." Int. J. Man-Machine Studies 3,
pp. 669-686 (1977).

19. Friedman, R.B. and Gustafson, D.H. “Computers in clinical medicine:
a critical review." Comp. Biomed. Res. &, pp- 199-204 (1977).

20. Fries, J.F. "Time-oriented patient recor
Amer. Med. Assoc. 222, pp. 1536-1542 (1972

gs and a computer databank.” J.

21. Fries, J.F. "A data bank for the clinician?" (editorial). N. Eng. J. Med.
294, pp. 1400-1402 (1976). =

22. Garland, L.H. "Studies on the accuracy of diagnostic procedures." Amer. Je
Roentgen. 82, pp. 25-38 (1959).

23. Gill, P.W., Leaper, D.J., Guillou, P.J-, et al. "Observer variation
in clinical diagnosis - a computer~aided assessment of its magnitude
and importance.” Meth. Inform. Med.. 12; pp- 108-113 (1973).

74. Ginsberg, A.S. Tecision Analysis in Clinical Patient Management With an
Application to the Pleural Effusion Syndrome. The Rand orporation, R-/51—
ety uly i9/f.

25. Ginsberg, A.S. "The diagnostic process viewed as a decision problem." @n
Computer Diagnosis and Diagnostic Methods, (JA. Jacquez,
ed-), Springfield, .ll.: Charles t. T omas, 1°72.

26. Gleser, M.A. and Collen, M.F. "Towards automated medical decisions." Comp.
Biomed. Pes. 5, pp. 180-189 (1972). nes

27. Goldwyn, R.M., Friedman, H.P., Siegel, J.H. “Iteration and interaction in
computer data bank analysis: as case stud the

in
pysiglogic classification and assessment of the critically ill." Comp.
fomed. Res. 6(1973). —

———

28. Gorry, G.A. and Barnett, G.C. "Fxperience with a model of
sequential diagnosis." Como. Biomed. Res. 1, pp. 490-507 (1968)

29. Gorry, G.A. Kassirer, J.P., Essig, A., and Schwartz, W.B.
"Decision analysis as the basis for Somputercaided management of acute
renal failure." Amer. J. Med 55, pp. 473-484 (1973).

30. Gorry, G.A. "Computer-assisted clinical decision making." Meth. Inform.
Med. 12, pp. 45-51 (1973). a

31. Gorrvy, G.A., Silverman, E., and Pauker, §.G. “Capturing
clinical expertise: a computer program that considers clinical responses
to digitalis." Amer. J. Med 64, pp. 452-460 (1978).

ia
to

- Greenes, R.A., Barnett, G.O., Klein, S.V., et al. "Recording, retrieval,

181
Sec.

34.

35.

26.

i
ion |

4l.

42.

43.

44.

45.

46.

References Appendix B

:

and review cf medi

by physician-conputer interaction." hr. Eng. J.
Wed. 282, pp. 307 _—

Greenfield, S., Komaroff, A.L., and Anderson, B. "A headache protocol for
Tis tissbs oo ess and efficiency." Arch. Intern. Med. {36, pp. Llli-

Grimm, R-H., Shimoni, K., Harlan, W.Re, and Estes, F.H. "Evaluation
of patient-care protocol use by various providers." N. Eng. J. Med. 292,
pp- 5C7=-511 (1975).

Groner, G.F., Clark, R.L., Berman, R.-A., and De Land, E.C. "BIOMOD -
an interactive computer graphics system for modeling." Proc. Fall
Joint Computer Conference, Pp. 369+378, 1971. —_ " "

Hess, E.V. "A uniform database for rheumatic diseases." Arthritis
and Rheumatism 19, pp. 645-648 (1976). oo

Pewitt, C. .Description and Theoretical Analvsis (Using Schemara)
of PLANNER: A Languege -or Proving Theorems anc Manipnulatine WYodels Ina
Robot. ©cF.D. Dissertation, Cepartment ot Mathematics, Massachusetts
Institute of Technology, Cambridge, Mass., 1972.

Horrocks, J.C., McCann, A.P., Staniland, J.R., et al. “Computer-—
aided diagnosis: descripcion of an adaptable System, an
operational experience with 2,034 cases." Brit. Med. J. 2, pp. 5=9 (1972).

Eorrocks, J.C, and deDombal, F.T. "Computer-aided diagnosis
of dyspepsia." Amer. J. Diges. Dis. 2C, 397-406 (19735.

FBoward, R. A. (ed.). "Special Tssue on Decision Analysis.”
IEEE Transactions on Systems, Science and Cybernetics, vol SSC-4(3), Sept.,

- °

Inglefinger, F.J.| "Decision in medicine" (editorial). Ne. Eng. Jo
Med. 293, pp. 254-255 (1975).

Jacquez, J.A. Compnuter Diagnosis ard Eiagnostic Methods, Springfield,
Ill.: Charles C. Thomas, TO7T-

Jelliffe, R.W., Buell, J., Kalaba, Re, et al. "A computer program
for digitalis dosage regimens." Math. Biosci. 9, pp- 179-193 (1970).

Jelliffe, R.W., Buell, J., and Kalaba, R. “Reduction of digitalis toxicity
Bo esoe eiepas ete! glycoside dosage regimens." Anns. Int. Med. 77, pp-

Johnson, D.C. and Barnett, G.O0. "MEDINFO - a medical information system."
Comp. Prog. in Biomed. 7, pp. 191-201 (1977).

Kanal, L.N. "Patterns in Pattern Recognit
3a

inZformation Theory, vol. IT-20, no. § (1974

tous 1968-1974," TEEE Trans. on

Karpinski, R.H.S. and Bleich, B.L. "MISAR: 3 miniature information storage
anc retrieval system.” Comp. Biomed. Res. 4, pp. 655-660 (1971).

Kassirer, J.P. and Gorry, G.A. “Clinical problem solving: a
behavioral analysis." Anns. Int. Med. &¢, pp-e 245-255 (1978).

182
Sec.

5l.

52.

53.

54.

57.

60.

6l.

62.

63.

64.

Peferences Appendix B
Kleinmuntz, B. and McLean, P.S. "Diagnostic interviewing by
digitalcomputer." Behav. Sci. 13, pp. 75-8C (1968).

Knapp, R-G., Levi, So, Lurie, D., and Vestphal,

M. " A computer-generated
diagnostic decision guide: a comparison of statistical diagnosis and
clinical diagnosis." Comput. Biol. Med. 7, pp. 222-220 (1977).

Komoroff, A.L., Black, W.L., Flatley, M., et al. "Protocols for physicien
assistants: management of diabetes and hypertension.” N. Eng. J. Med.
290, 307-312 (19745.

Korein, J., Lyman, M., and Tick, J.L.- "™ The computerized medical
record," Bulletin New York Academy of Medicine, Vol.47, pp. 824-826

Koss, N. and Feinstein, AR. "Computer-aided prognosis: II. development of
a prognostic algorithm.” Arch. Intern. Med. 12 » pp» 448-459 (1971)

Leaper, D.J.e, Horrecks, J.C., Staniland, J.P., and deDombal
F.T. "Computer-assisted diagnosis of abdominal Rain usin
“estimates” provided by clinicians." Brit. Med. J. 4, pp. 350-354 (1972).

3
ce
>

Ledley, R.S. and Lusted, L.B. "Reasoning foundations of
medical diagnosis." Science 130,9-21 (1959).

Levi, §., Frant, J.R., Westphal,. M.C., and Lurie, D. "Development of a
decision guide - optimal discriminations for meningitis determined by
statistical analysis." Meth. Inform. Med. 15 (2), &7-90° (1976).

Lipkin, M. and Hardy, J.D. "Mechanical correlation of data in differen
ai agassis of hematologic diseases." J. Amer. Med. Assoc. 166, pp. 113

Lusted, L-B. Introduction To Medical Decision Making. Springfield, I1l.:
Charles C. Thomas, IU68.

Mabry, J.C., Thompson, F.K-, Hopwood, ™.D., and Baker, W.R. “A prototype
data management and analysis system (CLINFO): svsten deseription and user
exper rence-" In MEDINFO 77, Amsterdam: North-Holland Publishing Co., 1977,
pp- -75.

McDonald, C., Bhargava, B., and Jeris, D- "A clinical information svstem

(CIS_) for ambulatory care," Proc. of the 1975 NCC, AFIPS Press, vol. 44
(1975) pp. 749-756

McNeil, BeJ., Keeler, E., and Adelstein, S.J. "Primer on certain elements
of medical decision making." N. Eng. J. Med. 293, pp. 211-215 (1975).

McNeil, B.J. and Adelstein, S.J. “Determining the value of diagnostic and
screening tests." J. Nucl. Med. 17, pp. 439-448 (1977).

Menn, S.J-, Barnett, G.0., Schmechel, D., et al. "A computer program to
assist in the care of acute respiratory failure." J. Amer. Med. Assoc. 222,
pp- 308-312 (1973).

Mesel, E., Wirtschafter, D.D., Carpenter, J-T., et al. Clinical algorithms
for cancer chemotherapy - systems for community-based consultant~extenders
and oncology centers. Meth. Inform. Med. 15, pp. 168-173 (1976).

183
n

ece

Py
la
e

6€.

6&7.

68.

69.

7¢.

71.

74.

75.

76.

References Appendix B

Nordyke, P.eA., Kulikzowski, C.A., and Kulikowski, C.W. "A comparison
of methods for the automated diagnosis of thyroid dysfunction." Comp.
Biomed. Res. 4, pp. 374-38¢ (1¢97!). . —

Norusis, M.J. and Jacquez, J.4. "Diagnosis. 1. Symptom nonindependence in
es eras asat models for diagnesis." Comp. Biomed. Pes. &, pp. 1L56-
fa = ° ————

Patrick, E.d. "Pattern Rec ent

Q ion in Medicine," Systems, Man
and Cybernetics Review, 6, p. 4 (1977) , SYSESES, =an

Pauker, S.G. and Kassirer, J.P. "Therapeutic decision waking: a cost-
benefit analysis." N. Eng. J. Med. 293, pp. 229-234 (1975).

Pauker, S.G., Gorry, G.A., Kassirer, J.P., and Schwartz, W.B. "Towards the
simulation of clinical cognition: taking a present illness by computer."
Amer. J. Med. 60:981-996 (1976).

Pauker, (§.G. "Coronary  arterv surgery: the use of decision
analysis." Anns. Int. Yed. 85, pp. &-18 (1976).

Pauker, S.P. and Pauker, $.G. "Prenatal diagnosis: a directive aporoach to
genetic counseling using decision analysis." Yale J. Biol. Med. 30,275-229

. 7 .

Peck, C.C., Sheiner, L.B., Martin, C.M., et al. "Computer-assisted digoxin
therapy." MN. Eng. J. Med. 289, pps 441-446 (1973).

Pipberger, E.V. "Clinical application of a second generation
e

ectrocardiography computer program." Amer. J. Electrocardiology 35 .
597- 608 (1975). 7 PP

Pliskin, - J.S. and Beck, C.H. "Decision analysis in individual
clinical decision making: a real-world application in reatment of
renal disease." Meth. Inform. Med. 15, pp -* 43-46 (1976).

Pople, H.E., Myers, J.D. and Miller, R.A. "DIALOG: A model of diagnostic
logic for internal medicine." Proc. 4th Int. Joint. Conf. on Artif.
Intell., MIT, Cambridge, Mass., 1075.

Pople, R. "The formation of composite hypotheses in diagnostic prcblem
solving: an exercise in synthetic reasoning.” Proc. of Sth Intl Joint Conf
on Artif. Intelligence, Cambridge, Mass, 1977, pp- lU30=-1037.

Prutting, J. "Lack of correlation between antemortem and
postmortem diagnosis." N.Y. J. Med. 67, pp. 2081-2084 (1967).

Paiffa, #. Decision Analvsis: Introductory Lectures on Choices
Under Uncertainty. Heading, Mass.: Addison Wesley, L068.

Richards, 2. and Goh, A.E.S. "Computer assistance in the treatment
of patients with acid-base and electrolyte disturbances." MEDINFO 77,
Amsterdam: North-Folland Publishing Company, 1977, pp. 407-410.

Rodnick, J., and Wiederhold, G. , "Review of automated ambulatory medical
recore systems: charting services that are of essential benefit to the
physiczen MEDINFO 77, Amsterdam: North-Holland Publishing Co., 1977
po- 957-961.

+

184
Sec. References Appendix 8B

Rl. Rosati, R.aA., Wallace, aA.G., and Stead, E.A. "The way of the
futuree" Arch. Intern. Med. 131, pp. 285-287 (1973).

2. Rosati, R.D., MceNeer, J.F., Starmer, C.F., et al. "A new
information system for medical practice." Arch. Intern. Med. 135, pp. 1017+
1024 (1975). a

€3. Rosenblatt, M.B., Teng, P.K., and Kerpe, S-_ "Diagnostic accuracy in cancer
as determined by pest-mortem examination." Prog. Glin. Cancer 5, pp- 71-280

(1973).

84. Rubin, A.D. and Risley, J.F. "The PROPEET system: an experiment

in providing a computer resource to scientists." MEDINFO jJ7, Amsterdenm:
North-Holland Publishing Co., 1977, pp. 77-81.

85. Safran, C., Tsichlis, P.N., Bluming, A-Z-, and Desforges, J.F. "Diagnostic
planning using computer-assisted decision making for patients with
fodgkins’ disease." Cancer 39, pp. 2426-2434 (1977).

&6. Schoolman, H. and Bernstein “7

> Le ' uter use in diagnosis, prognosis, and
ys . nr e
therapy." Science 200, pp. 924-03! )

D
a78

87. Schwartz, W.R., Gorry, G.eA., Kassirer, J.P., and Essig, A.
Casto" analysis and clinical judgment." Amer. J. Med 55, pp. 459-472
ae. scott, AeCe,y Clancey, We, Davis, Re, and Shortliffe, E.H.

Explanation capabilities of knowledge~based production systems." Amer.
- Computational Linguistics, Microfiche 62, 1977.

S89. Sheiner, L.B., Halkin, H., Peck, C., et al- "Improved computer~assisted
digoxin therapy." Anns. Int. Med. 82, pp. 619=627 (1975).

$0. Sherman, H., Reiffen, B., and Komoroff, A.L. “Ambulatory care systems." In
Probler—Directed and Medical Information Svstems (¥.F. Driggs, ed.), New
Tork: Intercontinental Yedical Book Corporation, 1973, pp. 143-171.

 

Sl. Shimura, ©. “Learning procedures in pattern classifiers - introduction
and ayeCls. Proc. intl. Joint Conf. on Pattern Recognition, Kyoto, 1978,
pp. 125-1238.

» Shortliffe, E.H., Axline, $.G., Buchanan, BeG., and Cohen, S.N. "Design
considerations for a program to provide consultations in clinical
therapeutics." Proc. 12th San CDiego Biomedical Symposium, 211-319, San
Diego, Calif., February I974.

93. Shortliffe, E.H. and Davis, R. "Some considerations for the implementation

of knowledge-based expert systems." SIGART Newsletter, No. s 9912,
December 1975.
$4. Shortliffe, E.E., and Buchanan, 2.G "A model of inexact reasoning in

Bz e
medicine." Math. Biosci. 23, pp. 251-379 (1975).

95. Shortliffe, E.R. Computer-Based Medical Consultations: MYCIN, New York:
Elsevier/North Foll@nd, i976.

recka, V., Camp, H.N.,

Slere Bed 4.N., end Fall, '.D. UMAR
svstem for internal redicin L
(1977)

e, ¢ S: a knowled
. inform. Process & Man. 3 -

- Tr
a t
e 3

185
Sec.

$7.

100.

1Ql.

102.

104.

105.

Icé.

107.

108.

lll.

il2.

113.

References Appendix B

Sox, H.Cs, Sox, C.H., and Tompkins, R.K. "The training of
physicians assistants: the use of a clinical algorithm system." N. Eng. J.
Med. 288, pp. @18-824 (1973).

 

Sridharan, N.S. Guest editorial. Artificial Intelligence !1, pp. I-
4 (1978).

Startsman, T.S., and Robinson, R.F. "The attitudes of medical
a4 Berane sss personnel towards computers." Comp. Biomed. Res. 5, pp-

Stead, W.W., Brame, R.G., Harmond , W.E., et al. "S computerized obstetric
medical record." Obstet. & Gyn. 49, pp. 502-509 (1977).

Szolovits, P. and Pavker, $.G. "Categorical and probabilistic reasoning in
medical diagnosis.” Artificial Intelligence Ll, pp- 115-144 (1978).

"Clinical decision analysis." Meth. Inform. Med.. 15, pp.

Vickery, D.M. "Computer support of paramedical personnel: the question of
quality control.” MEDINFO 74, Amsterdam: North-Holland Publishing Company,
1974, pp. 281-287.->

Wagner, G.-, Tautu, P., and VWolber, U. “Problems of medical diagnosis:
a Bibliography." Meth. Info. Med. 17, pp. 55-74 (1978).

"Recognition

Walsh, E.T., Bookhein, W.W., Johnson, BReCe, et al.
35, pp» 1493-1497

of abepeptococcal pharyngitis in adults." Arch. Int. Med. 1

Wardle, A. and Wardle, Le ese aise diagnosis: a review
of research." Meth. Info. Med. 17, pp- 15-22 (1978).

Warner, H.R., Toronto, A.F., and Veasy, L.G. "Fxperience with
Bayes’ Theorem for computer diagnosis of “congenital heart disease."
Anns. N.¥. Acad. Sci. 115, pp. 558-567 (1964).

Warner, H.R. “Experiences with computer-bas

ed patient monitoring." Anes.
& analgesia Current Researchers 47, pp. 453-461 (1 ). .

Warner, H-R., Olmsted, C.M., and Rutherford, B.D. "HELP - a program for
medical decision-making." Comp. Biomed. Pes. 5, pp. 65-74 (1972).

Warner, H-R., Rutherford, B.D., and Houtchens, Be "A sequential approach
Toe COrY taking and diagnosis." Comp. Biomed. Res. 5, pp. 256-262

Warner, H.P., Morgan, J.D., Pryor, TeA-, et al. "HELP - a self-improving
system for medical decision making." MEDINFO 74, Amsterdam: North-Holland
Publishing Company, 1974.

Varner, H.R. Knowledge sectors for logical processing of parent data in
the HELP svstem-" Proc. of 2nd. Ann. Svop. on Computer Applications in
Medical Care, IEEE, Vasn. D-C.,(1978), pp. 401-at4.

Watson, R.J. "Medical staff response to 2 medical information system with

186
Sec.

ll7.

11é.

125.

 

 

References Appendix B

cirect physician~computer interface." MEDINFO 74, pe. 299-302, Amsterdam:
North-Holland Publishing Compery, 1°74.
Wechsler, #. "A fuzzy ht to medical diegrosis." Int. J.
Biomed. Comm. 7, pp. 191-203 (1674).
Weed, L.L. ‘Medical records that guide and teach." N. Eng. J. Med. 278,
pp-> 593-599,652-657 (1968).
Weed, L.L. "Problem-orierted medical records." In Problem-Directed and

Medical Information Systems (M.F. Driggs, ed.), New York: intercontinental
“ecical Book Corporation, 1973.

Weiss, $-.M., Fulikowski, C.A., Amarel, S$. and Safir, A. "S model-
based method for computer-aided medical decision-making."
Artificial Intelligence ll, pp. 145-172 (1978).

Wevl, S., Fries, J., Wiederhold, Ge, and Germano, F. "A modular
cegsEybing clinical databank system." Comp. Biomed. Res. &, pp. 2

4.

Wiederhold, G., Fries, J.F., and Weyl, S. "Structured ar

Gronass databases," Proc. of the 1975 NCC, AFIPS Press vol. 44 (

Winston, P.F. artificial Intellicence, Peading, Mass.: Addison-Wesley,
O77.

Wortman, P.M. "Medical diagnosis: an information processing
approach." Comput. Biomed. Res. 5, pp» 215=328 (1972).

Yu, V.L., Fagan, L.eM., Wraith, S.M., et al. “Computer-based consultation in
antimicrobial selection - a comparative evaluation by experts." Stanford
University School of Medicine. Submitted for publication, November 1978.

Yu, V.L., Puchanan, 3.G., Shortliffe, E.H., et al. "An evaluation of the
aah trie @ computer-based consultant." To appear in Comput. Prog.
vise

Siomed.,
Zadeh, L.A. "Fuzzy sets." Information and Control 8, pp. 338-353 (1965).

Zoltie, N., Forrocks, J.C., and deDonbal, F.T. "Computer-
assisted diagnosis of dyspepsia - report on transferability of ae system,
with emphasis on early diagnosis of gastric cancer." Meth. Inform. Med. 16,
pp. @9-92 (1977).

187
Appendix C

THE ART OF ARTIFICIAL INTELLIGENCE:

I. Themes and case studtes of knowledge engineering

Edward A. Feigenbaum

Department of Computer Science,
Stanford Universitcy,
Stanford, California, 94305.

Abstract

The knowledge engineer practices the art of
bringing the principles and tools of AI research
to bear on difficult applications problems

requiring experts” knowledge for
The technical issues of acquiring
representing it, and using tr appropriately to
construct and explain lines-of-reasoning, are
important problems in the design of knowledge-
based systems. Various systems that have achieved
expert level performance in scientific and medical
inference Llluminate the art of knowledge
engineering and its parent science, Artificial
Intelligence.

their solution.
this knowledge,

INTRODUCTION: AN EXAMPLE
This is the firse of a pair of papers thae
will examine emerging chenes of knowledge
engineering, illustrate them with case studies
draw from the work of the Stanford Heuristic
Programming Project, and discuss general issues of
knowledge engineering art and practice.

Let me begin with an example new to our
workbench: a system called PUFF, che early fruit
of a collaboration between our project and a group
ac the Pacific Medical Center (PMC) in San
Francisco.

PMC “s
diagnosis of

A physician refers a patient to
pulmonary function testing lab for
possible pulmonary function disorder. For one of
the tests, the patience inhales and exhales a few
tines in a tube connected to an
instrument/computer combination. The instrument
acquires data on flow rates and volumes, the so-
called flow-volume loop of the patient’s lungs and
airways. The computer measures certain parameters
of the curve = and presents them to the
diagnostician (physteian or PUFF) for
interpretation. The diagnosis is made along these
lines: normal or diseased; restricted lung disease
or obstructive airways disease or a combination of
both; the severity; the likely disease type(s)
(e.g. emphysema, bronchitis, etc.); and ocher
Factors important for diagnosis.

188

PUFF is given not only the measured data but
also certain items of information from the patient

record, e.g. sex, age, number of pack-years of
cigarette smoking. The task of the PUFF system is
to infer a diagnosis and print tt out in English
tn the normal medical summary :form of the
interpretation expected by the referring
physician.

Everything PUFF knows about pulmonary

function diagnosis {fs contained in
rules of the IF...THEN... form. No textbook of
medicine currently recerds these rules. They
constitute the Partly=public, partly-private
knowledge of ‘an expert pulmonary physiologise at
PMC, and were extracted and polished by project
engineers working intensively with che expert over
a pertod of time. Here is an example of a PUFF
rule (the unexplained acronyms refer to various
data measurements):

(currently) 55

 

RULE 31

IF:

1) The severity of obstructive airways
disease of the patient {s greater than or
equal to mild, and

2) The degree of diffusion defect of the
patient {1s greater than or equal to mild,
and

3) The tlc(body box}observed/predicted of
the patient is greater than or equal to 110
and

4) The observed-predicted difference in
tv/tle of the patient is greater than or
equal to 10

THEN:

1) There is strongly suggestive evidence
(.9) that the subtype of obstructive airways
disease is emphysema, and

2) It is definite (1.0) thar “OAD,

Diffusion Defect, elevaced TLC, and elevated
RV together indicate emphysema." is one of
the findings.

 
One hundred cases, carefully chosen to span
the variety of disease states with sufficient
exemplary information for each, were used to
extract the 55 rules. As the knowledge emerged, it
was represented in rule form, added to the system

and tested by running additional cases. The
expert was sometimes surprised, sometimes
frustrated, by the occasional gaps and

inconsistencies in the knowledge, and the
incorrect diagnoses that were logical consequences
of the existing rule set. The interplay between
knowledge engineer and expert gradually expanded
the set of rules to remove most of these problems.

As cumulation of techniques in the art
demands and allows, a new tool was not invented
whea an old one would do. The knowledge engineers
pulled out of their toolkit a version of the MYCIN
syscem (to be discussed later), with che rules
about infectious diseases removed, and used it as
the inference engine for the PUFF diagnoses. Thus
PUFF, Like MYCIN, ts a relatively simple backward~
chaining inference system. It seeks a valid Line-
of-reasoning based on its rules and rooted in che
instrumene and patient data. With a little more
work at fitting some existing tools together, PUFF
will be able to explain this line-of-reasoning,
jusc as MYCIN does.

As itis, PUFF only prints out the final
interpretation, of which the following is an
example:

 

PATIENT DATA:

The degree of dyspnea: MODERATELY-SEVERE
The severity of coughing: MILD

Spucum production MODERATELY-SEVERE

The number of pack-years of smoking: 48
veferral diagnosis: BRONCHITIS
IvG/IVC=-predicted: 80

RV/RV=predicted: 191

FVC/FVC-predicted: 87

TLC (bedy box)observed/predicted: 127
Predicted FEVI/FVC: 983

TLC (DLCO)observed/predicted: 83
FEVI/FVC ratio: 50

RV/TLE Observed-Predicted: 21
MMF/MMF-predicted: 19

the DLCO/DLCO-predicted: 48

The slope (F50_obs-F25_obs)/FVC_obs: 19

DEGREE OF OBSTRUCTIVE AIRWAYS DISEASE:

OAD degree by SLOPE: (MODERATELY-SEVERE
790)

OAD degree by MMF: (SEVERE 900)

OAD degree by FEVI: (MODERATELY-SEVERE
700)
FINAL OAD DEGREE: (MODERATELY-SEVERE

910) (SEVERE 900)
No conflict. Final degree:
(MODEPATELY=SEVERE 910)

189

Appendix C

INTERPRETATLON:

Obstruction is indicated by curvature of
the flow-volume loop.

Forced Vital Capacity is normal and peak
flow rates are reduced, suggesting

airway obstruction.

Flow rate from 25-75 of expired volume Is
reduced, indicating severe airway
obstruction.

OAD, Diffusion Defect, elevated TLC, and
elevated RV together indicate emphysema.
OAD, Diffusion Defect, and elevated RV
indicate emphysema.

Change tn expired flow rates following
bronchodilation shows that there is
reversibility of airway obstruction.

The presence of a productive cough is an
indication that the OAD is of the
bronchitic type.

Elevated lung volumes indicate
overinflacion.

Air trapping is indicated by che elevated
difference between observed and predicted
RV/TLC ratios.

Improvement in airway resistance indicates
some reversibility of airway

Airvay obstruction is consistent with the
patient’s smoking history.

The airway obstruction accounts for the
patient’s dyspnea.

Although bronchodilators were not

useful in this one case, prolonged use may
prove to be beneficial! to the patient.
The reduced diffusion capacity indicates
airway obstruction of the mixed
bronchitic and emphysematous types.

Low diffusing capacity indicates loss of
alveolar capillary surface.

Obstructive Airways Disease of mixed types

 

150 cases not studied during the knowledge
acquisition process were used for a test and
validation cf the rule set. PUFF inferred a

diagnosis for each. PUFF=produced and expert=
produced interpretations vere coded for
statistical analysis to discover the degree of

agreement. Over various types of disease states,
and for two conditions of match between human and
computer diagnoses (“same degree of severity" and
“within one degree of severity"), agreement ranged
between approximately 902 and 100%.

The PUFF story is just beginning and will be
told perhaps at the next IJCAI. The surprising
punchline to my synopsis is thac the currence state
of the PUFF system as described above was achieved
in less than 50 hours of interaction with the
expert and less than 10 man-weeks of effort by the
knowledge engineers. We have learned much in the

*
past decade of the art of
based intelligence agents!

engineering knowledge-

In the remainder of this essay, I would like
to discuss the route that one research group, the
Stanford Heuristic Programming Project, has taken,
fllustrating progress with case studies, and
discussing themes of the work.

2 ARTIFICIAL INTELLIGENCE & KNOWLEDGE ENGINEERING

 

The dichotomy that was used to classify the
collected papers in the volume
Computers and Thought seill characterizes well the
motivations and research efforts of the AL
community. First, there are some who work toward
the construction of intelligent artifacts, or seek
to uncover principles, methods, and techniques
useful in such construction. Second, there are
those who view artificial intelligence as (to use
Nevell’s phrase) "theoretical psychology,” seeking
explicit and valid information processing models
of human thought.

For purposes of this essay, I wish to focus
on the motivations of the firat group, these days
by far the larger of the two. I label these
motivations “the intelligent agent viewpoint” and
here is my understanding of that viewpoint:

“The potential uses of computers by

people to accomplish tasks can be ‘one-
dimensionalized’ into a spectrum
representing the rature of instruction

that must be given the computer to do its
job. Call it the WHAT-TO-HOW spectrum.
AC one extreme of the spectrum, the user
supplies his intelligence to instruct the
machine with precision exactly HOW to do
his job, atep-by-step. Progress in
Computer Sctence can be seen as steps avay
from the extreme ‘HOW’ point on the
spectrum: the familiar panoply of assenbly
languages, subroutine libraries,
compilers, extensible languages, etc. At
the other extreme of che spectrum is the
user with his real problem (WHAT he wishes
the computer, as his instrument, to do for
him). He aspires to communicate WHAT he
wants done in a language thet is
comfortable to him (perhaps English); via
communication modes that are convenient
for him (including perhaps, speech or
pictures); with some generality, sone

vagueness, imprecision, even = error;
without having co lay out in detatl all
necessary subgoals for adequate

performance - with reasonable assurance
thac he is addressing an intelligent agent

chat is using knovledge of his world to
understand his intent, to fill {a his
vagueness, to wake specific his
abstractions, to correct his errors, to
discover appropriate subgoals, and

190

Appendix C€

ultimately to translate
wants done into
define HOW ic

WHAT he really
processing steps that
shall be done by a real
computer. The research activity aimed at
creating computer programs that act as
“intelligent agents" near the WHAT end of
the WHAT-To-HOW spectrum can be viewed as
the long-range goal of AL research."
(Feigenbaum, 1974)

Our young science its still more art than
sactence. Art: “the princtples or methods governing
any craft or branch of learning.” art: "skilled
workmanship, execution, or agency." These the
dictionary teaches us. Knuth tells us that the
endeavor of computer programming fs an art, in
Jjuse these ways. The art of constructing
intelligent agents is both part of and an
extension of the programaing art. It is the art of
building complex computer programs that represent
and reason with knowledge of the world. Our art
therefore lives in symbtosis with the other
worldly arts, whose practitioners -~ experts of
their art -- hold the knowledge we need to
construct intelligent agents. In most “crafts or
branches of learning" what we call “expertise” is
the essence of the art. And for the domains of

knowledge that we touch with our art, it is the
"rules of expertise” or the rules of “good
judgment" of the expert practitioners of that

domain that we seek to transfer to our prograns.

2.1 Lessons of the Past

Two insights from
pertinent to this essay.

previous work are

The first concerns the quest for generality

and power of the inference engine used in the
performance of intelligent acts (what Minsky and
Papere [see Goldstein and Papert, 1977} have

labeled “the power strategy"). We must hypothesize
from our experience co date chat the problem
solving power exhibited in an intelligent agent’s
performance is primarily a consequence of the
spectalist’s knovledge employed by the agent, and
only very eecondarily related to the generality
and power of the inference method employed. Our
agents must be knowledge-rich, even if they are
methods-poor. In 1970, reporting the first major
sumary-of-results of the DENDRAL program (to be
discussed later), we addressed this issue as
follows:

“,.egeneral problew-solvers are too
weak to be used as the basis for building
high-performance systems. The behavior of
the bese general problemsolvers we know,
human problem-solvers, is observed to be
weak and shallow, except in the areas in
which the human problem=solver is a
specialist. And it is observed that the
transfer of expertise becween spectalty
areas is slight. A chess master is
unlikely to be an expert algebratst ar an
expert mass spectrum analyst, etc. In
this view, che expert is che specialist,
with a specialist’s knowledge of his area
and a specialist’s methods and
heuristics.” (Feigenbaum, Buchanan and
Lederberg, 1971, pe 187)

Subsequent evidence from our laboratory and
all others has only confirmed this belief.

AI researchers have dramatically shifted
Cheitr view on generality and power in the past
decade. In 1967, the canonical question about the
DENDRAL program vas: "Ie sounds like good
chemistry, but what does it have to do with Ar?"
In 1977, Goldstein and Papert write of a paradign
shtft in AI:

“Today there has been a shift in
Pparadign. The fundamental problem of
understanding intelligence is noc the
identification of a few powerful

techniques, but rather the question of hov
to represent large amounts of knowledge in
a fashion that permits their effective use
and interaction." (Goldstein and Papert,
1977)

work concerns
expert brings

The second insighe from past
the nature of the knowledge that an
Ca che performance of a task. Experience has
shown us that this knowledge ts largely heuristic
knowledge, experiential, uncertain -- mostly "good
guesses" and “good practice," in lieu of facts and
tiger. Experience has also taughe us that much of
this knowledge is private to the experc, not
because he is unwilling to share publicly how he

performs, but because he its unable. He knows more
than he is aware of knowing. [Why else is the
Ph.D. or the Iaternship a guild-like

apprenticeship to a presumed “master of the
craft?" What che masters really know its noe
written tn the textbooks of the masters.] But we
have learned also that this private knowledge can
be uncovered by the careful, painstaking analysis
af a second party, or sometimes by the expere
himself, operating in the context of a large
number of highly specific performance problems.
Finally, we have learned that expertise is multi-
faceted, that the expert brings to bear many and
varied sources of knowledge in performance. The
approach to capturing his expertise must proceed
oa many fronts simultaneously.

Appendix C

2.2 The Knowledge Engineer

The knowledge engineer is chat second party

just discussed. {An historical note about the
term. In the mid=60s, John McCarthy, for reasons
obvious from his work, had been describing

Artificial Intelligence as “Applied Epistemology."
When I first described the DENDRAL program to
Donaid Michie in 1968, he remarked that {fe was
“epistemological engineering,” a clever but
ponderous and unpronounceable turn~of-phrase that
I simplified into "knowledge engineering.") She
(in deference to my favorite knowledge engineer)
works intensively with an expert to acquire
domain-specific knowledge. and organize it for use
by a program. Simultaneously she is matching the
tools of the AL workbench to the task at hand -~

program organizations, methods of symbolic
inference, techniques for the structuring of
symbolic information, and the like. If the tool
fits, or Mmearly fits, she uses it. If noe,
necessity mothers Al invention, and a cew tool

gets created. She builds the early versions of the
intelligent agent, guided always by her tntent
that the program eventually achieve expert levels
of performance in the task. She refines or
reconceptualizes the system as the increasing
amount of acquired knowledge causes the AI cool to
“break” or slow down intolerably. She also refines
the human interface to the incelligent agent with
several aims: to make the system appear
“comfortable” to the human user in his linguistic
transactions with it; to make the system’s
inference processes understandable to the user;
and to make the assistance controllable by the
user when, in the context of a real problem, he
has an insight chac previously was not elicited
and therefore not incorporated.

In the next
summary form) sone
engineer’s art.

sectton, I wish to
case studies of

explore (in
the knowledge

3 CASES FROM THE KNOWLEDGE ENGINEER ’S WORKSHOP
nn SG ENEER _S_WORKSHOP

I will draw material for this
the work of my group at Stanford.

section from
Much exciting

work in knowledge engineering is going on
elsewhere. Since my intent is not to survey
literature but to illustrate themes, at the risk

of appearing parochial I have used as case studies
the work I know best.

My collaborators (Professors Lederberg and
Buchanan) and [I began a series of projects,
initially the development of the DENDRAL progran,
in 1965. We had dual motives: first, to study
scientific problem solving and discovery,
Particularly che processes scientists do use or
should use in inferring hypotheses and theories
from emptrical evidence; and second, to conduct
this study in such a way that our experimental
programs would one day be of use to working
scientists, Providing intelligent assistance on
important and difficult problems. By 1970, we and

293
our co-workers had gained enough
we felt comfortable in laying

research encompassing work on theory formation,
knowledge utilization, knowledge acquisition,
explanation, and knowledge engineering techniques.
Although there were some surprises along the way

experience that
out 3 program of

{like the AM program), the general lines of the
research are proceeding as envisioned.
THEMES

As a road map to these case studies, it is
useful to keep in mind certain major themes:
Generation-and-test: Omntpresent ia our

experiments is the "classical" generation-and-
test framework that has been the hallmark of AI
programs for two decades. This is not a
consequence of a doctrinaire attitude on our part
about heuristic search, but rather ef the

usefulness and sufficiency of the concept.

chosen to
this form.

Situation => Action Rules: We have
Tepresent the knowledge of experts in
Making no doctrinaire claims for the universal
applicabiliry of this representation, we
nonetheless point to the demonstrated utility of
the rule-based representation. From this
representation flow rather directly many of the
characteristics of our programs: for example,
ease of modification of the knowledge, ease of
explanation. The essence of our approach is that
a rule must capture a "chunk" of domain knowledge
thac is meaningful, in and of itself, to the
domain specialist. Thus our rules bear only a
historical relationship to the production rules
used by Newell and Simon (1972) which we view as
"machine~language programming" of a
recognize => act machine.

The Domain-Specific Knowledge: It plays a critical
role in organizing and coustraining search. The
theme is that in the knowledge is the power. The
interesting action arises from the knowledge
base, noc the inference engine. We use knowledge
in rule form (discussed above), in the form of
inferentially-rich models based on theory, and in
the form of tableaua of symbolic data and
relationships (i.e. frame-like structures).
System processes are made to conform to natural
and convenient representations of the domain-
specific knowledge.

Flexibility to modify the knowledge base: If the
so-called “grain size" of the knowledge
representation is chosen properly (i.e. small
enough to be comprehensible but large enough to
be meaningful to the domain specialist), then che
rule-based approach allows great flexibility for
adding, removing, or changing: knowledge in the
systen.

Line-of-reasoning: A central organizing principle
in the design of knowledge-based intelligent
agents is the maintenance of a line-of-reasoning
that 1s comprehensible to the domain specialist.

Appendix C

This principle is, of course, not
necessity, but seems to us to be
principle of major importance.

a logical
an engineering

Multiple Sources of Knowledge: The formation and
maintenance (support) of the line-of-reasoning
usually require the integration of many disparate
sources of knowledge. The representational and
inferential problems in achieving a smooth and
effective integration are formidable engineering
problems.

Explanation: The ability to explain the line-of-
reasoning in a language conventent to the user is
Necessary for application aad for system
development (e.g. for debugging and for extending

the knowledge base). Once again, this ts an
engineering principle, but very important. What
constitutes “an explanation” is not a simple
concept, and considerable thought needs to be

given, in each
explanations.

case, to the structuring of

CASE_STUDIES

In this section I will try te illustrate
these themes with various case studies.

3.1 DENDRAL: Inferring Chemical Structures

3.1.1 Historical Note

Begun in 1965, this collaborative project
with the Stanford Mass Spectrometry Laboratory has
become one of the longest-lived continuous efforts
in che history of AI (a facet that in no small way
has contributed to its success). The basic
framework of generation~and-test and rule~based
representation has proved rugged and extendable.
For ua the DENDRAL system has been a_ fountain of
ideas, wany of which have found their way, highly

metamorphosed, into our other projects. For
example, our long-standing commitment to rule=
based representations arose out of our
(successful) attempt to head off che imminent

ossification of DENDRAL caused by the rapid

accumulation of new knowledge in the system around
1967.

3.1.2 Task

To enumerate plausible structures (atom-bond
graphs) for organic molecules, given two kinds of
information: analytic instrument data from a mass
spectrometer and a nuclear magnetic resonance
spectrometer; and user-supplied constraints on the
answers, derived from any other source of
knowledge (instrumental or contextual) available
to the user.

192
3.1.3 Representations

Chemical structures are represented as node-
link graphs of atoms (nodes) and bonds (links).
Constraints on search are Tepresented as subgraphs
(atomic configurations) to be denied or preferred.
The empirical theory of mass spectrometry is
represented by a set of rules of the general form:

Situation: Pareicular atomic
configuration
(subgraph)

Probability, P,
of occurring

a

Action: Fragmentation of the
particular configuration

(breaking links)

Rules of this form are natural and expressive to
mass spectrometrists.

3.1.4 Sketch of Method

DENDRAL’s inference procedure is a heuristic
Search that takes place in three stages, withouc
feedback: plan-generate-test.

“Generate” (a program called CONGEN) is a
generation process for plausible structures. Its
foundation is a combinatorial algorithm (with
mathematically proven properties of completeness
and non-redundant generation) that can produce all
the topologically legal candidate structures.
Constraints supplied by the user or by the “Plan”
process prune and steer the generation to produce
the plausible set ({1.e. those satisfying the
constraints) and not the enormous legal sec.

“Test” refines the evaluation of
plausibility, discarding less worthy candidates
and rank-ordering the remainder for examination by
the user. “Test” firse produces a “predicted™ set
of instrument data for each plausible candidate,
using the rules described. It then evaluates the
worth of each candidate by comparing its predicted
data with the actual input data. The evaluation
is based on heuristic criterta of goodness-of-fit.
Thus, "“ctese" selects the “best” explanations of
the daca.

“Plan” produces direct (i.e. not chained)
inference about likely subscructure in the
molecule from patterns itn the data thac are
indicative of the presence of the substructure.
(Patterns in che data trigger the left—-hand-sides

Appendix C

of substructure rules). Though composed of many
atoms whose interconnections are given, the
substructure can be manipulated as atom=like by

"generate." Aggregating many units entering into a
combinatorial process into fewer higher-level
units reduces the size of the combinatorial search
Space. "Plan" sets up the search space so as to be

Felevent cto the input data. "Generate {ts the
inference tacticfan; “Plan” is the inference
strategist. There is a separate "Plan" package

for each type of instrument data, but each package
passes substructures (subgraphs) to "Generate."
Thus, there is a uniform interface between “Plan"
and “Generate.” User-supplied constraints enter
this interface, directly or from user~assist
packages, in the form of substructures.

3.1.5 Sources of Knowledge

The various sources of knowledge used by the
DENDRAL system are:

Valences (legal connections of atoms);
stable and unstable configurations of atoms; rules
for mass Spectrometry fragmentations; rules for
NMR shifts; expert’s rules for planning and

evaluation; user-suppl ied constraints
(contextual). ‘°
3.1.6 Results

DENDRAL’s structure elucidation abtlities

are, paradoxically, both very general and very
narrow. In general, DENDRAL handles all molecules,
cyclic and tree-like. In pure structure
elucidation under constraincs (without instrument
data) ,CONGEN is unrivaled by human performance. In
structure elucidation with inacrument§ data,
DENDRAL “s performance rivals expert human
performance only for a small number of molecular
families for which the program has deen given
specialist’s knowledge, Namely the families of
interest to our chemise collaborators. I will
Spare this computer science audience the List of
names of these families. Within these areas of
knowl edge-intensive specialization, DENDRAL ‘s
performance is usually noe only much faster but
also more accurate than expert human performance.

The statement just made Bummarizes thousands
of runs of DENDRAL on problems of interest to our
experts, their colleagues, and their students. The
results obtained, along with the knowledge that
had to be given to DENDRAL to obtain them, are
published in major journals of chemistry. To date,
25° papers have been published there, under a
series title “Applications of Artificial
Intelligence for Chemical Inference: <specific
subject>” (see references).

The DENDRAL system
Stanford chemists, their
universities and
interested

{fs in everyday
collaborators
collaborating or
chemists in industry.

use by
at ocher
otherwise
Users outside

193
Stanford access the

compucer/communicat ions
they are solving are often difficult and novel.
The PBeitish government ts currently supporting
work at Edinburgh aimed at transferring DENDRAL to
industrtal user communities in the UK.

commerce ial
probless

systes over
network. The

3.1.7 Discussion

Representation and
representation chosen for the molecules,
constraints, and rules of instructient data
interpretation is sufficiently close to that used
by chemists in thinking about structure
elucidation that the knowledge base has teen
extended smoothly and easily, mostly by chemists

extensibility. The

themselves in recent years. Only one major
reprogramming effort took place in the last 9
years -- when a new generator was created to deal

with cyclic seructures.

Representation and the Integration of
multiple sources of knowledge. The generally
difficult problem of integrating various sources
of knowledge has been made easy in DENDRAL by
careful engineering of the representations of
objects, constraints, and rules. We insisted on a
common language of compatibility of the
representations with each other and with the
inference processes: the language of molecular
structure expressed as graphs. This leads toa
straightforward procedure for adding a new source
of knowledge, say, for example, che knowledge
associated with a new type of instrument data. The
procedure is this: write rules that describe the
effect of the physical processes of the instrument
on selecules using the situation => aceion form
with molecular graphs on both sides; any special
inference process using these rules muse pass tts
results to the generator only(!) tn the common
graph language.

It is today widely believed in AI
use of many diverse
problem solving

that the
sources of knowledge in
and data interpretation has a
strong effect on quality of performance. How
strong is, of course, domain-dependent, but the
impact of bringing just one additional source of
knowledge to bear on a problem can be startling.
In one difficult (bue not unusually difficulr)
mass spectrum analysis problem*, the program using
its mass spectrometry knowledge alone would have
generated an imposstbly large set of plausible
candidates (over 1.25 million!). Our engineering
response to this was to add another source of data
and knowledge, proton NMR. The addition on a
simple interpretive theory of this NMR data, from

which the program could infer a few additional
constraints, reduced the set of plausible
candidates to one, the right structure! This was

not an tsolated result but
times in subsequent analyses.

showed up dozens of

* the analysis of an
C20H45N.

acyclic amine with formula

194

Appendix C

DENDRAL and data. DENDRAL’s rcobust models
(topological, chemical, instrumental) permit a
strategy of finding solutions by generating

hypothetical "correct answers" and
these with critical tests. This
opposite to that of piecing
{implications of each data

choosing amang
Strategy ts
together the
point to form a

hypothesis. We call DENDRAL’s strategy largely
model-driven, and the other data«driven. The
consequence of having enough knowledge to do

model-driven analysis is a large reduction in the
amount of data that must be examined since data is

being used mostly for verification of possible
answers. In a typical DENDRAL mass spectrum
analysis, usually no more than about 25 data
points out of a typical total of 250 points are
processed. This important point about data
reduction and focus-of-attention has been
discussed before by Gregory (1968) and by the

vision and speech research groups, but
widely understood.

is not

Conclusion. DENDRAL was an carly herald of
Al’s shift to the knowledge-based paradigm. It
demonstrated the point of the primacy of domain-
specific knowledge in achteving expert levels of
performance. Its developmert brought ta the

surface important problems of knowledge
representation, acquisition, and use. It showed
that, by and large, the AI tools of the first

decade were sufficient to cope with the demands of
a complex scientific problem-solving task,or were
readily extended to handle unforseen difficulties.
It demonstrated thac Al’s conceptual and
programming tools were capable of producing
programs of applications interest, albeit in
narrow specialties. Such a demonstration of
competence and sufficiency was important for the
credibility of che AI field at a critical juncture
in its history.

3.2 META-DENDRAL: inferring rules of mass
spectrometry

3.2.1 Historical note

The META-DENDRAL program is a case
automatic acquisition of domain knowledge. It
arose out of our DENDRAL work for two reasons:
first, a decision that with DENDRAL we had a
sufficiently firm foundation on which to pursue
our long-standing interest in processes of
scientific theory formation; second, by a
recognition that the acquisition of domain
knowledge was the bottleneck problem in the

study in

butiding of applications-oriented intelligent
agents.
3.2.2 Task

META-DENDRAL’s job is to infer rules of

fragmentation of molecules in a
for possible later use by the

mass spectrometer
DENDRAL performance
from actual
molecular structures.

program. The inference is to be made
spectra recorded from known

The output of the system is the set of
fragmencation rules discovered, summary of che
evidence supporting each rule, and a summary of
contra~indicacing evidence. User-supplied

constraints can also be inpuc to force the form of
rules along destred lines.

3.2.3 Representations

The cules are, of course, of the same form
as used by DENDRAL that was described earlier.

3.2.4 Sketch of Method

META-DENDRAL, like DENDRAL, uses the
eeneration-and-test framework. The process is
organized in chree stages: Reinterpret the data
and summarize evidence CINTSUM); generate

plausible candidates for rules (RULEGEN); tese and
refine the set of plausible rules (RULEMOD).

INTSUM: gives every data
spectrum an interpretation as a
specific) fragmentation. Ie
statistically che “weight of
fragmentations and for atomic
cause these fragmentations.
INTSUM is to translate data to
and bond-breaks, and to
accordingly.

poine in every
possible (highly
then summarizes

evidence" for
configurations that
Thus, the job of

DENDRAL subgraphs
summarize the evidence

RULEGEN: conducts a heuristic search of the
space of all rules that are legal under the
DENDRAL tule syntax and the user-supplied
constraints. It searches for plausible rules, i.e.
those for which positive evidence exists. A search
path is pruned when there is no evidence for rules
of the class just generated. The search tree
begins with the (single) most general rule
(loosely put, “anything” fragments from
"anycthing") and proceeds level=by-level coward
more detafled specifications of the “anything.”
The heuristte stopping criterion measures whether
a rule being generated has become too specific, in
particular whether {ct is applicable to too few
colecules of the inpuc sec. Similarly there is a
criterion for deciding whether an emerging rule is
coo general. Thus, the outpuc of RULEGEN is a set

of candidate rules for which there is positive
evidence.

RULEMOD: tests the candidate rule set using
more complex criteria, including the presence of

negative evidence. It removes redundancies in the
candidate rule set; merges rules chat are
Supported by the same evidence; tries further

specialization of candidates to
evi-ttence; and tries further
preserves positive evidence.

remove negative
generalization thac

Appendix C

3.2.5 Results

META-DENDRAL produces
in quality those produced by our collaborating
experts. In some tests, META-DENDRAL recreated
rule sets that we had previously acquired from our

tule sets that rival

experts during the DENDRAL project. In a more
stringent test involving members of a family of
complex ringed molecules for which the mass

spectral theory had not been completely worked out
by chemists, META-DENDRAL discovered rule sets for
each subfamily. The rules were judged by experts
to be excellent and a paper describing them was
recently published in a major chemical journal
(Buchanan, Smith, et al, 1976).

In a test of the generality of the approach,
a version of the META-DENDRAL program is currently
being applied cto the discovery of rules for the
analysis of nuclear magnetic resonance datas

3.3 MYCIN and TEIRESIAS: Medical Diagnosis
3.3.1 Historical note

MYCIN originated in the Ph.D. thesis of E.
Shortliffe (now Shortliffe, M.D. as well), in

collaboration with the Infectious Disease group at

the Stanford Medical School (Shoreliffe, 1976).
TEIRESIAS, the Ph.D. thesis work of R. Davis,
arose from issues and problems indicated by the
MYCIN project but generalized by Davis beyond Lhe
bounds of medical diagnosis plications (Davis,
1976). Other MYCIN-related theses are in
progress.

3.3.2 Tasks

The MYCIN performance task is diagnosis of
blood infections and meningitis infections and the
recommendation of drug treatment. MYCIN conducts
a consultation (in English) with a physician-user

about a patient case, constructing Ilines~of-
reasoning leading to the diagnosis and treatment
pian.

The TEIRESIAS knowledge acquisition task can
be described as follows:

In the context of a particular consultation,
confront the expert with a diagnosis with which he
does not agree. Lead hin Systematically back
through che line-of-reasoning chat produced the
diagnosis to the poine at which he indicates the
analysis went avry. Interact with che expert to
modify offending rules or to acquire new rules.
Rerun the consultation to cest the solution and
gain the expert’s concurrence.

195
3.3.3 Representations:

MYCIN’s rules are of the form:
IF <conjunctive clauses> THEN <implicatton>

Here is an example of a MYCIN rule for blood
infections.

 

RULE 85

IF:

1) The site of the culture is blood, and

2) The gram stain of the organism is
gramneg, and

3) The morphology of the organtam is
rod, and

4) The patient is a compromised host

THEN:

There is suggestive evidence (.6) that
the identity of the organiam is
pseudomonas~aeruginosa

 

TEIRES IAS allows the representation of
MYCIN-like rules governing the use of other
rules,i.e. rule-based strategies. An example
follows.

 

METARULE 2

IFs

1) the patient {s a compromised host, and

2) there are rules which mention in their
premise pseudomonas

3) there are rules which mention in their
premise klebsiellas

THEN:
There is suggestive evidence (.4) that the
former should be done before the latter.

 

196

Appendix C

3.3.4 Sketch of method

MYCIN employs a generattion-and-tcest
procedure of a familiar sort. The generation of
Steps in the Line~of-reasoning is accomplished by
backward chaining of the rules. An IF-side clause
is either immediately true or false (as determined
by patient or test data entered by the physician

in the consultation); or its to be decided by
subgoaling. Thus, “test is interleaved with
"generation" and serves to prune out incorrect
lines-of-reasoning.

Each rule supplied by an expert has
associated with it a “degree of certainty"
representing the expert’s confidence in the
validity of the rule (a number from I to 10).

MYCIN uses a particular ad-hoc but simple model of
inexact reasoning to cumulate the degrees of
certainty of the rules used in an inference chain
(Shortliffe and Buchanan, 1975).

Te follows that there may bea number of
“somewhat true" lines-of-reasoning ~- some
indicating one diagnosis, some indicating another.
All (above a threshold) are used by the system as
sources of knowledge indicating plausible lines-
of-reasoning.

TEIRESIAS” rule acquisition process is based
on a record of MYCIN’s search. Rule acquisition is

guided by a set of rule models that dictate the
form and indicate the likely content of new rules.
Rule models are not given in advance, but are
inferred from the knowledge base of existing
rules.
3.3.5 Language of Interaction

The language used looks like it tuight be

English bue is actually the dialect “Doctor-ese”
used by members of the tribe of healing arts
practitioners. Reasonably simple JTanguage
processing methods suffice. When ambiguities of
interpretation are encountered, they are fed dvack
to the user for decisions about meaning.

3.3.6 Example diagnosis and treatment plan

The following is an example of a disgnosis
reported by MYCIN to one of its users following a
consultation. An example consultation will not be
shown, but one can be found in Shortliffe’s book
(Shortliffe, 1976).

 

My therapy recommendations will be designed
to treat for organisms that are either very
likely or, although less likely, would have
a significant effect on therapy selecrion
if they were present. It is important to
cover for che following probable
infection(s) and associated organism(s):

INFECTION=1 is CYSTITIS

<item L> PSEUDOMONAS-CEPACIA (ORGANISM-5)
<Item 2> CITROBACTER-DIVERSUS [ORGANISM-4]
<Item 3> E.COLI [ORGANISM=3]

INFECTION-2 is UPPER-RESPLRATORY~INF ECTION
<Item 4> LISTERIA [ORGANISM-2}

INFECTION-3 is BACTEREMIA
<Item 5> ENTEROCOCCUS (ORCANISM-1 J

{(REC-1] My preferred therapy recommendation
is as follows:

In order to cover for Itema <! 2 3 4 $>:

Give che following in combination:

1) KANAMYCIN

Dose: 750 mg (7.5 mg/kg) ql2h IM (or IV)
for 28 days

Comments: Modify dose in renal fatlure

2) PENICILLIN

Dose: 2,500,000 units (25000 units/kg)
q4h IV for 28 days

 

3.3.7 Results

MYCIN is currently being prepared for
clinical evaluation. In a prelininary evaluation,
done outside of the clinical setting, a panel of
expert judges was asked to rate MYCIN’s
performance along a number of dimensions (e.g.
infecting organism identification, organism
significance). In 90% of the cases submitted to
the judges, a majority of the judges said that the
Program’s decisions were the~same-as Or as-good-as
the decisions they would have made.

3.3.8 Explanation

MYCIN can expose its line of reasoning by
offering explanations of various kinds. "Why" and
"How" quescions can be asked by che user during
the consultation. “Why” is the user’s way of
inquiring, “Why do you vane to know that?” "How"
is his way of asking, “How did you conclude
that..-?" In a pest-consultation, he can similarly
inquire about how the diagnostic conelusions were
reached, and inquire about the reasons why certain
other conclusions were ruled out. Examples of the
MYCIN-TEIRESIAS explanation capabilities follow:

Appendix C

 

MYCIN: Is the patient’s illness with
ORGANISM-1 a hospital~acquired infection?
USER: WHY

TEIRESIAS: (i.e. WHY is it important to
determine whether or not the infection with
ORGANISM~l was acquired while the patient
was hospitalized?)

{i.0) This will aid in determining the
category of ORGANISM-1.

Tt has already been established chat

Ql.1] the gram stain of ORGANISM=-1 is
gramneg, and

{1.2] the morphology of ORGANISM=I is rod,
and

(1.3] the aerobicity of ORCANISM-1 is
facultative

Therefore, if

{1.4] the infection with ORGANISM-1 was not
acquired while the patient was hospitalized

then

there is strongly suggestive evidence (83
that the category of ORCANISM-1 is
enterobacteriaceae

Also: there is weakly suggestive evidence
(.1) thac the identity of

ORGANISM~1 is pseudomonas-aeruginosa

[RULE 37]

USER: HOW 1.3

TEIRESIAS: [1.e. HOW was it escablished
that the aerobicity of ORGANISM-! {s
facultative?}

The following were used:

[3.1] RULE 27 indicated there is strongly
suggestive evidence (.8) that zhe
aerobicity of ORGANISM-1 is facultative

{3.2] RULE 27 indicated chere ts weakly
suggestive evidence (.2) thae the
aerobicity of ORGANISM-1 is anaerobic

Since this gave a cumulative CF of (.8) for
facultative, and (.2) for anaerobic, tt has
been established that the aerobicity of
ORGANISM-1 is facultative.

197
USER: WHY DIDN°’T YOU CONSIDER
STREPTOCOCCUS AS A POSSIBILITY FOR
ORGANISM-1?

TEIRESIAS: The following rule could have
been used to determine that the identify of
ORGANISM~I was streptococcus: RULE 33

But clause 2 ("the morphology of the
organism {is coccus”) vas already known

to be false for ORGANISM-[, so the rule vas
never tried.

 

3.3.9 Discussion

 

A few shore comments ained

engineering are in order.

at knowledge

English is the language most natural and
comfortable for our experts and users, and vas
therefore the language chosea for interactive
consultation, explanation, and external

representation of the rules (the internal format
is INTERLISP). This situation is noe peculiar to
doctors; in mest areas of application of
incelligent agents I believe that English (i.e.
natural language) will be the language of choice.
Programming an English language processor and
front-end to such systens is not a scary
enterprise because:

a) the domain is specialized, so that
possible interpretations are constrained.

b) specialist-calk is replete with standard
jargon and stereotyped ways of expressing
knowledge and queries —- juse right for text
Cemplates, simple grammars and other simple
processing schemes.

¢) the ambiguity of interpretation resulting
from simple schemes can be dealt with easily by
feeding back interpretations for confirmation. If
this is done with a pleasant "I didn’t quite
understand you..." tone, it is not irritating to
the user.

English may be exactly the wrong language
for representation and interaction in some
domains. It would be awkward, to say the least, to
tepresent DENDRAL’s chemical structures and
knowledge of sass spectrometry in English, or to
tnteract about these with a user.

Simple explanation schemes have been
of the AI scene for a number of years aud
hard to implemenc. Really good models
explanation is as a transaction between user and
agent, with programs to implement § these models,
will be the subject (I predict) of much future
research in AI.

a@ part
are not
of what

198

Appendix C

Without the explanation capability, I
assert, user acceptance of MYCIN would have been

nil, and there would have been a greatly
diminished effectiveness and contribution of our
experts.

MYCIN was the first of our programs that
forced us to deal with what ve had always
understood: that experts’ knowledge is uncertain
and that our inference engines had to be made to
reason with this uncertainty. It is less important
that the inexact reasoning scheme be formal,
rigorous, and uniform than it is for the scheme to
be natural to and easily understandable by the
experts and users.

All of these points can be summarized by
saying that MYCIN and its TEIRESIAS adjunct are
experiments in che design of a see~through syscea,
whose representations and processes are almost
transparently clear to the domain specialise.
"Almost" here is equivalent to "with a few minutes
of introductory description." The various pieces
of MYCIN =< the backward chaining, the English
transactions, che explanations, etc. — are each
simple tn concept and realization. But there are
great virtues to simplicity in system design; and
viewed as ai total intelligent agent system,
MYCIN/TEIRESIAS’ is one of the best engineered.

3.4 SU/X: signal understanding

324.1 Historical nore
SU/X ia a system design thac waa tested in
an application whose details are classified.

Because of this, the ensuing discussion will
appear considerably less concrete and tangible
than the preceding case studies. This system
design was done by HP. Nii and me, and was
strongly influenced by the CHU Hearsay II system
design.

3.4.2 Task

SU/X°s task is che formation and continual
updating, over long pertods of time, of hypotheses
about the identity, location, and velocity of
objects in a physical space. The Output desired is
adisplay of the “current best hypotheses” with
full explanation of the support for each. There
are two types of input data: the primary signal
(to be understood); and auxiliary symbolic data
(to supply context for the understanding). The
primary signals are spectra, represented as
descriptions of che spectral lines. The various
spectra cover the physical space with some spatial
overlap.
3.4.3 Representations

The rules given by the expert about objects,
chetr behavior, and the interpretation of signal
data from them are all represented in the
situation => action forn. The “situations”
constitute invoking conditions and the "actions"
are processes that modify the current hypotheses,
post unresolved isaues, recompute evaluations,
etc. The expert’s knowledge of how to do analysis
ia the task is also represented in rule form.
These strategy rules replace the normal executive
program.

The situation-hypothesis ig represented as a
node-link graph, tree-like in thac it has distinct
"levels," each representing a degree of
abstraction (or aggregation) that is natural co
the experc in his understanding of the domain. A
node represents an hypothesis; a link to that node
represents support for that hypothesis (as in
HEARSAY II, “support from above" or “support from
below"). "Lower" levels are concerned with the
specifics of the signal data. “Higher” levels
represent symbolic abstractions.

3.4.4 Sketch of mechod

The situation-hypothesis {s forned
incrementally. As the situation unfolds over time,
the triggering of rules modifies or discards
existing hypotheses, adds new ones, or changes
support values. The situation-hyporhesis is a
common workspace (“blackboard," in HEARSAY jargon}
for all the rules.

In general, the incremental steps toward a
more complete and refined situacion-hypotheats can
be viewed as a sequence of local generateand-cest
activities. Some of the rules are plausible move
generators, generating either nodes or links.
Other rules are evaluators, testing and modifying
node descripcionsa.

In typical operation, new data is submitted
for processing (say, N time-units of new data).
This initiates a flurry of tule-criggerings and
consequently rule-actions (called “events"). Some
events are direct consequences of the data; other
events arise ina cascade-like fashion from the
triggering of rules. Auxiliary symbolic data also
cause events, usually affecting the higher levels
of the hypothesis. As a consequence, support~
fromabove for the lower level proceases is made

avatlable; and expectations of posaible lower
level events can be formed. Eventually all the
relevant rules have thetr say and the system

becomes quiescenc, thereby triggering the input of
new data to re-energize the inference activity.

The system uses the simplifying strategy of
maintaining only one “bese” situation-hypothesis
at any moment, modifying it incrementally as
required by the changing daca. This approach is
made feasible by several characteristics of the

A(spectfically, they do not change

Appendix C€

there is the
objects and

domain.
over

First,
tine of

strong continuity
their behaviors

radically over
time, or behave radically differently over short

periods). Second, a single problen (identity,
location and velocity of a particular set of
objects} persists over numerous dara gathering
periods. (Compare this to speech understanding in
which each sentence is spoken just once, and each
presents anew and differene problem.) Finally,
the system’s hypothesis is typically “almost
right,” in part because it gets numerous
Opportunities to refine the solution (i.e. the

numerous data gathering periods), and in part
because the availability of many knowledge sources
tends to over~determine the solution. As a result
of all of these, the current best hypothesis
changes only slowly with time, and hence keeping
only the currence best is a feasible approach.

Of interest are the time-based events. These
rule-like expressions, created by certain rules,
trigger upon the passage of specified amounts of

time. They implement various “wait-and-see”
strategies of analysis that are useful in the
domain.

3.4.5 Results

In the test application, using
generated by a simulation program because real
data was not available, the program achteved
expert levels of performance over a span of test
problems. Some problems were difficult because
there was very little primary signal to supporr
inference. Others were difficult because too much
Signal induced a plethora of alternatives with
much ambiguity.

signal data

A modified SU/X design is currently being
used as the basis for an application to the
faterpretation of x-ray crystallographic data, the
CRYSALIS program mentioned later.

3.4.6 Discussioa

The role of the auxiliary symbolic sources
of data is of critical importance. They supply a
symbolic model of the existing situation that is
used to generate expectations of events to be
observed in the data stream. This allows flow of
inferences from higher levels of abstraction to
lower. Such a process, so familiar to AL
researchers, apparently its almcest unrecognized
anong signal processing engineers. In the
application task, the expectation-driven analysis
is essential in controlling the combinatorial
processing explosion at the lower levels,exactly
the explosion chat forces the traditional signal
Processing engineers to seek out the largest
possible number-cruncher for their work.

The design
the user takes

of appropriate explanations for
an interesting twist in SU/X. The

"oO
Oo
Situation-hypothesis unfolds piecemeal over time,

but the “appropriate” explanation for the user ts
one that focuses on individual objects over time.
Thus the appropriate explanation must be

synthesized from a history of all the events that

led up to the current hypothesis. Contrast this
with the MYCIN-TEIRESIAS reporting of rule
invocations in the construction of a reasoning
chain.

Since its knowledge base and its auxiliary
symbolic data give it a model-of~the-situation
that strongly constrains interpretation of the
primary data  strean, s8u/X is relatively
unperturbed by errorful or missing data. These
data conditions merely cause fluctuations in the
credibility of individual hypotheses and/or the
creation of the “wait-and-see" events. SU/X can be
(but has noe yet been) used to control sensors.
Since its rules specify what types and values of
evidence are necessary to establish support, and
since it is constantly processing a complete
hypothesis structure, it can request “critical
readings” from the sensors. In general, this

al lows an efficient use of limited sensor
bandwidth and = data acquisition processing
capability.
3.5 OTHER CASE STUDIES

Space dees not allow wore than just a brief
sketch of other interesting projecta that have

been completed or are in progress.

3.5.1 AM: mathematical discovery

AM is «a knowledge-based system that
conjectures interesting concepts in elementary
mathematics. It 1s a discoverer of interesting
theorems to prove, not a theorem proving progran.
It was conceived and executed by D. Lenat for his
Ph.D. thesis, and is reported by him in these
proceedings (“An Overview of AM").

AM’s knowledge is basically of two types:
tules that suggest possibly interesting new
concepts from previously conjectured concepts; and

rules that evaluate the mathematical
“interestingness" of a conjecture. These rules
attempt to capture the expertise of the
professional mathematician at the task of
mathematical discovery. Though Lenat is not a
professional mathematician, he was able
successfully to serve as his ows expert in che

butlding of this program.

AM conducts a heuristic search through the
space of concepts creatable from its rules. Its
basic framework is generation-and-test. The
generation ts plausible nove generation, as
indicated by che rules for formation of new
concepts. The test {a the evaluation of
“intereacingness."” Of particular note is the
method of test-by-example that lends the flavor of

200

Appendix C

scientific hypothesis testing to che enterprise of
mathematical discovery.

Initialized with concepts of elementary set
theory, it conjectured concepts in elementary
number theory, such as "add," “multiply” (by four
distinct paths!), “primes,” the unique
factorization theorem, and a concept similar to
primes but previously not much studied called
“maximally divisible numbers."

3.5.2 MOLGEN: planning experiments in molecular
genetics

MOLGEN a collaboration with the Stanford
Genetics Departuent, is work in progress.
MOLGEN’s task is to provide intelligent advice to
a molecular geneticist on the planning of
experiments involving the manipulation of DNA. The
geneticist has various kinds of laboratory
techniques available for changing DNA material
(cuts, joins, insertions, deletions, and so on);
techniques for determining the biological
consequences of the changes; various instruments
for measuring effects; various chemical methods
for inducing, facilitating, or inhibiting changes;
and many other‘tools.

MOLGEN will offer planning assistance in
organizing and sequencing such tools to accomplish
an experimental goal. In addition MOLCGEN will
check user-provided experiment plans for
feasibility; and its knowledge base will bea
repository for the rapidly expanding knowledge of
this specialty, available by interrogation.

Current efforts to engineer a knowledge-base
Management system for MOLGEN are described by
Marcin et al in a paper in these proceedings. This
subsyatem uses and extends the techniques of the
TEIRESIAS system discussed earlier.

In MOLGEN the problem of integration of many
diverse sources of knowledge is central since the
essence of the experiment planning process is the
successful merging of biological, genetic,
chemical, topological, and instrument knowledge.
In MOLGEN the problem of vepresenting processes is
also brought into focus since the experte’s
knowledge of experimencal strategies -- proto~
plans -- wust also be represented and put to use.

3.5.3 CRYSALIS: inferring protein structure from

electron density maps

CRYSALIS, too, is work in progress. Its task
ia to hypothesize the structure of a protein from
a map of electron density that is derived from x-
ray crystallographic data. The map is three-
dimensional, and the contour information is erude
and highly ambiguous. Interpretation is guided
and supported by auxiliary tnformation, of which
the amino acid sequence of the protein’s backbone
is the most important. Density map interpretation
is a protein chemist’s art. As always, capturing
this are in heuristic rules and putting it to use
with aa inference engine is the project’s goal.

The inference engine for CRYSALIS is a
modification of the SU/X system design described
above. The hypothesis formation process must deal
with many levels of possibly useful aggregation
and abstraction. For example, the map itself can
be viewed as consisting of “peaks,” or "peaks and
valleys," or “skeleton.” The protein model has
“atoms,” “amide planes,” "amino acid sidechains,"
and even massive substructures such as "helices."
Protetn molecules are so complex that a systematic
generation-and-test strategy like DENDRAL’s is not

feasible. Incremental piecing together of the
hypothesis using region-growing methods is
necessary.

The CRYSALIS design (alias SU/P) ts

described in a recent paper by Nii
CL9T7}.

and Feigenbaur

4 SUMMARY OF CASE STUDIES

Same of the themes presented earlier need uo
vecaptculacion, but I wish to revisit chree here:
generatton-and-test; situation => action rules;
and explanations.

4.1 Generation and Test

 

Alreraft come ina wide variety of sizes,
shapes, and functional designs and they are
applied ia very many ways. But almost all chat fly
da so because of the unifying physical principle
of lift by airflow; the others are described by
exception. So it ts with intelligent agent
pragrams and, the information processing
psychalogiscs tell us, with people. One unifying
principle of “intelligence” is generation~and-
test. No wonder that 1c has been so thoroughly
studied in AI research!

Ta the case studies, generation is
manifested in a variety of forms and processing
schemes. There are legal move generators defined
formally by a generating algorithm (DENDRAL‘s
graph generating algorithm); or by a logical rule
of inference (MYCIN’s backward chaining). When
legal move generation is not possible or not
effictenct, there are plausible move generators (as
tm SU/X and AM). Sometimes generation is
interleaved with testing (as in MYCIN, SU/X, and
AM). In one case, all generation precedes testing
(DENDRAL). One case (META~DENDRAL) is mixed, with
some testing taking place during generation, some
after.

Test also shows great variety. There are
almple tests (MYCIN: “Is che organism aerobic?":
SU/X: "Has a spectral line appeared at position

P?") Some teses are complex heuristic evaluations
(AM: "Is che new concept ‘interesting’ ?"; MOLGEN:

Appendix C

"Will the reaction actually take place?")
Sometimes a complex test can involve feedback to
modify the object being tested (as in META-

DENDRAL).

The evidence from our case studies supports
the assertion by Newell and Simon chat generation-
and-test is a lawof our science (Newell and
Simon, 1976).

4.2 Situation = > Aetion rules

Situation => Action rules are used to

trepresent experts” knowledge in all of the case
studies. Always che situation part indicates the
specific conditions under which the tule its

relevant. The action part can be simple (MYCIN:
conclude presence of particular organism; DENDRAL:
conclude break of particular bond). Or it can be
quite complex (MOLGEN: an experiential procedure).
The overriding consideration in making design
choices is thee the rule form chosen be able to
represent clearly and directly what the expert
vishes ta expreas about the domain. As
illustrated, this may necessitate a wide variation
in rule syntax and semantics.

From a study of all the
regularity emerges. A salient feature of the
Situation => Actton tule technique for
representing expert’s knowledge is the modularity
of the knowledge base, with the concomitant
flexibility to add or change the knowledge easily
as the experts’ understanding of the domain
changes. Here too one must be pragmatic, not
doctrinaire. A technique such as this can net
represent modularity of knowledge Lf othac
aodularity does not exist in che domain. The
virtue of this technique is that it serves as a
framework for discovering what modularity exists
in the domain. Discovery may feed back to cause
reformulation of the knowledge toward greater
wodularity.

projects, a

Finally, our case studies have
strategy knowledge can be captured in rule form.
In TEIRESIAS, the metarules capture knowledge of
how to deploy domain knowledge; in  SU/X, the
strategy rules represent the experts’ knowledge of
“how to analyze" in the domain.

shown that

4.3 Explanation

Most of the programs, and all of the gore
recent ones, make availabie an explanation
capability for the user, be he end-user or system
developer. Our focus on end-users in applications
domains has forced attention to human engineering
issues, in particular making the need for the
explanation capability imperative.

The Intelligent Agent viewpoint seems
to demand that the agent be able to
activity; else the question arises

to us
explain its
of who is in

201
The issue is not
an engineering
and military

control of the agene’s activity.
academic or philosophical. It is
issue that has arisen in medical
applications of incelligene agents, and will
govern future acceptance of Ar work in
app! ications areas. And on the philesephical level
one might even argue chat there is a moral
imperative to provide accurate explanations to
end-users whose intuitions about our systems are
almost nil.

Finally, the explanation Capability its
meeded as part of the concerted attack on the
knowledge acquisition problem. Explanacion of the
reasoning process is central to the interactive
transfer of expertise to the knowledge base, and
ie is our most powerful tool for the debugging of
Che knowledge base.

5 EPILOGUE

What we have learned about knowledge
engineering goes beyond what is discernible in the
behavior of our case study programs. In the next
paper of this two-part serfes, Iwill raise and
discuss many of the general concerns of knowledge
engineers, including these:

What constitutes an
techniques?

"application" of ATI

There is a difference between
application and an
problem.

a serious
application-flavored toy

What are some criteria for the judicious
selection of an application of AI techniques?

What are some applications areas worthy of
serious attention by knowledge engineers?

For example, applications to science, to
signal interpretation, and to human
interaction with complex Systems.

How to find and fascinate an Expert.

The background and prior training of the
expert.

The level of
elicited.

commitment that can be
Designing systems that "think the way 1
do."

Sustaining attention by quick feedback
and incremental progress.

Focusing attention
problems.

to data and specific

Providing ways to express
expert knowledge.

uncertainty of

Appendix C

The side benefits to the expert of his
investment in the knowledge engineering
activity.

Gaining consensus among experts about che

knowledge of a domain.
The consensus may be a more valuable
outcome of the knowledge engineering effore
than the building of che program.

Problems faced by knowledge engineers today:

The lack of adequate
computer hardware.

and appropriate

The difficulty of export of systems to
end-users, caused by the lack of properly-=
sized and -packaged combinations of hardware
and software

The chronic absence of cumulation
techniques in the form of
that can achieve wide use.

of AL
software packages

The shortage of
engineers.

trained know] edge

The difficulty of obtaining and
sustaining funding for interesting knowledge
eogineering projects.

6 ACKNOWLEDGMENT

The work reported herein has received long-
term support from the Defense Advanced Research
Projects Agency. The National Institutes of Health
has supported DENDRAL, META~DENDRAL, and the
SUMEX-AIM computer facility on which we compute,
The National Science Foundation has supporced
research on CRYSALIS and MOLCEN. The Sureau of
Health Sciences Research and Evaluation has
Supported research on MYCIN. I am grateful] to
these agencies for their continuing support of our
work. .

I wish to express my deep admiration and
thanks to the faculty, staff and students of the
Heuristic Programming Project, and to our
collaborators in the various worldly arts, for the
creativity and dedication that has made our work
exciting and fruicful. My particular thanks for
assistance in preparing this Manuscript go to
Randy Davis, Penny Nii, Reid Smith, and Carolyn
Taynat.

202
Appendix ©

7 REFERENCES

General

Feigenbaum, E.Ae “artificial Intelligence
Research: What is it? What has it achieved?
Where is it going?," invited paper, Symposium on
Artificial Intelligence, Canberra, Australia,
1974.

Galdscein, I. and Ss. Papert, “arcificial
Tntelligence, Language, and the Study of

Knowledge,” Cognitive Science, Vol.l, No.1, 1977.

Gregory, R., "On How so
Controls so Much Behavior,” Bionics Research
Report No. 1, Machine Intelligence Department,
University of Edinburgh, 1968.

Litele Information

Newell, A. and H.A. Simon, Human Problem Solving,
Prentice-Hall, 1972.

Newell, A. and HeA. Simon, "Computer Science as
Empirical Inquiry: Symbols and Search,” Com ACM,
19, 3, March, 1976.

DENDRAL and META-DENDRAL

 

Feipenbaum, E.A., Buchanan, B.G. and J. Lederberg,
"On Generality and Problem Solving: a Case Study

Using the DENDRAL Program," Machine Intelligence
8, Edinburgh Univ. Press, 1971.

Buchanan, 58.G., Duffield, A.M. and A.V. Robertson,
“An Application of Artificial Intelligence to the

 

Interpretation of Mass Spectra,” Mass
Spectrometry Techniques and Applications, G.W.A.
Milne, Ed., John Wiley & Sons, Inc., p- 121,
1971.

Michie, D. and B.G. Buchanan, "Current Status of

the Heuristic DENDRAL Program for Applying
Artificial Incelligence to the Interpretation of

Mass Spectra,” Computers for Spectroscopy, R.A.G.
Carrington, ed., London: Adam Hilger, 1974.

Buchanan, 8.G., “Scientific Theory Formation by
Computer," Nato Advanced Study Institutes Series,
Series £: Applied Scfence, 14:515, Noordhoff-
Leyden, 1976.

Buchanan, B.G., Smith, D.H., White, W.C., Critter,

R.J., Fetgenbaum, E.A., Lederberg, J. and C.
Djerassi, "Applications of Artificial
Intelligence for Chemical Inference XXII.

Automatic Rule Formation in Mass Spectrometry by
Means of the Meca-DENDRAL Program," Journal of
the ACS, 98:6168, 1976.

oe

MYCIN

 

Shortliffe, E. Computer-based Medical Consul-
tations: MYCIN, New York, Elsevier, 1976.

Davis, R., Buchanan, B.G. and E.H. Shortliffe,
"Production Rules as a Representation for a
Knowledge-Based Consultation Program," Artificial
Intelligence, 8, 1, February, 1977.

Shortliffe, E.H. and B.G. Buchanan, "A Model of
Inexace Reasoning in Medicine,” Mathemacical
Stosciences, 23:351, 1975.

TEIRESTAS

Davis, Re, “Applications of Meta Level Knowledge
to the Construction, Maintenance and Use of Large
Knowledge Bases," Memo HPP-76-7, Stanford
Computer Science Department, Scanford, CA, 1976.

Davis, Ra, “Interactive Transfer of Expertise I:
Acquisition of New Inference Rules,” these
Proceedings.

Davis, R. and B.G. Buchanan, "Meta-Level
Knowledge: Overview and Applications," these
Proceedings.

su/X

Nit, HP. and E.A. Feigenbaun, "Rule Based
Understanding of Signals," Proceedings of the

Conference on Pattern-Directed Inference Systems,
1977 (forthcoming), also Memo HPP-77-7, Stanford
Computer Science Department, Stanford, CA, 1977.

AM

Lenat, D., "AM: 4n Arcificial Intelligence
Approach to Discovery in Mathematics as Heuristic
Search," Memo HPP~76-8, Stanford Computer Science
Department, Stanford, CA, 1976.

MOLGEN

Martin, Ne, Friedland, P., King, Je, and M
Stefik, "Knowledge Base Management for Experiment

Planning in Molecular Genetics,” these
Proceedings.
CRYSALIS

Engelmore, R. and HeP. Nii, "A Knowl edge-3ased

System for the Interpretation of Protein X-Ray
Crystallographic Data,“ Memo HPP~77-2, Department
of Computer Science, Stanford, CA, 1977.

202
10.

References

Adams, J.B. A probability model of medical reasoning and
the MYCIN model. Math. Biosci. 32,177-186 (1976).

Anderson, R.H., Gallegos, M., Gillogly, J.J., Greenberg,
R., and Villanueva, &. RITA Reference Manual, Report R-
1808-ARPA, The Rand Corporation, Santa Monica, CA.,
September 1977.

Bennett J.S., Creary L.G., Engelmore R.E-, Melosh R.B.,
A Knowledge-based Consultant for structural analysis,
forthcoming.

Bleich, H.L. The computer as a consultant. New Eng. J.
Med. 284,141-147 (1971).

Blum, Robert L. and Wiederhold, Gio: Inferring Knowledge
from Clinical Data Banks Utilizing Techniques from
Artificial Intelligence. "Proc. 2nd Annual Symp. on Comp.
Applic. in Med. Care," pp. 303-307, IEEE, Washington D.C.,
Nov. 5-9, 1978.

Bobrow D.G., Winograd T., An Overview of KRL, a Knowledge
Representation Language, Cognitive Science 1:1 (1977).

Bobrow D.G., Winograd T., Experience with KRL-0, One cycle
of a knowledge representation language, Proceedings of the
5th International Joint Conference on Artificial
Intelligence, Cambridge, Mass. (August 1977).

Bonnet A-, BAOBAB, A parser for a rule-based system using
a semantic grammar, Technical Report HPP-78-10, Heuristic
Programming Project, Stanford California (September 1978).

Brown, J.S., Steps toward a Theoretic Foundation for
Complex, Knowledge-Based CAI. BBN No. 3135.

Brown, JS, Collins, Ae, and Barris, G.

i)
om
-
ll.

12.

13.

14.

15.

16.

i7.

18.

Artificial Intelligence and Learning Strategies. To appear
in Learning Strategies (ed. Harry O’Neil), Academic
Press, New York, 1978.

Buchanan, Bruce G. and Feigenbaum, Fdward A. DEMNDRAL and
Meta-DENDRAL: Their Applications Dimension, Artificial
Intelligence, 11:5 (1978).

Clancey, W. "The Structure of a Case Method Dialogue", to
appear in Int. Jnl. of Man Machine Studies, Fall, 1978.

Colby, K.M., Weber, S., and Hilf, F. Artificial
paranoia. Artificial Intelligence 2,1-25 (1971).

Croft, D.J. Is computerized diagnosis possible? Comput.
Biomed. Res. 5,351-367 (1972).

Davis, R. Applications Of Meta Level Knowledge To
the Construction, Maintenance, And Use Of Large Knowledge
Bases. Doctoral dissertation, Stanford University ; Memo
HPP-76-7, Stanford Computer Science Department, 1976.

Davis, R- and King, J. An overview of production
systems. Machine Intelligence &: Machine Representations of
Knowledge (eds. E.W. Elcock and D. Michie), John Wiley,
April 1977.

de Dombal, F.T., Leaper, D.J., Staniland, J.R., McCann,
A.P., Horrocks, J.C. Computer aided diagnosis of acute
abdominal pain. Brit. Med. J. II,9+13 (1972).

Duda, R. O., Hart, P., Nilsson, N.- & Sutherland, G.
"Semantic network representations in rule-based inference
systems", in Pattern Directed Inference Systems (eds.
Waterman and Hayes-Roth), Academic Press,New York, 1°78.

Engelmore R.S., Nii H.P., A knowledge~based system for the
interpretation of protein x-ray crystallographic data,
Heuristic Programming Project Memc HPP-77-2 (February
1977).

Erman L.D., Lesser V.R., A multi-level organization for
problem solving using many, diverse, cooperating sources of

205
22.

23.

24.

25.

26.

28.

knowledge, in Proceedings of the 4th International Joint
Conference on Artificial Intelligence, Tbilsi, Russia
(1975).

Fagan L.M., Ventilator Manager: A program to provide on-
line consultative advice in the intensive care unit,
Heuristic Programming Project Memo HPP-78-16 (Working
Paper), Computer Science Department, Stanford University
(September 1978).

Feigenbaum E.A., The art of artificial intelligence: I.
Themes and case studies of knowledge engineering,
Proceedings of the 5th International Joint Conference on
Artificial Intelligence, Cambridge, Mass. (August 1977).

Feitelson J., Stefik M., A case study of the reasoning in a
genetics experiment, Heuristic Programming Project Report
77-18 (working paper) ,Computer Science Department,
Stanford University (April 1977).

Friedman, R.B. and Gustafson, D.H. Computers in
clinical medicine: a critical review. Comput. Biomed. Res.
10,199-204 (1977).

Fries, J. Time~oriented patient records and a
computer databank. J. Amer. Med. Assoc. 222,1536-1542
(1973).

Goldstein, Te, Papert, S. Artificial Intelligence,
Language, and study of knowledge. Cognitive Science 1:1
(1977).

Gorry, G.A. and Barnett, G.O. Experience with a model
of sequential diagnosis. Comput. Biomed. Res. 1,490-507
(1968).

Gorry, G.A., Kassirer, J.P., Essig, A.-, and Schwartz,
W.B. Decision analysis as the basis for computer-aided
Management of acute renal failure. Amer. J. Med. 55,473-
484 (1973).

Gorry, G.A., Silverman, H., and Pauker, §.G. Capturing
clinical expertise: a computer program that considers
clinical responses to digitalis. Amer. J. Med. 64,452-460
(1978).

206
30.

33.

34.

35.

36.

37
wie

38.

39.

Green, B.F., Wolf, AeK., Chomsky, C., and Laughery, K.
BASEBALL: An automatic question-answerer. In Computers and
Thougrt (eds. E.A. Feigenbaum and J. Feldman), pp. 207-216,
McGraw-Hill, San Francisco,1962.

Harless, W.G., Drennon, G.G., Marxer, J.J., Poot, J.A.,
Wilson, L.L., and Miller, G.E. CASE - a natural language
computer model. Comput. Biol. Med. 3,227-246 (1973).

Hart, P.E. Progress on a computer-based consultant. AI
Technical Note 99, Stanford Research Institute, Menlo Park,
CA., January 1975.

Hayes-Roth F., Lesser V.R-, Focus of attention in the
HEARSAY-II speech understanding system, Proceedings of the
5th International Joint Conference on Artificial
Intelligence, Cambridge, Mass. (August 1977).

Heiser J.F., Brooks R.E., Ballard J.P., "Progress Peport: A
Computerized Psychopharmacology Advisor", Proceedings of
the llth Colegium Internationale
NeuroPsychopharmacologicum. Vienna, 1978.

Reiser, J.F. and Brooks, R.E. A computerized
psychopharmacology advisor. Proceedings of the 4th Annual
AIM Workshop, Rutgers University, June 1978.

Hoffer, E.P. Experience with the use of computer
simulation models in medical education. Comput. Riol. Med.
3, 269-279 (1973).

Kunz J.C., Fallat R.J., McClung D.H., Osborn J.J., Votteri
BeA-, Nii H.P., Aikins J.S., Fagan L.M., Feigenbaum E.A., A
physiological rule based system for interpreting pulmonary
function test results, Heuristic Programming Project Memo
HPP-78-19, Stanford University, 1978.

Lenat D.B., The ubiquity of discovery, Artificial
Intelligence 9:3 (1977).

Lowerre B.T., The HARPY speech recognition system, Doctoral
thesis, Department of Computer Science, Carnegie-Mellon
University (April 1976).

207
4c.

4l.

42.

43.

44.

45.

46.

47.

48.

Martin N., Friedland P., King J., Stefik M., Knowledge Base
Management for Experiment Planning, Proceedings of the 5th
International Joint Conference on Artificial Intelligence,
Cambridge, Mass. (August 1977).

Mesel, E., Wirtshcafter, D.D., Carpenter, J.T., Durant,
J.«R., Henke, C., and Gray, EA. Clincial Algorithms for
Cancer Chemotherapy -— Systems for Community-Based
Consultant-Extenders and Oncology Centers. Meth. Inform.
Med. 15:3, 168+73 (1976).

Minsky M., A framework for representing knowledge, in The
psychology of computer vision, (ed. P. Winston), McGraw-
Hill, New York (1975).

Nii H.P., Feigenbaum E.A., Rule-based understanding of
signals in Pattern-Directed Inference Systems (eds.
Waterman and Hayes-Roth), Academic Press, New York, 1978.

Osborn, JeJe, Funz, J.C., and Fagan, LM.
PUFF/VM: interpretation of physiological measurements in
the pulmonary function laboratory and the intensive care
unit. Proceedings of the 4th Annual AIM Workshop, Rutgers
University, June 1978.

Pauker, S.G., Gorry, G.A., Kassirer, J.P., and Schwartz,
W.B. Towards the simulation of clinical cognition: taking a
present illness by computer. Amer. J. Med. 60,981-996
(1976).

Pople, H.E., Myers, J.D., Miller, R.A. DIALOG (INTERNIST):
a model of diagnostic logic for internal medicine.
Proceedings of the 4th International Joint Conference on
Artificial Intelligence, pp. 849-855, Thoilisi, Russia,
1975.

Quillian, M.R. Semantic memory. In Semantic
Information Processing (ed. M. Minsky), pp. 227-270, M.I.T.
Press, Cambridge, MA., 1968.

Scott, A.C., Clancey, W.J., Davis, R-, and Shortliffe,
E.H. Explanation capabilities of knowledge-based production

Systems. Amer. J. Computational Linguistics, Microfiche 62,
1977.
49. Shortliffe, E.H. and Buchanan, B.G. A model of inexact
reasoning in medicine. Math. Biosci. 22,351-379 (1975).

50. Shortliffe, E.H., Davis, R., Axline, S.G., Buchanan, E.G.,
Green, C.C., and Cohen, S.N. Computer=based consultations
in clinical therapeutics: explanation and rule-acquisition
capabilities of the MYCIN system. Comput. Biomed. Res.
8, 303-320 (1975).

51. Shortliffe, E.H. Computer-Based Medical Consultations:
MYCIN. Elsevier/North Holland, New York, 1976.

52. Stefik M., An examination of a frame-structured
representation system, Stanford Beuristic Programming
Project Memo HPP-78-13 (working paper) (September 1978).

53. Stefik M., Inferring DNA structures from segmentation
data, Artificial Intelligence 11 (1978).

54. Van Melle, W. Would you like advice on another horn?
MYCIN project internal working paper, Stanford University,
Stanford, California, December 1974.

55. Warner, H.R-, Toronto, A-F., and Veasy, L.G. Experience
with Bayes’ theorem for computer diagnosis of congenital
heart disease. Anns. N.Y. Acad. Sci. 115,558-567 (1964).

56. Weinberg, A.D. CAI at the Ohio State University College
of Medicine. Comput. Biol. Med. 3,299-305 (1973).

57. Weiss, S&., Kulikowski, C.A., and Safir, A. Glaucoma
consultation by computer. Comput. Biol. Med. &, 25-40
(1978).

58. Weyl, S., Fries, J-., Wiederhold, G., and Germano, F. A
modular self-describing clinical databank system. Comput.
Biomed. Res. 8,279-293 (1975).

59. Woods, W-.A. et al. The lunar sciences natural
language information system: final report, BBN Report 2378,
Bolt, Beranek and Newman, Cambridge, MA., June 1972.
6C.

6l.

62.

63.

Wooster, H.» and Lewis, J.F. Distribution of computer-
assisted instruction materials in biomedicine through the
Lister Hill Center Experimental Network. Comput. Biol.
Med. 3,319-323 (1973).

Wortman, P.M. Medical diagnosis: an information
processing approach. Comput. Biomed. Res. 5, 315-328
(1972).

Yu, VeL., Buchanan, B.G., Shortliffe, E.H., Wraith, §.M.,
Davis, Re, Scott, AeC-, and Cohen, §.N. Evaluating the
performance of a computer-based consultant. To appear in
Computer Programs in Fiomedicine, 1978.

Yu, V.L., Fagan, L.M., Wraith, S.M., Clancey, W.J., Scott,
A.C., Hannigan, J., Blum, R.L., Buchanan, B.G., and Cohen,
S-N. Computer-based consultation in antimicrobial selection
- a comparative evaluation by experts. Submitted for
publication, September 1978.

210
The appropriate programmatic and administrative personnel
of each institution involved in this grant application are
aware of the NIH consortium grant policy and are prepared to
establish the necessary inter-institutional agreement (s)
consistent with that policy.

Page 211