Resource Progress

context of MRS. We designed a method for representing temporal
knowledge in ONCOCIN. Finally, Cooper's Ph.D. thesis on representing and
using causal and probabilistic knowledge was published in this year.

[See KSL technical memos KSL-84-9, KSL-84-10, KSL-84-18, KSL 84-31,
KSL~84-41, KSL-85-5.]

2. Advanced Architectures and Control: What kinds of software tools and
system architectures can be constructed to make it easier to implement
expert programs with increasing complexity and high performance? How
can we design flexible control structures for powerful problem solving
programs?

Much of our research in the past year has involved investigations with the
Blackboard architecture begun in previous years. We have implemented our
design in a working system called BBI1.

(See KSL technical memos KSL-84-11, KSL-84-12, KSL-84-14, KSL 84-16,
KSL 84-36.]

3. Knowledge Acquisition: How is knowledge acquired most efficiently -- from
human experts, from observed data, from experience, and from discovery?
How can a program discover inconsistencies and incompleteness in_ its
knowledge base? How can the knowledge base be augmented without
perturbing the established knowledge base?

Three Ph.D. theses (Fu, Greiner, and Dietterich) in the area of knowledge
acquisition were completed in this year. Fu's work develops methods for
learning by induction, where the target rules may have some associated
degrees of uncertainty and may contain names of intermediate concepts.
This work was demonstrated in the context of diagnosing causes of jaundice.
Greiner's work examines learning by analogy. Dietterich’s work elucidates
methods needed in learning programs to deal with state variables and with
problems of using a partially learned theory to interpret new data that will
be used to learn new elements of the theory. In addition, we implemented
the first parts of a program that can learn by watching an expert. And we
implemented a prototype system that learns control heuristics from an expert
using a problem solving program written in BB1.

[Preliminary results have been published in KSL-84-10, KSL-84-18,
KSL-84-24, KSL-84-38, KSL-84-45, KSL 84-46, KSL-85-2, KSL-85-4.]

4. Knowledge Utilization. By what inference methods can many sources of
knowledge of diverse types be made to contribute jointly and efficiently
toward solutions? How can knowledge be used intelligently, especially in
systems with large knowledge bases, so that it is applied in an appropriate
manner at the appropriate time?

We completed the design of a system using Dempster's rule of propagating
uncertainty, and we examined several other issues regarding the use of
probabilistic information in expert systems. Dr. Jean Gordon, a
mathematician and Stanford medical student, collaborated with Dr. Shortliffe
on work that examines inexact inference using the Dempster-Shafer theory
of evidence, demonstrating its relevance to a familiar expert system domain,
namely the bacterial organism identification problem that lies at the heart
of the MYCIN system, and presenting a new adaptation of the D-S approach
with both computational efficiency and permitting the management of
evidential reasoning within an abstraction hierarchy.

We examined the use of counter-factual conditionals in logic-based systems
and completed an analysis of how procedural hints can be used by a
problem solver.

Privileged Communication 101 E. H. Shortliffe
Resource Progress

[See KSL technical memos KSL-84-11, KSL-84-17, KSL-84-21, KSL-84-30,
KSL-84-31, KSL-84-35, KSL 84-41, KSL-84-42, KSL-84-42, KSL-84-43.]

5. Software Tools: How can specific programs that solve specific problems be
generalized to more widely useful tools to aid in the development of other
programs of the same class?

We have continued the development of new software tools for expert system
construction and the distribution of packages that are reliable enough and
documented so that other laboratories can use them. These include the old
tule-based EMYCIN system, MRS, and AGE. Progress has been made in
making the BB1 instantiation of the blackboard architecture domain-
independent. We have begun constructing and editing subsystems and have
completed a first implementation of an explanation subsystem.

[See KSL technical memos KSL-84-16, KSL-84-39.]

6. Explanation and Tutoring: How can the knowledge base and the line of
reasoning used in solving a particular problem be explained to users? What
constitutes a sufficient or an acceptable explanation for different classes of
users? How can knowledge in a system be transferred effectively to students
and trainees?

A program for inferring a model of users was designed and implemented in
the context of a tutoring system that aids in teaching algebra. A second
user-modelling program was implemented in the context of NEOMYCIN to
help understand how an expert solves problems. A survey of explanation
capabilities in medical consultation programs was published.

A new project on knowledge-based explanations in a decision analysis
environment is getting underway as the thesis research of Dr. Glenn
Rennels. This work is actually a synthesis of artificial intelligence, decision
analysis and statistics. The work concerns medical management, not
diagnosis; diagnostic decisions identify underlying mechanisms of the illness,
and group the patient's problems under a diagnostic label, whereas
management decisions plan actions that will prevent undesirable outcomes
and restore health. The intelligent behavior we want to emulate is (a) the
identification of studies relevant to a given clinical case, and (b)
interpretation of those studies for decision-making assistance.

[See KSL technical memos KSL-84-12, KSL 84-27, KSL-84-29.]

1. Planning and Design: What are reasonable and effective methods for
planning and design? How can symbolic knowledge be coupled with
numerical constraints? How are constraints propagated in design problems?

A major paper on skeletal planning was published in this year. And we
published in the biochemistry literature some results of applying skeletal
planning to experiment design in genetic engineering.

[See KSL technical memos KSL-84-33, KSL-85-6.]

8. Diagnosis: How can we build a diagnostic system that reflects any of
several diagnostic strategies? How can we use knowledge at different levels
of abstraction in the diagnostic process?

Research on using causal models in a medical decision support system
(NESTOR) was published in this year. Using the domain of hypercalcemic
disorders, NESTOR attempts to use knowledge-based methods within a
formal probability theory framework. The system is able to score
hypotheses with causal knowledge guiding the application of sparse
probabilistic knowledge; search for the most likely hypothesis without

E. H. Shortliffe 102 Privileged Communication
Resource Progress

exploring the entire hypothesis space; and critique and compare hypotheses
which are generated by the system, volunteered by the user, or both.

A second medical diagnosis program that uses causal models of renal
physiology (AI/MM) was also published. In this system, analysis and
explanation of physiological function is based on two kinds. of causal
relations: empirical “Type-1" relations based on definitions or on repeated
observation and mathematical "Type-2" relations that have a basis in
physical law. Inference rules are proposed for making valid qualitative
causal arguments with both kinds of causal basis.

A working implementation of the PATHFINDER system was evaluated and
its diagnostic strategies were analyzed. A taxonomy of diagnostic methods
was completed and integrated into the NEOMYCIN system.

[See KSL technical reports: KSL-84-13, KSL-84-19, KSL-84-48, KSL-85-5.]

Relevant Core Research Publications

HPP 84-9

HPP 84-10
HPP 84-11
HPP 84-12

HPP 84-13

HPP 84-14
HPP 84-15
HPP 84-16
HPP 84-17
HPP 84-18

HPP 84-19

David H. Hickam, Edward H. Shortliffe, Miriam B. Bischoff,
A. Carlisle Scott, and Charlotte D. Jacobs; Evaluations of the
ONCOCIN System: A Computer-Based Treatment Consultant for
Clinical Oncology, (1) The Quality of Computer-Generated Advice
fae (2) Improvements in the Quality of Data Management, May
984.

Thomas G. Dietterich; Learning About Systems That Contain State
Variables, June 1984. In Proceedings of AAAI-84, August 1984.

M. Genesereth, and D.E. Smith; Procedural Hints in the Control of
Reasoning, May 1984.

Derek H. Sleeman; UMFE: A User Modelling Front End Subsystem,
April 1984.

Eric J. Horvitz, David E. Heckerman, Bharat N. Nathwani, and
Lawrence M. Fagan; Diagnostic Strategies in the Hypothesis-Directed
PATHFINDER System, June 1984, submitted to the First Conference
one ici Intelligence Applications, Denver, CO., December 5-7,
1984.

Vineet Singh, and M. Genesereth; A Variable Supply Model for
Distributing Deductions, May 1984.

Bruce G. Buchanan; Expert Systems, July 1984, Journal of Automated
Reasoning, Vol. 1, No. 1, Fall, 1984.

STAN-CS-84~-1034 Barbara Hayes-Roth; BB-/: An Architecture for
Blackboard Systems That Control, Explain, and Learn About Their
Own Behavior, December 1984.

M.L. Ginsberg; Analyzing Incomplete Information, 1984.

William J. Clancey; Knowledge Acquisition for Classification Expert
Systems, July 1984, Proceedings of ACM-84, 1984.

E.H. Shortliffe; Coming to Terms With the Computer, to appear in
S.R. Reiser, and M. Anbar (eds.), The Machine at the Bedside:
Strategies for Using Technology in Patient Care, Cambridge
University Press, 1984.

Privileged Communication 103 E. H. Shortliffe
Resource Progress

HPP 84-20

HPP 84-21

HPP 84-22

HPP 84-23
HPP 84-24
HPP 84-25

HPP 84-27
HPP 84-28

HPP 84-29

HPP 84-30

HPP 84-31
HPP 84-32

HPP 84-33
MCS Thesis
HPP 84-35

HPP 84-36

E. H. Shortliffe

E.H. Shortliffe; Artificial Intelligence and the Future of Medical
Computing, in Proceedings of a Symposium on Computers in
Medicine, annual meeting of the California Medical Association,
Anaheim, CA., February 1984.

E.H. Shortliffe; Reasoning Methods in Medical Consultation Systems:
Artificial Intelligence Approaches (Tutorial), in Computer Programs
in Biomedicine January 1984.

ONCOCIN Project: Studies to Evaluate the ONCOCIN System; 6
Abstracts, February 1984.

Edward H. Shortliffe; Feature Interview: On the MYCIN Expert
System, in Computer Compacts, 1:283-289, December 1983/January
1984,

B.G. Buchanan, and E.H. Shortliffe; Rule-Based Expert Systems: The
MYCIN Experiments of the Stanford Heuristic Programming Project,
published with Addison-Wesley, Reading, MA., 1984.

W.J. Clancey, and E.H. Shortliffe; Readings in Medical Artificial
Intelligence: The First Decade, published with Addison-Wesley,
Reading, MA., 1984.

Edward H. Shortliffe; Explanation Capabilities for Medical
Consultation Systems (Tutorial), in D. Lindberg, and M. Collen
(eds.), Proceedings of AAMSI Congress 84, pp. 193-197, San
Francisco, May 21-23, 1984.

E.H. Shortliffe, and L.M. Fagan; Artificial Intelligence: The Expert
Systems Approach to Medical Consultation, in Proceedings of the 6th
Annual International Symposium on Computers in Critical Care and
Pulmonary Medicine, Heidelberg, Germany, June 4-7, 1984.

David C. Wilkins, Bruce G. Buchanan, and William J. Clancey:
Inferring an Expert's Reasoning by Watching, Proceedings of the
1984 Conference on Intelligent Systems and Machines, 1984.

M.L. Ginsberg: Non-Monotonic Reasoning Using Dempster's Rule,
June 1984.

M.L. Ginsberg: Implementing Probabilistic Reasoning, June 1984.

Bruce G. Buchanan: Artificial Intelligence: Toward Machines That
Think, July 1984, in Yearbook of Science and the Future, pp.
96-112, Encyclopedia Britannica, Inc., Chicago, 1985.

Rene Bach, Yumi Iwasaki, and Peter Friedland; /ntelligent
Computational Assistance for Experiment Design, in Nuclear Acids
Research, January 1984.

Kunz, John C.; Use of Artificial Intelligence and Simple
Mathematics to Analyze a Physiological Model, Doctoral dissertation
Medical Information Sciences, June 1984.

Jean Gordon, and Edward Shortliffe; A Method for Managing
Evidential Reasoning in a Hierarchical Hypothesis Space, September
1984 and in Artificial Intelligence, 26(3), July 1985.

Michael! R. Genesereth, Matt Ginsberg, and Jeff S. Rosenschein:
Cooperation Without Communication, September 1984.

104 Privileged Communication
HPP 84-38

HPP 84-39

HPP 84-41

APP 84-42

HPP 84-43

HPP 84-45

HPP 84-46

HPP 84-48

KSL 85-2

KSL 85-4

KSL 85-5

KSL 85-6

KSL 85-7

KSL 85-8

Resource Progress

Li-Min Fu, and Bruce G. Buchanan; Enhancing Performance of -

Expert Systems by Automated Discovery of Meta-Rules, September 6,
1984.

Paul S. Rosenbloom, John E. Laird, John McDermott, Allen Newell,
and Edmund Orciuch; RIi-Soar: An Experiment in Knowledge-
Intensive Programming in a Problem-Solving Architecture, to appear
in the Proceedings of the [EEE Workshop on Principles of
Knowledge-Based Systems, October 1984.

STAN-CS~84-1032 Michael R. Genesereth, Matthew L. Ginsberg, and
Jeffrey S. Rosenschein; Solving the Prisoner's Dilemma, November
1984.

Matthew L. Ginsberg; Does Probability Have a Place in Non-
Monotonic Reasoning? submitted to the /JCAI-85, November 1984.

STAN-CS-84-1029 Matthew L. Ginsberg; Counterfactuals, submitted
to the [JCAI-85, December 1984.

Devika Subramanian, and Michael R. Genesereth; Experiment
Generation with Version Spaces, December 1984.

Thomas G. Dietterich; Constraint Propagation Techniques for Theory-
Driven Data Interpretation, PhD Thesis, to be published as a book by
Kluwer, December 1984.

STAN-CS-84-1031 Gregory F. Cooper; NESTOR: A Computer-Based
Medical Diagnostic Aid That Integrates Causal and Probabilistic
Knowledge, PhD Thesis, December 20, 1984.

STAN-CS~-85-1036 Barbara Hayes-Roth, and Michael Hewett;
Learning Control Heuristics in BBI, submitted to the [JCAI-85,
January 1985.

(Needs Authors Permission) Li-Min Fu, and Bruce G. Buchanan;
Learning Intermediate Knowledge in Constructing a Hierarchical
Knowledge Base, submitted to the IJCAI Conference Proceedings for
1985, January 1985.

(Needs Authors Permission) William J. Clancey; Heuristic
Classification, March 1985.

Peter E. Friedland, and Yumi _ Iwasaki; The Concept and
Implementation of Skeletal Plans, published in the Journal of
Automated Reasoning, 1985.

Rene Bach, Yumi Iwasaki, and Peter Friedland; J/ntelligent
Computational Assistance for Experiment Design, published in
Nucleic Acids Research, 1985.

(Needs Authors Permission) M.G. Kahn, J. Ferguson, E.H. Shortliffe,
and L. Fagan; An Approach for Structuring Temporal Information in
the ONCOCIN System, March 1985.

Summary of Core Research Funding Support

We are pursuing a broad core research program on basic AI research issues with support
from not only SUMEX but also DARPA, NASA, NSF, and ONR. SUMEX provides

Privileged Communication 105 E. H. Shortliffe
Resource Progress

some salary support for staff and students involved in core research and invaluable
computing support for most of these efforts. Additional salary support comes from the
sources shown starting on page 36.

Interactions with the SUMEX-AIM Resource

Our interactions with the SUMEX-AIM resource involve the facilities -- both hardware
and software ~- and the staff -- both technical and administrative. Taken together as a
whole resource, they constitute an essential part of the research structure for the KSL.
Many of the grants and contracts from other agencies have been awarded partly because
of the cost-effectiveness of AI research in the KSL due to the fact that much of our
computing needs could be more than adequately met by the SUMEX-AIM resource. In
this way the complementary funding of this work by the NIH and other agencies
provides a high leverage for incremental investment in Al research at the SUMEX-AIM
resource.

We rely on the central SUMEX facility as a focal point for all the research within the
KSL, not only for much of our computing, but for communications and links to our
many collaborators as well. As a common communications medium alone, it has
significantly enhanced the nature of our work and the reach of our collaborations. The
existence of the central time-shared facility has allowed us to explore new ideas at very
small incremental cost.

As SUMEX and the KSL acquire a diversity of hardware, including LISP workstations
and smaller personal computers, we rely more and more heavily on the SUMEX staff
for integration of these new resources into the local network system. The staff has
been extremely helpful and effective in dealing with the myriad of complex technical
issues and leading us competently into this world of decentralized, diversified
computing. At the same time, the staff has provided a stable, efficient central time-
shared machine running software that has been developed at Many sites over many
years. Without the dedication of the SUMEX staff, the KSL would not be at the
forefront of AI research.

E. H. Shortliffe 106 Privileged Communication
Resource Progress

2.1.4.6. Dissemination Activities

Throughout the history of the SUMEX-AIM resource, we have made extensive efforts at
disseminating the AI technology developed here. This has taken the form of many
publications -~ over 45 combined books and papers are published per year from the
KSL; wide distribution of our software including systems software and Al application
and tool software, both to other research laboratories and for commercial development:
production of films and video tapes depicting aspects of our work; and significant
project efforts at studying the dissemination of individual applications systems such as
the GENET community (DNA sequence analysis software) and the ONCOCIN resource-
related research project (see 209).

Books and Publications

A sampling of the recent research paper publications of the KSL was given in the
previous section on core AI research progress. The following lists the major books
published in the past 4 years from the KSL:

« Heuristic Reasoning about Uncertainty: An AI Approach, Cohen, Pitman,
1985.

e Readings in Medical Artificial Intelligence: The First Decade, Clancey and
Shortliffe, Addison-Wesley, 1984. :

e Rule-Based Expert Systems: The MYCIN Experiments of the Stanford

Heuristic Programming Project, Buchanan and Shortliffe, Addison-Wesley,
1984,

¢ The Fifth Generation: Artificial Intelligence and Japan's Computer
Challenge to the World, Feigenbaum and McCorduck, Addison-Wesley, 1983.

e Building Expert Systems, F. Hayes-Roth, Waterman, and Lenat, eds.,
Addison-Wesley, 1983.

e System Aids in Constructing Consultation Programs: EMYCIN, van Melle,
UMI Research Press, 1982.

e Knowledge-Based Systems in Artificial Intelligence: AM and TETRESIAS,
Davis and Lenat, McGraw-Hill, 1982.

e The Handbook of Artificial Intelligence, Volume 1, Barr and Feigenbaum,
eds., 1981; Volume II, Barr and Feigenbaum, eds., 1982; Volume III, Cohen
and Feigenbaum, eds., 1982; Kaufmann.

e Applications of Artificial Intelligence for Organic Chemistry: The
DENDRAL Project, Lindsay, Buchanan, Feigenbaum, and Lederberg,
McGraw-Hill, 1980.

Software Distribution

We have widely distributed both our system software and our AI tool software. We
have no accurate records of the extent of distribution of the system codes because their
distribution is not centralized and controlled. The recent programs such as the
TOPS-20 file recognition enhancements, the Ethernet gateway and TIP programs, the
SEAGATE AppleBus to Ethernet gateway, the PUP Leaf server, the SUMACC
development system for Macintosh workstations, and our Lisp workstation programs are
well-distributed throughout the ARPANET community and beyond.

Privileged Communication 109 E. H. Shortliffe
Resource Progress

We do have reasonably accurate records of the distribution of our AI tool software
because the recipient community is more directly coupled to us and the distribution is
centralized:

GENET Prior to the establishment of the BIONET resource at IntelliCorp, we
distributed 21 copies of the DNA sequence analysis programs and
databases for both DEC-10 and DEC-20 systems.

EMYCIN A total of 56 sites have received the EMYCIN [6, 68] package for
backward-chained, rule-based AI systems.
AGE The AGE [54] blackboard framework system has been sent out to 35

Sites in versions for several machines.

MRS The MRS [20] logic-based system for meta-level representation and
Teasoning has been provided to 76 sites.

Other Programs Smaller numbers of copies of programs such as the SACON [2]
knowledge base for EMYCIN, the GLISP [57] system (now
distributed by Gordon Novak at the University of Texas), and the
new BB1 [28, 27] system have been distributed.

A number of other software packages have been licensed or otherwise made available
for commercial development including DENDRAL (Molecular Designs), MAINSAIL
(Xidak), UNITS (IntelliCorp), and EMYCIN (Teknowledge and Texas Instruments).

Video Tapes and Films

The KSL and the ONCOCIN project have prepared several video tapes that provide an
overview of the research and research methodologies underlying our work and that
demonstrate the capabilities of particular systems. These tapes are available through our
groups, the Fleischmann Learning Center at the Stanford Medical Center, and the
Stanford Computer Forum and copies have been mailed to program offices of our
various funding sponsors. The three tapes include:

« Knowledge Engineering in the Heuristic Programming Project -- This 20-
minute film/tape illustrates key ideas in knowledge-based system design and
implementation, using examples from ONCOCIN, PROTEAN, and
knowledge-based VLSI design systems. It describes the research environment
of the KSL and lays out the methodologies of our work and the long term
research goals that guide it.

e ONCOCIN Overview -- This is a 30-minute tape providing an overview of
the ONCOCIN project. It gives an historical context for the work, discusses
the clinical problem and the setting in which the prototype system is being
used, and outlines the plans for transferring the system to run on single-user
workstations. Brief illustrations of the graphics capabilities of ONCOCIN
on a Lisp workstation are also provided.

e ONCOCIN Demonstration -- This 1-hour tape provides detailed examples of
the key components of the ONCOCIN system. It begins with a
demonstration of the prototype system's performance on a_time-shared
mainframe computer and then shows each of the elements involved in
transferring the system to Lisp workstations.

E. H. Shortliffe 110 Privileged Communication
Resource Progress

The GENET Dissemination Experiment.

Beginning in early 1980, the MOLGEN project investigators at Stanford have made a
new set of computing tools available to a national community of molecular biologists
through a guest facility called GENET on the SUMEX-AIM resource. This
experimental subcommunity was started to broaden MOLGEN’s base of scientist
collaborators at institutions other than Stanford and to explore the idea of a SUMEX-
like resource to disseminate sophisticated software tools to a generally computer-naive
community. The enthusiastic response to the very limited announcement of this facility
eventually necessitated SUMEX placing severe restrictions on the scope of services
provided to this community.

Three main programs were offered to assist molecular genetics users: SEQ, a DNA-RNA
sequence analysis program; MAP, a program that assists in the construction of
restriction maps from restriction enzyme digest data; and MAPPER, a simplified and
somewhat more efficient version of the MOLGEN MAP program, written and
maintained by William Pearson of Johns Hopkins University. Some of the other,
more-sophisticated programs being developed through MOLGEN research efforts were
not yet available for novice users. However, GENET users had access to the SUMEX-
AIM programs for electronic messaging, text-editing, file-searching, etc.

The GENET experiment proved so successful that eventually that community was the
single biggest consumer of processor cycles on SUMEX. This overioad diverted our
very limited computing resources away from our mainline goal of supporting projects
developing new AI systems in the medical and biological sciences, including molecular
biology. Efforts to secure funds to increase SUMEX capacity for the burgeoning
GENET use failed. Thus, without any fair way to allocate a small resource to the
growing GENET community and in order to restore the necessary emphasis on
biomedical computer science research on SUMEX, it was necessary to phase out the
GENET usage. We closed the GENET account at the end of 1982, with a mandate
from an ad hoc GENET Executive Committee, and phased out all usage by spring of
1983. In the process, we developed procedures by which academic users could obtain
their own copies of the GENET programs used at SUMEX and we provided a list of
alternate sources for GENET-like computing services. As indicated above, SUMEX has
supplied 21 systems to academic users with compatible machines.

Since the phase-out of GENET at SUMEX, IntelliCorp, a commercial AI company,
submitted a proposal to the NIH Division of Research Resources for a BIONET
resource and was successful in obtaining funding. The BIONET resource began
operation in the summer of 1984.

Privileged Communication lll E. H. Shortliffe
Resource Progress

2.1.4.7. Training Activities

The SUMEX resource exists to facilitate biomedical artificial intelligence applications
from program development through testing in the target research communities. This
user orientation on the part of the facility and staff has been a unique feature of our
resource and is responsible in large part for our success in community building. The
resource staff has spent significant effort in assisting users gain access to the system
and use it effectively. We have also spent substantial effort to develop, maintain, and
facilitate access to documentation and interactive help facilities. The HELP and
Bulletin Board subsystems have been important in this effort to help users get familiar
with the computing environment.

On another front, we have regularly accepted a number of scientific visitors for periods
of several months to a year, to work with us to learn the techniques of expert system
definition and building and to collaborate with us on specific projects. Our ability to
accommodate such visitors is severely limited by space, computing, and manpower
resources to support such visitors within the demands of our on-going research.

And finally, the training of graduate students is an essential part of the research and
educational activities of the KSL. Currently 41 students are working with our projects
centered in Computer Science and another 20 students are working with the Medical
Computer Science program in Medicine. Of the 41 working in Computer Science, 25
are working toward Ph.D. degrees, and 16 are working toward M.S. degrees. A number
of students are pursuing interdisciplinary programs and come from the Departments of
Engineering, Mathematics, Education, and Medicine.

Based on the SUMEX-AIM community environment, we have initiated two unique and
special academic degree programs at Stanford, the Medical Information Science program
and the Masters of Science in AI, to increase the number of students we produce for
research and industry, who are knowledgeable about knowledge-based system techniques.

The Medical Information Sciences (MIS) program is one of the most obvious signs of
the local academic impact of the SUMEX-AIM resource. The MIS program received
recent University approval (in October 1982) as an innovative training program that
offers MS and PhD degrees to individuals with a career commitment to applying
computers and decision sciences in the field of medicine. The MIS training program is
based in School of Medicine, directed by Dr. Shortliffe, co-directed by Dr. Fagan, and
overseen by a group of nine University faculty that includes several faculty from the
Knowledge Systems Laboratory (Profs. Shortliffe, Feigenbaum, Buchanan, and
Genesereth). It was Stanford's active ongoing research in medical computer science,
plus a world-wide reputation for the excellence and rigor of those research efforts, that
persuaded the University that the field warranted a new academic degree program in the
area. A group of faculty from the medical school and the computer science department
argued that research in medical computing has historically been constrained by a lack
of talented individuals who have a solid footing in both the medical and computer
science fields. The specialized curriculum offered by the new program is intended to
overcome the limitations of previous training options. It focusses on the development
of a new generation of researchers with a commitment to developing new knowledge
about optimal methods for developing practical computer-based solutions to biomedical
needs.

The program accepted its first class of four trainees in the summer of 1983 and a
second class of five entered last summer. A third group of seven students has just been
selected to begin during 1985. The proposed steady state size for the program (which
should be reached in 1986) is 20-22 trainees. Applicants to the program in our first
two years have come from a number of backgrounds (including seven MD's and five
medical students). We do not wish to provide too narrow a definition of what kinds of

E. H. Shortliffe 112 Privileged Communication
Resource Progress

prior training are pertinent because of the interdisciplinary nature of the field. The
program has accordingly encouraged applications from any of the following:

e medical students who wish to combine MD training with formal degree work
and research experience in MIS:

e physicians who wish to obtain formal MIS training after their MD or their
residency, perhaps in conjunction with a clinical fellowship at Stanford
Medical Center;

e recent BA or BS graduates who have decided on a career applying computer
science in the medical world;

e current Stanford undergraduates who wish to extend their Stanford training
an extra year in order to obtain a "co-terminus” MS in the MIS program;

» recent PhD graduates who wish post-doctoral training, perhaps with the
formal MS credential, to complement their primary field of training.

In addition, a special one-year MS program is available for established academic
medical researchers who may wish to augment their computing and statistical skills
during a sabbatical break.

With the exception of this latter group, all students spend a minimum of two years at
Stanford (four years for PhD students) and are expected to undertake significant
research projects for either degree. Research opportunities abound, however, and they
of course include the several Stanford AIM projects as well as research in psychological
and formal statistical approaches to medical decision making, applied instrumentation,
large medical databases, and a variety of other applications projects at the medical
center and on the main campus. Several students are already contributing in major
ways to the AIM projects and core research described in this application.

Early evidence suggests that the program already has an excellent Teputation due to:

¢ high quality students, many of whom are beginning to publish their work in
conference proceedings and refereed journals;

e a rigorous curriculum that includes newly-developed course offerings that are
available to the University's medical students, undergraduates, and computer
science students as well as to the program's trainees:

e excellent computing facilities combined with ample and diverse opportunities
for medical computer science and medical decision science research;

e the program's great potential for a beneficial impact upon health care
delivery in the highly technologic but cost-sensitive era that lies ahead.

The program has been successful in raising financial and equipment support (almost
$1M in hardware gifts from Hewlett Packard, Xerox, and Texas Instruments; over $200K
in cash donations from corporations and foundations: and an NIH post-doctoral
training grant from the National Library of Medicine).

The Master of Science in Computer Science: Artificial Intelligence (MS-:AI ) program
is a terminal professional degree offered for students who wish to develop a competence
in the design of substantial knowledge-based AI applications but who do not intend to
obtain a Ph.D. degree. The MS:AI program is administered by the Committee for
Applied Artificial Intelligence, composed of faculty and research staff of the Computer
Science Department. Normally, students spend two years in the program with their

Privileged Communication 113 E. H. Shortliffe
Resource Progress

time divided equally between course work and research. In the first year, the emphasis
is on acquiring fundamental concepts and tools through course work and and project
involvement. During the second year, students implement and document a substantial
Al application project.

E. H. Shortliffe 114 Privileged Communication
Resource Progress

2.1.4.8. Resource Community Management

Early in the design of the SUMEX-AIM resource, an effective management plan was
worked out with the Biotechnology Resources Program (now Biomedical Research
Technology Program) at NIH to assure fair administration of the resource for both
Stanford and national users and to provide a framework for recruitment and
development of a scientifically meritorious community of application projects. This
structure is described in some detail in Section 2.3.3 on page 181 of the renewal plan.
It has continued to function effectively as summarized below.

e The AIM Executive Committee meets regularly by teleconference to advise
on new project applications, discuss resource management policies, plan
workshop activities, and conduct other community business. The Advisory
Group meets together at the annual AIM workshop to discuss general
resource business and individual members are contacted much more
frequently to review project applications. (See Appendix C on page 307 for
a current listing of AIM committee membership).

e We have actively recruited new application projects and disseminated
information about the resource. The number of formal projects in the
SUMEX-AIM community still runs at the capacity of our computing
resources. With the development of more decentralized computing resources
within the AIM community outside of Stanford (see below), the center of
mass of our community has naturally shifted toward the growing number of
Stanford applications and core research projects. We still, however, actively
support new applications in the national community where these are not
able to gain access to suitable computing resources on their own.

e With the advice of the Executive Committee, we have awarded pilot project
Status to promising new application projects and investigators and where
appropriate, offered guidance for the more effective formulation of research
plans and for the establishment of research collaborations between
biomedical and computer science investigators.

e We have allocated limited "collaborative linkage" funds as an aid to new
projects or collaborators with existing projects to support terminals,
communications costs, and other justified expenses to establish effective
links to the SUMEX-AIM resource. Executive Committee advice is used to
guide allocation of these funds.

e We have carefully reviewed on-going projects with our management
committees to maintain a high scientific quality and relevance to our
biomedical AI goals and to maximize the resources available for newly
developing applications projects. Several fully authorized and pilot projects
have been encouraged to develop their own computing resources separate
from SUMEX or have been phased off of SUMEX as a result and more
productive collaborative ties established for others.

¢ We have continued to provide active support for the AIM workshops. The
last one was held at Ohio State University in the summer of 1984 and the
next one will be in Washington, DC, hosted by the National Library of
Medicine under Drs. Lindberg and Kingsland.

e We have continued our policy of no fee-for-service for projects using the

SUMEX resource. This policy has effectively eliminated the serious
administrative barriers that would have blocked our research goals of

Privileged Communication 115 E. H. Shortliffe
Resource Progress

broader scientific collaborations and interchange on a national scale within
the selected AIM community. In turn we have responded to the
correspondingly greater responsibilities for careful selection of community
projects of the highest scientific merit.

e We have tailored resource policies to aid users whenever possible within our
research mandate and available facilities. Our approach to system
scheduling, overload control, file space management, etc. all attempt to give
users the greatest latitude possible to pursue their research goals consistent
with fairly meeting our responsibilities in administering SUMEX as a
national resource.

As indicated above, we have sought to retain SUMEX resources for new projects, those
exploring new areas in biomedical AI applications and those in such an early state of
feasibility that they are unable to afford their own computing resources. This policy
has worked effectively as seen from the following lists of terminated projects and
projects now using their own computing resources at other sites:

Projects Moved All or In Part to Other Machines:
Stanford Projects:
e GENET [Brutlag, Kedes, Friedland - IntelliCorp]
National Projects:
« Acquisition of Cognitive Procedures (ACT) [Anderson - CMU]
e Chemical Synthesis [Wipke - UC Santa Cruz]
e Simulation of Cognitive Processes [Lesgold - Pittsburgh]
e PUFF [Osborne, Feigenbaum, Fagan - Pacific Medical Center]
e CADUCEUS/INTERNIST [Pople, Myers - Pittsburgh]
¢ Rutgers [Amarel, Kulikowski, Weiss - Rutgers]
e MDX [Chandrasekaran - Ohio State]
e SOLVER [P. Johnson - University of Minnesota]

Completed Projects Summary
Stanford Projects:
« DENDRAL [Lederberg, Djerassi, Buchanan, Feigenbaum]
e MYCIN [Shortliffe, Buchanan]
e EMYCIN [Shortliffe, Buchanan]
e CRYSALIS [Feigenbaum, Engelmore]
« MOLGEN I [Feigenbaum, Brutlag, Kedes, Friedland]
e AI Handbook [Feigenbaum, Barr, Cohen]

E. H. Shortliffe 116 Privileged Communication
Resource Progress

e AGE Development [Feigenbaum, Nii]

National Projects:

« Ventilator Management [Osborne, Feigenbaum, Fagan - Pacific Medical
Center]

¢ Higher Mental Functions [Colby - USC]

Privileged Communication 117 E. H. Shortliffe
Planned Resource Activities

2.2. Planned Resource Activities

We have already summarized the overall aims of the SUMEX-AIM resource for the
proposed 5-year renewal period on page 64. This section gives details of our research
plans in pursuit of those aims for the major areas of our resource activities -- core
research and development, collaborative research, service, training and education, and
dissemination. To recap the overall scope and guiding goals of our new work:

e SUMEX-AIM is a national computing resource that develops and provides
advanced computing facilities and expertise to support 1) a long-term
program in basic research in artificial intelligence, 2) applying AI techniques
to a broad range of biomedical problems by collaborative and user projects
at Stanford and other universities around the country, 3) studying and
developing methodologies for disseminating AI systems into the biomedical
community, 4) experimenting with communication technologies to promote
scientific interchange, and 5) developing better tools and facilities to carry
on this research.

e Our applications, core research, and system development will be directed
toward realizing and exploiting the computing environment that will be
routinely available in the late 1980's and early 1990's, based on compact,
decentralized, high-performance personal workstations that take advantage of
the intelligent computing environments beginning to emerge from today's
Lisp workstations. Consistent with these plans, we will immediately
discontinue DRR subsidy for the DEC 2020 demonstration machine and for
the shared VAX 11/780 time-sharing system. Also we will gradually and
tesponsibly phase out DRR support for the DEC 2060 mainframe system
that has been our chief shared resource and link to the past.

e There are consistent threads through our applications, system dissemination,
core research, and computing environment development work. These threads
are that our research work at all levels is driven by the real-world scientific
applications that we undertake; that we choose applications that have a high
impact on current medical and biological problems and that expose key
underlying AI research issues; and that we seek to maximize the availability
of the facilities for and results of this work in the biomedical community.
This is seen, for example, in the coupling between our core research and
development work and applications projects such as ONCOCIN and
PROTEAN.

e We must continue to provide the computing resources for the growing
Stanford biomedical AI research community and the national projects still
dependent on us, to emphasize nurturing newly started AI applications, to
serve as a communications cross-roads for the large and diverse AIM
community, and to ensure broad dissemination of our research results and
methods.

2.2.1. Core Research and Development

Reasoning in medicine and the biological sciences is knowledge-intensive. A recent
article in Science [12], for example, discusses the role of information in the search for
a cure for cancer. As the rate of explosion of knowledge continues to increase,
clinicians and biomedical scientists must turn to computers for help in managing the
information, and applying it to complex situations.

E. H. Shortliffe 118 Privileged Communication
Core Research and Development

Artificial intelligence methods are particularly appropriate for aiding in the
management and application of knowledge because they apply to information
represented symbolically, as well as numerically, and to reasoning with judgmental rules
as well as logical ones. They have been focused on medical and biological problems for
over a decade with considerable success. This is because, of all the computing methods
known, AI methods are the only ones that deal explicitly with symbolic information
and problem solving and with knowledge that is heuristic (experiential) as well as
actual.

Expert systems are one important class of applications of AI to complex problems
-- in medicine, science, engineering, and elsewhere. Expert Systems draw on the current
stock of ideas in AI, for example, about representing and using knowledge. They are
adequate for capturing problem-solving expertise for many bounded problem areas. But
the current ideas fall short in many ways, necessitating extensive further basic research
efforts. Our core research goals are to analyze the limitations of current techniques, to
investigate the nature of methods for overcoming them, and to develop tools to build
and disseminate new and more effective biomedical expert systems.

Long-term success of computer-based aids in medicine and biology depend on
improving the programming methods available for representing and using domain
knowledge. That knowledge is inherently complex -- it contains mixtures of symbolic
and numeric facts and relations, many of them uncertain: it contains knowledge at
different levels of abstraction and in seemingly inconsistent frameworks; and it links
examples and exception clauses with rules of thumb as well as with theoretical
principles. Current techniques have been successful only insofar as they severely limit
this complexity. As the applications become more far-reaching, computer programs will
have to deal more effectively with richer expressions and much more voluminous
amounts of knowledge.

Privileged Communication 119 E. H. Shortliffe
Core Research and Development

2.2.1.1. ONCOCIN-Related Core Research

As mentioned earlier in this application, our research plan for the next five years
includes merging the core research activities of the ONCOCIN project with other basic
research activities coordinated by the SUMEX resource. The ONCOCIN project is now
in its sixth year and has involved approximately 40 research staff and students, some of
whom have worked full time on aspects of the program or its knowledge. base. It is
accordingly large and has elements that span a variety of basic and applied research
issues. The project's elements have been summarized in some detail elsewhere in this
application and in the SUMEX annual report.

Since 1983 the Biomedical Research Technology Program, through a resource-related
grant (RR-01631), has supported the effort to convert ONCOCIN to run on
professional workstations (the Xerox 1108 Lisp machine). When that grant terminates
in 1986, ongoing research will include a mixture of applied activities (evaluation of the
workstations in the Stanford clinic and experiments to implement ONCOCIN
workstations in private oncology offices in Northern California) and more basic
activities intended to generalize past ONCOCIN results for the AIM community. We
propose to continue the basic aspects of this work as core research under the SUMEX
grant, and use complementary support for the other aspects of the project from the
National Library of Medicine and, if a pending application for a dissemination
experiment is successful, jointly from the National Center for Health Services Research
and the National Cancer Institute.

In this section we summarize the core research activities that we intend to pursue in the
context of ONCOCIN. They fall into four principal categories: implementation of
ONCOCIN workstations in the Stanford clinic, knowledge acquisition research (OPAL),
research to generalize ONCOCIN for application in clinical trial domains other than
medical oncology (E-ONCOCIN), and research on generalized approaches to strategic
therapy planning (ONYX).

Background on The ONCOCIN Program

From the outset, the ONCOCIN research effort has been directed towards both basic
research in artificial intelligence and the development of a clinically useful consultation
tool. We initially sought to apply techniques developed during our earlier work on the
MYCIN system and to extend those methods to interact with a large database of clinical
information. More recently, however, the system has departed from the uniform
production rule approach of MYCIN in several significant ways (e.g., introduction of
heterogeneous knowledge structures and distributed control processes [50] in the
workstation version of ONCOCIN). Our approach to these problems has been greatly
influenced by the Lisp machine technology to which we were first exposed through the
foresight of SUMEX when it acquired such experimental machines in the early 1980's.

The initial version of ONCOCIN, including its clinical implementation in our cancer
clinic, runs on a time~shared DEC-20 computer and uses a customized video display
terminal installed in our oncology clinic. Since May of 1981, the prototype has been
used on a limited experimental basis by oncology faculty and fellows to obtain advice
on the treatment of patients enrolled in protocols for the treatment of Hodgkin's
disease and non-Hodgkin's lymphoma. In the past year, additional protocols for
adjuvant chemotherapy of breast cancer were added to the system.

We are excited by the promise of this prototype version of ONCOCIN. Formal
evaluation of the system has shown that ONCOCIN does very well in suggesting
therapy, even in cases where complex attenuation or changes in drugs are required [33].
It has also had a significant effect on the completeness with which clinical trial data
are captured and made available for analysis [35]. In addition, we are extremely

E. H. Shortliffe 120 Privileged Communication
Core Research and Development

encouraged by the effectiveness of the interface program we have devised (the

Interviewer) and the speed with which new users have been able to learn to use the
system.

We believe that our current efforts to adapt the existing prototype for use on
professional workstations will increase ONCOCIN's clinical acceptability. The use of a
dedicated computer featuring high resolution graphics and mouse pointing devices to
obviate typing should make the system even more attractive to busy physicians. As is
described in the ONCOCIN progress report elsewhere in this proposal, we expect to
have two Lisp machine (Xerox 1108) workstations in use in the Stanford oncology
clinic by mid-1986. Thus, the continuation of ONCOCIN research in that clinic
(knowledge base enhancement, software development in response to user feedback, and
evaluations of the impact and acceptance of the workstation technology) will continue
under the SUMEX umbrella after the merger of the SUMEX and ONCOCIN activities
at the beginning of the next grant period. We should emphasize that, because of the
moderate price of these computers, we look forward to transferring ONCOCIN for use
in small clinics and physicians’ offices. This will offer private physicians up-to-date
decision support-for the treatment of cancer patients (a recognized area of need) while
allowing randomized clinical trials (RCTs) in oncology the benefit of greatly expanded
access to appropriate patients. A four year experiment to install and test ONCOCIN in
private offices has been proposed and is awaiting review and a site visit at this time.

Automated Knowledge Acquisition for RCTs

RCTs are based on rigidly structured therapy plans. Oncology protocols demonstrate
this point nicely. RCT protocols are comprised of treatment arms, which in the case
of oncology specify sequences of chemotherapy or radiotherapy. There is an explicit
hierarchy of knowledge elements in these protocols which becomes important for

knowledge acquisition. The hierarchy for a typical cancer chemotherapy protocol is
shown in Fig. 6.

—_—_— Erotoco! ee
Arm,

a me
yy ee ry Radiotherapy
Drug, Drug, Drug, Orug, Drugs Drug, Drug, Drug,

Figure 6: Sample Chemotherapy Protocol Hierarchy

ONCOCIN uses a variety of internal representations to store protocol knowledge. For
example, in one arm of a protocol for small cell lung cancer, seven different drugs are
used as part of two chemotherapies in a specific sequence over seven weeks. The
sequence of chemotherapies is repeated five times, making the total duration of
treatment 35 weeks. The names of the chemotherapies are POCC and VAM.
Administering POCC requires that the patient make two separate clinic visits to receive
medication during each treatment cycle. Hence, POCC is divided into two sub-cycles:
POCC-A and POCC-B. After the second complete cycle of POCC, the patient is given
cranial irradiation. The computer representation of this entire complex sequence is:

Privileged Communication 121 E. H. Shortliffe
Core Research and Development

(({POCC 1 A) (POCC 1 B) (VAM 1)
POCC 2 A) (POCC 2 B)
XRT CRANIAL)
VAM 2)
POCC 3 A) (POCC 3 B) (VAM 3)
POCC 4 A) (POCC 4 B) (VAM 4)
POCC 5 A) (POCC 5 B) (VAM 5))

This purely procedural knowledge can be extracted from protocol documents fairly
easily; one need not understand oncology. However, much of the important knowledge
in ONCOCIN is more judgmental and is represented in the form of production rules.
ONCOCIN currently uses over 400 rules to determine:

e how to adjust specific drug dosages because of treatment-induced low blood
counts or other adverse (toxic) reactions to therapy

e when to delay treatment or abort a therapy cycle

e how to modify therapy in light of a patient's changing clinical conditions or
response to the protocol

e when to order certain laboratory tests and how to interpret their results.

Note that these issues are generic for all clinical trials, and similar rules could be
written to assist with proper administration of treatment for RCTs in other medical
domains.

An example of one such rule, drawn from the ONCOCIN system, is shown in Fig. 7. It
was developed by examining a formal protocol and then further enhancing and
validating the knowledge through discussions between an oncologist and a knowledge
engineer.

To determine the current attenuated dose for patients with all lymphomas
in CHOP chemotherapy for Cytoxan or Adriamycin:
If: The blood counts warrant dose attenuation
It patient did not receive chemotherapy
before the last radiation therapy
This is the first cycle after significant radiation
This is not the first visit after an Abort cycle

a NPE
ee ee

Then: Conclude that the current dose is 75% of the standard
dose further attenuated by either the dose attenuation
for low WBC or the dose attenuation for low platelets,
whichever is less.

Figure 7; Sample ONCOCIN Rule, Translated to English from Internal Format

The knowledge engineer then must convert this rule into a_ representation
understandable by the computer. The rule format for computer use is generally
unreadable to the clinician who helped to develop the rule in the first place. It is the
translation shown in the figure that is created and reviewed by the clinician. The
knowledge engineer's detailed understanding of the manner in which information is
represented in the computer allows him or her to develop the corresponding machine-
understandable format.

Because the knowledge engineering process is cumbersome and inefficient, we have
recently embarked on work to develop a system, termed OPAL, that acquires new
knowledge of oncology protocols directly from physicians while shielding them from
technical details. As part of our SUMEX core research activities, we will seek to

E. H. Shortliffe 122 Privileged Communication
Core Research and Development

generalize this approach for application in other medical domains in which RCTs are
commonly used. The knowledge contained in protocols for oncology (and for other
RCTs as well) has already been formalized in the protocol document. The most
fundamental problems of conceptualizing and structuring the domain knowledge should
therefore not be an issue in this work.

For example, detailed discussions with our oncology experts and review of dozens of
protocol documents make it clear that the knowledge in protocols is both predictable
and constrained by the very nature of oncologic clinical trials. For each concept that
appears in oncology protocols, we can anticipate the general nature of most of its
possible values. For example, we can assume that all drugs will have a dose that can
be represented by an integer. All drugs will have a route--intravenous, intramuscular,
or oral. Our knowledge of the field allows us to determine a priori what possible
choices might be appropriate for most concepts. This has great implications for
automated assistance in knowledge acquisition.

We have known for some time that it would be ideal to provide an environment so
that the physicians can themselves enter and manipulate knowledge of a RCT protocol
and related medical knowledge. However, since it is generally unrealistic to teach
collaborators to become programmers or knowledge engineers, we are faced with the
traditional problems of getting a computer to understand the meaning underlying
unstructured phrases or sentences entered by a physician. TEIRESIAS had approached
the problem by cleverly manipulating the context of an interaction with an expert,
thereby simplifying the task of understanding entries [13]. However, problems in
computer-based understanding of natural language (still a major research topic in
artificial intelligence) prevented TEIRESIAS from becoming sufficiently robust for
routine use. We have been unwilling to reopen the Pandora’s box of natural language
understanding for the ONCOCIN project, and therefore in the early years have had to
resort to the LISP-based entry of knowledge.

Two factors have accounted for our decision to turn again to the problem of knowledge
acquisition. The first has been a simple matter of need. As we have developed plans
to adapt ONCOCIN for use on single-user machines in physicians’ offices, and have
contemplated the large numbers of protocols that must be available online for practical
use of such a tool, we have been forced to acknowledge the necessity of an enhanced
knowledge acquisition capability. Second, in transferring ONCOCIN to personal
workstations and familiarizing ourselves with this new technology, we have become
aware of the potential for using advanced graphics techniques to avoid problems of
natural language understanding during entry of knowledge by a computer-naive user.
To explore the possible use of the graphics capabilities of LISP machines to facilitate
knowledge acquisition directly from experts, we have recently developed a prototype
system for knowledge entry. OPAL was designed in close collaboration with oncologists
who will be the eventual end users of such a system. To build the prototype version of
OPAL we reviewed all of the concepts that had been required for each of the protocols
that we entered by hand, and explored a large number of existing protocol documents
that we hoped to enter into the completed system.

The OPAL prototype runs on the same professional workstation (the Xerox 1108
"Dandelion”) on which the new version of ONCOCIN is being developed. Like the
new ONCOCIN system, OPAL is designed to take advantage of the advanced graphics
capabilities of the workstation and uses a mouse pointing device almost exclusively for
input by the physician.

In developing OPAL, we attempted to organize the information to be entered by the
physician in a manner similar to the structure of typical protocol documents. A
constant consideration was to request knowledge from the physician in a manner
consistent with the way oncologists tend to think about protocols. OPAL guides

Privileged Communication 123 E. H. Shortliffe
Core Research and Development

protocol entry in a loose fashion; the expert is provided with an ability to change
topics at his or her convenience. However, the program follows an orderly progression,
first asking for general information about the scope of the protocol, principal
investigators, and inclusion and exclusion criteria; next asking for the protocol “schema”
-- a shorthand notation that describes the sequences of treatments; and finally
Tequesting information on specific drugs, dose modifications, and diagnostic tests
Tequired by the protocol.

The questions for each of these categories are grouped into individual windows on the
graphics display. These windows contain a number of “blanks” on the screen to be
completed in order to provide pertinent protocol information. Most blanks can be
filled in by selecting them with the mouse and then selecting an item from a menu that
is displayed. Rarely the blanks are filled in by typing at the keyboard. The windows
are not all displayed at once but rather are selected one at a time by the physician
working his or her way through a protocol. Selecting a window brings it "into view".
In the present OPAL prototype, most of the major windows are portrayed graphically as
a stack of overlapping “file folders” on the screen. Using the "mouse" to select the
“tab” of one of these folders brings the corresponding window into view. Special menu
windows can be created for the entry of purely numerical data. For example, we have
developed menus, called "registers", that appear either in the format of a 10-key
calculator pad (for free-form digit selection) or else in a columnar format, akin to the
front of an old-style cash register. In either case, the user indicates the appropriate
digits sequentially using the mouse without needing to touch the keyboard. Several
examples of the windows used for protocol entry are provided in the working paper by
Differding included as an Appendix to this application.

The OPAL prototype presumes that the user will have no appreciation for how
knowledge is stored in the computer for use by the reasoning elements in ONCOCIN;:
the user need only be able to understand oncology protocol documents. The system
deals with chemotherapy knowledge at such a high level that the user is completely
shielded from issues of knowledge base organization and format. The physician using
OPAL needs to be concerned only with the actual knowledge in the protocol to be
entered,

The preliminary version of OPAL consists of a series of windows that may be displayed
on the screen of the 1108 workstation in any order. Each window represents a series of
questions or blanks to be filled in for a specific portion of a protocols knowledge.
For example, one window asks questions about the names and standard dosages for the
drugs to be used for a given chemotherapy; another asks what laboratory studies are
required by the protocol; a third inquires what actions to take if certain toxicities
evelop.

For each possible “blank” in the window, information is entered automatically by the
system if the corresponding data are already known because of previous responses (e.g.,
if a standard chemotherapy is chosen in one window, the individual drugs involved will
then appear in all of the other windows that ask for drug information). Otherwise,
selecting a blank with the mouse causes a menu with possible completions for that item

to “pop up” on the screen. The mouse is then used to select the desired response from
the menu.

The OPAL prototype has been tested by several physicians and all have found the
system easy to use after a few minutes of training. Frequent feedback from our
oncology collaborators has allowed us to make modifications, expanding the options in
certain menus and improving the user interface. These modifications have been
effected by reprogramming parts of the system. However, we plan to be able to make
changes to OPAL eventually by editing data structures, rather than by having to update
the actual computer programs.

E. H. Shortliffe 124 Privileged Communication
Core Research and Development

When a protocol is entered using OPAL, the knowledge ultimately must be encoded .in
an internal form so that ONCOCIN can use it to give advice and manage the protocol
data. We see this encoding occurring in a two stage process, with an intermediate data
Structure serving to insulate the interaction with OPAL from the detailed structure of
the knowledge base. Thus OPAL will be used to enter protocol knowledge, it will be
stored in an intermediate data structure (or IDS), and then further refined into a
knowledge base for use by ONCOCIN. As is outlined in the next section, these ideas
generalize to RCT advice systems in other clinical domains -- a generalized OPAL
might be used to enter RCT guidelines, thereby creating a knowledge base for use by a
generalized version of ONCOCIN.

Generalization of ONCOCIN: E-ONCOCIN

Most protocols in clinical medicine contain elements in common with oncology trials.
We plan to build on our experience creating OPAL to apply the same methodology to
develop expert systems for RCTs in other medical areas. This research to develop
generalized knowledge acquisition programs like OPAL for other RCTs will be of great
practical importance. However, we recognize that the work will address significant
theoretical issues in the field of medical artificial intelligence. In fact, we expect that
the Meta~-OPAL work outlined below will constitute a Ph.D. dissertation for one of our
Medical Information Sciences graduate students (Dr. Mark Musen).

What we propose is a high-level tool for use by knowledge engineers in conjunction
with clinicians to define all the properties of a knowledge acquisition system (KAS)
that may be used subsequently to enter the knowledge for a particular class of clinical
trials. OPAL is an example of a KAS, one that is customized for the class of clinical
trials relevant to clinical oncology. A KAS for another domain, such as hypertension
or epilepsy management, might look very different. Certainly the display windows for
protocol entry would bear little resemblance to those used in the current version of
OPAL. This new high-level tool, Meta-OPAL, will take as its input the complete
specifications for a KAS. It will produce as its output a data structure that will enable
a second program, E-OPAL, to interact with a domain expert to capture and encode a
whole class of new protocols. These encoded protocols can then be used for data
management and consultation by a domain-independent version of ONCOCIN (the
ONCOCIN inference engine, to be termed E-ONCOCIN)!. E-OPAL will be a version
of OPAL stripped of all its built-in oncology knowledge. E-OPAL thus will rely on
Meta-OPAL to provide all the information required to perform knowledge acquisition
and management. The relationships of the various modules is diagramed in Figure 8.

The concept of a “knowledge acquisition system for knowledge acquisition systems” is
attractive in many respects. First, many of the problems of a limited "world view” in a
program such as OPAL will be readily overcome because all of the domain assumptions
(eg., beliefs about oncology, cancer protocols, or chemotherapy) will be explicitly
declared at the Meta-OPAL level. For example, an implicit assumption built into the
present OPAL prototype is that patients are treated with either chemotherapy or
radiotherapy. The physician using OPAL is never asked to enter information regarding,
say, surgery because knowledge about options for surgery is not currently within
OPAL's “world view". Even by modifying OPAL to specify new parameters, no
protocol that called for repeated surgical procedures could be satisfactorily encoded
unless we had an ability to make even higher-level modifications to OPAL.

At present, we can make this sort of higher level modification to OPAL only by

Ithe names E-OPAL and E-ONCOCIN are inspired by the similar domain independent tool developed by
our group in the 1970's. This program, EMYCIN or “Essential MYCIN", is the inference engine separated
from the knowledge base of MYCIN

Privileged Communication 125 E. H. Shortliffe