Core Research and Development believe that a promising approach is to enrich the representational structure of our network language, so that the program knows not only that "X causes Y”, but also has enough detailed knowledge so that it can explain why the connection is plausible. Such a program could aid the knowledge acquisition process by automatically critiquing the evolving network. Moreover, the program would ask questions to help it fill in the gaps and lack of coherency it detects. Using the above example, after being told an implication (ordinary heuristic rule) relating brain-mass-lesion and brain-tumor, the program would attempt to classify these terms as processes or substances, note the locations, and isolate the particular causal interaction (mass causes a lesion). The key to such a capability is a representation language that defines concepts in terms of a relatively small number of relations (such as the conceptual dependency notation of Schank), plus generic knowledge of physical processes (e.g., the idea of a mass growing in size severing an enclosing substance). A great deal of research in qualitative reasoning of physical processes [3], particularly the research of Wendy Lehnert, lays the foundation for this kind of investigation. The learning program we will construct could be termed "the advice requester.” We believe that the ability to ask good questions is the mark of a good student or researcher, and it can greatly focus the learning process. Asking good questions requires relevant background knowledge, so the learner can learn something new by relating it to some facts or some general framework he already understands. This process can be complex, because there are levels and perspectives for understanding. What may at first appear consistent, could become puzzling later as new gaps appear in an evolving network. Concepts in fact change their meaning as exceptions and complex special cases come to light. Learning by asking is a form of knowledge-intensive learning, to be contrasted with research in automatic learning (becoming more efficient). For knowledge engineering, such an approach is a dramatic switch from giving the program surface causal rules that it in no sense understands, to giving a program knowledge of underlying causal models that enable the program to justify its causal network. Most importantly, these models provide a set of expectations of states and faults that might be included in a causal network. To take an example from another domain in which we are working, iron casting, one fault is a shrinkage cavity. Generic knowledge would indicate that a cavity is an absence of material, and that for casting the source of material is what is poured and a reservoir (part of the mold) to allow for shrinking. A built-in generic model would indicate three reasons why a source of material does not arrive at the sink: insufficient supply (reservoir is too small), supply lost by leaking, and blocked flow from source to sink, These three generic causes set up expectations for specific causal processes that will appear in the state network. A given knowledge base might refer to a model only once, but a library of such models would form the basis of a powerful knowledge acquisition program that could learn about new domains fairly quickly. We believe that this generic library of processes is part of what we call common sense knowledge. An advice requester that would be as proficient as our best knowledge engineers is obviously not going to be constructed in a year or two. Our approach will be first to study the causal networks we have constructed in medicine and casting, and re-represent the knowledge in structures that include the generic, underlying abnormal processes. Next, using a method we have found to be advantageous in the past for refining a knowledge representation, we will construct a simple teaching program that can explain such a causal network and help the student critique an incomplete network. Ultimately, we believe that teaching students to think like knowledge engineers, that is to learn the process of asking good questions, may be even more valuable than directly trying to convey our products, the constructed knowledge bases. Privileged Communication 151 E. H. Shortliffe Core Research and Development 4. Qualitative Simulation GOALS In the context of the Molgen-II project, we are exploring the process of scientific theory formation and modification by computer. Qualitative simulation of biological processes is an important part of this goal because it is mecessary to ask about the results of hypothetical experiments in the course of theory formation and running a detailed simulation is often too expensive. MOTIVATION We are carrying out this research by studying a specific biological system: the regulatory genetics of the £. Coli tryptophan operon (the trp system). In the mid 1960's Dr. Charles Yanofsky (who is a collaborator with us on this project) began to probe the existing theory of gene regulation in this operon. Yanofsky's initial experiments revealed a number of anomalies. Since that time, Yanofsky's research (which continues today) resulted in the discovery of a totally new mechanism of prokaryotic gene regulation, and continues to refine our knowledge of exactly how this mechanism unctions, Our goal is to build a machine learning system which will accept an initial theory of gene regulation equivalent to that which Yanofsky began to probe in the 60's. We will then present our system with a series of experimental results based on Yanofsky's early observations. The learning system will then propose, implement, and attempt to confirm possible modifications to its theory of gene regulation. We view theories - such as that of the trp operon's function - as problem solvers. The inputs to these problem solvers are descriptions of hypothetical experiments. The problem solver's outputs are descriptions of the predicted results of these experiments. Thus our learning program will be attempting to improve the predictive performance of a problem solver in bacterial regulatory genetics. This research in machine learning presumes the existence of a simulator of the trp system. Building such a problem solver in itself raises interesting AI research issues in qualitative simulation. And building such a system in a form which can be reasoned about by another program (the learning element) complicates the problem even further. Below we discuss our past work on the construction of two versions of such a problem solver ("the simulator”). We then outline a number of interesting research issues which this work has raised, and the approaches we plan to pursue in the construction of the simulator. BACKGROUND Version I An exploratory version of the system was built in the Spring of 1984. The system was constructed using the UNITS system - one of the first general-purpose expert system building tools. This first system was more of a success as a static knowledge base than as a dynamic simulator. Building this system forced us to come up with a concrete conceptualization of the problem domain: we determined the full range of objects the system would have to simulate, and considered what types of properties and internal states these objects have, and how they should be represented within the UNITS system. This knowledge base was examined several times by our biologist collaborators (Yanofsky and Dr. Robert Landick - a post-doctoral fellow in Yanofsky's lab) to help us detect errors and omissions. E. H. Shortliffe 152 Privileged Communication Core Research and Development The first system never contained much simulation capability. We did provide a mechanism whereby the state of the transcription mechanism could be determined after the user specified experimental conditions such as approximate tryptophan concentration and whether or not various objects such as the trp-R repressor and the trp promotor contained deleterious mutations or not. The simulation capability was essentially provided by backward chaining on the slot values of relevant units, with the actual inferences carried out by Lisp code attached to some slots. We learned a number of things from this prototype system. The knowledge base we created became a concrete record of the objects relevant to problem solving in this domain, and of design decisions regarding their representations. We also discovered a number of things about the UNITS system: 1. Its knowledge base editor ran fairly slowly 2. We encountered and fixed several significant bugs 3. Its rule language is fairly awkward 4. Its inheritance hierarchy lacked some important features, such as the ability of a given object to inherit slots from more than one parent class. ‘(Note that points 1 and 2 result from UNITS having been developed and maintained within a university research environment.) We also confirmed an observation made long ago by other AI researchers. Previous work has shown that the simpler a language is, the more amenable it is to being both executed by one entity and interpreted by another entity (such as an explanation facility). This is one reason expert systems are now often encoded in production rules rather than Lisp. It became quite obvious that if our learning element is forced to reason about a simulator containing Lisp procedures, it would be significantly more complex than if the simulator were written in another language. Simple as the syntax of Lisp is, even a reasonable subset of full Interlisp would contain quite a large number of fairly complex constructs, and would complicate the learning element tremendously. We also made an interesting observation about how building an expert system can help experts think about their own domain. We will consider two examples of this particular idea. Both involve subclass units which were defined in the knowledge base by Karp and then discussed with Yanofsky and Landick. One subclass was called “DNA Segments" and was intended to include contiguous segments of DNA with discrete functions, such as: promoters, terminators, genes, and operators. Among the properties associated with this class were: sequence, position within some larger functional piece of DNA, and “generalized sequence” - an attempt to capture those sequence elements common to a given subclass of DNA Segments such as promoters. The other defined class of interest was termed “Molecular Switches”. This was an attempt to represent the general notion of a molecule with two functional states, where transitions between states are caused by the binding and dissociation of the molecule from some other molecule. Examples of Molecular Switches are operators, promoters, and repressors. In both cases Yanofsky and Landick expressed interest in these concepts, and noted that biologists had coined no terms for them. This suggests that these concepts are in some sense new to biologists. We hypothesize that the process of constructing an expert system will naturally lead to the identification of such general concepts - or, equivalently - to the creation of analogies between known concepts. The reason for this is that in attempting to represent the behaviors of N different entities, it is often much more efficient (with respect to development time and code Privileged Communication 153 E. H. Shortliffe Core Research and Development volume) to develop one general-purpose procedure which yields the N different behaviors given different parameter bindings, than it is to develop a different procedure for all N cases. It is the knowledge engineer's job to search for such general procedures. Version If Recently we have begun building the next version of the simulation system. We are implementing this using the KEE knowledge engineering tool developed by IntelliCorp. This will free us from all the limitations of the UNITS system mentioned above. We have accomplished the initial obvious goal of porting the knowledge base defined using UNITS to KEE. Related Work Recently a significant amount of work has been done in AI in Qualitative Simulation (de Kieer and Brown, Forbus, Patil, Kuipers). While this work is somewhat relevant to the research we propose, there are several reasons why it is not sufficient. First, most of this work attempts to simulate systems described by Physics using differential equations. Much of this work is an attempt to generalize numerical differential equations into qualitative differential equations. However, Biology is a much younger science than Physics, and as such does not describe its mechanisms to nearly such a quantitative degree. Differential equations are rarely if ever used by Molecular Biologists, and hence qualitative differential equations do not RESEARCH PLAN The next step is to define the behavior for these objects so that actual simulations can be executed. This raises the question: in what language should this behavior be defined? We rule out Lisp for reasons discussed earlier. We also believe production rules are not a good language for defining this behavior, for reasons that will be outlined below. We now discuss the features we believe the simulator should provide, describe research questions these features raise, and consider what constraints such a simulator imposes on an underlying implementation language. Reasoning At Varying Levels Of Detail We believe it is important that the simulator be able to reason at varying levels of detail depending upon the demands of a particular problem. That is, it should be possible for the simulator to solve many problems without simulating every single process it knows about in the most detailed manner possible. Rather, given a problem statement the simulator should perform meta-level reasoning to determine which processes to simulate, and at which of several possible abstraction levels to simulate each process. For example, in an experiment involving an otherwise normal E. Coli cell with a deleterious mutation in its trp-R protein, it should not be necessary to simulate the RNA-synthesis actions of RNA-~polymerase at the nucleotide level. , from , to , subject , and all. Reading the letter associated with a header then transfers the actual text of the letter from the server to the client with a read-mail transaction, unless the letter has already been transferred to the client and is cached Privileged Communication 173 E. H. Shortliffe Core Research and Development there. This transaction causes the read-date stamp and "seen" bit to be updated in the spindle file entry. Mail Reading and Composition: Mail commands such as read, answer, set alarm, delete, and copy, key off of header selection. When one reads a letter it is then read from the server to the client by a read-letter transaction. The text is displayed in a window and can be scrolled as well as edited. All text editing and composition is done on the local workstation. When one answers a letter immediate destination host address Tecognition is mandatory. This can be accomplished by requesting host address validation after the addresses have been typed. One can use the domain name server and LAN name servers for this purpose. It also makes sense to cache known host names locally and if for some reason the name servers do not reply this list can be used for a second guess. If all else fails, then one should simply attempt to deliver the letter. If in fact the address is not valid, then this. will be noted when the letter is returned to the sender as undeliverable. Mail Delivery: Once a letter is composed and the sender requests it to be delivered, it will be spooled on one of the file/mail servers. These servers already have all of the knowledge necessary to deliver any letter to a known host. Mail delivery is done in background on these servers by a low priority process. An attempt should be made to spool the mail on the server with the smallest mail queue and such a mail-queue-size query message will be sent to those servers that respond to a request-to-send-mail broadcast. Each host can override the latter broadcast by simply remembering which servers responded to earlier broadcasts, and thus maintaining a mail-delivery-path for directing mail-queue-size queries. The system resource manager will maintain current mail delivery information. Often a host in a mail-delivery-path is down for some teason, and mailers will continuously attempt to shrink their growing mail queues by uselessly badgering this host. It makes sense to be able to request server-downtime and alternative mail routes from a resource manager. If there is no alternative route, the mail client/server can periodically check until the host comes up rather than try and send mail to a down host which amounts to useless network traffic, Ultimately, a mail-server process ought to be able to run in the background on personal workstations, and mail could then be delivered directly to that host for those users who desire such a service. This will then take the file/mail-servers out of the mail storage and retrieval loop for such hosts. Mail is simply sent directly to the workstation that has a registered address in the domain name server tables. The mail is then retrieved and read "as if” it had already been copied from a remote file/mail-server. This latter mechanism is part of the initial design. As mail accumulates on such a host, the user will be able to take advantage of those already existent file/mail-server processes to maintain mail archive directories remotely so that old mail can still be examined in the client/server role. Virtual Graphics Terminal Service Virtual graphics terminal service (VGTS) allows the display of structured graphical objects on a workstation running the V system [37]. We have already indicated the power of this set of tools. While running V on a small and inexpensive workstation located either at home or on the LAN, or anywhere that has TELNET access to the LAN on which a personal Lisp machine has a TELNET server running, one can then access that Lisp machine and drive the graphics display of the smaller workstation from the Lisp machine. Geographic proximity of such a Lisp machine is then moot. As the ratio of researchers per Lisp machine increases it is no longer possible to guarantee Lisp machine cycles to everyone during prime computing time, and a means for remotely accessing these machines in graphics mode becomes mandatory. VGTS satisfies this need perfectly. In order to install the software tools necessary for remote E. H. Shortliffe 174 Privileged Communication Core Research and Development VGTS access there are two requirements: First the ability to TELNET into a Lisp machine is necessary. Second, the interfacing of VGTS primitives with the current graphics/window calls on the Lisp machine. We address each of these below. Not all of the current Lisp machines have servers which allow the establishment of an incoming TELNET connection. Currently, only the Symbolics machines have this property. What is necessary here is to modify the outgoing TELNET code where applicable so that it can also run as a server process. This is really a straightforward task. What is interesting here is just how to globally establish that the incoming data stream is to be interpreted by the Lisp machine command executive, and then all output characters are to be sent via the TELNET stream and not to the local graphics display stream. This redirection of I/O streams is well within the scope of all of our Lisp machine operating systems. The central concept of VGTS is that application/client programs should only have to deal with creating and maintaining abstract graphical objects [37]. The actual viewing of these objects is done on the workstation running V. For example: To create a view or window on a workstation/server running V from a Lisp machine/client two things are required. The client calls a routine to remotely create a file, the structured display file (SDF), which will then contain descriptions of graphical objects. Each such object has an client assigned item number associated with it in the SDF. This SDF is then associated with what is commonly referred to as a window by first calling a routine to create a virtual graphics terminal(VGT) associated with this SDF, and then calling a routine to create a view on this VGT. A view is seen as a white area on the screen with a border. Thus a VGT/SDF pair can have multiple views associated with it. And one can have multiple VGT/SDF pairs at any one time as well as more than one VGT associated with the same SDF. The mapping of VGTs to SDFs need can be but not be one to one. Each of these calls involves little more than the passing of a few data bytes between the client and server. Once the SDF/VGT relationship is established and a view is created on the server, then graphical objects can be created by adding them as items to the SDF by opening a symbol for editing and adding an item to that symbol in the SDF. An SDF then contains symbols which are in turns lists of items. An item itself can also be a symbol. These objects can then be displayed in the view(s) associated with the VGT. Thus, objects can appear on several VGTs at the same time. A client can also create menus on the server and then interrogate the actions implied by those menus via mouse buttoning. In fact one can actually query a mouse event within a view and receive back not only the buttons that were touched but also the VGT number and view coordinates of the cursor position itself, or a list of objects that are near the cursor position. This allows the client to interrogate, as well as edit viewed objects remotely. One need not maintain a great deal of information about objects on the client. In fact, one needs only the VGT number, SDF number, which are returned by the server at when they are created, and the item number which is sent when items are added to SDFs. A client can then inquire about this item and receive its definition as a reply. Thus, VGTS is designed to maximize what is done on the server by maintaining the SDF database and allowing detailed queries about its contents which can for the most part be driven by user/mouse interaction with their graphical representation. The VGTS has a resident view manager for moving, zooming, opening, closing, and creating new instances of views associated with VGTs. Consequently, the view overlaying, manipulating and trimming algorithms do not impact the client. A list of the current VGTS object primitives is as follows: Filled Rectangle These can be filled either with gray scale shades or Stipple patterns or black and white monitors, and with colors on color monitors. Privileged Communication 175 E. H. Shortliffe