a) In all cases, we plar close cooperation with the users in all aspects of the problem. Although the hasic isolation procedures are the problem of each investigator, his knowledge of the available facilities and their limitations can be an important aiioto sample preparation and analysis of the results. This is particularly true for collaborators who are unfamiliar with the techniques of HRMS {[e.g., sample size and resolving power 1ecessary tt» separate the mass doublets that can be realistically axpetted in different contexts). b) The needs of the user comnunity will be varied. Drs. Duffield and Smith will, in collaboration with the users, determine the kinds of MS experiments which will be most useful, considering sample complexity, stability, quantity, and so forth. We wish to utilize fully the existing ressurce and our proposed extensions, hringing to bear on a problem any techrique which is appropriate and can b2 provided. This will include the full scope of available experimental techniques in MS (LRMS, HRMS, GC/LRMS, SC/HRMS, matastable defocussing, ultra-high resolution mass neasirements) and available computer programs (see below). c) Many problems will be amenable to treatment by computer programs which exist or which will be leveloped, for example, structural isomer problems or HRMS interpretation on compounds in 3 well-understood class. We will take the responsibility for utilizing these programs where appropriate to assist in structure alacidation problems. We will instruct members of the community in use of the programs when programs are used routinely hy collaborators. 57 TIT. METHODS “oleztular structure elucidatio1 entails the intelligent and patient application of a large body of krowledge to each specific problen. The importance and relative difficulty of the problem impel us to seek the powerful assistance of computer programs to help chemists in their analyses. It is unlikely that such prograns will ever replace chemists, especially because computer programs are readily written only to focus on rather narrow aspects of problems. Nevertheless, our past research is reasonably forwarded as a demonstration of the computer's ability to assist in practical biomoletular characterization although this w2s a spinoff from theoretically oriented research. In order +) meet the major objectives of this proposal we will foztus our attention primarily on structure elucidation of biomedistally important compounds through MS and AT. However, many of the computer programs can already use information from other analytical techniques. Sno we want to be able to think of structure 2lucidation in the context of an ensemble of analytical capabilities. A. Enhancing the Power of the Mass Spactrometry Resource Wo have developed a siqnificant resource consisting of instrunentation (the Varian MAT-711 and ancillary equipment) and souputear programs for instrument 2valuation, data acquisition and reduction. PRoutine reduction of high resolution mass spectra to elamental compositions and ion abundances without human intervention provides the capability for efficient handling of lacge volumes of high resolution mass spectra (such as will result From GO/HRMS runs). The development of the GC and of the GC/MS sombinatior is in the excellent hands of Ms. Annemarie Wegmann, who is responsible for operation of the complete system. We now have more than two years of operational experience with the MS, the $C and related equipment under a wide variety of experimental tonditions. Yona of the resource-related research discussed in this proposal sar bo carried out without significant quantities of mass spectral Jata. The existence and extensions of the MS resource, the davelopment of computer techniques and the applications to biomedical problems demand an efficient mechanism for acquisition and reduction of MS data, and eventual transmission of the data to the SUMPY cesource. Thus, operation of the MS requires substantial computer support to deal with the large volumes of Jata proliuced by the system at high data rates. We feel that a properly configured system of hardware and software should provide, at a minimum, the following capabilities: 1) Detailed evaluation of the condition and performance of the “S prior +> recording data on valuable samples, with feedback to the operator. L sonditioning, peak detection and peak analysis. 2) A ztoordinated system of hardware and software for signal Tur 3) Pata reduction techniques based on a compntei (not theoretical) noiel of the MS, including peak shapes, mass/time function, and resolving power as a function of mass. 4) Poak profile analysis for nultiplet detection and resolution. 5) Conpuiter control of scan rates, clock rates and optimun analog and digital filtering parameters. 3) Some on-line feedback, to the operator, assessing the performance of the svstem 4uring an experiment. 7) The system must deal with frequent or repetitive HRMS 2ans, ceagairing the capability for rapid storage and analysis of tard? volumes of data. Pravious support of our research by the NIH and NASA has given us a fiecn foundation of programs and experience. We have, up until the termination of the ACME computer facility (July 31, 1973) , Jemonstrated capabilities 1-5 above. We were precluded from picsuing capabilities 6 and 7 due to the configuration of the ACME facility. The demise of the ACME computer facility and the subsequent inzorporation of the PL/ACME language into a new IBM 370/158 Facility under Stanford auspices has forced a reevaluation of the means for pvroviiing HRMS laboratory computing support. We had oreviously depended exclusively on ACME for data reduction prozessing. The ACMF transition poses both technical and fiscal Jactisions in that the real-time support capabilities of the new facility will he different from ACME's and the fee for service basis of the facility requires an explicit budget allocation for its use. Previously we had received ACME computing support without charge as part of the core research effort. Since we were thus required to revise our computing plans, we have explored a number of options for near-term as well as longer term solutions. As outlined in the attached annual report, we have chosen an interim approach (through the end of the carrent grant, 4/30/74) which mininizes near-term costs, including hardware and software zo1iversions as well as operating expenses. This approach entails connecting the MAT-711 spectrometar to the 3707158 computer through an IBM 1300 interface. It allows use of the existing 2L/AZME programs but will have real-time response limitations at least as severe as ACME had {which is inadequate for either SC/LPMS or SC/HRMS). Our existing computing budget provides for only a very low level of instrument utilization in this mode. For 2a lonq2r term solution these constraints are unacceptable. surrent estimates are that continued use of the inadequate 1390-370/158 connection and PL/ACME intaractive programs under full instrument productivity would cost up to £4,100 per month. Threa alternatives have heen investigated for improving technical necfarmance and reducing cost. This review has resulted in our surrent proposal to augment the existing mini-computer system (PPP-11720) with local storage and arithmetic capabilities. This stand-alon2 system wonld not support real-time, on-line data reduction but would allow routine data acquisition and instrument performanca evaluation, followed by off-line data reduction. IT awe wee ewan een ewe ee eee ee eee wee eK BD EP eH OTe we ee eee Alternatives considered include: 1) Modifisad 370/158 Connection Ho discussed with personnel in the Stanford Center for Information Processing (SCIP) various approaches for improving 379/158 Service. Detailed planning is still under wav within SCIP in regari t> ceal-time support ani future pricing policies. Thus the following ztonclusions are tentative. It appears that the long-tern cost woul? be prohibitive to continue real-time data acquisition by the 370/158. Insteai, 2 store-and-forward system was proposed. This would entail an augmentation of the existing PoP-11/29 front-end mini-computer with memory, disk, tape, anda new interface to the 370/158, totaling about £28,000. This approach is workable, if limited near real-time instrument performance evaluations could be made to assur? satisfactory instrument setup and data acquisition. It was recommended that the existing software be converted fron PL/ACMF to a more 2fficient Language {such as FORTRAN) to reduce operating costs. This would require approximately 4-6 man-months of effort. The resulting decrease in operatiny costs could not be estimated in tine for this proposal because the new SCIP pricing policies are not formulated and inadequate 370/158 system analysis tools are ypacational to avaluate our benchmarks in terms of detailed cesourcte consumption. We have therefore budgeted an approach based on the remaining two options with the understanding that we will reconsider the SCTP option before proceeding with an imolenentation should this proposal be funded. 2) SUMEX fhe recently approved AIM-SUMEX PDP-10 facility will provide gecessary computing support for the development and use of DENDRAL AI programss. The ®S laboratory produces data which these programs analyze and thus has a close relationship to the AI research. The SUMEX computer could help in the off-line reduction >f instrumant data, particnlarly during the early stages of the proqact when the machine load will be relatively light (20-25%). The present programs would require conversion from PL/ACME as in aption (1), which would take 4-6 man-months. Such computing may use From 15-30 minutes of CPU time per day, depending on the amount of SC/MS work. While this approach saves operating somputing costs, the front-end PDP-11/20 would require zugmantation as in option (1) ($28,000) to allow store-and-forward aperation with subsequent off-line data reduction on SUMEX. This is needed because SUMFX is not configured to allow real-time acquisition of the volume of data anticipated. This approach, while the least costly, would entail a measurable use of the PDOP-10 resources which we feel are better reserved for the {itanded AIM-SUMEX applications. In addition, because of the priorities anticipated for allocation of SUMEX to AI research, particularly as loading increases, scheduling may he required which will constrain the MS laboratory operation. For these reasons, wa feel a better, even though slightly more expensive, approach is a stand-alone PDP-11/29 data reduction systen. 3) Stand-Alone PNP-11/20 SY The auamentations of the existing front-end PPP-11/20 required for store-and-forward operation in conjunction with the 370/158 or with SUMEY¥, come close to meeting the needs of a stand-alone data system. In addition to the menory, disk and tape, an augmented arithmetic capahility is needed to allow rapid floating point talculations. A special device for this purpose costs about £7,500. The SUMEX interface can be less sophisticated in this case, however, accounting for the much lower data volume after reduction, so that the total cost of the stand-alone system would be $34,900, As with the other options, conversion of the present programs wonld be required. This approach, while slightly nore exvensive, has the advantages af off-loaiing al] data logging and reduction functions from SUMEX aad affords an adequate capacity for non-real-time, stand-alone Jata reduction on the PDP-11/29. It furthermore allows more freajon and responsiveness ir the operation of the MS laboratory since aata collection or reduction can be scheduled without worrying about the impact on AIM-SUMEX users. We therefore aropose and have budgeted an augmentation of our existing mini-computer system as a stand-alone data reduction facility. ee ee a ne ee ee ee re ee ee ee ee ee ee ee ee ee ee ee ee ee [The biomedical community (see ser Community, Sec. I.B.4 above) Jasircing access to our facilities for structure elucidation have a variety of problems, some of which can be solved hy existing instrumentation and computer techniques, as noted above. However, many problems consist of complex mixtures of compounds where analysis by conventional GC/LRMS does not lead to unambiguous solutions, and separation cof components on a preparative scale for other spectroscopic analysis is difficult (9.g., see marine sterdls, saction D, below). These problems are amenable to attack by a system comprised of a GC/HR4S combination, the GC providing s9paration, coupled with the MS operating at high resolution to provide alamental compositions. Thas, upgrading of our current system so that SC/HPMS data can be provided on a routine basis is a desirabl2, and we believe necessary, st2p to solve mary of these pcobloms. Ha propose to continue the development of the SC/HRMS system while maintaining existing capabilities of routine HRMS analysis and 3C/M3 where this efficiently responds to local needs. Many nembers of the user community will require in addition to GC/HRMS, HPMS analysis of relatively pure compounds or mixtures of small nambers of compounds. we will provide this capability on an interim hasis, using Stanford's TBY 370/158 system while the PDP 11/29 system is being upgraded. We wera able, using the ACMF computer facility, to start avaluating the operation of a GC/4S system at high mass rasolutions. These experiments were hampered somewhat by the Llinitations of the computer system used to acquire the data (only aecasional, single scans were possible); they were necessarily tiscontinued {as well as all HP#S operation!) upon the termination of ACME. We do have, however, some benchmark figures for the nerformance of the proposed system. Mixtures of fatty acid esters (@.9., methyl palmitate and methyl stearate) gave good quality mass maasurements (+-10 ppm) over a dynamic range of 100:1 for sample sizes of the order of 0.5-1.0 micrograms/component during 190 sac/dAecade in mass scans {resolving powers 5,000-8,900). We are haltingly continuing our evaluation of the SC/HRMS systen aven without a data system, making measurements on individual ions sf the mass standard and known materials in the GC effluent. Thase data can be approximately translated into expectations during dynamic scanning. We have performed an extensive series of m-asurements on both methyl stearate and cholesterol (not derivatized), the latter compound being more representative of our ~yrreait research problems. These measurements tend to confirm the graliminaty data described above. Firmer data will be available subsequent to the submission of this proposal. We prorose to operate our existing GC/MS system under high resolution conditions aiming toward optimization of resolving powers, scan rates and GC and molecular separator operating conditions to datermine the maximum usable sansitivity of the SySstan. #o recognize that the ultimate sensitivity will not approach that attainable by photographic methods of recording; we feel that the ability for on-line operation and evaluation of the operating conditions of the MS partially offsets the sensitivity jisadvantajes. We realize that some structure elucidation problems will not be amenable to study because of the sensitivity limitations: we feel, however, that many problems of interest to the User Community can be studied effectively with this performance capability. Rather than propose a research program to inzrease the sensitivity of high resolution mass spectrometers (e.g., MeLafferty, at.al., Anal. Chem., 44, 2282 (1972), dynamic rascanning of peaks; Jet Propulsion Laboratory - chemical multiplier emission/detector arrays, private communication to T. Rindfleisch), we propose to identify our limitations and, with our aollaborators, use discretion in selecting and preparing samples. Further acrtalerations of technical capability to meet the state of +ha art in sensitivity will require investments in hardware that san he better justified at a later stage of a successful facility progran . Meanwhile, other laboratories can be ©xpected to make significant contributions to this important problem. Practical rayacd for budget limitations is the main reason we do not press this issue ourselves at the present time. Significant improvements in sensitivity (with only small decreases in mass measurement accuracy) can be achieved by operating the MS at reduced resolving powers coupled with intelligent analysis of thea resulting data to detect and resolve the potentially greater aunber of xverlapping peak envelopes. This proposal is not antirely new (e.g., see Smith, et.al., Anal. Chem., 43, 1796 (1971); Burlingame, et. al., in "Computers in Analytical Chemistry, C.R. Orr and J.A. Norris, Fd., Progress in Analytical thomistry, Vol. 4, Plenum Press, New York, N.Y., 1970, Chap. IIT). We can, however, significantly extend these earlier techniques by atilizatior of our multiplet rasolution algorithm. This algorithm 2mbolied in a computer program, has been shown to increase the effective resolving power of the ™S up to a factor of three. It hases its »0eration on a dynamic model of peak shape computed Yirectly from the data. For computational efficiency and to avoid 56 spicious information, this algorithm would be best implemented as a post-processor, basing its search for multiplets on the results of prior elemental composition determination. The ability to detect and analyze for unresolved peaks is mediated by consideration of the mass measurement accuracy of an MS systen. These systems are capable of determining peak positions {and thus nasses) to a small fraction of ths peak width. The high accuracy af such neasurements (#- 2-10 ppm) can, in fact, be utilized to letect and "resolve" multiplets in instances where the unresolved spazies ar2 known precisely (see Burlingame, et al., ref. above, For C4 vs. 13¢ doublet detection and resolution). For instantes where the heteroatom content of a molecule is known yr where the possibilities are reduced severely by chemical, Spectroscopic and mass measurement heuristics, there may be a ranqe of possible overlapping ions resulting from fragmentation of the moleculs. These potential overlaps may be computed and then used (in combination with the known resolving power and mass yeasuremant accuracy of the MS and the measured mass of the peak, assuming it was comprised of only one type of ion) to direct the nultiplet resolution program. As an example, we have computel the possible mass doublets for various ranges of compositions (Lederberg, et al., to be a published). A sample table for Cc, N, 3 =<4 is appended {Table 1). Inly 28 of the 364 possibilities are shown, namely those whose nass difference {e}) <.95 mass units. N€ these 28, 13 show e>.03 and would he fairly easy to resolve, cretuiring 175009 resolution at MW=150. At the othar extreme, 5 doublets show e<.01 {CNG vs. H4YO4; C2H20 vse N3: T2N2 vs H4¥03: C3N vs #293; and T4 vs H2NO2) which would demand special treatment for resolution. The 10 doublets for which .01 =< e =< .03 pose the interesting thallenges for tradeoff of resolution vs. sensitivity in the context of given problems. For example, if N is absent, the only ambiguities are C3 vs. H402 {e = -.02) and C4 vs 03 {(e = .015). “uch as we would wish always to have unambiguous empirical formulas for all ions, HRMS remains a valuable tool despite these limitations. As shown by these examples, even moderate resolution reduces the number of candidates to a manageably small number of alternativas. Contextual and intarval jJata (within the spectrum) can o2 us2i to trim these further at two levels: (a) pooling of aoaak statistics to sharper decision probabilities on the presence af haternatoms -- the fraqments are subsets of the molecule and (bh) the assemblage of candidat? solutions under each of the alternative formulae. Manifestly, computer processing can sort dyut branches of decision trees that would soon exhaust human patience. These heuristics are bnilt ints the DENDRAL programs {solutions based on fraqmentation theory), but are also applicable to table look-up approaches. We (ref. 29,33), and others (9.9., H.-K, Wipf, et. al., J. Amer. Than. Soc., 95, 3369 (1973)) have illustrated the importance of 57 netastable ion determinations in automated structure elucidation based on MS data. Data on metastable ions must be judiciously salacted because of the time and sample normally required to perform the measurements. Our programs are now capable of precise specification of those experiments necessary and sufficient to distinguish among a set of canlidate structures. We seek more afficient ways of acquiring these selected instrumental data. This can ba accomplished with minimal cost by developing the hardware and software necessary to perform (defocussed) metastable scans and zalculate the data. Much of the hardware, except an accurate sensor for accelerating voltaye, already exists. We have had considerable experience in peak detection on the software siie; the calculations to determine transitions are simple. [t is assumed that the operator would manually adjust the instrument to the i2sirel "daughter" mass prior to initiation of the scan of netastable origins ("parents") of this daughter. The retent availability of revarsei-geometry instruments has provided naw methods of metastable defocussing (e.g., Beynon, at.,al., Anal. Chem., 45 (12), 1023A (1973)). We have illustrated the power of these techniques in mixture analysis (ref. 69). No "normal" qeometry instrument is equipped to perform these neasuremants to determine all the daughters of a given parent, information which is frequently more useful than the converse. This infornation can be obtained, in principle, hy synchronous variation of two of the three fields (magnatic/ accelerating/s electrostatic daflection) in a very accurate way. We would like to explore this possihility because we feel that this technique, if feasibl>, would represent a significant upgrading of the many standard geometry, double-focussing instruments in existence. B. Computer Assisted Structure Elucidation AS mantion24d above, some existing programs can be used immediately For structure elucidation problems using MS data. The programs have been iascribed in detail elsewhere andi are mentioned in the section on existing capabiliti2s (Sec. I.B.E, above). The Planner's performance, for example, is excellent precisely in the areas where MS, by itself, is capable of definitive structure analysis. The general intellectual flexibility of the human chemist is beyond the reach of plausible programs. On the other hand, where the history of a sample is known, so as to restrict the potential classes of compounds and for classes where the rules af MS fraqnentation are well understood, the program's performance natches that of trained mass spectroscopists, the program also affers soma advantages in its exhaustive and rapid analysis of the Jata. Many structure elucidation problems of the user community Fit into tais category and existing resources can fulfill these needs. Whether man- or computer-implemented, MS cannot solve all stcuctaure 2lucidation problems, however. In such cases, recourse is to other spectroscopic techniques if sample size permits. As Aascribed in the introductory section, diverse information is ni2c2d together to achieve a solution. Interactive computer programs can assist in seqments of this procedure, with the advantages of exhaustive evaluation of the data and the molecular structures suqgested by these data. 5k Tn our own and in planned collaborative work, we will call upon the extensive facilities of the chemistry department for acyuisition of additional spectroscopic data. These services are Fiiaanced by fees, paid from existing research grants of the user coumunity. There are sufficient documented examples of structure >Laicidatio.r problems to obviate the requirement for extensive use af these additional facilities in jJevelopment of the programs. On the other hand, the intensive pursuit of mechanized "intelligence" in the domain of MS requires more than availability of public MS qJata . It requires the collaboration of skilled chemists actively engaged in practical MS research and, at the same time, committed to the exploration of innovations in the application of AI to the solution of the problems As i121 the oast, we will develop the computer programs through zlos2 collaboration among Drs. Duffield ani Smith (and other nembers of their groups) and the program designers and proqrammercs. For us, this means daily censultation for discussion 9f strategy, extensions to the program, ani solutions to new problems. In particular, we propose to continue software development (on the AIM-SUMEX facility) as folllows: 1) The rezeantly completed structure generating algorithm will be the core of our efforts to assist in structure elucidation. The structure yanerator can guarantee that the correct solution is somewhere in the list of possibilities. Additional programs, such as th2 Planner allow us to avoid exhaustive generation in practice. Some parts of the cyclic structure generator program have not b2en extensively tested yet, and these tests will be the First task to completes. 2?) The structure elucidation task is strongly directed toward rejection of whole categori2s (2.9g., compound classes) of solutions as quickly as possible by using as mich knowledge about the themical history or characteristics of a sample as is available. Details of spectroscopic data then define the nolesular framework more precisely. Each step in this procedure represents the application of constraints on the sat of possible solutions. Computational efficiency demands that these constraints be applied early in the generation process when the structure generator is utilized. Jo have made some effort to examine the kinds of constraints used by scientists engaged in structure elucidation. We have begun Yesigning strategies so that these constraints can he brought to bear on th2 structure generator. Some of these strategies involve niior changes to the existing program; others require significant axtensions of existing generating functions. One approach which Spams particularly attractive to us is presently under Jevelopment. This approach will utilize the existing structure yenerator, with some modifications, to generate a dictionary of cyclic skeletons up to those containing a maximum of twelve tertiary vertices. The dictionary will be a complete, irredundant list of ring systems which contain no multiple bonds and no tut-edges {acyclic parts). This dictionary will be organized and ceye) so that many constraints can be implemented easily. The dictionary will allow exhaustive specification of ring systems with loubl2e bonds and/or aromaticity. The rings themselves can he Labelled with heteroatoms to generate heterocyclic ring systems, 57 ar with acyclic radicals to qenerate substituted ring systems. The existence of the dictionary will lead to greater computational afficiency as it needs to be generated only once, and specific configurations of rings (numbers, sizes, fusions) can be pulled immediately from the dictionary. #2 propose to continue these investigations so that a reasonable variaty of constraints can be recognized and utilized effectively by a computer program. This rapresents the first step toward increasing the chemical knowledge of a program which views molecular structures and their manipulation as mathematical entities and transforms. 3) Present, effective use of the structure generator or its subroutines for special problems requires a detailed knowledge of the program. We propose to develop an interface between users and the progran to remove this requirement. The interface would contain elements of structure input and display routines anda simple language for application of constraints. Portions of these alamants are available from other workers (e.g., Richard Feldman, NTH) and we would draw on these sources whenever possible. Hy We prooose that initial efforts will be directed toward a system where the scientist examines his own data and inputs his fiirdings (in terms of allowed and disalloved structural features) to the program as constraints. The generator would then provide a list of possible solutions to he evaluated, followed by iteration on this procedure. 5) Many structure elucidation problems can be characterized as assembly 92£ sub-structures inferred from spectroscopic data into complet? molecular structures. Although there are two instances in the literature describing programs with the capability to solve this problem (see S. Sasaki, "Determination of Organic Structures sy Physical Methods, Vol. 5," F.C. Nachod and J.J. Zucherman, Ed., Academic Press, New York and London, 1973, p. 285; M.E. Munk, C.5. Sandano, R.L. McLean, and T.H. Haskel, J. Amer. Chem. Soc., 89, 4158 (1967)), we do not feel these approaches fulfill the regquicements for generating complete lists of structures and avoiting duplicate structures. We have some strategies to solve this problem, thus extending the scope of the generator while tying i+ more closely to the methods used hy chemists engaged in structure alucidation. Our existing structure generator has this capability; as long as the sub-structures are connected only by a single tond, no new rings are formed. 6) Wo wish to implement general rontines for finding molecular ions from spectroscopic data in order to improve the general power of the Planning program. The current Planning program depends on yaving som2 metastable ion information with HEMS data, together with knowledge of the structural class with special rules for the class. We will incorporate strategies suggested by Biemann (K. Biamann andi W.J. McMurray, Tet. Lett., 647 (1965)) and McLafferty (8. Venkataraghavan, F. W. McLafferty, and G. FE. Van Lear, Org. Mass Spectrom., 2, 1 (1969)) for finding molecular ions, but also give the program the flexibility to use class-specific information whan available. The procedure will be to use these kinds of iaformatioar within a general heuristic search paradigna. 7} The sertion on aims indicated some longer-term directions 60 which might be pursued. Of these, we feel that the incorporation af threa-dimensional information into the program is perhaps most inportant {e.g., representation of three dimensional information, molecular nodelling including steric factors). Lederberg has previously discussed ways (Ref. 1) in which three dimensional information can be considered in the generation and representation of molecular structures. More recently, the work of Wipke (J. Amer. Chem. Soc, in press; personal communication) in connection with computer assisted organic synthesis has provided important results which we would attempt to utilize to avoid unnecessary Juplication of effort. We plan to collaborate with Dr. G. Loew (Stanford genetics Dept.) to utilize her available programs on nolecular orbital methods to determine local minima for conformations. Another longer term goal which we feel is both interesting and important is the use of an extended Predictor (which we have previously described in the context of MS) to assist in Jistirnquishing amona potential solntions to a structure alucidation problem. We have recently carried out some extensions to the existing Predictor by incorporating the ability to suggest natastable defocussing experiments. Further extensions to include knowledge about other spectroscopic techniques and the information which can be elicited from these techniques are clearly feasible an@ could be a powerful extension to our computer assistance afforts. [. Theory Formation Jna inpoctant aim of this project is to improve the existing theory formation capabilities and thus provide more assistance to scientists investigating regularities within classes of compounds. This is a theory formation task at a very pragmatic level. The MS theory that the program attempts to find is of the same form as the one practicing mass spectroscopists use for structure alicidatioa. Thus, resulting pieces of theory are extensions to both the scientists! theory and the computer's theory of the Jiscipline. To improve this program we need to complete the Plan-GZenerate-Test prodram that has been started (as described in th2 appendad annual report) and tune it over many test cases. We also wish to make the programs interactive and easy to use so that they are more readily accessible. This can he done when the programs are transferred to the AIM-SUMEX facility. we plan to apply the theory formation program to two different kinds of data: (a) the data collected in the interest of anierstanding the mass spectronetry of a particular class of zoupounds, as was done for estrogenic steroids, and (b) sollections of diverse data that may provide some insight into tore ceneral fraqmentation mechanisms. For example, we hope to fiid general rules analogous to the alpha-cleavage rule or the stability a€ aromatic rings. The INTSOM program mentioned in Section (1) is the planning phase yf the theory formation program. It currently runs in batch mode an Stanforits 360/67 computer. We wish to add an interactive monitor to INTSU™ to give an investigator the ability to set up his awn conditions for interpreting the mass spectra and to sontrol th2 type of summary he wishes to see. For example, if he 6 / is interested in the allowable hydrogen transfers associated with one specific process the program could be instructed to produce a yery sp2ocific summary. Also, we wish to add an interactive program for answering questions about the results. Por example, an investigator should he able to find out easily how many procasses involve cleavage of a specific bond and how strong their resulting MS peaks are. The INTSOM program is now used routinely by mass spectroscopists at Stanford engaged in investigations of the mass spectrometric fragmertation of various classes of organic compounds, primarily steroiis. A manuscript is now in preparation (Ref. 54) describing the fragmeitation of progesterone and related compounds. The program was used extensively in this work. We are now beginning a letailed examination of the fragmentation of steroids related to the anjJrostane skeleton, particularly the biologically important testosterones. We propose t9 continue to use the INTSUM program in its present form and as it is improved in support of these studies. The qanerator of rules that we now have, RULEGEN, does a credible job of explaining the regularities summarized by INTSUM. It has found, for example, the well-known alpha-cleavage fragmentation process ani beta cleavage followed by rearrangement in the low casolnition data for fifteen aliphatic amines. The program will be extended. in two important ways to increase its utility: (i) the proaram needs to be able to work with an increased number of Jescriptive predicates in the generation of rules, and (ii) it needs to bh2 given a more flexible reprasentation of complex fraqmentation mechanisms so that it can more easily find rules iavolving nore than two bonds. We will continue working with low resolution MS data of the 150-200 nonofunctional aliphatic compounds studied previously in the coritext of the performance program. These compounds are well-understood and thus provile a gooi test of the program's effectiveness. In order to insure generality in the theory Formation programs, we will alse test the system against the high resolution mass spectra of the 68 astroganic steroids. Since they ace also wall-understood, these compounds will show how well the proqram can deal with complex ring systems, multifunctional ztompoands, cleavages involving more than two bonds, and high resolution data. The existing programs are in good working order - within definite Linits - 8s) we expect to apply them to new sets of data from the “S laboratory as interest arises. For example, as the high and low resolution MS from marine stersls are collected we expect to use INTSUM and RULEGEN (at least) to assist in the interpretation ani generalization of these data. Since these problems will advance tha state of knowledge of MS, it is not correct to look on than as test problems. However, in the past the programs Jeaveloped most rapidly when they were applied to unsolved problems af interest to our colleagues in the chemistry department. Por Yavelopment of the interactive programs, we will rely heavily on the criteria of acceptahility by Stanford users. The programs themselves will he written in INTERLISP on the SUMEX computer. Tnittially, we will provide interactive access to the control daraneteors of the programs in order to allow users to tailor their rans to thair immediate interests. Later we hope to expand these to allow interrogation of the programs with respect to both sontents of the results and the program's reasoning steps. >. Applications to Binmedical Problems Wa can immediately offer te the user community the Planner, for analysis of HP/MS in terms of molecular structure. The program is insensitiva to the source of the 4S data, and we foresee significant use of the program for analysis of spectra of mixtures without prior separation and spectra from the GC/HRMS facility without adiitional programming effort. Examples of applications ar2as are summarized below. 42 wish to exploit our existing capabilities of the analysis of birdlogical mixtures without prior separation (ref. 33). This approach will prove particularly useful in studies of mixtures which aro difficult to separat2 and analyze by GC. Phytoecdysones related to ecdysone, an insect molting hormone, present such a problem. GC of these compounds is very difficult, although high-pressure liquid cthromatography has recently been used to aarry out separations. This class of compounds represents an interesting and valuable test case for our combined MS and somputer tachniques, particularly the specification and subsequent acquisition of metastable defocussing data for precise linking of parent and fraqment ions in the spectrum of a complex mixture (c2fs. 28, 33). Model compouris, mixtures and current structure alactidation problems are available (Nakanishi, Columbia; Takemoto, Tohoku University, Sendai, Japan). Although most users cannot be sompletely specific as to the natur2 of their future structure 2lacidation problems, we feel that several of these problems can bo handled by soch an approach. As the structure generator and its extensions are developed Further, we foresee continuing use of an interactive version applied to specific problems of the user community. As an example, the work in collaboration with the GRC project will involve studies of several classes of compounds extracted from hunan hody fluids (e.g., aromatic and alivhatic acids, various classes of bases, amino acids and carbohydrates) which contain representatives varying by substitutions about a small number of molecular skeletons. The generator can define all isomers which must be considered as possible solutions. For those problems which are amenable to attack by library search procedures, e.g., screening of GC/LRMS runs of marine sterols to weed out known compounds, we propose to use these procedures and to investigate extensions to them. using a procedure related to that described by McLafferty (K-S. Kwok, et al., J Amer. Chen. Soc., 95, 4185 (1973), we seek to Jatermine from modified library search techrigques the known structures which yield similiar spectra. Utilizing the DENDRAL structural manipulation routines, we would then seek to determine those related structures (whose Spectra are not in the library) which are possible solutions. A library, including Wiswesser Line Notation names, exists (F. W. WcLafferty, private communication) and would be of some utility in this work. The MS facility in conjunction with our programs will ke used in studies of the following natur?: 63 1) Prof. Djerassi - we plan use of the MS facilities and computer proqrams in ongoing research connected with existing NIH-supported studies on steroids and marine sterols and continued collaboration with Prof. Adlercreutz on estrogen mixtures isolated from body fluids. Further collaboration with Prof. Adlercreutz will be on structural studies of new estrogen metabolites whose presence in nixtuirces has been inferred through our previous collaborative efforts. Phe work on marine sterols presently utilizes SC/L&MS and frequently laborious separation procedures to isolate individual fractions for HRMS analysis. GC/HRMS will be a significant assistance in this effort. We plan MS studies of known marine sterols (utilizing INTSUM) to jierive fragmentation rules, which then will he used in the Planner to aid structure elucidation of naw compounds. We also plan further work on extensions of MS theory in the steroid field, initially focussed on additional hiomedically iauportant classes of steroids related to the pregnane {progesterones) and androstane (testosterones) skeletons. This work is cucrently being carriel out by Dr. Smith in collaboration with two visiting senior scientists (Dr. Roy Gritter, Dr. Geoff dromayv) currently on sabbatical leave fellowships. 2?) ‘Chemistry Department Collaborators - as indicated by the responses summarized in the letters of interest (Appendix A), thara is significant interest in use of the MS facility by other NIH-supported members of the chemistry department. All those Yisted are familiar with the technique of MS as applied to structure elucidation problems. Most have usel MS frequently, darticularly Prof. Van Tamelen in his studies of the cyclization af squalene and related studies in the terpenoid and steroid fieli. The interests of these collaborators are generally in HRMS and 3C/HREMS, with occasional nse of other capabilities of the systam. The types of compourds studied by this group and an iriication of the amount of use expected are summarized in the letters of interest. 3) Zenetics Research Center (GRC): (Profs. J. Lederberg, H. Cann; Dr. A. Duffield) The body fluids analyzed by GC/LEMS to aate include urine, blood, amniotic fluid and cerebrospinal Fliia. ach body fluid is fractionated into the following compound classes: a) organic acids and neutral compounds b) amino acids c) carbohydrates which after appropriate derivatization are analyzejl by SCYLRMS/conputer system. A library of known LRMS will serve as the primary means of identifying metabolites from their axperimentally recorded LRMS. n those instances where the LPMS is insufficient for metabolite Jantification GC/HRMS data will be necessary to determine the composition of all ions in its mass spectrum. These data will jreatly enhance the prospects of identifying the metabolite in question. ve ad GY It is known {on past performance) that if a compound is present in body fluids at the level of 1 microgram per GC peak then good qguality HR/MS will be recorded (ion amplitude dynamic range of 12109, mass accuracy 9f +-Spom) using the Varian MAT 711 mass spectrometer. If the GC peak of interest contains insufficient material for a HRMS scan then preparative 3C conld be used to concentrate that portion of the chromatograph effluent prior to SC/HRMS. Prior to the demise of the ACME computer system (July 31, 1973) we Javeloped a GC/HRMS system and applied it to the analysis of axtracts from body fluids. The following example represents cesults obtained with this system during its development. The example us3d was a routine analysis and was run to determine the capability of the overall system during its development and not as an unknown sample of extreme interest. The total ion plot recorded during the Lifetime of the GC/HRMS analysis of an amniotic fluid is reproduced as Figure 1. A complete high resolution scan was recorded on each of the peaks shown in Figure 1. Filing time of the time-shared ACME computer systam did not allow the system to operate in a repetitive scan nole. For the sake of brevity only the GC/HRMS scan (# 1594, Fiqurce 2?) ctorresponding to glutamic acid N-TFA O-n-Butyl ester derivative is produced. (The corresponding GC/LPMS scan is Figure 3). The scan time per decade of mass was 10.5 seconds, the resolution 6,500 and the matching tolerance for the assignment of ampirical somposition set to 4 mmu. The rasults show that the system was capable of accurate mass measurement with a dynamic rarge in ion amplitude of about 33:1 in this instance. Tha cassation of computer support for the GC/HRMS system did not allow a HEMS analysis to be maie which was crucial to the ideatification of a metabolite present in a body fluid. Since that time however, several instances have arisen where GC/HRMS jata would hava been collected in an effort to identify natabolites not previously seen. The expected sample throughput in the GRC project with existing personnel is expected to approach 5 to 7 body fluids per week {(15- 21 GIVL2MS fractions to be run in the Genetics Department per weak). On average GC/HRMS would be required on 1 - 2 samples per waek, The research interests of the Medical School collaborators relative t> the proposed #S resource are summarized in the letters of interest (Appendix A). The MS services required by this cormunity will include GC/LRMS (Forrest, Sera, Kalman for drug and Arug metaholite identificatior, Rabinowitz and Wilkinson for prostaglandin identification, Robin for identification of oxidized/reduced rejox pairs, Hollister for Marihuana metabolites, Rarchas, neurotransmitters, Fair, polyamines and the prostatic antibacterial factor in urine); GC/HRMS (Trudell, drug metabolite identification, Kvenvolden, structure of amino acids and related compounds plus samples as required from interests described under SC/LRMS). In those instances where the biological extract contains insufficient material for a GC/HRMS scan preparative GC, using axisting instrumentation within the chemistry department, can be 6s” used to concentrate the material prior to the GC/HRMS analysis. If the mat2rial of interest is obtained relatively pure by this technique then HRMS analysis using direct sample insertion into the ion source would be utilized. Dn rom 4s mantioned above, several of the computer programs have immediate utility for assisting with structure elucidation oroblems. For example, the Structure Generator program can answer structural isomerism questions independently of mass spectrometry, {o.q. , to provide lists of isomers in conjunction with isomer interconversion problems such as carbonium ion rearrangements). Because the program will be able to generate complete lists of isomers with (or without) some specified structural features, a researcher can have confidence that no possibilities have been yverlooked. Some interest in the structure generator has been expressed by representatives of the pharmacentical industry. The qyanrerator could be used to suggest complete sets of structural alternatives for possible synthesis, once a physiologically active congener has been identified. Tn more general terms, the structure generator can he cichly suqqyestive of new, unexolored areas of synthetic arqanic chemistry. for example, the generator has heen used by a graduate student in chemistry, Mr. Jan Simek, to jiantify the space of possible Diels-Alder condensation products consisting of six atoms of any combination of carbon, nitrogen, oxygen, and sulfur ina six-nembered ring with one double bond. A literature search through the Ring Index revealed that many of the ring systens have never heen reported. 66 QV. SIGNTIPICANCE OF PROPOSED RESEARCH Structure alucidation is an important and difficult problem for birmedical scientists. Many of them lack the detailed chemical hackground necessary to he efficient in this endeavor. Generally speaking, they also lack the frequently complex and expensive 2quipment {(e@.q., high resolution mass spectrometers) to provide spectroscopic data to assist them in solving problems of molecular struztara. We plan to provide the chemical and analytical expertise to facilitate the solution of their structural problens. This research aims at providing more powerful techniques for Jetormininy molecular structures than are now routinely available. In particular, we have proposed {a) providing extended MS services az a means of collecting powerful analytic data for scientists; (b) develosing (and extending) sophisticated computer programs to assist with the interpretation of the jata from mass spectrometry and 2lsewh2re, {(c) developing (and extending) novel computer programs to assist with formulation of the rules of interpretation, and {d) applying these state of the art techniques to problems of biomedical relevance. Our research group is thus iedicated to a broad-based attack on the applications of structure alucidation to biological and hiomedical problems. The proposed research not only holds promise for significant long-term advances, it can have immediate henefits as well. Many nembars of the hiomedical community at Stanford have called upon the MS laboratory for assistance in the past and will continue to Jno so in the future. The proposed resogurce will provide the conduit for a substantial increase in the utilization of MS within the Stanford biomedical community, The ability of the proposed resource to interpret the experimental data it generates (enhanced hy the close proximity of the resource and hiomedical community) Should rasult in a successful program 9f interdisciplinary research. 4R¥S is an important source of data for these problems, and SC/HRMS is still more important. Previous investment by the NIH a the Varian MAT-711 HRMS system at Stanford can be utilized now ni built zpon for the future. Continued operation of the GC/MS ystam will give the Stanford community access to state-of-the-art spactroscopic techniques and to professional mass spactroscopists h> can help with ongoing problems. The comput2r programs themselves constitute a unique resource for assisting with the structure datermination. The previous NIH Jrant supported development of the programs. Tn part, we are requesting funds to exploit these programs. One of the most siqnificant aspects of this work is its interdisciplinary view of solving molecular structure problems by inzelligently directed search of the space of chemical graph structures. As a result of posing the structure determination problem in this framework, we have been able to further the knowledge about structure elucidation in at least three ways. Ficst, soma of the knowledge used by analytical chemists has been nade more precise for use in a computer program. Second, codifying such knowledge for the computer has led to the discovery 67 »f new research areas to extend our existing knowledge of MS. Several publications listed in the bibliography (Refs. 42 and following) are reports of exactly this kind of research. Finally, the computar's systematic search through the space of possible structures gives the practicing scientist the confidence that no structures were merely overlooked. The efficiency of the program depends on the exclusions of many whole classes of compounds, but the componter will have rejectel those classes using precise, axplicitly stated criteria. Our recent work on Finding MS interpretation rules (theory formation) can provide additional unique capabilities for assisting with the problem solving. We wish to continue this research bacause it offers hope for a solution to the problem of Furnisking real-world knowledg2 to computer programs ~-- in particulac to the computer programs that assist with structure alucidation. This is a pressing problem in current AI research. High performance programs, of which DENDRAL is most often cited, lerive their power from large stores of knowledge. Yet there are no routine methods for infusing such systems with knowledge of the task domaii. We believe our research in theory formation holds a cay to the solution of this problem. oy V. FACILITIES & EQUIPMENT fhe Stanford Mass Spectrometry Laboratory will provide 4S services on the Varian MAT-711 mass spectromater coupled with a 4awlatt-Packard gas chromatograph (Model 7610A). As service fastrunents for more rontine mass spectral analyses, the laboratory has a MS-9 and CH-4Y mass spectrometers. Nata reduction is currently provided on Stanford's IBM 370/158 sompater ia conjunction with a front-end PDP-11/20 data acquisition computer. (The PPP-11/20 presently has only the capability for buffering peak profile data between the mass spectrometar and the IBM 370/158 computer at the Stanford Computer Teanterc.) An alternative to buying time on the 370/158 is proposed and discussed in the bndget justification. Tha AT projrams will be run on the NIH-sponsored AIM-SUMEX tonupater facility (a PDP-10 conputer with the TFENEX operating systan, 192K words of memory, and adequate peripherals for our puroos2s). Running these programs on SUMFY will incur no charge. 67 VI. BT BLT OGRAPHY A. D&@ND2AL PUBLICATIONS {1) J. Lederberg, "DFENDRAL-64 - A System for Computer Tonstruction, Enumeration and Yotation of Organic Molecnles as Tree Structures and Cyclic Graphs", (tachnical reports to NASA, also available from the author and summarized in (12)). {la) Part I. Notational algorithm for tree structures (1964) CR.57029 (1b) Part II. Topology of cyclic graphs (1965) CR.68898 {1tc) Part III. Complete chemical graphs; enbedding rings in trees (1969) {2) J. Lederberg, "Computation of Molecular Formulas for Mass Spectrometry", Holden-Day, Inc. (1964). (3) J. Lederberg, "Topological Mapping of Organic Molecules”, Proc. Nat. Acai. Sci., 53:1, January 1965, pp. 134-139. (1) J. Lederberg, "Systematics of organic molecules, graph topology and Hamilton circuits. A general outline of the DENDRAL system." NASA CR-48R99 (1965) (5) J. Leierberg, "Hamilton Circuits of Convex Trivalent Polyhedra (up to 18 vertices), Am. Math. Monthly, May 1967. (6) S. L. Sutherland, "“DENDRAL - A Computer Program for Zeneratirg and Filtering Chemical Structures", Stanford Artificial Tntelliqence Project Memo No. 49, February 1967. {7) J. Lele2rberg and &. A. Feigenbaum, "Mechanization of Taductive Inference in Organic Chemistry", in RB. Kleinmuntz (ed) Formal Rapresentations for Human Judgmant, (Wiley, 1968) (also Stanford Artificial Intelligence Project Memo No. 54, August 1967). (8) J. Lederberg, "Online computation of molecular formulas from mass number." NASA CR-94977 (1968) (9) B. A. Feigenbaum and B. G. Buchanan, "Heuristic DENDRAL: A Program foc Generating Explanatory Hypotheses in Organic Chemistry", in Proceedings, Hawaii International Conference on System Sciences, 3. K. Kinariwala and F. F. Kuo (eds), University of Hawaii Press, 196R, (10) B. G. Buchanan, G. L. Sutherland, and E. A. Feigenbaun, "Heuristic BFNDRAL: A Program for Generating Explanatory Hypotheses in Organic Chemistry". In Machine Intelligerce 4 (B. Yeltzer and D. Michie, eds) Fdinburgh University Press (1969), (also Stanford Artificial Intelligence Project Memo No. 62, July {11) F. A. Feigenbaum, "Artificial Intelligence: Themes in the Second Recade"™. Tn Final Supplement to Proceedings of the IFIP658 Internatioazal Congress, Edinburgh, August 1968 (also Stanford Artificial Intelligence Project Mamo No. 67, August 1968). 70 (12) J. Lederherg, "Topology of Molecules", in The Mathematical Sciences - A Collection of Essays, (ed.) Tommittee on Support of Qesearch in the Mathematical Sciences (COSRIMS), National Academy af Sciences - National Research Council, M.I.T. Press, (1969), pp. 37-51. (13) G. Sutherland, “Heuristic DENDRAL: A Family of LISP Programs", Stanford Artificial Intelligence Project Memo No. 80, March 1969, (14) 3. Lederberg, G. L. Sutherland, B. G. Buchanan, FE. A- Faiganbaum, A. V. Rohertson, A. M. Duffield, and Cc. Djerassi, "Anplications of Artificial Intelligence for Chemical Inference I. The Yunber of Possible Organic Compounis: Acyclic Structures Tontaining C, H, OQ and NN". Journal of the American Chemical Sociaty, 91211 (May 21, 1969). (15) A. ™. Duffield, A. V. Robertson, C. Djerassi, B. G. 3uchanan, 3. Le. Sutherland, FE. A. Feigenbaum, and J. Lederberg, "Application of Artificial Intelligence for CThemical Inference II. Interpretation of Low Resolution Mass Spectra of Ketones". Journal of the American Chemical Society, 91:11 (May 21, 1969). (16) R. G. Buchanan, G. L. Sutherland, E. A. Feigenbaum, “Toward an Understanding cf Information Processes of Scientific Inference in the Context of Organic Chemistry", in Machine Intelligence 5, (8. Maltzar and DPD. Mickie, eds) Edinburgh University Press 11979), (also Stanford Artificial Intelligence Project Memo No. 99, September 1969). (17) J. Lederberg, G. L. Sutherland, B. G. Buchanan, and E. A. Feigenhaum, "A Heuristic Program for Solving a Scientific Inference Problem: Summary of Motivation and Implementation", in 8, Banerji & M.D. Mesarovic {eis.) Theoretical Approaches to Non-Numerical Problem Solving, New York: Springer-Verlag, 1970. (Also, Stanford Artificial Intelligence Project Memo No. 104, Yovember 1969.) {18) C. W. Churchman and B. G. Buchanan, "On the Design of Inductive Systems: Some Philosophical Problems". British Journal for the Philosophy of Science, 20 (1969), op. 311-323. (19) G. Schroll, A. M4. Duffield, c. Djerassi, 8B. G. Buchanan, G. L. Sutherland, E. A. Feigenbaum, and J. Lederberg, "Application of Artificial Intelligence for Chemical Inference TII. Aliphatic Fthers Diagnosed by Their Low Resolution Mass Spectra and NMR Data". Joirnal of the American Chemical Society, 91:26 (December 17, 1969). (23) A. Bochs, A. M. Duffield, G. Schroll, C. Djerassi, A. 3B. Delfino, B. G. Buchanan, G. L. Sutherland, ®. A. Feigenbaum, and J. Lederberg, "Applications of Artificial Intelligence For themical Inference. IV. Saturated Amines Diagnosed by Their Low Pesotution Mass Spectra and Nuclear Magnetic Resonance Spectra", Journal of the American Chemical Society, 92, 6831 (1970). (21) Y.™M. Sheikh, A. Buchs, A.B. Delfino, G. Schroll, A.M. Puffielda, Cc. Djerassi, B.G. Buchanan, G.L. Sutherland, E.A. faigenbaum and J. Lederberg, "Applications of Artificial Intelligence for Chemical Inference V. An Approach to the 7/ Tompnuter Ganeration of Cyclic Structures. Differentiation Between All the Possible Isomeric Ketones of Composition C6H100", Organic Yass Spectrometry, 4, 493 (1979). (27) A. Buchs, A.B. Delfino, A.M. Duffield, C. Djerassi, B.%. Buchaian, E.A. Feigenbaum and J. Lederberg, “Applications of Artificial Intelligence for Chemical Inference VI. Approach to a 3eneral Method of Interpreting Low Resolution Mass Spectra with a Tomouter", Helvetica Chemica Atta, 53, 1394 (1970). (23) F.A. Feigenbaum, 28.G. Buchanan, and J. Lederberg, "On Generality and Proklen Solving: A Case Study Using the DENDRAL Program". [In “achine Intelligence 6 (B. Meltzer and D. Michie, eds.) Fdinburgh Iniversity Press (1971). (Also Stanford Artificial Intelligence %rodect Memo No. 131.) (24) A. Bachs, A.B. Delfino, ©. Djerassi, A.M. Duffield, B.G. Buchanan, ®.A8,. Feigenbaum, J. Lederberg, G. Schroll, and G.L. Sntherland, "Tha Application of Artificial Intelligence in the Interpretation of Low- Resolution Mass Spectra", Advances in Mass Spectrometry, 5S, 3174. (25) B.G. Buchanan and J. Lederberg, "The Heuristic DENDRAL Program for Explaining fmpirical Pata". In proceedings of the IFIP TSongreses 71, Ljubljana, Yugoslavia (1971). {Also Stanford Artificial Intelligence Project Memo No. 141.) (26) B.G. Buchanan, F.A. Feigenbaum, and J. Lederberg, "A Heuristic Programming Study of Theory Formation in Science." In proceedings xf the Second International Joint Conference on Artificial Tntelliganze, Imperial College, London (September, 1971). {Also Stanford Artificial Intelligence Project Memo No. 145.) (27) Buchanan, B. G., Duffield, A.M., Robertson, A.V., “An Application af Artificial Intelligence to the Interpretation of Mass Spectra", Mass Spectrometry Techniques and Appliances, Fdited by G. W. A. Milne, John Wiley & Sons, Inc., 1971, p. 121-77. (28) D.H. Smith, B.G. Ruchanan, R.S. Engelmore, A.M. Duffield, A. Yeo, 7.A. Feigenhaum, J. Lederberg, and C. Djerassi, "Applications of Artificial Intelligence for Chemical Inference VIII. An approach to the Computer Interpretation of the High Resolution Mass Spectra yf Complex Molecules. Structure Eluciitation of Estrogenic Steroids", Journal of the American Chemical Society, 94, 5962-5971 (1972). (29) B.S. Buchanan, F.A. Feigenbaum, and N.S. Sridharan, “Heuristic Theory Formation: Data Interpretation and Rule Formation". In Machine Intelligence 7, Edinburgh University Press (1972). (30) Lederberg, J., "Rapid Calculation of “olecnlar Formulas from Yass Values". Jnl. of Chemical Education, 49, 413 (1972). (21) Brown, H., Masinter L., Hjelmeland, &., "Constructive Graph Labeliag Using Double Cosets". Discrete Mathematics {in press). (Also Computer Science Memo 318, 1972). (32) 8B. G. Buchanan, Review of Hubert Dreyfus’ "What Computers Can't No: A Critique of Artificial Reason", Computing Reviews (January, 1973). (Also Stanford Artificial Intelligence Project Memo No. 7a Tat) (33) D0. 4. Smith, B. G. Buchanan, R. Ss Engelmore, H. Aldercreutz and 2. Dierassi, “Applications of Artificial Intelligence for Chemical Tafererce IX. Analysis of Mixtures Without Prior Separation as Illustrated for Estrogens". Journal of the American Chemical Society 95, 6078 {1973). (34) DP. HL Smith, B. G. Buchanan, W. C. White, F. A. Feigenbaum, ™. Djerassi and J. Lederberg, "Applications of Artificial Tnhtelligqenze for Chemical Inference ¥. Intsum. A Data Tnterpretation Program as Applied to the Collected Mass Spectra of Fstroqenic Steroids". Tetrahedron, 29, 3717 (1973). (35) B. G. Buchanan and N. S. Sridharan, "Rul2 Pormation on Non-~domogeneous Classes of Objects". In proceedings of the Third Tnternational Joint Conference on Artificial Intelligence (Staaford, California, August, 1973). {Also Stanford Artificial Tntelligance Project Memo No. 215.) (36) DP. Michie and 2.G. Buchanan, "Current Status of the Heuristic DENDRAL Program for Applying Artificial Intelligence to the Interpretation of Mass Spectra". August, 1973. (37) #. Brown and L. Masinter, “An Algorithm for the Construction of the Graphs of Organic Molecules", Discrete Mathematics (in press). Also Stanford Computer Science Department Memo ZTAN-CS-73-361, May, 1973) (39) D.H. Smith, L.M. Masinter and N.S. Sridharan, "Heuristic NENDRAL: Analysis of Molecular Structure," Proceedings of the NATO/CNA Ailvanced Study Institute on Computer Representation and Manipulation of Chemical Information, in press. (39) RB. Carhart and Cc. Djerassi, "Applications of Artificial Intelligense for Chemical Inference XI: The Analysis of C13 NMR Nata for Structure Elucidation of Acyclic Amines", J. Chem. Soc. (Parkin IT), 1753 (1973). (40) L. Masinter, N. Sridharan, and D.H. Smith, "Applications of Artificial Intelligence for Chemical Inference XTI: Exhaustive Saneration of Cyclic and Acyclic Isomers.", suhmitted to Journal of the American Chemical Society. (41) L. Masinter, NS. Sridharan, 8. Carhart and D.H. Smith, "Applications of Artificial Intelligence for Chemical Inference XTII: An Algorithm for Labelling Chemical Graphs", submitted to Journal of the American Chemical Society. (4?) The Determination of Phenylalanine in Serum by Mass Sragqmentography. Clinical Biochem., 6 (1973). By W.E. Pereira, V.A. Bacon, Y. Hoyano, R. Summons and A.M. Duffield. (43) The Simultaneous Quantitation of Ten Amino Acids in Soil Fxtracts by Mass Fragmentography. Anal. Biochem., 55, 236 (1973). Ry W.E. Pereira, ¥. Hoyano, %.F. Reynolds, 2.E. Summons and A.M. Duffield. (44) An Analysis of Twelve Amino Acids in Biological Fluids by Mass Fragmentoqraphy. Anal. Chem., in press. By R.E. Summons, W.E. Pareira, H.3. Reynolds, 7.C. Rindfleisch and A.M. Puffield. (45) Tha Quantitation of B-Aminoisobutyric Acid in Urine by Mass Fraqmentogcaphy. Clin. Chim. Acta, in press. By W.E. Pereira, P.E. Summons, W.E. Reynolds, T.C. Rindfleisch and A.M. Duffield. ($6) The Determination of Fthanol in Rlood and Urine by Mass Pragmentography. Clin. Chim. Acta, in press. Py W.F. Pereira, R.E. Summons, T.C. Rindfleisch and A.M. Duffield. Vie 2nblications Describing DENDRAL-~Related Research But Not Funded By This Grant (47) An Automated Gas Chromatographic Analysis of Phenylalanine in Seram. Tlinical Biochem., 5, 166 (1972). By E. Steed, 4H. Pereira, RB. Halpern, M. D. Solomon and A.M. Duffield. (48) Pyrrolizidine Alkaloids. XIX. Structure of the Alkaloid Frucifoline. Coll. Czech. Chen. Commun., 37, 4112 (1972). By P. Sedmera, A. Klasek, A.M. Duffield and F. Santavy. (49) chlocination Studies I. The Reaction of Aqueous Hypochlorous Acid with cytosine. Biochem. Biophys. Res. Commun., 48, 880 (1972). Py W. Patton, V. Bacon, A.M. Duffield, R. Halpern, Y. Yovyano, KH. Per2ira and J. Lederberg. (52) A Stady of the Flectron Impact Fragmentation of Promazine Sulphoxide and Promazine using Specifically Deuterated Analogues. Austral. J. Chem., 26, 325 (1973). By M.D. Solomon, R. Summons, @. Pareira and A.M. Duffield. (51) Spectrometrie de Masse VITI. Elimination dtcan Induite par Impazt Flectronique dans le Tatrahyiro-1,2,3,4-Napthtal-ensa-diol-1,2. Org. Mass Spectre., 7, 357 (1973). By P. Perros, J.P. Morizur, J. Kossanyi and A.M. Buffield. (92) Chlorination Studies IT. The Reaction of Aqieous Hypochlorous Acid with a-Amino Acids and Dipeptides. Biochim. at Biophys. Reta, 313, 170 (1973). By W.E. Pereira, Y. Hoyano, R. Summons, V.&. Bacon and A.M. Duffield. (53) Spectrometrie de “asse. IX. Fragmentations Induites par Tmpart Flecstronigue de Glycols- En Serie Tetraline. Bull. Chim. Soc. France, 2105 (1973). By P. Perros, J.P. Morizur, J. Kossanyi and A.M. Onffield. (54) The Use of Mass Spectrometry for the Identification of Metabolites of Phenothiazines. Proceedings of the Third Trternational Symposium on Phenothiazines, Raven Press, New York, 1973, By A.M. Duffield. (55) ChLlocination Studies IV. The Reaction of Aqueous Hypochlorous Acid with Pyrimidine and Purine Bases. Biochem. Biophys. Res. Tommun., 53, 1195 (19735. By Y. Hoyano, V. Bacon, RF. Summons, W¥.F. Pereira, B. Halpern and A.M. Duffield. (96) Mass Spectrometry in Structural and Stereochemical Problems. CCXXYVIT. Flectron Impact Induced Hydrogen Losses and Migrations in Some Aromatic Amides. Org. “ass Spectry., in press. By A.M. Nnf field, GS. deMartino and C. Djerassi. (57) Stable Isotope Mass Fragmentogqraphy: Quantitation and tydroqen-Deuterium Exchange Studies of Fight Murchison Meteorite Amino Acids. Geochem. et Cosmochim. Acta, submitted for publication. By W.E. Pereira, B.F. Summons, T.C. Rindfleisch, 75