PUBLICATIONS: D. H. SMITH Page 2 Il. 12. 13. 14. 15. 16. 17. 18. 19. 20. The Lunar Sample Preliminary Examination Team, "Preliminary Examination of Lunar Samples from Apollo 12," Science, 167, 1325 (1970). D. H. Smith, "Mass Spectrometry," Chapter X in Guide to Modern Methods of Instrumental Analysis, T. M. Gouw, Ed., Wiley-Interscience, New York, 1972. D. H. Smith, R. W. Olsen, F. C. Walls and A. L. Burlingame, "Real-time Mass Spectrometry: LOGOS--A Generalized Mass Spectrometry Computer System for High and Low Resolution, GC/MS and Closed-Loop Applications," Anal. Chem., 43, 1796 (1971). A. L. Burlingame, J. S. Hauser, B. R. Simoneit, D. H. Smith, K. Biemann, N. Mancuso, R. Murphy, D. A. Flory and M. A. Reynolds, "Preliminary Organic Analysis of the Apollo 12 Cores," Proceedings of the Apollo 12 Lunar Science Conference, E. Levinson, Ed., M.1.T. Press, Cambridge, Mass., 1971, p. 1891. D. H. Smith, "A Compound Classifier Based on Computer Analysis of Low Resolution Mass Spectral Data," Anal. Chem., 44, 536 (1972). D. H. Smith and G. Eglinton, "Compound Classification by Computer Treatment of Low Resolution Mass Spectra-Application to Geochemical and Environmental Problems," Nature, 235, 325 (1972). D. H. Smith, N. A. B. Gray, C. T. Pillinger, B. J. Kimble and G. Eglinton, "Complex Mixture Analysis - Geochemical and Environmental Applications of a Compound Classifier Based on Computer Analysis of Low Resolution Mass Spectra," Adv. in Org. Geochem., 1971, p. 249. D. H. Smith, B. G. Buchanan, R. S. Engelmore, A. M. Duffield, A. Yeo, E. A. Feigenboum, J. Lederberg and C. Djerassi, "Applications of Artificial Intelligence for Chemical Inference, VIII. An Approach to the Computer Interpretation of the High Resolution Mass Spectra of Complex Molecules. Structure Elucidation of Estrogenic Steroids," J. Amer. Chem. Soc., 94, 5962 (1972). D. H. Smith, A. M. Duffield and C. Djerassi, "Mass Spectrometry in Structural and Stereochemical Problems, CCXXII. Delineation of Competing Fragmentation Pathways of Complex Molecules from a Study of Metastable lon Transitions of Deuterated Derivatives," Org. Mass. Spectrom., 7, 367 (1973). P. Longevialle, D. H. Smith, H. M. Fales, R. J. Highet and A. L. Burlingame, "High Resolution Mass Spectrometry in Molecular Structure Studies, V. The Fragmentation of Amaryllis Alkaloids in the Crinine Series," Org. Mass Spectrom., 7, 401 (1973). a6 98 ARO ctl a ae PUBLICATIONS: D. H. SMITH Page 3 21. 22. 23. 24. B. R. Simoneit, D. H. Smith, G. Eglinton and A. L. Burlingame. "Applications of Real-time Mass Spectrometric Techniques to Environmental Organic Geochemistry, If. San Francisco Bay Area Waters," Arch. Env. | Contam and Tox., 1, 193 (1973). D. H. Smith, B. G. Buchanan, R. S. Engelmore, H. Adlercreutz and C. Djerassi, "Applications of Artificial Intelligence for Chemical inference, IX. Analysis of Mixtures Without Prior Separation as Illustrated for Estrogens," J. Amer. Chem. Soc., 95, 6078 (1973). D. H. Smith, B. G. Buchanan, W. C. White, E. A. Feigenbaum, J. Lederberg and C. Djerassi, “Applications of Artificial Intelligence for Chemical Inference, X. INTSUM - A Data Interpretation and Summary Program Applied to the Collected Mass Spectra of Estrogenic Steroids," Tetrahedron, 29, 3117 (1973). G. Loew, M. Chadwick and D. H. Smith, "Applications of Molecular Orbital Theory to the Interpretation of Mass Spectra. Prediction of Primary Fragmentation Sites in Organic Molecules," Org. Mass Spectrom. , 7, 1241 (1973). 7 SECTION Il ~ PRIVILEGED COMMUNICATION BIOGRAPHICAL SKETCH (Give the fottowing information for all professional personnel listed on page 3, beginning with the Principat Investigator, Use continuation pages and follow the same general format for sach person} NAME TITLE BIRTHDATE (Ma, Osy, Yr.) Sridharan, Natesa S. Research Associate 10/2/46 PLACE OF BIRTH (City, State, Country) PRESENT NATIONALITY (/f non-US citizen, SEX indicate kind of visa and expiration date) ., India; Madras, India 5/73-U.S. permanent residence (R) Mate (7) Femate EDUCATION (Begin with beccelaureate training and include postdoctoral) * - YEAR SCIENTIFIC INSTITUTION AND LOCATION DEGREE CONFEARED FIELD Indian Institute of Technology, Madras Bachelor of India Technology 1967 Electrical Engineering State University of New York, Stony Brook |M.S. 1969 Computer Science Ph.D. 1971 Computer Science HONORS University Fellow - 1968-1971, SUNY Stony Brook; Graduate Assistant - 1967-1968, SUNY Stony Brook; Siemens! Award (awarded for top rank in Electrical Engineering) - 1967, ITT Madras; National Merit Scholarship - 1963-1967, ITT Madras MAJOR RESEARCH INTEREST ROLE IN PROPOSED PROJECT Computer Application in Chemistry and Medicine Research Associate RESEARCH SUPPORT (See instructions} RESEARCH AND/OR PROFESSIONAL E XPERIENCE (Starting with present position, dist training and experience relevant to sraa of projpct List aif or most representative publications, Do not exceed 3 pages for each individual.) 1971-present Research Associate, Heuristic Programming Project, Stanford University 1970-1971 Consultant, IAC Computer Corp., Long Island, N.Y. Sridharan, N.S., "An Application of Artificial Intelligence to Organic Chemical Synthesis" Doctoral Thesis, State University of New York at StonyBrook, 1971. Sridharan, N.S., "Search Strategies of Organic Chemical Synthesis", Third Internationa] Joint Conference on Artificial Intelligence (3IJCAI), Stanford, 1973 Sridharan, N.S. (co-author), "Heuristic DENDRAL: Analysis of Molecular Structure", Proc. NATO Advanced Study Institute, Amsterdam, 1973. Sridharan, N.S. (co-author), "Heuristic Theory Formation'', Machine Intelligence, Volume 7, Edinburgh, 1972. Hin 398 (FORMERLY PHS 390) Rav. 1/73 ; U, 8, GOVERNMENT PRINTING GFFICE : 1871 0-45 .-456 SECTION II — PRIVILEGED COMMUNICATION BIOGRAPHICAL SKETCH (Give the folowing information for all professional personne! listed on page 3, beginning with the Principal Investigator. Use continuation pages and follow the same general format for each person.) NAME TITLE BIRTHDATE (Ma,, Day, Yr.) Brown, Harold D. Associate Professor July 12,1934 PLACE OF BIRTH (City, State, Country} PRESENT NATIONALITY (/f non-U.S. citizen, SEX , indicate kind of visa and expiration date) ; South Bend, Indiana U.S. . (imate Cl Femate EDUCATION (Begin with baccalaureate training and include postdoctoral) YEAR SCIENTIFIC INSTITUTION AND LOCATION OEGREE CONFERRED FIELO University of Notre Dame, Notre Dame, M.Sc. 1963 Mathematics Indiana Ohio State University, Columbus, Ohio Ph.D. 1966 Mathamatics (No Baccalaureate Degree) HONORS Summa Cum Laude - Notre Dame MAJOR RESEARCH INTEREST JROLE IN PROPOSED PROJECT Applied Discrete Mathematics - Computer Science | Research Associate RESEARCH SUPPORT (See instructions} Principal Investigator, NSF-GP-16793 (Expires March, 1974) Pending Proposal NSF (Proposed starting date September, 1974) RESEARCH AND/OR PROFESSIONAL EXPERIENCE (Starting with present position, dist training and experience refevant to erea cf project List aff or most representative publications, Do not exceed 3 peges for each individual.) Visiting Associate Professor, Computer Science, Stanford University , 1971-72, 1973-present Associate Professor, Mathematics, Ohio State University, 1966- Visiting Professor, Mathematics, Rhine,Westf. Tech. Hoch., Aachen, 1972 and 1973 Visiting Member, Courant Institute, New York University, 1967-68 Instructor/Assistant Professor, Assistant Chairman, Mathematics, Ohio State U., 1963-65 Assistant to the Chairman, Mathematics, University of Notre Dame, 1960-63 Director or Associate Director, NSF-SSTP, 1964-70 NIH 398 (FORMERLY PHS 396) Rev, 1 * m3 U.S, COMERYMENT PRINTING OFFICE : 1077 C €59- 78 A ee AE TE ON ETRE Vitae Page 2 Publications Near Algebras, Ill. J. Math, 12(1968), Pg. 215. Distributor Theory in Near Algebras, Comm, Pure App. Math. XX1(1968), Pee 5356 An Algorithm for the Determination of Space Groups, Math Comp. 23(1969), Pe. 499. Some Empirical Observations on Primitive Roots, with H. Zassenhaus, J. Number Theory 3(1971), Pe. 306. A Generalization of Farey Sequences, with K, Mahler, J. Number Theory 3(1971), pe. 364, Basic Computations for Orders, Stanford CS Memo STAN~CS-72-208, An Application of Zassenhaus' Unit Theorem, Acta Arith. XX(1972), Pe. 154. Integral Groups I: The Reducible case, with J. Neubiser and H. Zassenhaus, Numer, Math. 19(1972), Pg. 386. Integral Groups II: The Irreducible Case, with J. Neubl’ser and H. Zassenhaus, Numer, Math. 20(1972), Pg. 22. ’ Integral Groups III: Normalizers, with J. Neubuser and H, Zassenhaus, Heth. Comp. 27(1973), Pee 167. Constructive Graph Labeling Via Double Cosets, with L. Hjelmeland and L, Masinter, Discrete Math, in press and Stanford CS Meno STAN~CS-72-318, An Algorithm for the Construction of the Graphs of Organic Molecules, with L, Masinter, Discrete Math. in press and Stanford CS hemo STAN-CS-73-261,. | The Crystallographic Groups of 4-dimensional Space, with J. Neubiser, H. Wondratschek and H, Zassenhaus, Wiley-Interscience in press. JO SECTION Il — PRIVILEGED COMMUNICATION BIOGRAPHICAL SKETCH {Give the foliowing information for all professional personnel listed on page 3, beginning with the Principal Investigator. Use continuation pages and fotlow the same general format for each person.]} NAME TITLE BIRTHDATE (Ma, Day, Yr) DROMEY, Robert Geoffrey Research Associate 11/21/46 PLACE OF BIRTH (City, State, Country) PRESENT NATIONALITY (/f non-U.S citizen, SEX indicate kind of visa and expiration date) Castlemaine, Victoria, Australia ‘Australian, J-1 Visa, Exp. 10/8/74 () Mate Cl Femate EDUCATION (Begin with baccalaureate training and include postdoctoral) YEAR SCIENTIFIC INSTITUTION AND LOCATION DEGREE CONFERRED FIELO Swinburne College of Technology, Diploma of 1968 Chemistry Melbourne, Australia Appl. Chem. La Trobe University Ph.D. 1973 Molecular Science Melbourne, Australia HONORS csTRO Postdoctoral Studentship Commonwealth Postgraduate Research Scholarship Walter Lindrum Memorial Scholarship Equivalent of First Class Honors Master of Science Preliminary (1969) MAJOR RESEARCH INTEREST anniication of ROLE IN PROPOSED PROJECT Artificial Intelligence Techniques to Bio; Medical and Chemical Problems. Research Associate RESEARCH SUPPORT (See instructions) RESEARCH AND/OR PROFESSIONAL E XPERIENCE (Starting with present positicn, list reining and experience refevent to ares of project List ail or most representative publications, Do not exceed 3 pages for each individual.) 1973 DENDRAL Project, Stanford University, Computer Science Department 1973 Software Development for Graphics Systems, LaTrobe University, Computer Centre 1969-73 Construction, development and applications of an on-line photoelectron spectrometer LaTrobe University, Chemistry Department 1969-73 Application of Deconvolution Techniques to the Processing of Experimental Data. Publications: "Deconvolution and Its Applicatim to the Processing of Experimental Data", Intl. Journal of Mass Spectrometry and Ion Physics, 1970, 4. (co-author). "Inverse Convolution in Mass Spectrometry", Intl. Jnl. Mass Spec. Ion Phys.,1971, 6. (co- author). "A Combined Time Averaging-Deconvolution Technique Applied to Electron Impact lonization Efficiency Curves", Internation Journal of Mass Spectrometry & Ion Physics, 1971, 6. (co-author). "The Perfect Direction and Velocity Focus at 254934' in a Cylindrical Electrostatic Field'', Reviews of Scientific Instruments, 1973, 44. (co-author). MiH 398 (FORMERLY PHS 396) Rev, } " 3 U, S. GOVERNMENT PRINTING CRFICE . 0971 O - wot PEE Ts F R. G. Dromey "Detection of Spin-Orbit Splitting in the Photoelectron Spectrum of Oot by Deconvolution", Chem. Physics Letters (in press), 1973. (co-author) "The Effect of Finite Line Widths on the Interpretation of Photoelectron Spectra", Journal of Electron Spectroscopic (accepted for publication). (co-author). "An On-line Ultraviolet Photoelectron Spectrometer for High-Resolution Studies of Molecular Structure'’, Australian Journal of Chemistry (accepted for publication). (co-author). "Photoelectron Spectroscopic Correlation of the Molecular Orbitals of the Alkanes and Alkyliodides", Journal of Molecular Structure (submitted for publication). (coauthor). “Comparison of the Photoelectron Spectra and the Photoionization Efficiency Curves for the Alkyliodides", Transactions of the Faraday Society (submitted for publication). (co-author). "A Convolution-Deconvolution Algorithm Using Fast Fourier Transforms), Decuscope, 1973 (in press). Tar RESEARCH PLAN RTOMOLECULAR CHARACTEPRTZATION: ARTIFICIAL INTELLIGENCE A Program of Resource-felated Research T. INTRODUCTION A. %Ibijectives B. Background and Rationale C. Relationship to AIM-SUMFX and the Genetics Research Center It. SPECIPIC AIMS Itt. MFTHODS Tv. SIGNTFICANCE OF PROPOSED RPESEARCE V. FACILITIES & EQUIPMENT Vr. ET BLIOGRAPHY Table 1 Pigures 1-3 Appendix A: Letters of Interest Appendix RB: 1973 Annual Report to the NTH S$ I. INTRODUCTION This renewal application is intended to. sustain and augment the capabilities of the mass spectrometry (MS) program which has served as a major institutional resource at Stanford for some yoars. With previous support from NASA and NSF it has made possible a highly interdisciplinary set of research projects ranging over: artificial intelligence (AT) in bionolecular tharacterization, natural product chemistry, clinical biochemical studies on steroids, and the machanisms of molecular fragment Formation in mass spectrometry. While the facility equipment for mass spectrometry has been funied mostly by other agencies, connected research programs embrace several NIY research projects as wall. In addition, this activity was closely coupled with the ACME Madical School computer resource (1966-1973) and will have Similar associations with the new ATM-SUMEX conputer resource cetently fanded by the BRR (see Section T.7). Previous support reflects the diversified facets of this interdisciplinary research. NASA haS supported projects in new iasteouneatation, including the initial mass spectrometer-computer link, NSF has supported chemical research, and ARPA has supported auc artificial intelligence research and initial application to mass spectrometry, Overall cuthacks have forced NASA to reduce Funding for this area of research despite their interest. Under ARPA support to Drs. Feigerbaum and Lederberg for AI research, the DENDRAL program became recognized as one of the most successful AI applications programs. However, ARPA is chartered to fund frontier computer science research and no longer provides funds for the DENDRAL applications programs. ARPA has indicated a reluctance to continue funding to this groun for the theory formation work in chemistry, although we expect to continue to ceceiv2 ARPA support for more theoretical aspects of our research oroaram (@.g9., automatic programming). We previously submitted a comprehensive proposal to the NIH (R8-00785, 3/28/73) which included an application for the AIM-SUMEX computing resource and a renewal of the existing DENDRAL grant {RR-00612). This proposal was approved for 5 years by the National Advisory Research Resources Council. Certain reservations were, however, communicated to us: they concerned aspecially what we must agree was an anbitious effort to close the sontrol loop for "Yintelliqent automation" whose costs overreached the immediate utility of the expected result. During subsequent discussions with the Biotechnology Resources Branch, taking into account the council review and a number of diverse policy issues, we ayreed administratively to seqment the two components of the yriginal proposal. The AIM-SUMEX portion of the original proposal (excluding DENDRAL) was recently funded for 5 years as a national resource for artificial intelligence in medicine. The present proposal for resource-related research in biomolecular characterization and artificial intelligence is an elaboration of thea DENPRAL portion incorporating intensive reexamination and revision of the previous proposal. Bith the dALfferentiation of priorities represented by AIMN-SUMEX, the Sanetizs Research Center (37C), and continuing work on artificial intelligence under Dr. Feigenbaum's leadership, the asreseit renewal application places more emphasis than heretofore an raal-world oriented applications. Correspondingly, we have aqceel that it is now more appropriate that Dr. Djerassi should be Aesignated as Principal Investigator in this phase of our work. Rs outline? in section B.2, the interests and responsibilities of Professors Djerassi (Chemistry), Feigenbaum (Computer Science) and Laderberg (Genetics) have been closely interdigitated. With their further rtonnections with many colleagues, these programs enjoy a high deqree of university-wide participation. For example, the Janatics Department is also closely affiliated with Biology, Biochemistry, Pediatrics, Psychiatry and Medicine through joint appointments or joint research projects or both. This breadth would be difficult to obtain except at a few institutions where the medical school is both academically ani geographically inteqrated with the university to the degree that characterizes the Stanford University environment. GLOSSARY OF ABBREVIATIONS ACME Advanced Computer for Medical Research (Nih-funded computer resource, 1968-1973) AT artificial intelligence AIM-S'MEX- A comprehensive computer resource intended to serve the national requirement for artificial intelligence in medicine. This will he implenented at the Stanford University facility called AIM-S'MEX AROA Advanced Research Projects Agency of the Department of Defense, BR Biotechnology Resources Branch T32MR carhon-12 magnetic resonance 3¢ gas chromatography or gas chromatograph 3R2r Genetics Research Center (Stanford, J. Lederberg, Principal Investigator; NIGMS-approved and awaiting funding. Grant #P01-GM 20832-01) HERES high resolution mass spectrometry TR infra-red IRL Tnstrumentation Research Laboratory (Stanford Genetics Department) Lees low cesolation mass Spectrometry ¥CD magnetic circular dichroisn MS Mass Spectrometry or mass spectrometer VASA National Aeronautics & Space Administration yep nuclear magnetic resonance NSF National Science Foundation IRD optical rotatory dispersion DULSACME a modified version of the PL-1 computer language (for the Stanford ACMF computer facility) SUMEX Stanford University Medical Experimantal Computer Resource (NIH funied computer resource, 1973-1978) Ty ultra-violet JE A. OBJECTIVES: Core Research. The funds now applied for would permit 1) the continued funding of the 4S laboratory as a biomolecular sharactterization resource; 2) advanc2ment of laboratory instrumentation capability in specific areas of GC-HRMS and the exploitation of metastable peak analysis. 3) the fucther development of AI computer techniques to match the instrunentation. This work will emphasize practical utilization for applications in biomolecular characterization connected with other on-qoing biomedical research programs. ft will include, for 2xanple, a) the analysis of mixtures by GT/MS; b) metastable peak analysis for difficult problems of pure compounds and of mixtures not rceadiily separable by GC; c) optimized data analysis for sharacterization of MS peaks ani d) heuristic analysis of spectra for the molecnlar ion composition. Juc projact is the only systematic effort, to our knowledge, currentiy underway in this country for computer assisted structure slicidation. Subsequent to our early publications, an intensive program has been mounted in Japan in similar areas. This situation may be contrasted with computer assisted organic synthesis, an area receiving considerable attention from several research groups. These capabilities can be beneficially provided to a wider community via the ATM-SUMEX resource. Research on the amulation of human intellect by computer programs will undoubtedly influance the efficiency with which chemical research can be applied to ever more complex problems of health, e.g., intermediary metabolism and its pathologies; environmental influenctes on health; the development and critical validation of new therapeutic agents. The athievament of these objectives depends on the continued naintenance and development of the DENDRAL AT programming system (sae balow). The advent of tha AIM-SUMEX facility will remove some of the serious computational limits on the exercise of this system that have delayed recent prograss. PFducation. Tn our university setting, pre-doctoral and post-doctoral aducation of course constitutes a part of our mission. As far as is practically possible, research participation in the DENDRAL program has been coupled with dissertation work by graduate students and post-doctoral research experience respectively. Examplas of people (and their research area) whose education has been enhansed in this way are the following: Scadnate Students: J. Simek, pedagogical aspects of the structure qenerator; Wai Lee Tan, synthesis of new estrogen compounds; H. PFqgert, 13°MR of amines and steroidal ketones; C. Van Antwerp, 13CMR of steroidal alcohols; c. Farrell, theory formation fron mass Spectral data: L. Masinter, development of the structure qJenerator: M. Stefik, AT applications to chemistry. 37 Postdoctoral Fellows: G. Dromey, theory formation from analytical Jata: &. Gritter, mass spectral fragmentation of hiologically active steroids; 8&8. Carhart, analysis of 13CMR spectra by DENDRAL-like programs; $. Hammerum, development of better Fragmentation rules for progesterones. Formal organization. Phis project has been a long-term commitment of Djerassi, Lederberg and Feigenbaum functioning in effect as so-investigators. We coordinate our activities with day-to-day tontacts ia the pursuit of convergent research objectives. In the light of the extension of our collaborativ2 activity during the last t#o y32ars, we are now organizing a formal advisory group to jaclude, in addition to ourselves, H. Cann, J. Barchas, and E£. Van Tamelen. This group will advise the principal investigators on the direction of the program with respect to allocating available facilities and seeking out and helping other collaborators. This Jesignuation simply recognizes the fact that many of our colleagues have alreaiy heen engaged in r2lavant collaborative research with 4S. A MS resource has recently been funded at the University of Talifornia/Berkeley, under the direction of Dr. A.L. Burlingame. Drs. Djerassi and Burlingame have recently engaged in some tollaborative research which was made more successful by the sharing of facilities ard expertise available at one institution but not at the other. We would hope to maintain and strengthen these contacts to avoid unnecessary duplication of effort. We plan to discuss with Dr. A.L. Burlingame the most appropriate procedures for coordinating the related activities of our respective programs at the University of California/Berkeley and hara. Phis may take the form of raciprocal membership in advisory rommittees. The “hardware resource" to which this application is pegged has n2aen identified as the MS facility. While these instruments alone represent an investment of over $300,090, funded previously by several agancies, they do not cepresent th2 most ilaportant resource. He would uses this designation instead for the working team led by the princioal and co-investigators. The skills embraced by this ycoup incliite, as mentioned, computer science, structural organic chemistry, molecular biology, instrumentation engineering anda wide cange of other disciplines. They are represented not only in the princinal professors but in a diversified and accomplished professional research staff {see Budget Justification). The program for which funds are now requested is the vital means by which the interests of this group can be sustained ina corrdinated effort that would be very costly both in funds and in tine if it had to be reconstructed from scratch. Without the finantial support now requested, this line of collaborative research will have to be abandoned, with it a unique style of interdisciplinary collaboration, and the MS facility will be terminated. Se mR, BATKGROUND AND RATIONALE 1. The Structure Flucidation Problen a) The General Problem. Analysis of molecular structure is a major activity in our program of resource related research. For the specific task of elucidating molecular structures, i.e., the topology of atom-to-atom connectivities, analysts utilize a mixture of information derived from chemical proceiures and spectroscopic techniques. Each item of information, if not redundant or uninterpretable, contributes to the solution of the problem. Chemists draw upon a tremendous body af specific knowledge about the task area (e.q., clinical shemistry, biochemistry), molecular structure, spectroscopic techniques, etc., in order to piece together this information and iafer the structure of molecules. These features, and the relative simplicity of the final concept of a structure, make the prohlem particularly well-suited for applications of the techniques of AI to assist research workers performing the task. b) Njerassi'ts Laboratory. Professor Djerassi has been concerned with structure elucidation problems since the beginning of his shamical research. His activities at Stanford have been concerned heavily with the application of particular spectroscopic tachniques to structural studies of bismedically important tonpounds. These techniques include optical rotatory dispersion (ORD) and, more recently, maqnetic circular dichroism (MCD) (both of them supported initially by the NIH). Since 1961 he and his yroup have also been concerned with MS because of the power of the technique, in terms of specificity and sensitivity, as an analytical tool for structure elucidation. Four books and approximately 250 articles on 4S have been puhlished by him and 4is colleayjues. The technique of MS does not suffice for all structure Jatarmination problems, but it is a very powerful tool in areas where there exists a body of knowledge about the MS behavior of related molecules. When sample size is limited MS may well be the only technique that can be utilized. The recent availability of hiqh resolution mass spectrometers has mad2 HPMS the technique of choicte for many applications because under ideal conditions the axact mass number uniquely specifies the the empirical formula of a molecule or fragment. On a parallel course, the technique of 37/™S, routinely available with low resolution mass spectrometers (GC/LEMS), has revolutionized investigations wherever complex mixtures ace encountered. All of the above considerations argue that an extension of MS at Stanford to provide routine GC/LRMS and SC/HRMS analyses would be the next logical step to assist researchers depending on this facility for solutions of their structure elucidation problems, 2. Historical Background a) Mass Spectrometry Lahoratory. Prior to the existing DENDRAL qrant, the groundwork was laid for computerization of the existing mass spectrometers, an Associated Flectrical Industries MS-9 high resolution mass spectrometer and an Atlas TH-4 low resolution mass spectrometar. This work, supported primarily by NASA via the 37 Trastrunmontation Research Laboratory (T&L) in the Department of Sanetics, cesulted in link-up to the then axisting ACME computer facility via a PDP-11 mini-computer which acted as a buffer between the spectrometers and ACME. Initial data acquisition and reduztion programs were written for the system and utilized ona Limited basis. The funding of the DENDRAL proposal, NIH grant RPR-612 (May 1,1971-present) in conjunction with additional resourztes provided by the IRL resulted in a najor improvement to thes2 capabilities. The fruits of these efforts are described uniter section I.B.3 {below}. b) Summary of Early DENDRAL Development. In 1964, Lederberg devised a notational algorithm for chemical structures (termed DENDRAL) that allowed questions of molecular structure to ba framed in precise graph-theoretic terms. (Refs. 1,3-5,12). He also showed how to use the DENDRAL algorithm to generate complete and irredundant lists of structural isomers. (Refs. 1,5). In 1965-66 Lederberg and Feigenbaum began 2axploring the idea of using the isomer generator in an artificial intelligence program - searching the space of possible structures for plausible solutions to a problem much as a chess-playing program searches the space of leqal moves for the best moves. (Refs. 7,12). This approach quaranteas that every possible solition to a problem is considered - aither inolicitly, as when whole classes of unstable structures ace rajected, or explicitly, as when complete molecules are tested for plausibility. In either case, an investigator easily jJetermines the criteria for rejection and acceptance and knows that no possibilities have heen forgotten. This approach also quarantees that structures appear in the list only once - that autonzorphic representations of the samo complex molecule have not b22n included. In both these respects the computer program has an advantage over manual approaches to structure elucidation. c) Tnitial collaboration with Djerassi. (Refs. 14,15,19, 20, 21,22,24). Lederberg and Feigenbaum realized that (a) only through application to real problems could the AI approach be materially advanced and critically evaluated, and (b) MS appeared to be a fruitful applications area. MS appeared to be an excellent problem area because of the close relationship between spectral Fraqmentation patterns and molecular structure for many classes of noleculas. Dijerassi'’s interest and expertise - and daily interactioa between members of his group and the AI group - led to a series of joint publications describing the approach and initial results of the programs. The success of these collaborative afforts led to the proposal to the NIH for initial funding to extend these efforts. 4) Efforts Under NIH Funiing for DENDRAL. (Refs. 25-41). The initial funding by NIH provided the opportunity to upgrade the instrumentation and computer programs. In particular we were able to mount a concerted project on both the analysis of mass spectra of bionedically important compounds and the mathematical aspects of molecular structure. Progress reports to the NIH describe this research in detail. The most recent annual report appears in Appendix B. A series of publications directed to audiences both in computer science and chemistry are listed in the bibliography. The following section (Section 3) summarizes the capabilities for 40 strusture alucidation which, in thamselves, constitute an important result of past work. a) Related Research. An important side effect of tha DENDRAL project is the extent to which additional research was inspired and carried out to fill qyaps in existing knowledge. This research, not supported by the QNENDAAL grant, has been beneficial to on-going DENDRAL work, and vice-versa. Publications which have arisen from this research are Listed in the bibliography (Refs. 58-70). A brief review of these publications should indicate the need for precise specification of the kaowleige elicited from chemists and used in computer programs. AS an example, consider the description and application yf an early algorithm for generation of cyclic structural isomers (21). This paper considered the problem of spectroscopic Jiffarentiation of isomers of [°6H100. Unsaturated ethers fall in one of the classes of isomeric compounds which must be considered, but the MS of unsaturated ethers had not been investigated Systematically. This work was subseguently carried out in Professor Djerassi's laboratory independently of DENDRAL support, but of benafit to DENDRAL (62). Other examples will be found in the Bibliography (Refs. 58-79). 3. Existing Capabilities #o have worked to develop distinctive capabilities for molecular structure elucidation, bringirg together a high quality HRMS 3ysten and AI programs applied to biomolecular characterization. The feasibility of our analytical approach has been demonstrated in saveral problem areas, basel upon the development both of a MS syst2n and a general set of computer programs for use in new areas. The princival capabilities are summarized below. These are now in yeiag and were developed primarily under NIH funding to this project, with additional support supplied by ARPA and NASA in specific areas. (These agencies have reduced funding levels for this work because overall cutbacks have forced NASA to cut out this area of research despite their interest and ARPA is chartered to provide funds for frontier computer science research but not for applications. Thus the NIH is the principal of support for Future development of anplications programs in the interdisciplinary area of artificial intelligence/heaith related chemical problems.) a. HRMS System and Coupled SC/LRMS System. We have coupled the NIH-supported Varian-MAT 711 High Resolution Yass Spectrometer with a Hewlett Packard Gas Chromatograph and Jeanoastrated its utility for GC/LRMS analysis of such difficult analytic problems as the free sterols (i.e., not derivatized) isolated from marine and other sources. Advanced data reduction techniques for this instrument were written for use with the ACME conputer system (360/50) and row exist in Stanford's new 370/158 which tontinues to support the PL/YACME language. SC/HRMS scans on extracts from urine and amniotic fluid demonstrated this system's ztapability to provide high quality mass measurements on complex nixtures obtained from biological sources. An example of one SC/HRMS run on the amino acid fraction of amniotic fluid is presented below (Sec. III.D). 4/ b. DENDRAL Structure Senerator (Refs. 1-6,14,31,37, 38, 40,41) Tho DENDRAL Structure Generator progran accomplishes exhaustive and irredundant generation of isomers, with and without rings. This proqram quarantees consideration of every candidate structure - either implicitly, as when whol® classes of structures are forbidjten, or explicitly, as when individual compounds in a class are specified. It corresponds to the "legal move generator" of convouterized chess playing and other heuristic programs. c. DRENDRAL Planner (Refs. 25, 28,33) qe have written a very general set of computer programs for Aaterminingy structural features from analytical data in well-defined areas. Such general planning programs have been written fot low and high resolution MS, interpreted proton NMR spectroscopy and 13CMR data. J. INTSUM (Refs. 26,29,34,35) INTSIM is a computer program that aids in finding interpretive rules for “4S. The program interprets a large collection of MS Jata actoriing to criteria specified by the investigator. Then it summarizes the data to show which of the possible interpretations sean most plausible. @. PULEGEN {Refs. 26,35) ROLEGEN is the current rule generation program that suggests various rules of interpretation for the MS data summarized by INTSIM. Although not finished, the program can provide useful assistance in practical theory formation. f. Ancillary Techniques 1. The MS facility provides other types of experiments in MS, including ultra-high resolution measurements (masses determined via peak matching), defocussed metastable ion determinations (Bacbar-Flliott technique) and low ionizing voltage experiments. These data are utilized by both scientists and programs where appropriate. 2. Additional computer prograns provide added problenm- solving assistance. a. Predictor program for predicting major features of mass spectra. b. Programs for drawing and displaying chemical structures. c. Subroutines developed in conjunction with or existing as parts of the Structure Ganerator for problems of partitioning, construction »f vertex-graphs, and constructive graph labelling. These can he applied? to answer certain questions of isomerism which do not reyuire the complete generator. For example, the labelling algorithm can list all structures resulting from substituting sites of a carbocyclic skeleton with stated numbers of different Functional groups. g. Other Spectroscopic Techniques Available to us are the facilities of Professor Djerassi's laboratory for work requiring additional spectroscopic data. Also available on a fee for service basis are extensive spectroscopic facilities (NMR, I.8., and U.V.) of the chemistry department. These woulld be utilized for collecting additional data on particular structure problems and gathering data on known tompounds {particularly in the area of 13CMR) as the AI programs beacoma knowledgable about other spectroscopic information. Fr h. Chemical Facilities The staff and facilities of the chemistry department represent substantial synthesis capabilities and general chemical know-how. This resource can be called upon to provid? assistance in synthesis of model or labelled compounds, derivatization of mixtures, and so forth. For example, a graduate student in shemistry is presently engaged in thesis research dealing with the laboratory synthesis of a new astrogen metabolite strongly suspected to be a component of certain preqnancy urines. The previously proposed structure of this tompound was one of the sandidate structures inferred by the planner in a study of astrogen mixtures (11-dehydroestradiol-17-alpha, ref. 33). 4. User Community ®conogmic utilization of existing and proposed facilities can be realized by sharing them with a community of users. Lacking supplementary funds that would be needed for a comprehensive, naior service facility, this community will include the following yr>duos, but will be informally available to others. A. Stanford Community i) Stanford Chemistry Department (except for Hodgson, all are heavily supportei by the NIH in their research efforts) Latters of interest are attached to the proposal in Appendix A. Prof. C. Djerassi - Steroids, marine sterols Prof. W. Johnson - steroids Prof. E. Van Tamelen - steroids, triterpenoids, other natural products Prof. H. Mosher - natural products {(e.g., marine toxins) Prof. K. Yodqson - biological ligands, ligand-metal complexes Prof. J. Collman ~ cytochrome P450 models ii) Stanford Medical School Collaborators The following research projects in the Stanford Biomedical Community will furnish samples for mass spactrometric analysis under the present proposal. Attached to this proposal (Appendix A) are copies of the letters of interest in the proposed facility received from the principal investigators of these qrants. Pe. James RF. Trudell, Department of Anesthesia, Stanford University School of Medicine. Drug metabolite identification in humans. Dr. Irene S. Forrest, Biomedical Research Laboratory, Veterans Administration Hospital, Palo Alto. Drug metabolite identification in humans. Dr. I. Rabinowitz and D.I. Wilkinson, Department of Dermatology, Stanford University School of , Medicine. Prostaglandins. Prof. Fugene D. Robin, Department of Respiratory Medicine, Stanford UWriversity School of Medicine, Ratio of NADt/NADH in cells by measnring ratio of oxidized to reduced redox pairs. Dr. Leo E. Hollister, Veterans Administration 43 Hospital/sNepartment of Medicine, Stanford University School of Medicines. Metabolism of Marihuana. Dr. Hiram 4. Sera, Pharmacy Devartment, Stanford University Hospital. Drug Identification. Dr. Sumner M. Kalman, Department of Pharmacology, Stanford University School of Medicine. Drug and drug metabolite identification. Dr. Jack Barchas, Department of Psychiatry, Stanford University School of Medicine. Neurotransmitters and. related compounds in man. De. Keith A. Kyenvolden, chemical Fvolution Branch, NASA Ames Research Center, Mountain View, Calif. Amino acids, acids in geochemical samples, structure of products formed from electrical discharges in gas mixtures. Dr. William PR. Fair, Department of Urology, Stanford University School of Medicine. Identification of the prostatic antibacterial factor; polyamines (putrescine, spernine, spermidine) in body fluids of patients with prostatic carcinoma. Besiias the user projects just summarized, other major prospects are in sight. At the time of writing, the chair of pharmacology is vacant. Conversations with the leading candidate have indicated a deep-seated interest in GC/HRMS as the principal analytical tool for broad ranging studies of drug metabolism in nai. 8. Extramural Users The davelopment of the techniques of ORD, MS and MCD at Stanford has beer paralleled with extensive sharing of these resources nation- ani world-wide in collaborative research efforts, without any additional funding. Rather than provide routine service, axperience has shown that discretionary selection of problems results in better utilization of our peopl? and instrumentation cesonrces. We would extend this provision of services including available computer programs, to a limited number of extramural users. Note, for example, our successful collaboration with Profassor Adlercreutz, Meilahti Hospital, University of Helsinki, ar tha identification of estrogens fron body fluids utilizing the AT planning program {ref. 33). 44 c Relationship to AIM-SUMEY and the Genetics fesearch Center we Tha present application is strengthenel by two research projects related to, but not overlapping, the proposed research of this grant. 1) AIM-SUWEX (NIH BR-00785, Oct. 1, 1973, thru July 31, 1978, Principal Investigator, J. Lederberg). This is a resource grant. to establish a national facility for applications of artificial intelligence in medicine (AIM). Our own use of this facility will inctluia SUMEXY PDP-10 computer time and file storage necessary to run the DENDRAL artificial intelligence programs. This support will be furnished without charge to the present proposal. It tepresents an annual investment of about $190,000 in computer time 2yuivalent value. The ATM-SUMBEX computing facility is shared equally between a national user community {AIM) and a Stanford Medical School tommunity. The DENDRAL research will be supported out of the Stanford portion. The AIM service will be administered under the 9olicy control of a national advisory committee and will be imolemanted over a national computer natwork. AIM-SUMEX provides the means for members of the national user community interested in structure elucidation to access the DENDRAL programs. 2) Genetics Research Center (NIH PO1-SM 20832-01 - approved by the NISMS Touncil, awaiting funding, Principal Investigator, J. Lederberg). This research proposal is a comprehensive grant which would snpport interdepartmental research at the Stanford Medical 3chool in Yedical Genetics, Pediatrics and other clinical apolications. A section of that proposal concerns the use of SC/LRMS for screening body fluids for evidence of inborn errors of metabolism. (This project grew out of the initial DENDRAL grant, one of the research qoals of which was the analysis of body fluids using SC/MS). This research on inborn metabolic errors will be zsonducted jointly in the Stanford Departments of Genetics and Pediatrics using existing equipment {Finnigan 1015 Quadrupole mass spectrometar, Varian Aerograph GC and a PDP-11/20 based data system). Wo appreciated the value of GC/HRMS analyses of selected extracts af body flaids (i.e., those containing metabolites not identified yv routine GC/LRMS data) when formulating the Genetics Research Teater proposal. Accordingly, a small amount of funding was there caqguestal For recording selected GI/HRMS data on the GC/Varian MAT 711 mass spectrometer in the Dapartment of Chemistry. If these funds are awarded, we will negotiate with NIH a suitable alimination of this minor overlap with the present budget. YS ty. SPScrFTte AIMS Th2 specific aims enumerated in this section will be pursued in tha highly inter-disciplinary manner that has characterized the DENDRAL project from the start of its NIH support. The aims are not disjoiat,but interactive and inter-dependent. For example, the power of MS and, potentially, other spectroscopic techniques, tan be anhanced by the use of computer programs to perform various asnects of structure elucidation and theory formation. From the starnipoint of computer science, one measure of the utility of techniques of artificial intelligence is how well they perform in real-world applications. It is necessary in the development of these programs to have a source of data and informed, involved tean-mates able to criticize m2thods and results. The aims are alabosrated in the methods section. We have attempted to keep the proposal to a readable length. Therefore, some detail has been omitted. However, many details can ba found in the biliography ani we are prepared to provide additional information Juring the site visit. 1. Enhance the power of the MS resource. The axisting MS resource, together with computer programs which axist oar which are proposed (see Aim 2, below), is capable of solving sone of the structure @lucidation problems of the user community given computer support for data collection and reduction. We refer specifically to the areas of GC/LRMS and roatine, batch HRMS samples, We believe that many of the problems of the user community require nore powerful technigues (see Section IIlt). These techniques, specifically GC/HREMS and Seni-antomatic metastable defocussing, can be provided with a minimum of cost and effort, thus enhancing considerably the capzhbilities of the resource. Jur first aim is to provide the resource with adequate computer support (replacing the previous ACME system) to enable collection and reduction of mass spectral data including low and high resolution scans and data or defocussed metastable ions. #2 oropose to develop this computer support in the ways described agelow. {these aims are written to include the work necessary to imolement the extended PDP-11/20 computer system. A description of the rationale for this choizte is provided in Section III.A and the specific augmentations in the Budget Justification). A) Convert existing, proven data acquisition and reduction nrograns from the PL/ACME larguage into Portran, consistent with tine-critical assembly language programs for data acquisition and instrument control. These programs will be written in Fortran to enhance compatibility with the computer systems of other users of such packages. B) Modify these programs, 3S requirei, to handle acquisition and reduztion of frequent or repetitive HRMS scans with selected instrument performance feedback to the operator, and to take 2aivantage of the expanded capabilities of the extended 11/20 systam. Prototype GC/HRMS systems have heen developed at Stanford and 2lsawhare, but this type of facility (in contrast to GC/LREMS) YE now available to the Stanford community. When this system 2lopad, service will be available to the Stanford community searzth collaborators and, if our resources permit, to any tist requesting assistance. In many instances this type of llaboration will require far mora involvement of convergent interests, efforts and skills than merely running samples on request. We have in mind the chemical and eventually biological interpretation of the analytical data as a matter of joint concern, as appropriate. 2 WV mo yo ahs bee ¥ mu a oO Ss I ie 7] OQ 4 we have praviously illustrated the advantages of high resolution Rass spectral data in the computer analysis of mass spectra {e.g., ref. 28). Also, we have previously shown that the same program cai deal with analyses of mixtures without prior separation especially when additional data (e.9., from selected metastable Jlefocissiny experiments) were provided (Ref. 33). We wish to use the MS rassurce and the comput2r program in further studies of nixtures of compounds which are difficult or impossible to separate by GC. The advent of routine systens for high pressure Liquid chromatography have made many of these separations possible, but the liquid chromatograph is not presently interfaced to the MS. Many of the problems of the user community require analysis of complex mixtures which are amenable to treatment by GC/MS techniques. We feel that where sample quantities permit, acquisition of GC/HRMS data is highly desirable. These data can be providai by the resource supplemented with computer support {above). We propose to continue tests of the GC/MS combination, operating ander moderately high mass resolutions (5900-19000), to define in Jetail the optimum operating conditions of the GC/HRMS combination. This will provide the necessary information on maximum practical sensitivity to be expected. This information tan then he used in collahoration with the user community for sampLo prenaration. The 37/HRMS system would normally be operated at reduced mass spactrometer resolutions to maximize sensitivity. We have existing multiplet resolution programs to increase the resolving power of the MS. We propose to provide the nultiplet resolution program with heuristic quidance based on compositional variations inferred from molecular ions or other singlet peaks. For example, a cesolvyiny power of 10,000 is harely sufficient to resolve ions which differ by CH2 vs. N (delta m = 0.012) for ions of about mass 193. Althsugh it will resolve CH4# vs. 0 doublets (delta m = 9.937) at this mass, it will not resolve closer doublets such as T3N vs. H293 (delta m = 9.003). We can provide exhaustive tabulations of multiplets hy mass separations (based on ref. 30) which can be used by the muitiplet resolution program. We have praviously indicated the power of metastable ion information in the operation of our prograns for structure alicidation {refs. 28, 33). Ye have extended one of our programs (the MS predictor program) to propose metastable defocussing experiments in order to avoid tolleaction of unnecessary data (see Ain 2, below). Although we can collect these data (Barber-Elliott techniyue) manually on our existing Varian MAT-711 MS, this is an 47 axcesedingqly wasteful operation, both in terms of sample sonsumption and time. We propose to implement some automation of collectian of these data on metastable ions. We also propose to bazyin preliminary investigation of alternative modes of metastable ion datermination {see Methods (Sec. ITI), helow). 2. Develop performance and theory formation programs to assist in the solution of structure elucidation problems in biomedicine. Tonputer programs have already been written for analysis of low and high resolution mass spectra, for generation of acyclic and cyclic molecular structures, for labelling structural skeletons with atoms, for analyzing 13CMR spectra of amines and for interpretation and summary of large volumes of data gathered on molel tompounds {see Existing Capabilities above, for references). We wish to increase the utility of these programs by providing interactive facilities that allow easier access to them, by increasing their generality ani power, and hy supplementing them with new raasoning programs. Performance Programs: The current structure generator program will be subjected to further datailed tests before using it for structure determination problens. A naw algorithm for generating cyclic skeletons (with 190 multiple bonds) will be projramned and checked. The algorithm is written and informally proved. A formal proof will be devised as wall. This algorithm represents one very powerful approach to the problen of implementation of constraints, as discussed in the following paragraph. The generating programs will be modified to allow isomer generation within constraints. Different kinds of constraints can ba inferred from different kinds of spactroscopic data. We intend to give tha program knowledge of a variety of these. The Planner programs that infer constraints from mass spectrometry data will he broadened to include additional knowledge about the spectral hehavior of classes of compounds of ralevance to the NIH-sponsored research of the user community. In atdition, w2 will add the capability for utilization of information ahont chemical isolation procedures (e.g., one expects acidic and neutral compounds in solvent extraction of acidified body fluids) and relative GC retention times (e.g., to admit the possibility of homologous series). We propose to implement a more general method for inferring the idantity of the molecular ion whether or not this appears explicitly in the spectrum. This information is important for the successful operation of the structure generator and the planiver. We want the program to use whatever information is available and not depend, as it currently does, on having knowledge of the structural class together with inference rules foc that class. Tnterface routines will be written to make it easier for other scientists to use these programs. We have to wait for an 48 interactive system hefore starting this: AIM-SUMEX will he ideal. Inpit/foutput routines will be crucial to easy use of the system. Haywavec, we also want to give users the facility to understand the system's reasoning steps so they can take advantage of it. Tn addition to making the computer programs available through ATM-SUMEX, we would like to translate parts of the LISP code into another language - for reasens of both efficiency and axportability. We have talked with computer professionals at IBM Research Canter about using tha APL language. FORTRAN, ALGOL and PL/1 are other languages whose merits for our purposes we will explore. Wea wish to continue a low-level of effort on computer programs that interpret other kinds of spectroscopic data. Planning programs Similar to the “4S Planner could be written for automatic analysis of data fron other spectroscopic tezhnigquas{e.q., IP, UV), as we have illustrated for 13CMR (ref. 39), Tha structure generator's view of chemical struct ire is topological and is presently unconstrained hy bond lengths and anjl2s. Because stereochemical considerations are frequently important in structure elucidation, we propose to begin consideration of stereochemistry in the structure generation and evaluation processes. A proyram with detailed knowledge about information abhtainable from various spectroscopic techniques could be written to exanine a list of candidate solutions and propose experiments necessary and sufficient to distinquish among them. The program would represert an extended Predictor (e.g., ref. 27). We have a ficst version of a program that suggests "crucial" metastable peaks to he sought in order to distinguish among candidate Structures. Work on this proqram will continue at a low level of activity, possibly expanding into areas other than MS. One topic w2 will continue to pursue is our collaborative effort with Dr. Silda Loaw, Genetics Department, on th2 potential application of molecular orbital theory to pr2adiction of mass spectra (ref. 71). Theory Formation Proqrams: The rile formation progran ({RULEGFEN) will be extended so that i+ can search a larger space of rules. Present a priori zoristraints on the rule generation give us a search reduction from teas of millions to a thousand possible rules. Even though search hairistics now allow efficient search o9f these possibilities, we want to be able to deal with much larger spaces efficiently, as whoa the number of primitive predicates is drastically increased. The RULFGEN progran will be modified so that complex Fragmentation and rearrangement processes are manipulated nearly as easily as simple fragmentations. The program currently finds frajmentation rules involving one or two bonds, possibly followed by hydrogea migration. In the case of cyclic systems such as astroaqens, however, the program must be able to work with sets of threa or more bonds in some cleavages. Interactive programs will he provided on AIM-SUMEX for the investigator to guery the rule generation program. For example, 49 many questions now arise about the program steps by which the program infers the rules it suggests as explanations of the regularities. Why, for exanple, was some particular rule not tonsideread plausible? New data will have to be selected in order to test the rules and to differentiate among competing rules. Wa will write a program that suygests new experiments (i.e., new data to obtain), depending on the nature of the existing rules. The t2st phase of the theory formation program will be written as an evaluation function of each rule against new data. Tasofar as any new experiments are "crucial" experiments, the avaluation function may merely reject a proposed rule. Mostly, yoweaver, riles will have to be evaluated against new data along many dimensions: frequency, strength of evidence, uniqueness, simplicity, and the like. ge wish to experiment with the whole theory formation program to determine the critical aspects of our design. For example, {1) how sensitive is the program t23 discrepancies, inconsistencies and errors in the data? (2) how well can the program find rules within a slightly different moilel of chemistry? (3) how well can the program perform with one pass through the data, or several passes? and (4) how critical are the principles of theory Formation? 3. Apply the structure elucidation techniques - both nstcumentation and computer programs - to biomedically relevant in compounds. Jur own interests are in elucilating the structures of, and anierstanding the MS of, marine sterols, hormonal steroids, and compounds isolated from human body fluids that can be associated with aqanatic disorders (from research in the GRC). In addition, we will be working closely with members of the Stanford Medical School and Chemistry Department - in particular those mentioned ahove {Section I.B.4) - on their structure elucidation problems in which MS will be used. Although most users expect to require HRMS and SC/HRMS data, some of their problems will be attacked ntilizing SC/LRMS techniques and library search through (usually) restrict2d libraries of mass spectral data. We propose to investiaate soma extensions to the technique of library search {s2e Methois) to complement our existing and planned DENDRAL preqrams. We plan to continue our exchange of mass spectral data ana library search information as we have previously done with Dr. S$. Yarkey (University of Colorado Madical School) and Dr. F. W. McLafferty (Cornell University). Rs in the past, attention to new biomedical research problems will lead to increased capabilities in the computer programs. We require close communication with the paople engaged in the research so that the programs actually assist the researcher while increasing in power. Collaborative proposals have come out of suzh past DENDRAL sponsored work, for example, large portions of the 3RC proposal and a proposal for 13CMR research. W2 anvision the interaction and collaboration with the user Sommunity to involve the following: 56