Form Apypeoved SECTION } Budget Burezu No. G8-ROZ4G DEPANTnNT OF LEAVE BLANK HEALTH, EDUCATIC!, AND WELFARE TYPE PROGRAM NUMBER PUBLIC HEALTH SERVICE REVIEW GROUP : FORMERLY. GRANT APPLICATION aS - COUNCIL (Aanth, Year) PATE RECEIVED VO SE COMPLETED BY PRINCI GL INVESTIGATOR (lems fthrougs 7 ang ISAS ji. TITLES OF PROFUSAL (Lo not excond 63 typewriter spaces} Resource-Related Research: Biomolecular Synthesis . whe ee _ _ 2. PADS NWESTIGATOS 12. DATES OF Ei TIPE PROPOSED PROJECT PERICE (This application’ 2A. MAME (Last, First, Inival) FRON THROUGH Wipke, W. Todd - Oct. 1, 1 _ < 1975 Sept. 39, 1973 20, TITLE OF POSITION 4 TOTAL DIRECT COSTS RE- 5 DIRECT COS Ts REQUESTED QUESTED FOR PERIOD IN OR FIRST 12-MONTH PERIOD Ns ate Professor niet ITEM 3 a Associate Professor of Chemistry § 391,532 $ 172,084 2C, MAILING AQORESS (Street, Criy, State, Zip Code) 6. LeRFONTAN ICE SITE(S) (See fascract yg) Natural Sciences II Natural Sei ences TL Voaversiuy of Co lifornia University of California Santa Cruz, California 95064 Santa Cruz; California 95054 16th Congressional District 2D. DEGREE 2F.TRLE- SiON 2G, OEE At RiMCNT, SERVICE, LAB ATORY OR EQUIVALENT (Sve fustructioas; Division of Natural Sciences IO 5 STF TF wer 7. Research involving ‘Woman Subjects (See fnstructions) 8. inventions (Renews! Apoucants Cridp - Soe i ALRINO 8.0) YES Approved: A.LENO B.LLO YES -- Not previously repurted N/A ay . WES -- Pencina Raview “Date C.LLIYYES — Previously reported 20 BY RESPONSISLE ADMINISTRATIVE AUTHORITY titans # through 13 and )58) MVONAS: (See fnstrucivens) 1. TYPE Ge ORGANIZATION (Check agplicah'e ian of the University cf (LJFEDERAL SI STATE DU LOCAL Celifornia nm The Regents University of California, Santa Cruz Te. NAME, -, Santa Cruz, Caliternia 95064 OFPICTAL IN GU rs WoL i earFornsa 90064 NOTIFIED IF AN avi IRS Now 1-04.7152 HK. J. Senner Contracts and Grants University of California Santa Cruz, California tiscth ¢ SIGLSss dona 1G, NA SIGMENG F -OR APPLINAILY DO NUR BER OF OFFICIALS SNIZATIONIS} Leo FP. Learort STEN TENS LO Bg HAP OPUS FOR ENGST! TUTIO! ek ‘GKAS oT PUP 3 2ES ‘See “ostcus Saat Desn Division of Natural Sciences 20 .~. Division of Natural Sciences KNOY “Od TUM GER (Ente 45 - 1481 » (295) PY ANCE. We, the usdersigned, certify that the statements herein are true and complete to the bast obo. coept, oS $9 any grant eewarded, the obligaian tu cemply with Public rloalth Suoice terms and conditions in cffeci et the thager he - cep qymnmnee ene ne Pode aw TS cae 4 - J a SECTION 1 DEPARTMENT OF HEALTH, EDUCATION, AND WELFARE LEAVE BLANK PUBLIG HEALTH SERVICE PROJECT NUMBER RESEARCH OBJECTIVES NAME AND ADDRESS OF APPLICANT ORGANIZATION The Regents of the University of California University of California, Santa Cruz, CA 95064 NAME, SOCIAL SECURITY NUMBER, OFFICIAL TITLE, AND DEPARTMENT OF ALL PROFESSIONAL PERSONNEL ENGAGED ON PROJECT, BEGINNING WITH PRIN INVESTIGATOR W. Todd Wipke Associate Professor Chemistry Board of Studies Graham M. Smith Research Associate Chemistry Board of Studies Hartmut Braun none Research Associate Chemistry Board of Studies S. Krishnan none Research Associate Chemistry Board of Studies Glenn I, Ouchi > Research Assistant Chemistry Board of Studies TITLE OF PROJECT RESOURCE-RELATED RESEARCH: BIOMOLECULAR SYNTHESIS USE THIS SPACE TO ABSTRACT YOUR PROPOSED RESEARCH. OUTLINE OBJECTIVES AND METHODS. UNDERSCORE THE KEY WORT, (NOT TO EXCEED 10) IN YOUR ABSTRACT. The objectives of this researkn are to develop the logical and heuristic principles of biomolecular synthesis and incorporate this information into @ practical computer program to assist investigators in designing organic syntheses of complex biomolecules. Special emphasis is placed on stereospecific syntheses using the strategies involving steric effects, electronic effects, and strain energy, as well as symmetry, graph theory, and topology. The methods involve computer graphics input/output to the chemist, three-dimensional model building and analysis, strategic plan formation, selection of relevant chemical transforms, and evaluation of generated synthetic schemes. The project is an extension of the SECS Simulation and Evaluation of Chemical Synthesis program, and is proposing to collaborate with the Stanford SUMEX resource. LEAVE BLANK ‘ t PHS-398 PAGE 2 Rev. 3-70 SECTION Il — PRIVILEGED COMMUNICATION T. Wipke a? FROM THROUGH DETAILED BUDGET FOR FIRST 12-MONTH PERIOD October 1, 1976 September 30, 197 i DESCRIPTION (/temize) TIME oR AMOUNT REQUESTED (Omit cents) PERSONNEL EFFORT FRINGE TOTAL NAME TITLE OF POSITION “MRS. | SALARY BENEFITS Winke. W. Todd, Ph.D. PRINCIPAL INVESTIGATOR 2 summer _ months 100%| 4,704 31 4,735 Braun, H., Ph.D. Postdoc. II All year 100%| 12,105 © 1,816 13,921 Smith, G,, Ph.D. Postdoc, I All year 100%| 11,598 1,740 13,338 Unnamed, Ph.D, Postdoc. I All year 100%|_ 11,142 1,671 12,813 Unnamed . Programming Asst, equiva- ; ient to one Prog. Asst, II ‘All year __Approximately= | 57% 6,000 -40 6, 040 Ouchi, Glen, M.S. Two (2) Grad. Students Unnamed (Res. Assists.) all year 50%] 10,596 70 10,666 Unnamed One Secretary I All Year 50% 4,011 602 4,613 Totals 60,156 5,970 66,126 CONSULTANT COSTS 0 EQUIPMENT ___(See attached equipment list) 74,338 suppLics__Magnetic tape, electrostatic paper, movie film, office supplies 3,000 poeta Trips to Conference and SUMEX site 1,500 TRAVEL FOREIGN One overseas trip to present paper and visit computer 1,500 Synthesis projects, round trip to Burgenstock, Switz. PATIENT COSTS (See instructions) . 0 ALTERATIONS AND RENOVATIONS 0 OTHER EXPENSES (Itemize/__Two leased lines to SUMEX resource (4,500): publication costs (3,000); manuals and documentation (1000): DEC maintenance of equipment (8,000); Telephone equip. rental, long distance, postage( 2000); Computer time IBM 360/40 (1,000); Disk drive lease (6,120) 25,620 tae DIRECT COST (Enter on Page 1, [tem 5) pe 172,084 DATE OF DHEW AGREEMENT: INDIRECT saws = March 26, 1975 LI WAIVED COST (See Instructions) sme (C] UNDER NEGOTIATION WITH: _ 34.2% ToC" Modified *IF THIS 1S A SPECIAL RATE (e.g. off-site), SO INDICATE. PHS 398 Pade 3 SECTION fl — PRIVILEGED COMMUNICATION BUDGET ESTIMATES FOR ALL YEARS OF SUPPORT REQUESTED FROM PUBLIC HEALTH SERVICE DIRECT COSTS ONLY (Omit Cents) DESCRIPTION eaeeHioo | ADDITIONAL YEARS SUPPORT REQUESTED (This application only) ] TAILED BUDGET) | 2NO YEAR 3RO YEAR 4TH YEAR STH YEAR GTH YEAR 7TH YEAR PERSONNEL | COSTS 66,126 | 71,574 | 77,874 | CONSULTANT COSTS | (Inctude fees, travel, etc.) 0 0 0 | . | EQUIPMENT 74,338 3,000 3,000 SUPPLIES 3,000 3,000 3,000 DOMESTIC 4 TRAVEL 500 1,500 1,500 FOREIGN 1,500 1,500 1,500 PATIENT COSTS 0 0 0 | ALTERATIONS AND | RENOVATIONS 0 ° ° | OTHER EXPENSES 25,620 26,000 26,000 TOTAL BInECT COSTS 172,084 |106,574 | 112,874 | TOTAL FOR ENTIRE PROPOSED PROJECT PERIOD (Enter on Page 1, Item 4) ————» | $ 391,532 REMARKS: Justify all costs for the first year for which the need may not be obvious. For future years, justify equipment costs, as \well as any significant increases in any other category. ff a recurring annual increase in personnel costs is requested, give percentage, (Use continuation page if needed.) Budget Justification Personnel Summer salary is requested for Professor Wipke. Also, funds for three postdoctoral fellows are requested. Drs. Braun, Smith, and Krishnan are cur- rently working in this research group. These postdoctoral positions provide training to the students in computer synthesis and they provide a high level of chemical knowledge needed in this work. Support is requested for graduate student Glenn Ouchi,.who is already working on this project, and one other graduate student yet to be named. Graduate students are important to this project because they provide needed continuity. Additional programming as- ‘Sistance will be provided by undergraduates, especially by juniors and seniors continuing their research work during the summer. This provides many pre- medical students valuable computer experience. Also requested is partial sup- port (50%) of a secretary for typing manuscripts, documentation, and project- related correspondence with users. Salaries were figured to include anticipated 5% range adjustment in each new fiscal year plus normal promotions or merit increases when appropriate. Fad ¥ SPACE-BINDING MARGIN DO NOT TYPE IN Ti Section II - Privileged CommunicatiorContinuation page EQUIPMENT LIST GT44 Graphics terminal $34,500 DL11-E Asynchronous link to host computer 595 Interface to 9 track magnetic tape drive 3,000 DD11-B Systems Unit Cage 275 LV11BA Electrostatic printer/plotter 12,400 GP-3-3D Graphic Tablet and model 1454 interface 6,700 16mm movie camera with animation motor 4,000 2 Teletype terminals (1 Data Media, 1 Thermal Printing 5,000 2 pair VA 3405 Vadic 1200 baud modems with VA 1601 enclosure and power supply 3,660 Subtotal $70,130 California State Sales Tax -6% 4,208 Total Cost $74,338 PHS-398 Page Rev. 2-69 GPO : 1969 © - 350-360 -.CE-BINDING MARGIN . DO NOT TYPE IN THIS S$ Privileged communication Continuation page T. Wipke 2. Equipment i This project plans to use the SUMEX Resource at Stanford University (directed by Professor J. Lederberg, Department of Genetics) for computing, file storage and scientific collaboration. SUMEX is the Stanford University Medical Experimen- tal Computer System which has as its objectives the application of advanced com- puter science to the field of medicine and the exploration of collaborative inter- actions between active researchers in this area which are facilitated by computer networking. The Biomolecular Synthesis project described in this proposal is an application of artificial intelligence to chemical synthesis problems in medicine and consequently it has much in common with other projects at SUMEX, For the past 6 months we have been users of the SUMEX Resource over a leased and a dial-up line. File storage is a major problem at SUMEX. The total file space currently is 120 K pages (1 page = 512 words of 36 bits each) which is completely allocated. We have had an allocation of 4 K pages, Lut requested © ¥ pages. To ease this problem for the entire SUMEX community, we request funds for an RPO3 disk drive (20 K pages) to attach to the last available slot in the disk controller at SUMEX for a 17% increase in on-line storage. This is shown in the budget as a 3-year lease plan under category "other expenses". The remainder of the equipment requested is to facilitate efficient use of people and SUMEX from Santa Cruz. We have found graphical input and output to be the most efficient means of interfacing chemist to computer because structural diagrams are the natural language of the chemist. Since this project involves computer construction and analysis of three-dimensional molecular models, we have a need to be able to visualize these models, to rotate them, and to move atoms in three-space by hand- (to change conformation). At Princeton Dr. Wipke developed a 3-D acoustic tablet for modifying such models and had available an LDS.~1 display system for rotation of the models. When he moved to Santa Cruz that equipment remained behind. Currently we are using a GT40 display which has a small screen (12") and is too slow to be able to rotate three-dimensional structures to create a kinetic depth effect. The requested GT44 graphics terminal is 3 times faster than the GT40 and can rotate small molecules (“72 atoms) in real time. Additionally it has a laraer screen (17"). Since it is software compatible with the GT40, we will be able to do two-dimensional graphics on either terminal. Currently time is lost waiting to use the GT 40. This is because we are building on an operational graphics program and more time is spent in graphical debugging and testing. The GT44 would provide more available display time. The GT44 is also the more sensible terminal to which the requested peripherals can be interfaced. The two options 1) expand the existing GT40 to accommodatethe peripherals (software and hardware required) and 2) purchase GT44 and connect peripherals to GT44, are very close in cost. Option 2 provides two graphics terminals instead of one, gives us the needed 3-D capability, more flexibility in the event of hardware failure, and basically a more supportable, reliable system. The GP-3-3D is an acoustic 3-D tablet now marketed by Science Assessories. It is used for 2-dimensional drawing as well as 3-D moving of atoms and tracing of 3-D models. It is really the only efficient way to change the conformation of a structure. We used one extensively at Princeton (the prototype) and are severely hampered now without it. It is a necessity for this work. The DL11-E asynchronous link connects the GT44 to the Vadic modem. ‘Two pair of Vadic modems are requested, one pair for each of two leased lines connecting the two graphics terminals to SUMEX. That is, this grant would cover all the cost of communication right up to the SUMEX computer including the modems at SUMEX. Vadic modems seem to be extremely reliable. (TYMNET is also using Vadic modems.) The two teletype terminals are needed for routine text etiting of programs. Although the market is changing rapidly, at this time a Dat: -dia terminal seems PHS-398 Poge & Rev. 2-69 GPO : 1869 O - 350-360 \CE-BINDING MARGIN DO NOT TYPE IN Tri. . i . Privi Leqded communication Continuat on page T. Wipke > a good choice, because it is the standard of SUMEX, meaning maintenance is easier, and SUMEX supports TV-edit which uses the capabilities of the terminal to advantage. Thejother terminal would be a portable thermal printing terminal that could be checked out for remote use to encourage use during the graveyard shift. A 9-track magnetic tape drive from the University of California is available at no cost. Information Sciences here has designed and tested an interface. Funds are requested to duplicate their operational interface. This tape drive will allow us to rapidly transmit to, and receive files from SUMEX, eliminating mail delays of several days. We have a number of infrequently used programs which can be loaded only when needed. Also, we will better be able to send tapes of pro- grams requested by users. The DD11-B Systems unit cage is needed for the tape drive interface. ‘The electrostatic printer/plotter is needed to provide rapid listings locally without again waiting days for the mail from Stanford to arrive. This device will be used also for graphically recording syntheses, plotting 3-D models, and preparation of documentation. It is quiet, and highly reliabld with little main- tenance required. The high resolution model was selected because it is needed for plotting. The movie camera will be used'to film graphical results, for documentation and recording interaction as well as producing demonstration films and teaching films. Films are even superior to on-line demonstrations for presentations to large audiences and are essential to describing an interactive-graphics program. In years 02 and 03 $3000 is for additional equipment, for example an additional teletype or inter: -:ive device such as. switches, knobs, or a color wheel for making color films of =: graphical displays. Dow... «.c travel is needed for presenting the results of this work at national meetings and for travel to the SUMEX site. The DENDRAL and SECS projects have joint group meetings monthly. Foreign travel is requested to present the results of this work overseas at invited lectures. The Buergenstock Conference is especially important since it is concerned mainly with stereochemistry. Additionally, Dr. Francois Choplin, tras “bourg, France, is modifying SECS for inorganic chemistry and periodic visits will be helpful in exchanging information relevant’ to this project. Other Expenses The cost for two leased lines from Santa Cruz to Stanford (35 air miles) is an annual cost and includes installation. These are needed to obtain reliable high transfer rates for interactive graphics when the graphics terminal is remote from the host. Publication costs include costs of reprints, photography, and page charges. Manuals and documentation is to buy manuals and the cost of repro- ducing documentation which this project generates. Maintenance is standard 8 hour provided by DEC except for the RPO3 disk which is 12 hour maintenance. All postage, telephone expenses,(rental and long distance) related to research are recharged to that research by the University. Computer time on the IBM 360/40 is to cover magnetic tape utilities and miscellaneous computing related to this research. The lease of an RPO3 disk drive is based on a three year lease. The ‘disk drive will be physically located at the SUMEX facility at Stanford, but will be supported from this grant. PHS-398 Poge 7 Rev. 2-69 GPO : 1969 © - 350-360 SECTION i! — PRIVILEGED COMMUNICATION T,. Wipke i ola BIOGRAPHICAL SKETCH . (Give the following information for all professional personnel listed on page 3, beginning with the Principal investigator. Use continuation pages and follow the same general format for each person} NAME TITLE BIRTHDATE (Mo., Day, Yr} W. TODD WIPKE Associate Professor 16 December 1940 PLACE OF BIRTH (City, State, Country) PRESENT NATIONALITY (If non-US, citizen, SEX St. Charles, Missouri USA \ U.S. indicate kind of visa and expiration date) Bj Male (J Femate EDUCATION (Begin with baccalaureate training and include postdoctoral) YEAR SCIENTIFIC INSTITUTION AND LOCATION DEGREE CONFERRED FIELD University of Missouri B.S, 1962 Chemistry University of California Ph.D. 1965 Chemistry Harvard University Postdoe 1967-69 | Ghemistry and Computer HONORS Sigma Xi, 1964; Phi Beta Kappa, 1962; University of Missouri, 1962 Honors in Chemistry Distinguished Military Graduate 1962; Omicron Delta Kappa (leadership and scholarship) 196% Pi Mu Epsilon (mathematics), 1960; Sigma Rho Sigma (Scholarship), 1959; Commendation Medal, US Army, 1967, MAJOR RESEARCH INiEREST ROLE IN PROPOSED PROJECT Organic Chemistry Structure and Synthesig Principal Investigator RESEARCH SUPPORT (See instructions} NIH GM 22990-01 Biogenetic-Like Cyclizations of Macrocyclic Polyenes $18,109 IBM Fellowship for Krishnan Subramanian December '75 - June '76 8,000 Merck, Sharp, and Dohme Fellowship September '75 - June '76 8,000 Initial Starting grant from UC Santa Cruz (equipment only) 4,000 Ortho Pharmaceutical (Application for support for chemicals) 1,000 RESEARCH AND/OR PROFESSIONAL EXPERIENCE (Starting with present position, list training and experience relevant to area af project. List all or most representative publications, Do not exceed 3 pages for each individual.} 1975- 1969-75 1967-69 1965-67 1964-65 1962-65 1962 Co-Chairman Associate Professor, University of California, Santa Cruz, CA Assistant Professor, Princeton University, Princeton, New Jersey Postdoctoral Fellow in Chemistry, Harvard University, Cambridge, Massachusetts, The Synthesis of Sativene and Related Natural Products. E.J. Corey Automatic Data Processing and Analysis Officer, US Army Combat Developments Command, Air Defense Agency, Fort Bliss, Texas NIH research assistant fellowship, University of California, Berkeley Graduate Work on Photochemistry of Transoid Dienes, Prof. W.G. Dauben, University of California, Berkeley Research Chemist, ESSO Research Engineering Company, Baton Rouge, Louisiana (summer) (1972- ACS Symposium on Computer-Assisted Design of Organic Syntheses, April 1976 ' Member National Academy of Sciences Committee to Establish a National Resource for Computation in Chemistry 1974—present Member Chemical Abstracts Advisory Board (1969-1972) Member Editorial Advisory Board of Chemical Substructure Index; Editorial Board, Computers and Chemistry, Pergamon Press; Editorial Board, Journal of Chemical Information and Computer Science, American Chemical Society Consultant Merck, Sharp, and Dohme, Squibb, BASF Director HATO Adyanse, Study bnst yom TGquPuisy Representation and Minipulation of RFIS-398 Rev. 3-70 Page 8 DO NO7 2 /?E IN THIS SPACE-BINDING MARGIN Privileged communication Continuation page T. Wipke i. Computer Experience W.T. Wipke Began using computers in 1962 at Berkeley, programming the IBM 7094 DCS System in FORTRAN and FAP/MAP to solve NMR analysis problems and photochemical kinetics problems. Used the SDS - 910 system for exhaustive enumeration of all tricyclic undecanes. Spent two years as an Army Systems Analyst in Air Defense Agency, Fort Bliss, Texas, IBM 7094 at White Sands, New Mexico, and IBM 360 models 50 and 65, attended IBM Systems programmer school at Los Angeles, and learned techniques of simulation and managing the development of complex programs and coordination of programming teams. Spent two years at Harvard, working closely with Professors Thomas Cheetham and Ivan E. Sutherland in building a program to predict organic syntheses using graphics. Audited courses in graphics, and linguistics. Developed system software on PDP-1 for graphics, color display, dynamic storage management, program overlaying, list processing, 3~dimensional display, and virtual memory systems. At Princeton he and his group completed the first stage of SECS, a program to help design stereospecific synthesis; developed GIGL, a general interactive graphics language; ALCHEM, a language for describing chemical reactions; SYNCOM, - a compiler for ALCHEM; and SYMIN, a 3-D molecular model builder and a 3-D tablet. Synthetic Experience Synthesis of decalins and strained ring systems (with Dauben); sesquiterpenes, sativene (with Corey); and at Princeton design and execution of novel approaches to sirenin, cantharidin, palisonin, Current syntheses underway include an approach to macrocycles and a new steroid synthesis. Other work includes a study of palladium t-complexes in synthesis. Considerable synthetic experience has also been gained in six years of designing computer programs which'‘design syntheses, and in organizing the body of chemical reactions. Conferences (recent) Invited Speaker: ''The Rudolph Anderson Symposium on Innovations in the Methods and Tools of Synthetic Organic Chemistry'', Yale University, Jan. 14-15, 1971. Invited Speaker: "Applications of Computers in Synthesis", Stanford Symposium on “Synthesis: A Science for All Seasons", Nov. 12-14, 1973. Invited Speaker: Symposium on Strategies in Organic Synthesis, sponsored by Societe Chimique de Belgique at the University Louvain la Neuvre, 1974. Invited Speaker: Artifical Intelligence in Medicire Symposium, Rutgers, 1975. PUBLICATIONS (recent relevant) 1. E.J. Corey and W.T. Wipke, "“Computer-Assisted Design of Complex Molecular Syntheses", Science, 166, 178 (1969). 2. E.J. Corey, W.T. Wipke, R.D. Cramer, and -W.J. Howe, "Computer-Assisted Synthetic Analysis: Facile Man-Machine Communication of Chemical Structure by Inter- active Computer Graphics", J. Amer. Chem. Soc., 94, 421 (1972). 3. E.J. Corey, W.T. Wipke, R.D. Cramer, and W.J. Howe, "Techniques for Perception by a Computer of Synthetically Significant Structural Features in Complex Molecules", J. Amer. Chem. Soc., 94, 431 (1972). 4, W.T. Wipke and A. Whetstone, ‘Graphic Digitizing in 3-D", Computer Graphics, 5, 10 (1971). PHS -398 Page 9 Rev. 2-69 GPO ; 1969 © - 350-360 : SPACE-BINDING MARGIN DO NOT TYPE «. Privilege Communication Continuation page T, Wipke a? 5. P. Gund, W.T. Wipke, and R. Langridge, "Computer Searching of a Molecular : Structure File for Pharmacophoric Patterns," Computers in Chemical Research land Education, Elsevier, Amsterdam, vol. II (1973) pp 5/33-38. 6. W.T. Wipke and T.M. Dyott, "Simulation and Evaluation of Chemical Synthesis. Computer Representation and Manipulation of Stereochemistry," J. Amer. Chem. Soc., 96, 4825 (1974). 7. W.T. Wipke and T.M. Dyott, "Stereochemically Unique Naming Algorithm," J. Amer. Chem. Soc., 96, 4834 (1974). 8. W.T. Wipke and P. Gund, "Congestion: A Conformation-Dependent Measure of Steric Environment. Derivation and Application in Stereoselective Addition to Unsaturated Carbon," J. Amer. Chem. Soc., 96, 299 (1974). 9. W.T. Wipke," Computer-Assisted Three-Dimensional Synthetic Analysis," in Computer Representation and Manipulation of Chemical Information, ed. W.T. Wipke, S.R. Heller, R.J. Feldmann, &. Hyde, John Wiley, (974), pp 147.174. 10. W.T. Wipke and T.M. Dyott, "Use of Ring Assemblies in a Ring Perception Algorithm," J. Chem. Info. and Computer Sci., 15, 140 (1975). 11. T.M. Gund, P.V.R. Schieyer, P.H. Gund and W.T. Wipke, "Computer Assisted Graph Theoretical Analysis of Complex Mechanistic Problems in Polycyclic Hydro- carbons. The Mechanism of Diamantane Formation from Various Pentacyclo- tetradecanes," J. Amer. Chem, Soc., 97, 743 (1S75). PHS-398 Pege 10 Rev, 2-69 GPO : 1969 O - 350-360 SECTION It — PRIVILEGED COMMUNICATION T. Wipke a? BIOGRAPHICAL SKETCH - 4 ° (Give the following information for all professional personnel listed on page 3, begi aning with the Principal Investigator. Use continuation pages and follow the same general format for each person.) NAME i TITLE BIRTHDATE (Moa., Day, Yr.) GRAHAM M. SMITH Research Associate 11 November 1947 PLACE OF BIRTH (City, State, Country) PRESENT NATIONALITY (ff non-U.S. citizen, SEX indicate kind of visa and expiration date) Bay Shore, New York U.S. 1 Male (] Female EDUCATION (Secin with baccalaureate training and include postdoctoral) \ YEAR SCIENTIFIC INSTITUTION ANE LOCATION DEGREE CONFERRED - BIELD Adelphi Suffolk College (Dowling) B.A. 1969 Chemistry State University of New York at Buffalo Ph.D. 1974 Chemistry HONORS Samuel B. Silbert Fellowship 1971-1972. MAJOR RESEARCH INTEREST ROLE IN PROPOSED PROJECT Organic Synthesis Development of Computer Assisted Synthesis RESEARCH SUPPORT (See instructions) Merck, Sharp and Dohme Fellowship December 1975-June '76 IBM Fellowship — February 1973-December 1975 RESEARCH AND/OR PROFESSIONAL EXPERIENCE (Starting with present position, list training and experience relevant to area of project. List all or most representative publications, Do not exceed 3 pages for each individual.) 9/73-1/74 SUNYAB Teaching Assistantship Qual. Organic 6/71-9/73 "Research Assistantship 6/70-6/71 " Teaching Asisstantship Organic Chemistry 1/70-6/70 oN " " - Analytical Chemistry 9/69-1/70 " " " Freshman Chemistry 1/67-6/69 Scanner for.the Brookhave National Laboratory 80 inch Bubbie Chamber Group PUBLICATIONS Wudl, F., Smith, G.M., Hufnagel, E.J., "Bis-1,3-dithiolium Chloride: an Unusually Stable Organic Radical Cation", Chem. Comm., 1456 (1970). Wudl, F., Smith, G.M., "Coordination Complexes of Alkaline and Alkaline-Earth Ions Il: Synthesis and Properties of Macrocyclic and Open-Chain Amino-Ethers and Their Derivatives", presented at the 164th National Meeting of the American Chemical Society in New York City, August 1972, Green, E.A., Duax, W.L., Smith, G.M., Wudl, F. Coordination complexes of Group I and IL; Potassium-0,0'-catecholdiacetate, JACS 97 6689 (1975). Member - American Chemical Society Member (Assoc.) Assoc. for Comp. Mach. 398 "70 Page 11 SECTION Il — PRIVILEGED COMMUNICATION T. Wipke i _ BIOGRAPHICAL SKETCH (Give the following information for all professional personne! listed on page 3, beginning with the Principal Investigator. Use continuation pages and follow the same general format for each person.) NAME TITLE BIRTHDATE (Mo., Day, Yr} Research Postdoctoral Hartmut W. Bra a un Fellow . Aug. 31, 1947 PLACE OF BIRTH (City, State, Country} PRESENT NATIONALITY (ff non-U.S citizen, SEX indicate kind of visa and expiration date} German, J-l, Sept. 1976 (X) Male C] Female EDUCATION (Begin with baccalaureate training and include postdcctoral) 29 Freudenstadt, W. Germany AR SCIENTIFIC INSTITUTION AND LOCATION DEGREE CONFERRED AELE University of Goettingen, Germany Diplom- Internal rotation Chemiker 1971 molecules Dr. 1974 Theoretical and spectroscopic meth HONORS , rds aseqchtte + Fae internal rotation molecules MAJOR RESEARCH INTEREST ROLE IN PROPOSED PROJECT Application of computers and pro- gramming in chemistry Research Associate RESEARCH SUPPORT (See instructions) Jan 75-Dec 75 - Ausbildungsstipendium (training and research fellowship) | from Deutiche Forschungsgeimschaft, Bonn, Germany RESEARCH AND/OR PROFESSIONAL EXPERIENCE (Starting with present position, list training and experience relevant to area of project List cil or most representative publications, Do not exceed 3 pages for each individual. } Princeton: 6 months of work with the synthesis program SECS Goettingen: 3 years of experience with infrared spectroscopy, molecular mechanics calculations and quantum mechanics calculations on problems of conformation equilibria, and programming related to these fields. 1. “Die innere Rotation der Butadien-Diepoxide", Diplomarbeit, Goettingen, 2. "Theoretische und spektroskopische Untersuchungen zum Konformationsgleich weewicht des Bicyclopropyls", Dissertation, Goettingen 1974. 3. "Die Rotationsisomerie des Bicyclopropyls. II. Die gauche/trans- Isomerisierungsenthalpie und -entropie von Preset eS aus IR- (1995) ep omessungen. Ein experimenteller Test", J. Mol. Str. 21, 39} 197 4, “Ueber due Bestimmumg von Isomerisierungsenthalpien und -entropien mit der IR-Intensitaetsmethode", J. Mol. Str. 21, 415 (1975). 5. "Die Rotationsisomerie des Bicyclopropyls. III. Untersuchung der inneren Rotation von Bicyclopropyl, Vincylcyclpropan, und Butadien und einiger verwandter Verbindungen mit der Kraftfeld-Methode" » J. Mol. Str., sccepted for publication. “RAS-398 On... D7 linan 49 SECTION Il ~ PRIVILEGED COMMUNICATION T. Wipke a. BIOGRAPHICAL SKETCH ’ (Give the following information for all professional personnet listed on page 3, beginning with the Principal Investigator. : Use continuation pages and follow the same general format for each person.} NAME / TITLE BIRTHDATE (Mo., Day, Yr.) Krishnan Subramanian Postdoctoral fellow 8-8-1949 PLACE OF BIRTH (City, State, Country] PRESENT NATIONALITY (/f non-U.S. citizen, SEX Wadak h . (K 1 ) INDIA indicate kind of visa and expiration date) adakancheri erala : . Indian, J-1l, Nov. 1976 Kj mate Female EDUCATION (8egin with baccalaureate training and include postdoctoral) YEAR SCIENTIFIC INSTITUTION AND LOCATION DEGREE CONFERRED eIELD Bombay University (INDIA) B.Sc. 1969 Physics Bombay University (INDIA) M.Sc. 1971 ' Indian Institute of Science (Bangalone) INDIA Ph.D. 1975 | Chemjcal Informa- : tion HONORS National Science Talent Scholar (INDIA) MAJOR RESEARCH INTEREST ROLE iN PROPOSED PROJECT Computer application to chemistry Research Associate RESEARCH SUPPORT (See instructions) IBM Fellowship - December 1975 - June 1976 RESEARCH AND/OR PROFESSIONAL EXPERIENCE (Starting with present position, list training and experience relevant to area of project. List ali or most representative publications, Do not exceed 3 pages for each individual.) 1. ALWIN - Algorithmic Wismesmer Notation System for Organic Compounds, J. Chem, Doc. 14, 130 (1974). 2. A Simplified Grammar for Algorithmic Wismesser Notation Using Morgan Name (to appear in International Classification). RHS-393 SECTION Il — PRIVILEGED COMMUNICATION T. Wipnke a. BIOGRAPHICAL SKETCH : (Give the following information for all professional personnel listed on page 3, beginning with the Principal Investigator. Use continuation pages and follow the same general format for each person.) NAME TITLE BIRTHDATE (Mo., Day, Yr) GLENN I. OUCHI Research Associate 23 August 1949 PLACE OF BIRTH (City, State, Country} PRESENT NATIONALITY (/f non-US, citizen, SEX indicate kind of visa and expiration date} Compton, California .S. P ; a U K] Mate (J Fermate EDUCATION (Begin with baccalaureate training and include pastdoctoral} YEAR SCIENTIFIC } INSTITUTION AND LOCATION DEGREE CONFERRED EiELo University of California, Los Angeles B.S. L971 Chemistry University of Minnesota M.S. 1975 Organic Chemistry HONORS : Kodak Summer Fellowship 1972 MAJOR RESEARCH INTEREST ROLE IN PROPOSED PROJECT Organic Synthesis Development of Computer Assisted Synthesis RESEARCH SUPPORT (See instructions) . Teaching Assistantship UCSC RESEARCH AND/OR PROFESSIONAL EXPERIENCE (Starting with present position, list training and experience relevant to area of project. List all or most representative publications, Do not exceed 3 pages for each individual.) 9/75-present UCSC Teaching Assistantship, Organic Chemistry 1/75-9/75 Foothill College, Instructor of Chemistry, General/Organic 4/73-9/75 Research Chemist, Stanford Research Institute 9/72~-3/73 ‘University of Minnesota, Teaching Assistantship, Organic Chemistry 9/71-9/72 University of Minnesota, Teaching Assistantship, General Chemistry Publication: Ouchi, G.I., Spanggord, R.J., Francis, A.J., "Degradation of Lindane by E. coli" Appl. Microbiol. 29(4) . 567 (1975). RHS-398 Rev, 3-70 / Page 14 DO NOT TYPE IN THIS SPACE-8INDING MARGIN Privileae Communication Continuation page T. Wipke : 4s A. RESOURCE.RELATED RESEARCH: BIOMOLECULAR SYNTHESIS RESEARCH PLAN Introduction. structure 1. Objective: The development of new drugs and the study of how drug/is related to biological activity depends upon the chemist's ability to synthesize new molecular structures as well as his ability to modify existing structures or to incorporate isotopic labels into biomolecular substrates. The long term objec- tive of this research is to develop the logical principles of molecular con- struction and employ these in practical computer programs to assist investigators in designing stereospecific syntheses of complex bio-organic molecules of the type encountered in natural products and pharmaceuticals. While some progress has been made toward this objective, there is much to do. In this proposal we plan to build on the current SECS synthesis program, increasing our coverage of chemistry, increasing speed and efficiency of processing, and capitalizing on our steric and electronic perception in strategy and plan development. We plan to evaluate this program by making it available over a nation-wide network to interested health-related non-profit users. We will also explore other possible applications of the SECS program in chemistry and other applications of the program modules, for example to explore the forward-working approach to synthesis. 2. Background: Although instrumentation has dramatically improved the speed of structurai analysis over the past 20 years, there has not been a significant increase in the number of reactions a chemist runs per month. Execution of even rather simple synthetic schemes may require a commitment of several man- years of laboratory work. Sarett noted that speed in the laboratory wiil not change much and that the greatest gains will come in the area of synthetic de- Sign.- He envisioned the use of computers in a backward-working analysis to generate a “synthesis tree". Of course the reason computer analysis is desir- able is that the computer can remember the many known chemical reactions, me- thodically apply them to generate a large number of unbiased synthetic routes, from which the chemist can select the best to actually execute in the laboratory. Thus he is assured of having considered all reasonable alternatives. Research in this area began with the representation of molecular structure and generation of isomers?» >Later with Corey,“ the first computer program (OCSS, later called LHASA?) was Geveloped which generated synthetic schemes using the logic-oriented approach. At this time the programs were written in assembly language and considered only the connectivity of a structure, com- pletely ignoring stereochemistry, steric hindrance, strain effects, and proximity, but still could produce some interesting carbocyclic syntheses. In 1969 Dr. Wipke at Princeton began developing on a PDP~10/LDS~1 system in FORTRAN a new synthesis program called SECS (Simulation and Evaluation of Chemical Synthesis) to concentrate on stereospecific syntheses taking into con- sideration the three-dimensional structure of the target molecule, © Algorithms were developed for representing and manipulating stereochemistry, ’” building a three-dimensional mode1,® and analysing proximity and steric congestion from the model.? SECS included new areas of chemistry, hetrocyclic and protecting group chemistry. 2° (Further details on SECS are presented in section A4) Meanwhile at Harvard the emphasis in LHASA was toward the building of so- phisticated transforms (eg.,Biels Alder) which could cause generation of up to 15 step sequences of reactions to achieve a certain type of synthesis. The rigidities of functional group interconversion in this early work appears PHS -398 Rev, 2-69 Puge 15 GPO : 1969 O - 350-260 DO NOT TYPE IN THIS SPACE-3INDING MARGIN eo ; : i i e ws _ Privileged Communication Continuation pag TT. Wipke fF to be at least partially overcome by a more recent algorithm (FGI) for up to 4 FGI's, Other recent work includes further definition of ringl? and appendc- agel4 strategic bonds and functional group protection. ?° Elsewhere Gelernter at Stony Brook has written a PL-1 non-interactive batch program which works backwards frem the, target, but selects by itself nodes in the synthesis tree to be developed. This program called SYNCHEM uses the Aldrich WLN file of available.compounds as acceptalbe termination points. The representation of chemistry in SYNCHEM is less detailed than in SECS or LHASA, but more attention is given to traditional heuristic tree search methods, . Yet another approach is that of Ugi which is based on a matrix form- alism.?7, The eventual goal of his approach is to make an break bonds in all possible "legal" ways without empirical rules. The biggest application of this type approach will probably be to discovery of new reactions rather than new syntheses. An implementation of this approach (CICLOPS) has been described, but unfortunately, no exempics of results have appeared. 19 A new version called MATSYN is in progress. 0 Hendrickson has made some significant contributions, not in the computer area, but in classifying reactions by changes in oxidation state@* and more recently by half reactions.?22 The latter is particularly useful for setting up a molecule to break strategic bonds. He also developed an approach to electrophilic aromatic substitution?! and for manually analysing all the rings in a molecule. 22) Sinanoglu at Yale has studied from a graph theoretical standpoint networks of reactions and numbers of pathways within such a nete work.°3 Jeff Powers at Carnegie Mellon has done some reaction path analysis in chemical engineering, Howard Whitlock at Wisconsin has explored linguistics in the functional group switching problem and R.V. Stevens ex- plored ene-reactions with a small special purpose program, 24 Significant advances in other areas of chemical inference which have a bearing on synthesis include the DARC system of DuBois which utilizes an un- usual description of structure,25 the DENDRAL mass spec analysis¢® and the CONGEN structure generator, the implication of structure from mass spectra by pattern recognition techniques, 28 the interactive graphical substructure search system of Feldmann and Heller®? and the work in reaction documentation, 30 A NATO,Advanced Study Institute recently reviewed the state of the art in these areas. A short history of SECS and recent advances by Dr. Wipke's group are presented in Section 4, Progress Report. 3. Rationale: The central goals in synthetic design are generation of chemically valid synthetic routes to a target molecule and then selection of the "best" routes. One clearly can not attain either of these goals if he ignores the important principles relating chemical reactivity to molecular shape and configuration or if he ignores important areas of chemistry such as heterocyclic chemistry.-yet these areas have been ignored. Therefore this research project is oriented to develop the ability to utilize stereochemistry in synthetic analysis, to develop strategies and new heuristics relating to stereochemistry, and to include hetero- cyclic and aromatic as well as carbocyclic chemistry. Stereochemistry includes both the conformation independent configurational relations (cis-trans) as well as the conformation dependent relations (steric hindrance, proximity, orienta- tion). Since the chemist uses molecular models, the computer should know how to PHS -398 Page 16 Rev. 2-69 GPO : 1969 © - 350-3560 RO NOT TYPE IN THIS SPACE-BINDING MARGIN Privileged Communication Continuation page T. Wipke build and analyze 3-dimensional molecular models, recognize enantiomers, evaluate steric environment, etc. Only in this way can the computer have any reasonable chance of predicting reactivity accurately enough to permit selection of "best" routes. An analysis of "Structures of Current Interst to the Chemotherapy Program, Sept 1971" showed 50% of the structures contained aromatic or heterocyclic ring systems, indicating the importance of this area of chemistry. Therefore, this research is also aimed at representation of the chemistry needed in the synthesis of these systems, analysis of electronic properties of such systems, and special strategies for their synthesis. When we teach students organic chemistry, we first teach structure and nomen- clature, next mechanisms and reactions, and finally strategies for synthesis. The same logical order pertains to teaching the computer. Initial priorities of this research focused on developing a general representation of stereochemistry, a capability to build 3-D models, and a representation of carbocyclic reactions in- cluding stereochemical consequences as well as the relationship between stereochnem- istry and reactions (eg, how to estimate steric hindrance). Although work in these areas continues, new priorities are 1) development of higher level heuristics and strategies which connect stereochemistry, symmetry, proximity, etc., with high level pianning (e.g., how does one capitalize on the fact that one functional group is ¥ highly hindered?), and which interconnect carbocyclic and heterocyclic chemistry; 2) exploring efficient ways of constraining the generation of valid, but uninterest- ing synthetic pathways; and evaluation of the SECS program on problems by us and users, using feedback for further improvement. This research focuses on exactly those important aspects of chemistry neglected in other computer synthesis programs, but at the same time, this research will not neglect synthetic strategies based on connectivity alone. The more complete treatment of chemistry in our approach is expected to provide greater power in synthetic design, especially in complex biomolecular syntheses. 4. Comprehensive Progress Report This proposal is a new proposal, not technically a renewal, but because consider- able work has been done before under another NIH grant, to assist the reader in viewing what has been done, we present a short report of this progress beginning in 1970. a) Original objectives: To design a program for computer-assisted design of complex organic syntheses which considered the three-dimensional nature of molecules and the important principles derived therefrom: stereochemistry, proximity, steric effects, strain energy, and stereo-electronic control. These principles are im- portant to the logic used in the planning of a synthesis of most drugs and molecuies Requirements 1) Program should easily communicate with any organic chemist 2) Program must "understand" the stereochemistry of a complex target molecule 3) Program must be able to build a three-dimensional model of molecule 4) Program must be able to analyze 3-D model for steric effects, etc. 5S) Must have knowledge of reactions and be able to apply them when appropriate. 6) Must generate precursors with proper stereochemistry according to mechanism of the reaction applied and stereochemistry of target 7) Chemist should be able to easily add new reactions to reaction library 8) Program should be applicable to real problems of interest PHS-398 Page Rey. 2-69 ve 17 GPO : 1969 © - 350-360 | LAME aa DO NOT TYPE 4S SPACE-BINDING MARGIN creates a minimum energy conformation with the correct stercochemistry (Fig 3).8 Privileged Communication Continuation page T. Wipke a 4.b. Summary of Results j 4 general FORTRAN program (SECS) for designing stereospecific syntheses was written for the PDP-10/LDS-1 system at Princeton. This was the first program to accept standard structural diagram input with stereochemistry,’ correctly manipulate structures according to stereospecific chemical transforms, and reconstruct valid structural diagrams for output. The internal representation facilitates symbolic recognition of enantiomeric, diasteriomeric, and isomorphic structures, and cise trans relationships.’ The program constructs a 3-D model8 of the structure which can be viewed in 3-D3* and modified in conformation using a 3-D acoustic tablet developed in this work.33 Using this model SECS evaluates steric congestion at a reaction center which has been correlated with experimental product distribution.?| A language for representing reactions (ALCHEM) 34 has been developed which accommodates ab initio electron-pushing as well as empirical name reactions. SECS IZ (1975) incorporated functional group protection, an initial approach at heterocyclic chemistry, and an electronic energy calculation module.?9 SECS-IT was placed on the First Data Corporation timesharing system for access by any interested partie Se also was brought up on a UNIVAC with a GT40 graphics terminal in Strasbourg, France | i | a j This research was moved to the University of California, Santa Cruz, and wes granted an allocation of the SUMEX resource. SECS was converted to TENEX and the GT40 display system. Since then a new strategy module and a symmetry module heve been under development as well as optimization of the program for remote graphics. SECS has also been used to find the rearrangement pathway to diamrantane, 39 and for er c ; . : 36 building many models of drugs for antileukemia pattern searching. 4.c. Detailed Progress Report First the organization of SECS-II will be described, then each of the previous requirements will be discussed as to specific progress in that area. SECS-~II Program Oraanization SECS used to occupy 11 seqments cperating in 48 K words of memory, but now is no longer overlayed on the SUMEX system because with paging virtual memory, ther: is no observable advantage in overlays. SECS still uses disk for storage of chemis~ vy files and structures, the disk being utilized aS backing stere for variable length dynamically allocated virtual memory (software implemented). Disk space is still a problem because of the need to store source files on-line and the necd to have severii versions available, but memory no longer is a problem, Following the modules shown in Fig. 1, the investigator picks up the light pen or acoustic pen and draws in a molecule, using normal structural diagrem notation including hashed and wedged lines (Figure 2). Alternatively if he is using only a teletype rather than a GT40 graphics terminal, he uses the teletype input routine to enter the connectivity, stereochemistry, atom types and coordinates (optional). At the completion of input the structure is analyzed by the perception module and a cannonical representation is generated.7> All graph theoretical perception is done in this module. The 3-dimensional model builder then using energy minimization techniques Then the electronic model builder calculates the pi-electron delocalization energy for any conjugated systems, especially aromatic, using Huckel NO method are fast. From this information certain reactivity information is infered. I the chemist has selected he then interacts with the strateqy command handler to entes his strategic commands, bonds to break or not break, etc., features he will or wen't accept in precursors, and cutoffs in quality of chemistry that is applied. When s which Ry PHS-398 Page 18 Rev. 2-69 GPO : 169 © - 350-359 THIS SPACE-BINDING MARGIN BO NOT TYF . Privileged Communication Continuation page T. Wipke: a 3-D tablet light pen La L f} Sf —~a> va /7 teletype ( plotter (: ns GT40 Lf a Nh A Graph Perception Graphics 3~D Model Executive Evaluation Builder & Perception Symbolic Electronic Gremicat ‘Symbolic | Model, Build Strategy | & Manipulat. | Perception Figure 1 the investigator has specified his desires in strategy, the chemistry module attempts to find chemical transforms relevant to the structural features in the target molecule and consistent with the specified strategies. Environmental and mechanistic requirements of the transforms are examined to determine transfor applicability. Applicable transforms generate all stereoisomeric precursors consistent with the stereochemistry of the transform and each precursor is evaluated by the evalua- tion module for valence violations, topologically unlikely bonding, duplications, and other undesir 4ble features. Precursors are displayed as evaluated and are represented as a node in the "synthesis tree". Horizontal lines in the tree join ensembles of molecules needed for a transform. The chemist then evaluates the precursors, chooses one as the next to be analyzed and the process recurs. A switch causes the display of the synthetic sequence from a selected precursor up through the tree to the target at the top (Figure 4). The sequence display helps the chemist to evaluate the whole sequence since he sees the global view. Requirement 1) easy communication of program with organic chemist. Extensive graphical communication capabilities are built into the program. Drawing of the input molecule is natural both with a tablet (best) and with the light pen(ok, but not as natural). Stereochemistry follows all natural conventions. Even the input from a teletype has been human engineered to allow the minimum of typing, eg, by describing connectivity as a path connected thru the molecule leaving only the branch points and ring closures to describe separately. The TIYINPUT module PHS -398 Page Rev. 2-69 19 GPO : 1969 O ~ 350-3F0 IN THIS SPACE-BINDING MARGIN ~ ey ey DO NOT TY?! Privileged Communication Continuation page T. Wipke > assumes errors will occur and disallows any illegal descriptions, and provides editing capabilities for correcting errors. Output is also graphical on a teletype but of course the plotting is less attractive. However, all chemically important information is there including stereochemistry. Changes are still being made to the program to optimize the use of remote graphics terminals over medium speed line~. Perhaps the best evidence of progress in this area is that chemists at Squibb were able to use SECS on the FDC timesharing system with a GT40 terminal without any instructions other than the one page description on the system. Requirement 2) Program must "understand" stereochemistry. We have developed an algorithm ‘by wiich the computer can interpret a standard stereochemical struc- tural diagram (Fig 2) or a 3D model? and generate an identifier which is unique for each stercoisomer and allows easy recognition of enantiomeric structures which in the synthesis of racemic materials are treated as being isomorphic. ? The algor: thm ignores centers which because of symmetry are not true stereo centers. rithm operating on the individual stereo ‘center descriptions. The chemical manin- ulation module, in addition to making and breaking bonds, also generates the cor- rect stereochemical descriptor for the precursor based on the mecnanism of the operating transform. ‘Thus 1 implies by a trans addition mechanism that the pre- cursor could have been 2 which rewritten is 3, whereas 4 implies by the same | mechanism both 5 and 6. Similarly SECS produces all valid precursors in electro~ cyclic transforms. (1) Note that this is not simply permutation of all possible | | . Relative stereochemical relationships (cis-trans) are derived by another algo- \ - -i i . TRANS 2 x 4 mara a: ‘“}— wes yee sy ; ~ @ A i B * A 3 4 2 2 ~ we] edad . CH CH, CU cH, Cils Sy — + a x Hx -— — heat. SL. fe ===> | SS B ane fs ( i) rod ek double bond isomers. SECS also has an algorithm to generate proper hashed and i wedged bonds to correctly represent the actual stereochemistry of the precursors even if the chemist moves atoms or rotates the molecule--~the hashing/wedging is changed to maintain an accurate representation. Requirement 3) Must be able to build a three-dimensional model. SECS contains a model-building module’ which creates a reasonable model given only the standard two-dimensional structural diagram. It accomplishes this by special minimization techniques in an implementation of the Westheimer method uSing four levels of parameters. Starting from any geometry, two-dimensional (Z=0), or 3-D, or even random coordinates, the program reshapes the structure into a reasonable conforma. tion, displaying the model and strain energy as it proceeds. initially, the program emphasizes non-bonding effects, later band lengths, then bond angles, and finally fine resolution non-bonding factors. Models for structures having up to 30 non-hydrogen atoms, double and triple bonds, and hetero-atoms may be built. During the building process SECS monitors the stereochemistry of the model and modifies the model to make it correspond to the stereochemistry initially specified Le i PHS -398 Page 20 Rev. 2-69 GPU: 1968 OQ = 350-560 Privilered communication ee . eer ee T. Wipke PONE AFORE fae iN THIS SPACE-BINDING MARGIN w DO NOT TY? Privileged Communication Continuation page T. Wipke a> _@ditor in ALCHEM, adding it to one of the existing files or creating a new one Figure 5 shows a comparison of the model built for morphine (dotted line) and the Xeray structure of morphine methiodide salt (solid line). This mcdel builder has been provided to Feldmann at NIH, is incorporated into the PROPHET pharmacology information system, and into the CONGEN-DENDRAL system at Stanford. Peter Jurs, (Penn State) and Bruce Kowalski (U. of Washington) also have been provided a copy of the model builder for use in pattern recognition studies. Squibb and Pfizer have used the model builder, on the FDC system. Requirement 4) Must be able to analyze 3-D model for steric effects. Certain ALCHEM statements cause the synthesis program to access it's internal 3-D model much as a chemist would do, making distance and angle measurements for proximity and syn or anti relationships. Evaluation of steric hindrance is considerably more difficult since one must first define steric hindrance. Using collision theory, we have developed a definition of ground state steric congestion at a reaction center assuming the rea¢ting partner is an infinitesimal particle.? This function works well for rigid ketones where there is rather large congestion. However for rather uncongested ketones, we had to include an electronic eclipsing effect based on the dihedral angle the incoming group makes with substituents attached to the alpha carbons of the ketone. When these two functions are combined they provide a quantitative treatment of steric hindrance not only for reduction of ketones, but also epoxidation of olefins. Since our functions give an absolute value for attack to each side of a planar group, we can also compare the least hindered side of the two identical groups in a target to determine the feasibility of carrying out a selective reaction on one of the groups in the presence of the other. Currently the only chemistry using this steric information is the reduction of ketones and Grignard reactions on ketones, but the correlation of congestion with product ratios in these cases is very good. Reouirement 5) Must have knowledge of reactions and be able to apply them whe appropriate. Chemical transforms (a transform can be a simple ab-initio electron pusing step or an empirical name reaction written in the antithetic direction) are written in an English-like chemical language, ALCHEM. See appendix for the BNF grammar of ALCHEM, Basically ALCHEM has facilities for describing relation- ships between functional groups, structural features, atoms, and bonds, and even arbitrary substructural fragments. It also contains general arithmetic capability Text : SYNCOM: SECS EK ; CHEMIST weexeccccnd ~ ALCHE ~ BINARY CHEMICAL 0 ew es ow oe es oe oe oe oe oS 7 E Po RILE TTTTTTT TP sy tHEses Editor PILE Compiler I as and means for symbolically manipulating structures to generate precursors. Since ALCHEM files are ASCII text and the compiler (SYNCOM) and interpreter (SECS) are in FORTRAN, we have a machine-independent language for describing reactions and the factors affecting them. Any reaction may be described in any amount of detail with the only limit to the number of transforms being the amount of disk space available. : Requirement 6 has already been discussed under item 2. : Requiremant 7) Chemist should be able to easily add new reactions to reaction library. All that is needed to add a new reaction is to type it into the text then run SYNCOM to take the ALC file and create the binary CHM file; then just’ run SECS and it will automatically use the new updated CHM file. Total time 5 min. Dr. Guenter Grethe entered many complex reactions without knowing any programming languages, only ALCHEM and how to operate the text editor, so it is possible for a chemist to easily enter reactions. It took him about a day of study of ALCHEM. PHS-398 Page 22 Rev. 2-69 GPO : 1569 © + 359-560 BINDING MARGIN IN THIS SPAC&E- yt iS in Boa BO NOj Privileged Communication Continuation page T. Wipke qn Requirement 8) Program should be applicable to real problems. SECS was design ed.to handle molecules of up to 72 non-hydrogen atoms, and to be able to generate any number of structures in the synthesis tree. It is able to do this because the structures are not in memory, but are in our own software implemented virtual memory which can be saved and restored in another session to allow interruptions. A version of SECS on FDC timesharing system in Boston has received considerable testing by Squibb, Pfizer, and Merck, Sharp and Dohme pharmaceutical companies in the past year. The feedback from these users is that they were giving it real problems and were finding interesting output which they had not thought of. This is not to say that everything produced was good or that every good route was pro- duced,. but it does say some progress has been made on real problems. Figure 2 shows the antileukemic cephalotaxine?® as a target with tne edited tree. Figure 3 shows the model created and figure 4 shows one of the routes form- ated by the syntnetic sequence layout. Below each structure is its sequence number. Above each arrow is the code name for the transform implied and below that is the final priority ranking of that transform. This route is different froi an earlier attempted synthesis 9 and recent successful syntheses. *° Preliminary Results Aromatic and heterocyclic chemistry. Our initial work in this area uses the powerful pattern transform capability of ALCHEM. We have about 100 heterocyclic transforms which represents many times more reactions and fairly well covers the Simple ring systems to a first approximation. Electrophilic substitution directing effects are included, also steric effects from groups already on the ring, but only for rings not containing heteroatoms. Recently we generalized this using the new electronic energy model with fairly good success, This research showed the difficulties with tautomerism (keto-enol) and with strategic control of when to perform the synthesis of the aromatic or heterocyclic ring system. Functional Groun Protectién. Dr. Willi Sieber created a module for checking the condition statements in the transform and automatically invoking protection, selecting the proper protecting group which would be stable to the conditions yet not react with other groups in the molecule. This protecting group is 2 text descripter which is attached to the functional group for that step. Corey just published a similar approach. 4 e Strategy Research. A general goal list structure with an interactive creation package now allows us to manually specify trial strategies involving not only the breaking and making of bonds, but also selection of transforms by character, ¢.d., rearangement, ring closure, modify stereochemistry. This has increased selectivity and suggested ideas for more advanced specifications to control sequences of reac- tions. The strategy of striving for "simplicity" has been explored by Peter Friedland by letting the executive choose the next structure to be processed cn the basis of a simplicity function. This function depends on the size of the mole« cule, number and size of rings, appendages, functional groups, and ring junc- tures. Stereo and group sensitivity were also incorporated. The user was allowed to set the depth of the search and the program processed the "simplest" structure in each set of precursors produced. For some syntheses this works well, but we find for may others that the computer does not find very interesting syntheses by itself because the interesting syntheses must go through a more complex inter- mediate structure. Symmetry. Work is underway to detect the symmetry of a target molecule and use this to prevent redundant reactions due either to moleaular symmetry or sym- metry of the reaction. This is related to the use of symmetry that the CONGEN program, ¢/P but is different because of the fact we are mapping reactions rather than just a set of labels, and also because we are including stereochemistry in PHS -3938 Page 3 Rev. 2-69 é GPO : 1969 © - 350-360 ‘MM THIS SPACE-BINDING MARGIN DO NOT 1. Privileged Communication Continuation page T. Wipke > our molecules and our symmetries in most cases corpespond to point group symmetry elements. For example, for dodecahedrane we find 124 symmetry elements, the same as the point group for the molecule. We expect this symmetry information to be very useful in not only reducing the number of precursors created to a small number of unique precursors, but also useful in building strategic goals. We expect that we may.also be able to use this symmetry to prevent the generation of enantiomers which are not normally desired unless one is dealing with resolved materials. In some cases, symmetry information will reduce the execution time of SECS by a factor of five or more. Conclusion Many problems still remain. !e are just beginning to learn how to use the powerful perceptual information we have collected, but we fecl we have established a firm foundation on which to build further research in this area. See letters in Appendix 1. 4.d. Publications W.T. Wipke and A. Whetstone, "Graphic Digitising in 3-D," Computer Graphics, 5 (4), 10 (1971). W.T. Wipke, "A New Approach to Computer Assisted Design of Organic Syn- theses," Proceedings of Northern Illinois University Conference on Comouters in Chemical Education and Research, DeKalb, July 19-25, 1971, p. 10-60. E.M. Engler, L. Chang and P.veR. Schieyer, "The Flexibility and Conformations of Polycycloalkanes with Two-Carbon Bridges," Tetrahedron Letters, 2525 (1972). T.M. Gorrie, E.M. Engler, R.C. Bingham, and P.v.R. Schleyer, “An Abnormally Weak Bond in a Diol with a 0° Dihedral Angle. Breakdown of the OH ... OH Spectral Shift-Dihedral Angle Relationship,'t Tetrahedron Letters, 3039 (1972). W.T. Wipke and P. Gund, "Congestion: A Confermation-Dependent Measure of Steric Environment. Derivation and Application in Stereoselective Addition te Unsaturated Carbon," J. Amer. Chem. Soc., 96, 299 (1974). W.T. Wipke and T.M. Dyott, "Simulation and Evaluation of Chemical Synthesis. Computer Representation and Manipulation of Stereochemistry," J. Amer. Chem. Sac., 26, 4825 (1974). W.T. Wipke and T.M. Dyott, "Stereochemically Unique Naming Algorithm," J. Amer. Chem. Soc., 95, 4834 (1974). W.T. Wipke, "Computer-Assisted Three-Dimensional Synthetic Analysis," in Computer Representation and Manipulation of Chemical Information, ed. W.T. Wipke, S.R. Heller, R.J. Feldmann and E. Hyde, J. Wiley, (1974) pp 147-174. P. Gund, W.T. Wipke, and R. Langridge, "Computer Searching of a Molecular Structure File for Pharmacophoric Patterns," Computers in Chemical Research and Education, Elsevier, Amsterdam, vol. II (1973) pp 5/33-~38. W.T. Wipke and T.M. Dyott, “Use of Ring Assemblies in a Ring Perception Algorithm," J. Chem. Info. and Computer Sci., 15, 140 (1975). T.M. Gund, P.v.R. Schieyer, P.H. Gund and W.T. Wipke, "Computer Assisted Graph Theoretical Analysis of Complex Mechanistic Problems in Polycyclic Hydro- carbons. The Mechanism of Diamantane Formation from Various Pentacyclotetra- decanes,"' J. Amer, Chem. Soc., 97, 743 (1975). PHS -398 Page 24 Rev. 2-69 GPO : 1969 O - 350-460 IM THIS SPACE-BINDING MARGIN = DO NOT TYP: Privileged communication W. T. Wipke, Ph.D. Associate Professor 4/70- P. Gund, Ph.D. Postdoctoral 9/70-10/73 C. Still, Ph.D. Postdoctoral 3/72-6/73 T. Brownscombe, Ph.D. Postdoctoral 9/72-10/73 G. Smith, Ph.D. Postdoctoral 2/74- S. Krishnan, Ph.D. Postdoctoral 12/75-= F. Choplin, Ph.D. Visiting Fellow 9/73=10/74 H. Bruns, Ph.D. Visiting Fellow 1/72-4/72 G. Grethe, Ph.D. Visiting Fellow 9/72~-8/73 M. Spann Visiting Fellow 6/73-9/73 W, Siebey Ph.D. Visiting Fellow 1/74-12/74 H. Braun, Ph.D. Visiting Fellow 1/75- T.M. Dyott, B.S. Graduate Student 4/70-8/73 D. Stevens, B.S. raduate Student V/7208/72 S. Stevens, B.S. Graduate Student 3/72=@8/72 G. Goeke, B.S. Graduate Student 3/72-8/72 J. Mitlitzky, B.S. Graduate Student 9/72-1/73 G. Ouchi, M.S. Graduate Student 1/76~ C. Marikakis, B.S. Graduate Student V/72-8/72 T. Su, B.S. Graduate Student Tf 73—1/74 P. Friedland Undergraduate 9/70-9/74 J. Verbalis Undergraduate 4/70-6/71 J. Jackson Undergraduate 9/71-6/73 A. 2celicoff Undergraduate 6/73-6/75 T. Newman Undergraduate 9/75 — D. Shapiro Undergraduate 9/7T5- T. Davis Undergraduate 9/75— 4.e. Staffing Continuation page B. SPECIFIC AIMS Our objective is to increase the speed, efficiency and reasoning power of the Simulation and Evaluation of Chemical Synthesis program, capitalizing on the steric and stereocherniical information from our previous werk. Specific aims for this project period are listed below according to the module in which they fall. 1. Symmetry b) Incorporate symmetry into transform applicator to eliminate generation of redundant precursors and enantiomers c) Investigate algorithms for detection of potential symmetry 2. Model Builder a) Generalize to more diverse types of bonding b) Increase speed and reduce calculation using heuristics for appendages, symmetry, etc. 3, Strategy and Planning a) Continue to explore and evaluate principle of separation of strategies from chemical transforms ic factors c) Evaluate special strategies for heterocyclic and aromatic chemistry d) Build a graphical interface to the strategy executive module a) Develop an efficient molecular symmetry recognizer using stereochemistr: i \ b) Develop strategy modules for steric, proximity, symmetry, and electron- PHS-398 Page . Rev. 2-69 25 GPO : 1969 O - 350-360