Vou. 33, 1965 BIOCHEMISTRY: NIRENBERG ET AL. 1161 RNA CODEWORDS AND PROTEIN SYNTHESIS, VII. ON THE GENERAL NATURE OF THE RNA CODE By M. Nimensere, P. Leper, M. BernFietp, R. BrowacomBe, iJ. Trupin,* F. Rorrmant, anp C. O'NEAL NATIONAL HEART INSTITUTE, NATIONAL INSTITUTES OF HEALTH, BETHESDA, MARYLAND Communicated by Robert J. Huebner, March 26, 1965 Nucleotide sequences of RNA codons have been investigated recentiy by directing the binding of C'4-AA-sRNA to ribosomes with trinucleotides of defined base sequence. The template activities of 19 trinucleotides= have been described and nucleotide sequences have been suggested for RNA codons corresponding to 10 amino acids.'~* In this report, the template activities of 26 additional trinu- cleotides are described and are related to the general nature of the RN A code. Materials and Methods.—Components of reactions: E. cols W3100 ribosomes and sRNA were prepared by modifications of methods described previously. Each C!+aminoacyl-3sRNA was prepared in the presence of 19 C!%amino acids. The assay for ribosomal bound C'+AA-sRNA and components of reaction mixtures have been described.!. The characteristics and amounts of labeled AA-sRNA not described® are shown in Table 1. Synthesis and characterization of oligonucleotides: ApG and UpG were obtained from a T-1 ribonuclease digest of RNA, and ApA was prepared by chemical synthesis.» " ApC, ApU, CpA, CpG, GpA, and GpC were obtained from Gallard Schlessinger Corp., but required extensive puri- fication prior to use. GpGpU was obtained by digesting poly UG with pancreatic RNase A: treatment with alkaline phosphatase to remove terminal phosphate groups from the degradation products; and isolation by procedures similar to those described for GpUpU.? The remaining trinucleotides were synthesized from the appropriate dinucleoside monophosphate, using either TABLE 1 RaDIOACTIVE AMINOACYL-3RNA PREPARATIONS* C% or HX AA-sRNA Added to Each Reaction Specific uumoles of C1t Origin of Radioactive radioactivity or H3-Amino sRNA E. cols anmune acidt ue’ umole A%¢ anits acid accepted straint Lys 240 4 ).25 8.5 W3100 Expt. ) 2 ).28 10.2 B Ala 88 33 8.5 W300 Expt. 5 88 0.75 32.6 W3100 Glu 205 : 0.56 14.9 B Expt. 6 {88 0.85 6.1 W3100 Gly-H3 1130 0.15 11.72 A-23§ -C% Expt. b 66 0.47 9.2 W3100 Pro 200 0.70 6.4 B Expt. 6 158 0.57 13.2 W3100 Ser 120 1.038 6.3 W3100 Expt. 6 {20 0.42 18.6 B Trypt-H? 3000 1.00 6.5 B * Other AA-sRNA ata have been described.5 + Amino acids stated were labeled with C!4 with the exception of H*-tryptophan, and H*-glycine. ¢ £. cole W3100 isa K12 strain. Aminoacy! sRNA synthetase preparations were from F&. coli W3100. « . § We thank Dr. Charles Yanofaky for this #. coli strain and Dr. Ray Byrne for the H*Gly-eRNA. A-23 sRNA and 100,000 X g supernatant fractions were used for the preparation of H?-Gly-sRNA. 1162 BIOCHEMISTRY: NIRENBERG ET AL. Proc. N. A. S. TABLE 2 CHARACTERIZATION OF TRINUCLEOTIDES 10n Digesti: Synthetic (T-2 Pei eP ane) (venom phospbodiesterune) Compound method® Producte Base ratio Products Base ratio ApCpA 1 Ap,Cp,A 1.00/0.95/1.05 A,pC,pA 1.05/0.96/1 00 ApCpC 1 Ap,Cp,C 1.05/1.00/0.95 A,pc 1.15/2.00 ApCpG? 1 Ap,Cp,G 1.00/0.95/1.05 ApC,pG 1.00/0.95/1.00 ApCpU 1 Ap,Cp,U 1.00/1.00/0.85 AjpCpUe — 100/0.90/1.10 ApGpA 1 Ap,Gp,A 1.00/1.00/0.95 A’pG/pA 1.00/1.06/1 00 ApGpC ] Ap,Gp,C 1.05/1.00/0.95 A,pG,pCe 1.00/1.00/1.15 ApGptl? 1 Ap,Gp, Us 1.10/1.09/0.95 A,pG,pU< 1.00/13 .00,1.00 ApUpG ] Ap,Up.G 1 0070" 85 05 A,pU,pG? —-1..00/0. 96/110 CpApA 2 Cp,Ap,A 0.95/1.0571.00 Cpa 1.05/20 CpApC 2 p,Ap,C 1.00/1.10/0.95 C,pA,pC 1.00/1.06/1.00 CpApG! 2 Cp,Ap, 1.10/1.00/1.00 C,pA,pG 1.00/1.06/1.15 CpAptU 2 Cp,Ap,U 1.00/1.10/0.90 Cipa’pt 1.00/1.05/0.95 CpCpa 2e Cp,A 2.00/095 bee. CpGpat 2 Cp,Gp,A 1.10/1.00/0.95 — C,pG,pA 1.05/0.95/1.00 CpGpc 2 Cp,Gp, 1.05/1.00/0.80 C,pG.pC? —0.90/1_00/1.00 CpUpG 2 Cp,Up,G 0.95/1 00/1.10 C,pU,pG 0.95/1.00/1.10 GpApA? 1 Gp,Ap,A¢ —0.90/1.00/1.10 Gipaé 1.10/2.00 GpApCre ] Gp,Ap,C 1.05/1.06/0.90 G,pA,pC 1.05/1.00/0.90 GpApU ] Gp,Ap,U 1.00/1.00/0.95 G,pA,pU¢ 1.20/1.00/0.90 GpCpU 1 Gp,Cp, 1.05/1.00/0.90 GipC'pt 0.85/1.00/1.00 GpGpUe 3 Gp, U’ 2.00/0.95 G,pG,pUS 0.85/1 .00/1.05 UpApaA 2 Up,Ap,A¢ 1.00/13 .00/1.16 U,paA 0.95/2.00 UpApG 2 Up,Ap, 1.00/0.95/1.00 U,pA,pGe 1.00/1.15/0.90 Upc 2 Up,Cp,G 1.00/0.90/1 00 U,pC,pG 0.95/1.00/1 .05 UpGpa 2 Up,Gp,A 0.95/1.00/1.10 U,pG,pA 0.90/1.00/1.15 UpGpc* 2 Up,Gp,C 1.00/0.95/1.05 U,pG,pce 1.10/1.00/0.90 ° Methed 2: primer-requiring polynucleotide phosphorylase.* Method £: derivative of bovine pancreatic ribonuclease. Method $: isolation from a ribonuclease digest. of poly UG? + Trinucleotide contained a small amount (ea. 2%) of an unidentified contaminant. In every case, the chromato- graphic characteristics of the impurity in aoivent systems A and B precluded the pomaibility of it being an oligo- nucleotide of chain-length greater than two base residues. ¢ The di job mixture contained apether component, 3-5% of the total nucleotide content; the elevirophoretic mobility and UV spectrum of this component corresponded in each case to unreacted or partially digested oligu- nucleotide. @ As “ec” above, except that here the contaminant was only 1-3% of the total nucleotidic material. © Trinucleotide was synthesized directly from adenosine, by addition of two cytidylic acid moieties. J Trinucleotide contained 5~10% of @ contaminant which did not migrate in solvent A, but which ran with GpGpT im solvent B. The T-: digestion mixture contained 8%, and the venom digestion mixture 14%, of un- digested material: this was possibly due to aggregation of the trinucleotide. * Trinucleotide contained 10% of pC; the base ratio of the venom digestion was adjusted accordingly. primer-requiring polynucleotide phosphorylase and a nucleoside 5’-pyrophosphate,? or a deriva- tive of bovine pancreatic ribonuclease with a nucleoside 2’,3’~cyclic phosphate” (Table 2). The products were isolated by paper chromatography and electrophoresis, as previously described!—+ 12 and the purity of each preparation was assessed by two-dimensional chromatography of an aliquot (2.0 A® units) on Whatman no. 40 paper. The first dimension (solvent A) was n-pro- panol/ammonia/water, 55/10/35; the second dimension (solvent B) sras 0.10 M sodium phos- phate, pH 7.0, containing ammonium sulfate (0.4 gm/ml). Chain-length and base composition (Table 2) were determined by digestion of 2:5-A2" unite of each trinucleotide with T-2 ribonuclease and 2.5 A™ units with venom phosphodiesterase, as previously described.+ # Results and Discussion—In Table 3 are shown the effects of 26 trinucleotides upon the binding to E. coli ribosomes of 19 C!\AA-sRNA preparations, each acylated with a different C!“amino acid (C-Cys-sRNA not used). In addition, near the bottom of the table are shown the effects of 18 trinucleotides previously described!—* upon the corresponding C'-AA-sRNA (C'-Cys-sRNA and UpGpU omitted). Many of these trinucleotides have not been isolated or synthesized previously. Several factors should be mentioned which may be useful in assessing the data. (a) It is often difficult to compare directly the response of one C}/AA-sRNA preparation to a template with that of another, for Kaji and Kaji have shown that Vou. 53, 1965 BIOCHEMISTRY: NIRENBERG ET AL. 1163 both deacylated and acylated sRN. bind to ribosomes in response to polynucleotide templates. The extent of acylation of each C!*AA-sRNA preparation must be considered (see Methods and Materials) as well as the relative response of each C'-AA-sRNA to other trinucleotides. (6) A trinucleotide which stimulates the binding to ribosomes of one C'*-AA-sRNA generally decreases binding of other C'*AA-sRNA preparations.' (c) Background binding of C™-AA-sRNA to nbosomes appears to be a function of the sRNA species, the amount of sRNA added to a reaction, the proportion of sRNA acylated with a C'-amino acid, and possibly the amount ‘of template RNA on the ribosomes or in the sRNA prepara- tions. +8 (d) Reactions contained limiting concentrations of ribosomes (as determined with ApApA, UpUpU, UpUpC, or GpUpU) and therefore were satu- rated with respect to these trinucleotides and C'*AA-sRNA. Most trinucleotides markedly stimulated the binding to ribosomes of only one C'-AA-sRNA preparation; however, a number of trinucleotides displayed lower template specificity for C'*-AA-sRNA. For example, ApCpU, ApCpC, and ApCpG stimulated C'-Thr-sRNA binding to ribosomes, but did not significantly stimulate the binding of 18 other C'-AA-sRNA preparations. ApCpA also stimulated C'.ThrsRNA binding. This trinucleotide also stimulated C'LyssRNA binding. However, the template activity of ApCpA for C-Lys-sRNA was only 10 per cent that of ApApA. The disparity between the template activity of ApCpA and ApApA was even more apparent in reactions containing limiting concentrations of trinucleotides (data not shown). Such considerations suggest that the sequences ApCpG, ApCpU, ApCpC, and ApCpA correspond to threonine codons. It is possible that the template specificity of one synonym codon may differ from that of another; however, other alternatives, such as the possibility that C'-Lys-sRNA may respond to an impurity in the ApCpA preparation which we have been unable to detect, alzo must be considered. The data of Table 3 indicate that the sequence GpCpU corresponds to an RNA codon for alanine; CpCpA, CpCpU, and CpCpC correspond to proline (the tem- plate activity of CpCpA for C'*-Pro-sRNA was higher than that of pCpCpC (ef. ref. 4); UpCpG, UpCpU, and UpCpC correspond to serine (ef. ref. 4); GpApU and GpApC, to aspartic acid; GpApA, to glutamic acid; CpApU and CpapC, to histidine: CpApA and CpApG, to glutamine: CpGpC and CpGpaA, to arginine; ApUpG, to methionine; and CpUpG and UpUpG, to leucine (CpUpU and CpUpC possibly serve as internal but ngt terminal Leu-codons’). It seems clear that GpCp(*serves as a codon for alanine, for this trinucleotide stimulated only the binding of C'-alanine sRNA to ribosomes. This sequence is also in accord with predictions based upon amino acid replacement data. How- ever, the weaker response of C'*-Ala-sRNA to ApGpC, CpGpC, and UpGpC sug- gests that recognition of 2 out of 3 bases, the GpC portion only of the latter tri- nucleotides, may permit C'+-Ala-sRNA binding. Similarly, C'-GlusRNA re- sponds best to GpApA, but also responds to a weaker extent to trinucleotides con- taining GpA, such as ApGpA, CpGpA, and UpGpA. C'*-Lys-3RNA responds best to ApApA but also recognizes ApApG,* GpApA, ApCpA, CpApA, UpApaA, and CpCpA. Additional examples in Table 3 are readily apparent. These data in- dicate that one trinucleotide sometimes can direct the attachment of a limited group of C'4-AA-sRNA species to ribosomes. It is possible that correct re- 1164 BIOCHEMISTRY: NIRENBERG ET AL. Proc. N. A. §. TABLE TEMPLATE SPECIFICITY OF TRINUCLEOTIDES Oo Moles of C1 or H*Aminoacyl-sRN A Bound to cua Ce che force Cie cw Cll or He” Cw cw, Trinucleotide Ala Are Asp Asp-NH:; Gh Glu-NH: Gly His Neu ApCpU ~0 16 ~0.04 -0.06 -0.03 -602 -003 —0 27 -0.02 0 ApCpC -0.15 -0.15 0.08 0.05 -0.02 -0.29 -—0 34 0.03 0.02 ApCpA -0.04 -0.01 0.05 0.05 0.03 -003 -0.78 0 0 ApCpG -0.15 —0.37 0.02 -0.03 —0 04 003) —0. 18% 0.03 0.02 GpCpU 0.73 ~0.18 -0.08 —0.08 0.01 -0.13 -o0.13% ~9 02 0 CpCpa ~0.15 0.06 ~0.03 6 0 0.04 0.04 0.01 0.01 UpCpG —0.20 0 0 0.07 0.03 -0.23 —-0.44 0.01 0.02 GpApU —0.01% —O.14 1.29 0.33+ 005 -0.02 -0.23) ~9 93 0.0) GpApC —0.05 —9 29 1.32 a.19t gt -0 10 —6 276 0.02 -0.03 GpApA -0 07) -on) 0.01 6 062 -0.32 -0.68 -0.03 -0 6) CpAptU 0.01 -0.31 —0.0: —0.03 —®,02 0 -—0.13° 0 52 ¢ CpApc ~0.02% -025 —_6.01 © 04 oe -0.14 —0.08> 0.26 -—06.02 CpApA 0.02% 0.01 -0.06 -~0.07 -@02 2.05 —0.93> 0.02 -6.04 CpAps -0.13 ~6.01 0.02 ~0 03 G05 2.60 ~0.12 -0.03 0 UpApa -0.022 -041 -0.01 -60.07 0.02 -0.30 -0.84° —~0.08 0 UpApG —0.07 0.06 0.01 0.12 0.04 0 -0.01 —6.0} 0.02 ApGpU — 0.07% © 03 0 0.04 0 -0.03 —0.06¢ 0.03 —-0.02 ApGpC 0.43 ~0.03 6.03 010 -0.02 6 ~0.13¢ 0.03 —6.03 ApGpA —0.06 0.10 0.01 0 0.19 0 ~0.31 0 — 0.08 GpGpU -—0 022 —0.33 0.01 -0.02 0.04 -0.2) 3.04 -0.03 -0.01 CpGpc o.14 1.63 0.02 -0.13 -0.01 -0.07 -0.62 -003 —0 02 CpGpA —C.20 142 -601 -6.0; 0.07 —0.05 0.08 —~0.03 0 UpGpc 0.28 -0.18 0.05 0.12 6.04 -005 -0.55 -002 -0 02 UpGpA -012 -0.12 0.07 014 0.10 -0.13 ~0.30 —0.03 0.02 ApUpG -0.01% -0.12 ~0 06 0.02 -001 -0.14 -0.17% ~09.02 0 CpUpG -0.14 -0.11 -0.01 0.04 0.01 0.1] ~1.20 6.03 6.03 Minus trinucleotide 0.50 1 16 0.21 0.21 0.12 1.65 2.89 0.25 0.08 (uzmoles)* 0. 20 -.- .. . Q.34° ee 1.10% Le . Trinucieotides pre- a wee wee 1.19 tee a ee wee 0.72 viously Lee wee . ApApU . a a an ApUpU described Le a . 1.50 . . — oa 0.59 {4 pumolea) Loe sae -. ApApC tee . a Lee ApUpc The specificity of trinucleotides in directing the binding of C™ or H*aminoacyl-sRNA to ribosomes. Repro ducible stimulations of AA-sRN 4 binding due to the addition of trinucleotides are bold face. For comparison, the template activities of 18 trinucleotides iously described’ ~5 are shown at the bottom of the table. Reactions contained the components deacribed Materials and Methods, the amount of Cit. AA-«RNA stated previ- oustye or in Table 1, and 0.150 A unite of trinucleotide, as apecified, in a final volume of 50 al. CC Asp-NH> wes cognition of 2 out of 3 bases in a trinucleotide, in or out of phase, or 2 bases in 1 trinucleotide and 1 base in an adjacent trinucleotide, often may suffice during protein synthesis. This striking phenomenon is often observed with trinucleotides containing 2 or 3 purines. Since the stability of codon-ribosome-AA-sRNA complexes may partially depend upon interactions between bases in codons and sRNA, the affinity of sSRNA for a ribosome may be greater when each base in a codon is recognized correctly and in proper phase than when codon recognition is only partially correct. The activity of both ApGpU and ApGpC in stimulating binding of C'*Ser-sRNA may indicate that these sequences correspond to serine codons (in addition to UpCpU, UpCpc, and UpCpG). However, these assignments should be considered tentative. Although ApGpU and ApGpC have been proposed as asparagine codon sequences,'* the data of Table 3 show that they do not Significantly affect the binding of C!Asp-NH-rsRNA under the conditions employed. We have previously reported that ApApU and ApApC stimulate binding of Asp- NH-rsRNA with high specificity.® ApGpaA slightly stimulated the binding of both C’*Arg- and C'-Glu-sRN A. The sequence ApGpA was predicted for arginine on the basis of codon sequence data obtained earlier and amino acid replacement data reported by Yanofsky and Vou. 53, 1965 BIOCHEMISTRY: NIRENBERG ET AL. 1165 3 ror C'+. on H?-Aminoacyt-sRNA Ribosomes Due to Addition of Trinucleotides* cia cu S14 cua Cc cM cite cm Hu cw. ci. Leu Lys Met Phe Pro Ser Thr Trypt Tyr Val —0.12 0.01 0.03 0.02 0.02 -0.05 0.63 —90.01 0.03 0 —90.09 —0.03 0 0 —0.01 0.03 0.50 —0.03 0.03 0 —0.09 0.17 —0.04 9.01 0.05% —0.07 0.45 0.03 0.01 0.01 —0.22 0 0.02 0.07 -0.01 —0.15 0.78 —0.03 0.03 0 0.25 -0.10 —0.18 =0.04 —0.02 —0.18 0.08 0 0.02 ° 0.02 0.11 0.03 0.03 0.40 ~0.08 ~0.02 0.03 0.03 0 0 0.04 -0.11 0.03 0.06° 1.09 -0.04 0.02 0.05 0.03 -0.10 -0.12 —0.09 —0.29 —0.03 0.01% 0 04 0.04 a 0 —9.10 -0.10 —0.10 —0.24 —0.01 0.01% 0.01 0 02 0.03 0.03 -0.03 0.73 0 -0.03 -0.07° -0.23 —0.08 —-0.04 0 ~0 04 —0.03 —0.07 —0.04 —0.15 ~0.01 0.02 -9.07 -0.01 0.03 0.02 0.01 -0.13 -0.01 —0.13 0.01 ~0.03¢ 0.05 —0.03 0 04 0.03 0.03 0 10 -0 12 -0.19 -0.01 -0.02% -90.04 0 -90.0t -0 01 0.03 —0 03 -0.02 0.04 -0.01 ~0.09 -0.09 - 0.03 0.09 0.02 —0.05 0.10 -0.09 —0.38 0.02 0 02 ~0.03 0 0 OL 0.03 -0.09 —0.05 -0.02 ~0.17 0 0 ~0.05 0.03 0.03 0 -0.16 —0.18 —0.05 —0.22 —0.04 0.27 —0.02 —0 04 -0 03 0.08 -O.11 -0.12 -0.05 —0.20 -0.01 017 . —0.05 0.01 0.02 —0.09 ~0.01 ~0.06 -0.13 0.01 0.03 ~0.05 -0.05 0 02 0.02 ~0.27 -0.10 —0.08 ~0.34 -0.07 -0.07 -~O.11 ~0.04 0.93 0.03 0 03 0.168 0.01 —0.03 ~9.05% 0.12 0 -0.02 0 -0.01 -0.14 -0.11 —0.17 —0.04 ~0.12 ~0.22 -0.06 0 0.03 ~0.02 -0.07 0 -0.18 0.04 0.02% 0.02 a 0 04 0.03 0.02 -0.04 -0.11 0 ~0.27 0 0 0.08 0.03 0.03 —0.05 —0.38 —0.06 1.00 ~0.04 0 —0.08° 0 Ot —0.06 0.03 -0.10 0.30 0.03 0.10 0.05 0.03 0.03 —0.06 —0.03 0.04 0.05 0.79 0:77 0.40 0.44 0.14 0.58 0.30 0.30 0.12 0.22 _— Lee _— nn 0. 40 0.24° _— see ee Lee 0.23 1.77 Lee 1.05 0.28 1.27 --- Lee 0.31 1.84 UpUpG ApApA Lae UpUpU pCpCpC UpCpu Lee Lee TpApU GpUpU 0.04 1.00 a 171 0.08 034 Lo. Lee 0.56 Lee Cet ApApG wae UpUpc Coeee UpCpc cee Lee UpApc CpUpc i ) ay CpCpt — “ ~ “ assayed in 100-ul reactions; amounts of all components were doubled. * Background binding of C!*aminoacyl-sRN A to ribosomes in the absence of trinucleotides is expressed in samoles (shown near the bottom of the table). All other values (4 wumoles) were obtained by subtracting back- ground binding of C'aminoacyl-aRNA from binding obtained upon addition of a trinucleotide preparation. tsRNA may contain some C!-Asp-sRNA. co-workers.'4 The recognition of one triplet by sRNA corresponding to several amino acids again suggests partial codon recognition. Such observations should be considered in terms of in vivo studies, particularly those related to extragenic suppression. Possible Base Sequences of Nonsense Codons.--Since nonsense codons may per- form special functions in protein synthesis, we have been particularly interested in a small group of trinucleotides, UpApA, UpApG, UpGpA, CpUpU, CpUpC, and ApGpA, which either have little tanplate activity, or have slight activity for two or more C'4-AA-3RNA. The possilflity that CpUpU or C pUpC may serve as in- ternal, but not as terminal, codons for leucine has been discussed previously.‘ The striking results of Sarabhai, Stretton, Brenner, and Bolle indicate that certain codons in “amber” mutants of T4 phage may correspond to an amino acid in certain strains of E. cola but may specify the terminus of a protein in other E. colt strains. Further analysis of mutant phages and amino acid replacement data have led Brenner and co-workers to suggest that UpApG or UpApA may specify the end of a polypeptide chain in some &. coli strains and serine in an additional strain which contains a suppressor gene. Weigert and Garen also have found that mutations which lead to the formation of nonsense codons in the alkaline phos- phatase gene can be related to amino acid substitutions at sites corresponding to 1166 BIOCHEMISTRY: NIRENBERG ET AL, Proc. N. A. 8. nonsense codons only if the base composition of the nonsense codon is (WAG)."* This codon also corresponds to serine in a strain containing an appropriate suppres- sor gene. We find that the sequences UpApG and UpApA have almost no template activity for C'