[Myers, Gerald. Tracking human immunodeficiency viruses] [This seminar presentation was originally videotaped for internal information purposes only and is made available now because of expressed interest in the subject material] [National Library of Medicine Biotechnology Seminar Series: An Update] [Tracking Human Immunodeficiency Viruses, Dr. Gerald Myers, 30 May, 1989] Dr. Gerald Myers: I'm very pleased to be here, for the occasion to play with one of these and to present some of my work and also for the opportunity of being a visiting scientist for this year, regretfully soon over. [Dr. Myers stands at podium.] I wish to especially thank Dennis Benson and David Lipman and the staff of the new Center for Biotechnology Information for their hospitality and stimulating conversations. If there's been a downside this year, it's been the tree pollens. On the other hand, the upside, for me, has been the crab cakes and cornbread from downstairs. [Laughter] Let's see now, I want to dim the lights in the house and get the first slide. You don't need to be able to read French to appreciate this cartoon, although it does help to know that "SIDA" is the French moniker for AIDS. To the extent that this cartoon humors us, we laugh at the human capacity for mechanical thought and habituated feeling seen here, and the relative indifference toward mortality resulting from traffic accidents. [Black and white cartoon strip comparing numbers of AIDS deaths to numbers of traffic deaths.] There is, of course, a bit of irony about the situation. The one hundred deaths due to traffic accidents don't have identical meaning to the one hundred deaths due to AIDS. For instance, we are now told that, in the year 1991, deaths due to AIDS in the United States will outnumber those on the highway. Thus, the comic irony of this cartoon is evanescent, depending, as it does, on the extraordinary abruptness with which AIDS came into the world. AIDS, as a historical phenomenon, came forward in the summer of 1981. By 1983, several researchers had identified and isolated the etiological agent, the retrovirus now known as HIV. And by 1985, the nucleotide sequences for these early isolates of HIV became known. That molecular information came just in time to address a flurry of speculations and allegations concerning the sudden origin of the AIDS virus. For instance, that it was the product of recombinant DNA technology. One interesting proposal for the origin of HIV was that it arose from recombination of already existent genetic material. In 1986, two nonvirulent strains of herpes simplex virus were shown to generate lethal recombinants in vivo. This touched off some speculation that the AIDS virus may have been similarly generated in nature or in a test tube. Given the rampant worldwide speculation about possible accidental or deliberate circumstances behind the sudden appearance of the virus, we at GenBank, and the National Library of Genetic Sequences, felt obliged to try to reconstruct the ancestry and recent evolution of HIV. Of the many ways to inquire into the evolutionary origins of an organism or pathogen, a comparative study of the genetic sequences is the most informative, since the genetic sequences constituting the genome are unique for each species, and, in some cases, for each individual in a species. The HIV genome, which consists of a single molecule, is approximately ninety-seven-hundred nucleotides in length, a length more than adequate for precise identification of the virus. Thus, in 1985, we set out to classify the newly determined nucleotide sequences of HIV by performing a computer search of all known genetic sequences that were stored in the GenBank and EMBL databases. The sequences of HTLB-3 and LAB, as they were called then, were compared to approximately six million bases of mammalian, viral, plant, invertebrate, and so forth, sequences, with the result that no similarity could be found. The search for close relatives was repeated in 1986, when there were nine million bases in the international libraries, and again in 1988 when there were over 15 million bases. Our research has shown that if HIV arose in recent time as a recombinant of two or more existing viruses, naturally or through human agency, it came from viruses that have not been characterized, that is, from genetic sequences that are not in the international databases. And a growing body of circumstantial evidence, much of which I'll present today, argues that HIV has not recently arisen as a recombinant of any viruses, known or unknown. A second and more viable notion about the cause of AIDS was that it arose from a monkey virus that recently crossed species lines. Unfortunately, this possibility was brought forward in terms that were inexcusably offensive to the Africans, on the one hand, and not that helpful to AIDS researchers, on the other hand. Much of this talk will be concerned with what I believe to be a more informed hypothesis for the simian origin of AIDS. That will entail tracking HIV into the past. The second part of this talk will describe efforts to predict the path of the virus into the future. As I have previously stated, of the many ways to inquire into the epidemiologic progress and pathogenic effects of a disease, the comparative study of genetic sequences should not be underplayed, especially when the pathogen appears to have made a recent ecological and evolutionary breakthrough. [Image of black-outlined map of the United States and the Caribbean, with black dots on Haiti and several US cities.] But molecular data and computer analyses have their limitations. Sometimes it helps, at the risk of seeming old-fashioned, to start out with an ordinary map, such as this plain old Macintosh map of the United States. AIDS as a phenomenon came forward in the United States in the summer of 1981. From retrospective examination of sera, collected in the late seventies, in association with hepatitis-B studies in San Francisco, Los Angeles, and New York City, it appears that HIV-1 entered the country quietly sometime in the mid-1970s. For example, stored sera from nearly 1,000 patients in Los Angeles with acute and chronic hepatitis have been examined for antibodies, and the earliest detection of HIV is from a 1979 sample. There is the intriguing case of a teenage boy in St. Louis, let's see if I can reach St. Louis, who presented AIDS-like symptoms as early as 1968, and whose tissues, stored in 1969, have tested positive. But virus has not been isolated from those. For reasons that will become clear later on in this talk, I wish to emphasize that the preponderance of evidence from the three major cities points to the late seventies as the time of substantial introduction of the virus into the country. In the Caribbean, sera surveys pertaining to dengue fever have also been reviewed. The earliest evidence of HIV in Haiti comes from 1979 samples. To our astonishment, the same story is encountered in the other large center of the pandemic, Africa. [Image of black-outlined map of Africa, with black dots on several cities and a black east-west line representing a travel route.] The sera epidemiological data is limited, for sure, but let me present fragments of it for those of you who have not already wrestled with the puzzle. Again, there is strong evidence for an exceptionally early infection from a single 1959 sample taken here in Kinshasa. However, sera sampled in 1970, from 805 healthy mothers in the same city, revealed just two positives, while the 1980 sampling of a smaller but similar group revealed fifteen positives. Sera from 659 persons collected during a 1976 investigation of Ebola fever in rural Zaire showed five sera positives. That degree of prevalence, five of 659, may seem high for a rural population, however the one confirmed infected individual of the group was known to have recently lived in Kinshasa, where she probably worked as a prostitute. The virus isolated from that stored 1976 sample is our oldest molecular specimen of an immunodeficiency virus. From nearby Congo--I'm going to let you find your way around the map, now. I think you can do it as well as I can. From nearby Congo, sera of 340 Pygmies was collected between 1975 and 1978. Being hunters and nomads, these people are in direct contact with monkeys. They hunt and eat monkeys. None of the sera of these Pygmies could be considered positive. In South Africa, between 1970 and 1974, thirty-five-hundred samples were taken from the black miner population as part of a pneumoccal vaccine study. At most, two of these showed HIV positive, and neither of those was confirmed. By 1986, however, nearly four percent of the sub-population of mine workers from Malawi were sera-positive. To the north, in urban Somalia, there is no evidence of HIV prior to 1983, and a rather expensive study of prostitutes and patients at STD clinics in Nairobi argues low prevalence in 1980, rising to ten to 60 percent, depending upon the group, by 1985. The number of documented AIDS cases in Rwanda went from ten, in 1983, to 705 by the end of 1986. The sera prevalence in urban Rwanda had reached 18 percent by the end of December 1986. These studies point to a recent spread of HIV by travelers and prostitutes in Central Africa that was undoubtedly facilitated by this road running from Kinshasa, in Zaire, across to Mombasa, in Kenya. How did the virus get onto this road? There are many ways to address that question, there are many possible answers. However, until the puzzling complexity about the AIDS epidemic in Africa is resolved, no secure conclusion can be reached. There are, as you know, two major foci of AIDS in Africa, HIV-1 disease in Central and South Africa, and HIV-2 disease in West Africa. I am unaware of extensive retrospective analysis pertaining to the second type of HIV, and it would be quite interesting to know how closely the onset of HIV-2 disease, in West Africa, coincided in time with what I have summarized for HIV-1. Through review of serological samples from a study of lassa fever, Patricia Fultz and coworkers concluded that the prevalence of HIV-2 in Guinea-Bissau in 1980 was no higher than one percent. By 1986, however, sera prevalence in Guinea-Bissau had reached six percent. There was no evidence for HIV-1 infection there in 1980. Thus the two epidemics caused by two viruses may have been separated by years, but hardly by decades. If HIV-2 had been endemic in West Africa for a very long time, why has that type virus only recently moved into Paris and New Jersey? Travel from West Africa to European cities is certainly as frequent as travel from other parts of Africa. Without evidence to the contrary, I'm going to assume, for the sake of this talk, that HIV-1 disease and HIV-2 disease came forward at about the same time. This curious situation can be summarized with the help of this map published by Essex in [?] in 1986. The earliest HIV-2 infection appears to have been confined to the countries of the West Coast, while the HIV-1 outbreak occurred in Central and South Africa, shown here in blue. [Map of Africa with some countries highlighted in dark blue, light blue, and red.] The map is already out of date. HIV-1 infection has spread to West Africa. In fact, one case of dual infection by the two types of immunodeficiency viruses has been documented there. A leading hypothesis for the origin of the second type of HIV is that it is a descendent of viruses expected to reside and be found in the Mangabey population that inhabits West Africa. And because 30 to 50 percent of African green monkeys, found throughout most of sub-Saharan Africa, have antibodies reactive to simian immunodeficiency virus, but are asymptomatic, that monkey species is suspected to recently have given rise to HIV-1-caused AIDS. How plausible are these hypotheses taken together? They require that two events of transmission across species lines have occurred at approximately the same time in history. Peter Duesberg's current challenge of HIV as the cause of AIDS brings up precisely this issue, and I'm going to quote from his paper, "It is highly improbable that, within the past few years, two viruses, HIV-1 and HIV-2, that are only 40 percent sequence-related would have evolved that could both cause the newly found syndrome we call AIDS." "Since viruses, like cells, are the products of gradual evolution, the proposition that, within a very short evolutionary time, two different viruses capable of causing AIDS would have evolved or crossed over from another species is highly improbable." That's the end of quote. Is that in focus, you people in the back? We're not going to be going into detail, but... To explore this in probability, I want to turn to another kind of map, a phylogenetic tree constructed, with the help of a computer, out of homologous nucleotide sequences. [Image of phylogenetic tree representing HIV data.] This particular tree has been built from the preliminary gene sequences of the viruses in question. These trees read like family trees, with time flowing from left to right. So this would be in the past and now we'd be moving up into the present, at the time when these viruses, seen at the tips of the branches, were isolated. Here are the HIV-1s, with which we first became familiar. Here are the HIV-2s, only one of many samples. Here, there close relatives, simian viruses isolated from a macaque and from a Mangabey virus. Here are some sequences from African green monkeys taken from a colony in Kenya. And finally, for comparison to distant sequences, here's the newly reported preliminary sequence for the feline immunodeficiency virus, here. And for Visna virus, an unrelated [?] virus that's been known for quite a while. As you can see from this tree, the immunodeficiency viruses are at some distance from their nearest relatives, the ungulate and feline lentiviruses. And, in fact, Russell Doolittle and his colleagues at the University of California at San Diego have estimated this distance to be equivalent to the distance between a fungus and a human. That distance, however, may have evolved over a reasonably short time, that's a matter of a thousand years or so, simply because the mutation rate of these viruses is so high. Many of the simian viruses in this tree, and also the feline virus, sequences were determined by Vanessa Hirsch, Phil Johnson, and Bob Olmsted at the National Institute of Allergy and Infectious Diseases. The computer algorithm that constructs this tree, provides the branching order of the tree based on sequences, which I'm sure you know, and also the branch lengths, and here the branch lengths are reported as fractions of the total number of sites that are under consideration. So the HIV-2 sequence shown here, if we added up these branch lengths, and took the distance to the HIV-1s, would turn out to be about 50 percent, simply by adding those numbers. Very briefly, since I'm going to be talking about quite a few of these trees and I want you to know how they are constructed, the tree analysis begins with an alignment of homologous nucleotide sequences, such as shown in this slide. By restricting our attention to just the single base changes that you see quite a few of in this region here, we felt that we could be confident about unambiguous alignments, so we've ignored any kind of insertion or deletion. The computer program then takes this data as input data and infers a minimal evolutionary path for the set of sequences. By performing such analyses for the several major genes of the viruses, we can corroborate our results. That is, in the absence of genetic recombination, we expect a phylogenetic tree for one HIV gene to be congruent with the tree for another gene. The branch lengths should differ, they should contract and dilate, because different selective forces are at work, but the overall topology, or branching order, should be invariant. The robustness, or stability, of the trees that are generated can be checked in several ways: by adding or subtracting sequences, by comparing third base changes to second base changes, by running different computer algorithms and so forth. This analysis does not provide explicit information about the time over which the evolutionary process has occurred, rather it provides only the branching order and the mutational distances. One way to estimate the time is to assume a representative mutation rate. Because we didn't know what to expect with the immunodeficiency viruses, we chose, instead, to calibrate the tree, taking advantage of the ten-year timespan between a 1976 isolate -- this was the virus that I mentioned was taken from the Ebola fever study -- and compare that with a 1986 isolate down here. Most of these viruses were isolated between 1983 and 1985. So we had a ten-year timespan over which we thought we could calibrate the tree. This calibration assumes a constant and uniform mutation rate for all of the genetic sequences involved, or the genomes involved, and that's probably not the case. Nevertheless, our purpose was to establish a minimum time for the divergence of the HIV-1s and HIV-2s. That is, we wanted to get a minimum estimate back into the past, or what's called the lookback time, for these viruses, and we estimated that to be 45 or 50 years. Undoubtedly, that's an underestimate. Other people have assumed mutation rates or done other kinds of calculations and come up with periods of 150 or so years. So we're talking about a time period of divergence of all of these viruses of 50 to 150 years back into the past. This is a very short time for an evolutionary process, of course, however, even a time period so short as 50 years tells us very little about the recent origin of AIDS. This reconstruction is, indeed, consistent with the transfer origin of the human immunodeficiency viruses from simians. Not the particular monkey viruses whose sequences terminate the ends of the branches here, but rather their ancestors. All of the simian sequences represented in this tree were taken from captive animals, African green monkeys from a colony in Kenya and Mangabey and macaque monkeys from U.S. primate centers. Thus, we have no information about feral animals in the epidemic areas. Nevertheless, the viruses taken from the captive animals possess the same genomic organization as the human viruses, and the Mangabey virus will cause an AIDS-like disease in macaques and sometimes the Mangabey itself. Duesberg and others who have challenged the HIV explanation of AIDS have simply failed to address these facts, choosing, instead, to emphasize the asymptomatic condition of SIV-infected animals. We should ask what sort of viral population is anticipated for wild monkeys in Africa. Is it conceivable that the twofold epidemic can be accounted for by monkey viruses transferring into a human host? When a slide is this blue, you know it was made downstairs in the last few days and that very fresh data is being presented. [Slide with blue background, showing white dotted lines forming a phylogenetic tree displaying HIV data.] Hirsch and Johnson and their coworkers have recently sequenced three additional African green monkey viral isolates. And you can see that these five viruses in this very large center cluster here are quite diverse. We can add, to this information, some further sequence data for the reverse transcriptase genes of two other African green monkey isolates from the New England Regional Primate Center. Taken together, this data argues for an astonishingly large reservoir, a quantitatively large and qualitatively diverse reservoir of immunodeficiency viruses in the African green monkey populations of Africa. It is my present expectation that further sampling of monkey and human populations in Africa will confirm an origin of both types of HIVs from African green monkey viruses, based on this large set of data we have here, now. This virus here, the first of the simian viruses to be isolated, was taken from a captive macaque. A macaque is an Asian primate that is not found with the virus in the wild, thus this is not, strictly speaking, a macaque virus, but rather an immunodeficiency virus found on a macaque. Perhaps the macaque acquired the virus from a mangabey. Mangabeys in the wild are known to be sera-positive, though we don't have a sequence from one of those, yet. Can we say that this sequence here represents the mangabey virus from Africa? Not really. All that we know for sure is that it was obtained from a virus taken from a captive mangabey. It could be the mangabey virus. Perhaps it is not. One form of the mangabey virus will cause an AIDS-like disease in macaques and, in some cases, mangabeys. Continuing this line of thought, it doesn't seem so far-fetched to wonder if the viruses that are being called HIV-1 and HIV-2 aren't also simian viruses being isolated from human hosts. When a transmission across species boundaries has recently occurred, when do we stop calling the virus "A" and begin to call it "B?" However, even if we follow that line of thought, that all of the viruses shown in this tree are essentially simian viruses -- which I happen to think is the most fruitful approach, at this time -- we still confront the problematic question of how viral transfer happened twice in a very short interval with significantly different virus. I suggest, in this regard, that we consider some possibilities of human accidents, that is historical accidents, that have been mentioned but not really thoroughly explored. And this, then, would respond to Duesberg's argument. First, that there was a sharp increase in the exportation of monkeys from Africa in the sixties and seventies, mostly for the purposes of research. This undoubtedly entailed quite a bit of handling of captured monkeys. Second, there was a valiant push with live viral vaccines in the sixties. Many countries began using live polio vaccines in the spring of 1960, that was the first time. These vaccines are prepared, as you know, from monkey tissues, typically African green monkey kidney cells. And the possibility of contaminated lots is not far-fetched. We know of two such accidents in the sixties, one involving the virus known as SV-40 that contaminated polio vaccines and the other involving the Marburg virus, which didn't actually contaminate vaccine, but contaminated the laboratory preparing vaccine, and both of these viruses were derived from the African green monkeys being used in the preparation of those polio vaccines. If a rare contamination had occurred through handling or through culture, it would never have been detected. Infected African green monkeys are typically asymptomatic. It is still thinkable that HIV has been endemic to either one of the African areas or perhaps to some still-to-be-identified area of the world. I regard that to be less likely today than it was a year ago, and I regard it to be less likely than the scenarios that I've just put forward. But I do wish to make clear that I know of no data or circumstantial evidence to support the hypothesis of vaccine contaminants beyond what I've outlined for you. If you leave the auditorium today thinking AIDS is a greater mystery than you'd previously realized, then I have succeeded to my satisfaction. There is still much data to come in involving HIV and SIV in Africa. For the remainder of this talk, I'd like to describe our current efforts at tracking HIV-1 into the future. The Institute of Allergy and Infectious Diseases is [?] an nambitious program of molecular epidemiology, and I'll be reporting on some of the preliminary findings of that program. The focus now is upon just the type 1 viruses because they account for the vast majority of AIDS cases worldwide. For the sake of simplicity, in this tree, only the single HIV-1 sequence here has been shown, but there are many more than that. This tree has been generated from envelope gene sequences from the earliest isolates in 1985, the isolates known as HTL-B3, or now called 3B viruses, and French isolates. [Slide showing dotted lines extending horizontally, forming a phylogenetic tree displaying HIV data.] This sort of configuration gave a very brief impression of genetic stability of HIV-1. Because the culture from which the so-called 3B viruses was isolated had been seeded with blood pooled from many infected individuals in New York City, these sequences were thought to represent different patients, and therefore represent a moderately stable retrovirus, stable in comparison to other retroviruses. That is, about equally stable. Further sequence work quickly gave a more discouraging story. This one you'll have to back up for. This tree was also generated by a computer from all of the envelope nucleotide sequences of just North American HIV-1 viruses that have been sampled and molecularly characterized to date. The original 3B and LAV viruses that I showed you on the previous slide are just these two up here. There are many other viruses from San Francisco, from Atlanta, New York, St. Louis, that are shown on here, though I wouldn't have you conclude that there's a major viral form for each major city. [Slide showing vertically-oriented phylogenetic tree.] You wouldn't believe that, anyhow, given the state of jet travel. This virus was taken from a hemophiliac patient in Japan who was undoubtedly the victim of transfusion-associated AIDS. These sequences represent a fairly broad spectrum of pathogens, some from macrophages, some from T-4 lymphocytes, some from brain. This sequence, BRVA, for example, was taken from the frontal lobe tissue of a 58-year-old Atlanta man who was incorrectly diagnosed as having Alzheimer's disease. And these sequences, all three of these here, were taken from the same individual, a child who happened to be of Asian descent. In fact, there is a geographical intelligibility to this slide. The Haitian sequences that have been isolated thus far, cluster together. The U.S. viruses, which would be up here, cluster together. And the Zairian viruses, there was only one member at the bottom of this tree, and I'm going to show you many more, they cluster together. Assuming this is a fair sample of viruses in the population up to 1985 or so, when most of these were isolated, what can be inferred, from this analysis, about the rate and extent of HIV variation in the United States? Based upon what we know about the probable time of entry of the virus into the U.S. and what we can reasonably infer from the time calibration mentioned earlier, we think the diversification represented in this tree took place between 1978 and 1986, thus it is quite possible that the viral variation is being driven by the epidemic. That is to say, the diversification of HIV-1 may be a function of the number of infected individuals, or, more precisely, the growth rate of the prevalence of the virus. There is some precedent for this with polio virus, which, during a one year outbreak in 1978, changed by 1 to 2 percent. That rate of change is not typical for polio and it's actually less than the rate implied here for HIV-1 within a single individual. I remind you that this tree analysis takes into account only single base substitutions, not insertions and deletions. Thus, any rate deduced from the tree would be a minimum. It is easier to gauge the extent, as distinguished from the rate, of variation by simply noting the pairwise distances for the various viruses and the total sites in the envelope gene that are under consideration. For example, these bowel sequences here, one and two, are minimally 34 nucleotide changes apart. 34 divided by the total number of sites, which was 2,399 for this tree, yields a minimal distance of 1.5 percent apart. That value, again, comes into focus when we realize that these viruses taken from the same patient are as far apart as the polio viruses were given in a population, in an outbreak in 1978. A summary of distances for the U.S. sample today is shown on the next slide. I won't be able to go through this whole slide, there just isn't enough time, but in general, there are three categories of viral distances or relationships. First, viruses that are one to three percent different from each other, that is 97 to 100 percent identical, we're calling these, for the sake of convenience, sibling viruses. And, with one exception, these are all known to have originated from the same patient. The next category, over here and continuing down into this corner of the figure, we refer to as cousins. These are viruses taken from different patients and they involve sequences that differ from five to twelve percent from one another. And finally, viral relationships that involve distances of up to a maximum of 26 percent, up to a slide I'm going to show you in a few minutes, but 26 percent different in the envelope gene sequence. These we call distant cousins and they represent viruses typically from different geographical areas of the world. This gap in here appears to be the consequence of the ad-hoc sampling that's been involved, up to now, of viruses from the U.S. pool. That is the discreetness of the distribution obtains from the straightforward consequence that there was little chance for capturing viruses that represented close contacts; that is, sexual partners, mother-baby, donor-recipients in transfusion-associated AIDS. We expect quite a bit of data pertaining to this region to come in in the next year. In fact, in one [?] study of which I am aware, sequences are being determined from viral isolates from several infected members of a single household, a single family. Returning, just for a moment, to the tree that was the basis for the distance summary, we should keep in mind that it is merely a snapshot of viral forms sampled in 1983, 1984, and 1985, for the most part. We cannot state what this viral assemblage looks like in 1989, nor can we predict what it will be like in 1995. However, the corresponding envelope gene tree for viruses sampled from Kinshasa in Zaire does provide some picture of the likely course of HIV variation in the United States. Doctors [?] and Gallo have undertaken an extensive sampling and sequencing project in Kinshasa as a part of the vaccine program and have generously made available some of their earliest findings for tree analysis. It is quite evident that there are at least two major forms, this group up here and this group down here, of HIV-1 in this single city, Kinshasa, and that the distances separating the viruses are now very great in 1988. That is, these sequences differ, if you follow this branch here all the way through the tree up to the highest branch here, you have a distance of nearly 50 percent. These startling changes are undoubtedly due to an untrammeled rate of variation, but also the result of travel along the road mentioned earlier in this talk. Given this extent of HIV variation, we should not be surprised by claims of new types of HIV, but most of these claims will hold up, I predict, only for specific genes of the virus, not for the entire genome. Nevertheless, a true type 3 virus may emerge for reasons I have suggested earlier. I'll close by noting that tracking human immunodeficiency viruses is not a totally futile task. Precisely because these viruses vary so rapidly, we learn about their genes and proteins all the more quickly. Vaccines and antiviral drugs are not prohibited by the situation. We must simply be very attentive to the fine details of the molecular structure. To give you some idea of the more optimistic results of this inquiry, I'd like to show you two slides, and they are not trees, that are, in my judgment, highly informative and encouraging. This has a lot of detail on it, but if you would simply focus on the asterisks, for the moment. The TAP protein is an, essentially a virally-encoded protein that determines viral latency. In this compilation of many HIV-1, probably 30 HIV-1, sequences here, HIV-2 sequences, and both HIV-1 and HIV-2 and also the African green monkey sequences, we can quickly identify what appear to be the invariant residues of the TAP protein, and those are shown by the asterisks here, for example these systems lining up here, down here these lysines. We're very interested in these three "R” genes right here for reasons I'll show you soon, in fact, this whole region here seems to be the active region of TAP. This sort of knowledge obtained from consensus sequences or consensus-like sequences is simply not available for the more familiar and the more stable viruses such as rabies or yellow fever, and it does represent a definite potential for defeating the epidemic. I began by noting that no genetic homologies have been found for the immunodeficiency viruses. That doesn't rule out the chance for discovering protein similarities. We can't expect that HIV will have genes corresponding to cellular genes, cellular protein structures, as are found for most oncogenic retroviruses. But as the sequence information concerning HIV and the genetic knowledge of HIV has grown, we have been watching for interesting similarities over short regions. These are short fragments of HIV proteins that look, for all the world, like small human peptide hormones. Specifically, they possess the typical processing signals. Shown here is a cleavage site, which is not unusual, but a cleavage site with glycine. That makes it a potential amidation site, which is very common for the C terminus of peptide hormones. And at this end is a signal, a processing signal, again for cleavage but also for amidation and formation of a pyroglutameal residue based on that glutamine there. These are relatively rare signals in the protein databases, but they are surprisingly frequent in lentiviral sequences. For example, the feline immunodeficiency virus, which is only about 30 percent similar to HIV, nevertheless has this motif here we're quite interested in which corresponds very closely to that motif in the TAP protein of HIV. And both of them, again, show potential signals for protein modification characteristic of short peptide hormones. These possibilities are just being explored, I wish I had something more encouraging and definite to report. Perhaps when I come back in a year, there'll be something better. I think, though, that I would like to close by stating that I am more optimistic today than I have been for two years. I want to acknowledge my coworkers in Los Alamos, Christy McGinnis, Randy Linder, Jeff Lawrence. A collaborator at Harvard School of Public Health, Temple Smith, who was involved in the tree analyses. And a collaborator at Brookhaven, Dr. John Glass, who taught me a lot about peptide hormones. [Applause] [National Library of Medicine Bethesda, MD, 1989]