Steven McKay's Review
A summary of â€śChapman, B.A., J.E. Bowers, F.A. Feltus, and A.H. Paterson. 2006. Buffering of crucial functions by paleologous duplicated genes may contribute cyclicality to angiosperm genome duplication. Proceedings of the National Academy of Science 103(8): 2730-2735.â€?
While it has been recognized for some time that gene duplication, including entire genome duplication, as been a relatively common event throughout flowering plant evolution, many questions remain about when and why it might occur. One explanatory model is that upon duplication, one copy of a gene is free to evolve and acquire new functionality, whereas the other copy is conserved with no corresponding loss of functionality. If this model, referred to as â€śfunctional divergence,â€™ is accurate, then, over time, enhanced levels of variability due to mutation would accumulate among homoeologous genes, those genes derived from a single ancestral gene. An alternative model, known as â€śfunctional buffering,â€? proposes that multiple copies of genes can yield increased levels of gene expression. Both of these models are at odds with previous studies of induced gene duplication in which resulting organisms are apparently maladapted, and genes are often rapidly lost. Both models require that duplicated genes be retained, either for a long enough duration for one set of genes to evolve in the first case, or indefinitely in the second case. The authors of this study have investigated whether there is evidence for a relationship between gene evolution and gene copy number. To do so, they utilize the recently sequenced genomes of Arabidopsis (a mustard) and Oryza (rice).
As a first step, the authors applied computational matching programs to identify pairs of duplicated genes. In other words, they identified genes in multiple locations throughout the genome, which are clearly derived from a common gene, yet are likely to contain mutations. They also identified non-duplicated genes, which they label â€śsingletons.â€? These were defined as genes found within a single genomic location and clearly derived from a single ancestral gene, yet which may contain mutations. Using subspecies and landraces, as well as related species, for comparative purposes, the authors were able to detect three genome duplication events in the development of the Arabidopsis genome and one in that of the rice genome. For example, the Arabidopsis genome was compared to those of Gossypium (cotton) and Brassica (canola, cabbage and relatives), and gene duplication events were identified as having occurred between the two speciation events of these taxa from the Arabidopsis lineage. Similarly, a gene duplication event was identified as having occurred after the Musa (banana) lineage split off from that of Oryza, and a second occurred before the Sorghum lineage split off from that of Oryza as well. Individual pairs of duplicated genes, paleologs, were attributed to particular events.
Then, small mutations known as single nucleotide polymorphisms (SNPs) were identified within homoeologous pairs of genes, and their locations within the genes were determined as well. Throughout the study, only genes located within known coding regions were analyzed, thus reflecting those portions of the genome upon which evolutionary pressures are likely to apply. Then, the SNPs associated with each gene of study were examined to determine whether they would encode for changes in the resulting amino acids. Those SNPs associated with singleton genes were more likely to result in amino acid substitutions than those SNPs associated with duplicated genes. In other words, despite mutations, duplicated genes were more likely to be functionally conserved than singleton genes. Additionally, the more recent the duplication event, the more likely that duplicated genes were functionally conserved, yet no such pattern is apparent for singleton genes. This suggests that duplicated genes tend to be preferentially conserved relative to singletons, but that over time, copies do continue to evolve to new functional types. In fact, further analysis of amino acid substitutions indicated that the changes were more benign within duplicated genes than within singletons.
The sizes and complexity of duplicated genes also appear to differ substantially from those of the singleton genes. Duplicated genes, on average, are longer than are singleton genes. Furthermore, larger portions of the duplicated genes are sensitive to mutations that would result in changes in amino acid synthesis. These two facts suggest that large, complex genes are more likely to be retained than smaller, simpler genes.
In a particularly clever analysis, the authors compared genes from Arabidopsis and Oryza with homoeologous genes from Gossypium and Musa, respectively, which diverged before the most recent gene duplication event. The Arabidopsis and Oryza genes were also compared to those from Brassica and Sorghum, respectively, which diverged after the most recent gene duplication event. These comparisons allowed for an assessment of whether rates of gene change are related to time since divergence. Genetic change was measured as the maximum detected length of identical codons, representing the longest identical stretch of expressed amino acids. By looking at singletons and duplicates separately, the authors were able to evaluate the rates of change in the two different classes. Additionally, the authors separately examined coding regions, which one would expect to be acted upon by evolutionary selection, and non-coding regions, which would be evolving randomly. Non-coding regions demonstrated a steady pattern of increasing differentiation over time (pre-duplication event versus duplication event versus post-duplication event) in both lineages. Within coded regions, however, genetic similarity was retained after gene duplication, both within Arabidopsis and Oryza, and between these two species and their close relatives Brassica and Sorghum. In other words, even after speciation events, duplicated genes were more likely to be conserved than singleton genes. The authors speculate that these long stretches of conserved code may be retained by gene repair, and that this may be more successful when there are multiple copies of these genes to serve as templates for the repair mechanisms.
The authors then speculate that the two models of functional divergence and functional buffering may actually represent two ends of a spectrum of responses to gene duplication. They point out that even full-genome scans are unlikely to detect ancient duplicated genes that have long ago diverged beyond the point of recognition. They also acknowledge that they have focused on protein encoding genes rather than regulatory genes, which could exhibit different patterns of evolution and retention.
Finally, as a counterpoint to empirical evidence that gene duplication often yields maladaptive traits, the authors speculate that as time since gene duplication increases, continued accumulation of mutations is likely to decrease the benefits of functional buffering. Consequently, under such conditions, organisms may benefit to a greater degree from gene duplication. These additional benefits may actually result in something of a cyclic or periodic nature of gene duplication events, ultimately leading to increased genetic evolution.