Dna methylation, epigenetics, and evolution in vertebrates. The generally accepted definition of what constitutes a cpg island was. To date, there has been no genomewide analysis of cgis in the fish genome. Download fulltext pdf download fulltext pdf comparative analysis of cpg islands in four fish genomes article pdf available in comparative and functional genomics 20083. Cpgpap is a webbased application that provides a userfriendly interface for predicting cpg islands in genome sequences or in user input sequences. Cpg content in the inferred humanmacauqe ancestral genome and the extant species genomes was compared for regions classified as hypodeaminated cpg islands green and bgc cpg islands red. Gardinergarden m, frommer m 1987 cpg islands in vertebrate genomes. Functional relevance of cpg island length for regulation of. Dna methylation of intragenic cpg islands depends on their. The globally methylated, cpg poor genomic landscape is punctuated, however, by cpg islands cgis, which are, on average, base pairs bp long. This article is from biochemical society transactions, volume 41. Methylation of cpg islands is associated with delayed replication, condensed chromatin. Orphan cpg islands identify numerous conserved promoters in. Functional relevance of cpg island length for regulation.
Cpg islands cgis are generally considered as the epigenetic and functional elements 1,2. While the regulatory importance of cpg islands is widely accepted, it is little appreciated that cpg islands vary greatly in lengths. Cpg island density and its correlations with genomic. Methylationdriven model for analysis of dinucleotide. Some of these regulatory regions may be associated with cpg islands located far from transcription startsites of any protein coding gene. In zebrafish, promoter regions, defined as 2000 bp upstream of annotated genes, are methylationpoor, similar to. Shown are the ratios between extant and ancestral cpg content for the human lineage x axis versus the rhesus lineages y axis, reflecting more cases. Cpg islands located within a region of human chromosome 19. Number of cpg islands and genes in human and mouse. Genomic islands play an important role in medical, methylation and biological studies. Evolutionary consequences of dna methylation on the gc content in vertebrate genomes. Experiments of molecular cloning and sequencing were performed in our previous study yang et al. Most, perhaps all, cgis are sites of transcription initiation, including thousands.
Cpg islands mark cpg enriched regions in otherwise cpg depleted vertebrate genomes. Previous studies have shown that cpg dinucleotides are enriched in a subset of promoters and the cpg content of promoters is positively correlated with gene expression levels. Evolution of replication origins in vertebrate genomes. Full text get a printable copy pdf file of the complete article 1. Comprehensive analysis of cpg islands in human chromosomes 21 and 22. Features of methylation and gene expression in the promoter. Dna methylation is a key epigenetic modification in the vertebrate genomes known to be involved in biological processes such as regulation of gene expression, dna structure and control of transposable elements. It is not clear why the cpg islands are such poor substrates for dna methyltransferase.
The mutational process is obviously ongoing in the human germline. The evolution of cpg density and lifespan in conserved. We first evaluated the performance of three popular cgi identification algorithms in four fish genomes tetraodon, stickleback, medaka, and. The upper panel illustrates a 65 kb portion of human chromosome 19 1719500017260000 which contains five annotated genes blue bars and four cpg islands. Author summary in the decade since the sequence of the human genome was announced, efforts have been made to annotate all genes with their regulatory sequences. We report here a study focused on cpg sites in the coding regions of hox and other transcription factor genes, comparing methylated genomes of homo sapiens, mus musculus, and danio rerio with.
In vertebrate genomes the dinucleotide cpg is heavily methylated, except in cpg islands, which are normally unmethylated. Apr 01, 2011 cpg islands mark cpgenriched regions in otherwise cpgdepleted vertebrate genomes. In the mid1980s, we showed that cpg is linearly and positively correlated with gc levels of vertebrate genes 7 see also bernardi and bernardi 8, a point confirmed by further detailed investigations 9. After removing cpg islands, npcpg and cpgpm trinucleotides in each of the 10 vertebrate genomes were counted using an in house java program for results, see supplementary table 7, additional file 1, and the eight parameters were then obtained with eqs. An example is the dna repair gene ercc1, where the cpg island containing element is located about 5,400 nucleotides upstream of the transcription start site of the ercc1 gene. A c cytosine base followed immediately by a g guanine base a cpg is rare in vertebrate dna because the cytosines in such an arrangement tend to be methylated. Human genes with cpg island promoters have a distinct. May 01, 2014 in the terminal tissues, cpg islands in promoters, although far less methylated than cpg islands overall, are still slightly methylationrich. The fact that cpg contents of lcgs are similar to that of the rest of the genome whereas hcgs preserve cpg contents in several distantly related vertebrate genomes fig. Methylated and nonmethylated dna sequences coexist in many animal genomes. For example, the average percentage of nucleotide substitutions between human and chimpanzee is 0. Cpg islands are useful markers for genes in organisms containing 5methylcytosine in their genomes. More than half of the genes in vertebrate genomes contain short approximately 1 kb cpgrich regions known as cpg islands cgis, and the rest of the genome is depleted for cpgs. To explain its origin and evolution, mainly three mechanisms have been proposed.
Dec 15, 1993 combining the number of cpg islands with the proportion of islandassociated genes, we estimate that the total number of genes per haploid genome is approximately 80,000 in both organisms. Cytosine methylation and the fate of cpg dinucleotides in. Contrasting chromatin organization of cpg islands and. Distribution of dna methylation, cpgs, and cpg islands in. The observedtoexpected cpg ratio can be derived where the observed is calculated as. All 5mc is present in the dinucleotide cpg, although only 70 to 80% of the potentially methylatable sites are actually in a methylated form. Cpg islands are typically common near transcription start sites tss, are. Distribution of cpg islands in patients with different phases of infection. Cpg islands are already being used to identify potential genes in isolated dna, and they may function analogously in vivo, as gene markers for ubiquitous. Cpg island predictor analysis platform bmc genetics. Vertebrate genomes are globally heavily methylated at the sequence cpg, with the exception of short patches of gcrich dna of between 12 kb in size that are free of methylation, and these are known as cpg islands see refs. However, cpgs located in cgis are highly conserved across vertebrate genomes 6. Intragenic nucleosomes and their modifications have been recently associated with rna splicing.
Purification of cpg islands using a methylated dna binding. Intergenic, gene terminal, and intragenic cpg islands in the. Organizational heterogeneity of vertebrate genomes core. There has been much interest in cpg islands cgis, clusters of cpg dinucleotides in gcrich regions, because they are considered gene markers and involved in gene regulation. Many occur at genes promoters, and their dna nearly always remains unmethylated. The chromosome region containing the highly polymorphic hla class i genes displays limited large scale variability in the human population. Comparative analysis using kmer and kflank patterns. May 29, 2012 more than half of the genes in vertebrate genomes contain short approximately 1 kb cpg rich regions known as cpg islands cgis, and the rest of the genome is depleted for cpgs. Sometimes referred to simply as dna methylation, in eukaryotes 5mc is most prevalent at cpg dinucleotides and is frequently associated with transcriptional repression. Scheme for the formation and evolution of cpg islands in the genome of vertebrates. In humans, about 70% of promoters located near the transcription start site of a gene proximal promoters contain a cpg island distal promoter elements also frequently contain cpg islands.
Cpg island clusters and proepigenetic selection for. Cpg dinucleotides are notably depleted in mammalian genomes where the observed frequency of cpg dinucleotides is only 0. Frequent hypermethylation of orphan cpg islands with enhancer. Cpg islands cgis have long been implicated in the regulation of vertebrate gene expression. Detecting anomalous gene clusters and pathogenicity islands in diverse bacterial genomes. Comparative analysis of cpg islands in four fish genomes.
At present, the mechanism of gcbiased gene conversion, i. Feb 26, 20 the tissuespecific islands were shorter and contained fewer cpg dinucleotides than those found in both types of tissue, a finding that is reminiscent of work at stanford that identified two classes of gene promotersone with high levels of cpg dinucleotides and one with lower levels saxonov et al. Two theories for the maintenance of a high frequency of cpg dinucleotides in cpg islands were tested. Aug 12, 2016 cpg dinucleotides are frequently methylated in vertebrate genomes. Cpg islands, genes and isochores in the genomes of vertebrates.
In addition, cpg islands located in the promoter regions of genes can play important roles in gene silencing during processes such as xchromosome inactivation, imprinting, and silencing of intragenomic parasites. Cpg islands are short regions containing the sequence cg at high density that map to regions controlling the expression of most human genes known as promoters. While epigenome analysis has been applied to genomes from singlecell eukaryotes to human, comparative analyses are still relatively few and computational algorithms to quantify epigenome evolution remain. Implications of cpg islands on chromosomal architectures and modes of global gene regulation. Primate cpg islands are maintained by heterogeneous. Because the function of intragenic dna methylation remains unclear, i explored the. At the time, these islands of cpg dinucleotides were.
Background the replication programme of vertebrate genomes is driven by the chromosomal distribution and timing of activation of tens of thousands of replication origins. Comparative analysis using kmer and kflank patterns provides evidence for cpg island sequence evolution in mammalian genomes heejoon chae1, jinwoo park2,3, seongwhan lee4, kenneth p. In the terminal tissues, cpg islands in promoters, although far less methylated than cpg islands overall, are still slightly methylationrich. It is now almost 26 years since the cpg island a stretch of dna with a larger than expected proportion of cytosine followed by guanine baseswas first defined, based on an analysis of the relative proportions of the four bases in the then limited amount of human sequence information available gardinergarden and frommer, 1987. Largescale human promoter mapping using cpg islands. Relating gene expression evolution with cpg content. Cpg islands as gene markers in the vertebrate nucleus. In contrast, cpg islands are cpg enriched regions where the frequency of cpgs is.
In addition to distinctive dna characteristics, cpg islands also have an open chromatin structure in that they are hyperacetylated, lack. Regulatory regions related to transcription of this noncoding rnas are poorly studied. On the other hand, dna methylation is absent in promoters but is enriched in gene bodies. Comprehensive analysis of cpg islands in human chromosomes. Cpg dinucleotides contribute to epigenetic mechanisms by being the only site for dna methylation in mammalian somatic cells. Approximately 4% of total cytosines are methylated, representing about 5. Here we report findings suggesting that the lengths of cpg islands have functional consequences. Vertebrate cpg islands cgis are short interspersed dna sequences that deviate significantly from the average genomic pattern by being gcrich, cpg rich, and predominantly nonmethylated.
Cpg islands are associated with genes, particularly housekeeping genes, in vertebrates. Isolation of cpg islands using a methylcpg binding column. Pdf cpg island density and its correlations with genomic. If dna repair mechanisms fail to remove the mutated t with a g on the opposite strand before dna replication 4,5,6, c t substitutions referred to by the pyrimidine of the mutated watsoncrick base pair. Although cpg sites are underrepresented in genomes overall, clusters of cpgs known as cpg islands are observed, and these are normally protected from methylation 8. Vertebrate genomes are methylated predominantly at the dinucleotide cpg, and consequently are cpg deficient owing to the mutagenic properties of methylcytosine coulondre et al. Combining the number of cpg islands with the proportion of island associated genes, we estimate that the total number of genes per haploid genome is approximately 80,000 in both organisms. However, the involvement of cgis in chromosomal architectures and associated gene expression regulations has not yet been thoroughly explored. Compositional transitions in the nuclear genomes of coldblooded vertebrates. Cpg islands typically occur at or near the transcription start site of genes, particularly housekeeping genes, in vertebrates. Nephew5 and sun kim2,3, 1department of computer science, school of informatics and computing, indiana university, bloomington, in, usa, 2department of computer science and engineering. The genomes of many vertebrates show a characteristic variation in gc content. Implications of cpg islands on chromosomal architectures.
Cpg islands are regions where cpgs are present at significantly higher levels than is typical for the genome as a whole 16. Aug 31, 2010 cpg dinucleotides contribute to epigenetic mechanisms by being the only site for dna methylation in mammalian somatic cells. Mice, which were thought to possess far fewer cpg islands than humans, turn out to have a very similar number. Nevertheless, the recent study by hackenberg et al. Cpg islands frequently contain gene promoters or exons1 and are usually unmethylated in normal cells1,2,3. Jawon song, bumkyu lee, lucy leblanc, laurie cannon, jonghwan kim, implications of cpg islands on chromosomal architectures and modes of global gene regulation, nucleic acids research, volume 46, issue 9, 18. In zebrafish, promoter regions, defined as 2000 bp upstream of annotated genes, are methylationpoor, similar to humans and other species feng et al.
Jan 19, 2010 recently, it has been discovered that the human genome contains many transcription start sites for noncoding rna. They also found evidence for cpg dinucleotide suppression in other genomes, including those of yeast and fruitflies. Implications of cpg islands on chromosomal architectures and. Mammalian genomic dna generally shows a great deficit of cpg dinucleotides, for example, the ratio of the observed over the expected cpgs obs cpg exp cpg is approximately 0. The 5methyl cytosines are susceptible to spontaneous deamination to thymine. Evolutionary consequences of dna methylation on the. Plant genomes display methylation, but otherwise the genomes of plants and animals represent two very divergent evolutionary lines. Dna methylation and structural and functional bimodality of. Vertebrate genomes are methylated predominantly at the dinucleotide cpg, and consequently are cpgdeficient owing to the mutagenic properties of methylcytosine coulondreetal.
Empirical models of sequence evolution have spurred progress in the field of evolutionary genetics for decades. Evolution of epigenetic regulation in vertebrate genomes. Though objective definitions for cpg islands are limited, the usual formal definition is a region with at least 200 bp, a gc percentage greater than 50%, and an observedtoexpected cpg ratio greater than 60%. Despite increasing knowledge about dna methylation, we still lack a complete understanding of its specific functions and correlation with environment and gene expression in diverse. But the relationship between divergence of cpg content and gene expression evolution has not been investigated.
A portion of five vertebrate species microrna mirna genes are found to associate with cpg islands. Genomewide studies have shown the frequent association of origins with promoters and cpg islands, and their enrichment in gquadruplex sequence motifs g4. Vertebrate genomic dna is generally cpg depleted1,2, possibly because methylation of cytosines at 80% of cpg dinucleotides results in their frequent mutation to. Pdf evolutionary consequences of dna methylation on the. Thegloballymethylated, cpgpoor genomic landscape is punctuated, however, by cpg islands cgis, which are, on average, base pairs. We are now realizing the importance and complexity of the eukaryotic epigenome. For example, the density of cpg islands is highly correlated with the number or the size of the chromosomes in mammalian genomes, and the number of cpg islands varies greatly among fish genomes. Abstractvertebrate dna can be chemically modified by methylation of the 5 position of the. Google scholar chimini g, pontarotti p, nguyen c, toubert a, boretto j, jordan br. Cpg islands are short stretches of dna containing a high density of nonmethylated cpg dinucleotides, predominantly associated with coding regions.
Aberrant cpgisland methylation has nonrandom and tumour. Aberrant methylation of the promoterassociated cgis might influence gene expression and cause carcinogenesis. Pdf comparative analysis of cpg islands in four fish genomes. Dna methylation is a common feature of vertebrate genomes and predominantly occurs at cytosines in cpg dinucleotides and converts cytosine into 5methylcytosine bird and taggart 1980. Cpg islands and nucleosomefree regions are both found in promoters. Cpg islands cgis are clusters of cpg dinucleotides in gcrich regions and represent an important feature of mammalian genomes. Most, perhaps all, cgis are sites of transcription initiation, including thousands that are remote from currently annotated promoters.
Cpg islands and htf islands in the hla class i region. Chemical modification of nucleotide bases in dna provides one mechanism for conveying information in addition to the genetic code. In vertebrates most of the genome is methylated, and nonmethylated sequences are reduced to short cpg islands, many of which include the 5. To explore the region, we propose a cpg islands prediction analysis platform for genome sequence exploration cpgpap. Cytosines at the cpg dinucleotide sequence contexts are frequently methylated in vertebrate genomes 1, 2. Conserved and divergent patterns of dna methylation in higher vertebrates.
Vertebrate cpg islands cgis are short interspersed dna sequences that deviate significantly from the average genomic pattern by being gcrich, cpgrich, and predominantly nonmethylated. Mar 19, 2002 this description eliminates alusequences and reduces the predicted number of cpg islands on chromosomes 21 and 22 from over 14,000 down to 1,101, which approximately resembles the number of genes found around 750. For vertebrate genomes, however, existing dna methylation data reveal that this modification is not randomly distributed see below. Cpg islands or cg islands are regions with a high frequency of cpg sites. Although a significant portion of the genome is methylated at cpg sites, cgis are usually unmethylated and remain transcriptionally active with active histone marks such as h3k4me3 as a result of the action of cxxc finger protein 1 cfp1 14. Pdf conserved and divergent patterns of dna methylation. Dna methylation is a conspicuous feature of vertebrate genomes. A substitution at the cpg dinucleotide contexts is the most frequent substitution type in genome evolution.
Using a biochemical method, we have identified and mapped all cpg islands. The mechanisms underlying these contrasting patterns of cgi. Conversely, intragenic cgis are often, but not always, methylated, and thus inactive as internal promoters. The vertebrate genomes being mostly methylated at the dinucleotide cpg, mostly are mutated and consequently are cpg deficient. Here we calculate the normalized cpg ncpg content in dna regions around.
808 530 1237 565 151 758 795 17 1632 1546 427 1345 577 1304 81 597 968 558 1261 193 370 460 53 526 1045 377 155 757 109 1033 1063 1064