How do I identify CpG islands?

Published by Charlie Davidson on

How do I identify CpG islands?

CpG islands are defined as sequence ranges where the Obs/Exp value is greater than 0.6 and the GC content is greater than 50%. The expected number of CpG dimers in a window is calculated as the number of ‘C’s in the window multiplied by the number of ‘G’s in the window, divided by the window length.

What are CGI promoters?

CGI promoters appear to define a class of transcription start site (TSS) which can initiate from multiple positions. The more tissue restricted class of non-CGI promoters is generally associated with a single well defined initiation site (reviewed in [12]).

What is a CpG shore?

Shores are the regions immediately flanking CpG islands (CGI) the consensus definition of a CpG shore is up to 2kbp away from the CGI.

What causes CpG island methylation?

The reason for methylation to be almost exclusive to CpG dinucleotides is the symmetry of the dinucleotide. This allows for preservation during cell division and is a hallmark for epigenetic modifications.

What makes a CpG Island distinct?

More than 50% of human genes initiate transcription from CpG dinucleotide-rich regions referred to as CpG islands. These genes show differences in their patterns of transcription initiation, and have been reported to have higher levels of some activation-associated chromatin modifications.

How is CpG methylation detected?

Currently, there are three primary methods to identify and quantify DNA methylation. These are: sodium bisulfite conversion and sequencing, differential enzymatic cleavage of DNA, and affinity capture of methylated DNA (1). Restriction enzyme based differential cleavage of methylated DNA is locus-specific.

What are CGI genes?

Results: We develop a new method for prioritizing genes associated with a phenotype by Combining Gene expression and protein Interaction data (CGI). The method is applied to yeast gene expression data sets in combination with protein interaction data sets of varying reliability.

Are CpG islands Hypomethylated?

Structure. In a normal cell, the CpG island is hypomethylated, and the rest of the genome is methylated. It is evident that the hypomethylation of the CpG island in normal cells provides no additional steric hindrance to future binding.

How are the annotations determined in the CpG?

Schematic of CpG annotations. The genic annotations are determined by functions from GenomicFeatures and data from the TxDb.* and org.*.eg.db packages.

What are the criteria for a CpG island?

The first traditional algorithm was postulated by Gardiner-Garden and Frommer [37], and its criteria include: length over 200 base pairs, over 50% GC pairs, and a ratio of observed to expected number of CpG dinucleotides over 0.60. However, many repetitive elements that commonly occur in genomes (such as Alu repeats) also meet these criteria.

How many CpG islands are there in the genome?

There are about ~28K CpG islands scattered around the genome, representing a somewhat dense set of non-overlapping regions. To get the data we’ll use the excellent AnnotationHubR package. In this case we’ll use it to get the full set of CpG islands from UCSC in a GRanges object.

Are there annotations for hg19 and hg38?

CpG annotations are available for hg19, hg38, mm9, mm10, rn4, rn5, rn6. Schematic of CpG annotations.

Categories: Popular lifehacks