Antisense RNA was a rather uncommon term in a physiology environment until short interfering RNAs emerged as the tool of choice to knock down the expression of specific genes. As a consequence, the concept of RNA having regulatory potential became widely accepted. Yet, there is more to come. Computational studies suggest that between 15 and 25% of mammalian genes overlap, giving rise to pairs of sense and antisense RNAs. The resulting transcripts potentially interfere with each other’s processing, thus representing examples of RNA-mediated gene regulation by endogenous, naturally occurring antisense transcripts. Concerns that the large-scale antisense transcription may represent transcriptional noise rather than a gene regulatory mechanism are strongly opposed by recent reports. A relatively small, well-defined group of antisense or noncoding transcripts is linked to monoallelic gene expression as observed in genomic imprinting, X chromosome inactivation, and clonal expression of B and T leukocytes. For the remaining, much larger group of bidirectionally transcribed genes, however, the physiological consequences of antisense transcription as well as the cellular mechanism(s) involved remain largely speculative.
- overlapping genes
- gene regulation
exciting technical achievements in the last 15 years including PCR, microarrays, and large-scale sequencing programs have brought chromosomes and entire genomes into the reach of detailed functional examination. A rather unexpected finding represented the extensive bidirectional transcription of eukaryotic genes (16, 49, 72). The resulting pairs of sense and antisense transcripts potentially lead to overlapping, perfectly matching RNA-RNA hybrids (Fig. 1) (31). Many aspects of natural antisense transcripts (NAT) have recently been addressed such as the total number of NAT in various genomes and their degree of evolutionary conservation, their chromosomal distribution, and the nature of the potential overlap between the sense and the antisense pair. Additional information derives from comparisons of gene cluster arrangements in different species as well as from promoter studies and large-scale mapping of transcription factor binding sites (TFBS) (14, 18, 29, 66). These efforts represent a strong support for a pivotal role of antisense transcripts in eukaryotic gene expression; however, when it comes to characterizing this role, experimental evidence is scarce. The aim of this review is to summarize established and hypothesized biological roles of natural antisense transcripts. A section that highlights the technical difficulties encountered in research on antisense RNA will be included.
ANTISENSE TRANSCRIPTION IN EUKARYOTIC GENOMES
Antisense transcription on a genome-wide level has been directly addressed in various animal and plant species including human (16, 72), mouse (49), Drosophila (44), rice, and Arabidopsis thaliana (50, 68). These studies are based on annotation of cDNA clone sets to the genome sequence of the cognate organisms. The strategy was pioneered by the Functional Annotation of Mouse (FANTOM) consortium and the Riken group, who annotated the FANTOM2 mouse cDNA set (60,770 clones) and found 2,481 pairs of overlapping sense/antisense transcripts (49). The minimal overlap required was 20 base pairs, a value that was also adopted in other studies. In addition to the 2,481 overlapping sense/antisense pairs, 899 bidirectionally transcribed RNA pairs without overlap were identified (Fig. 1). A total of 20.1% bidirectionally transcribed loci in the mouse genome were found, 14.8% overlapping and 5.3% nonoverlapping, respectively. The nonoverlapping transcript pairs are usually used as controls if the biological role of sense/antisense pairs that form RNA hybrids is investigated. Comparable numbers of overlapping transcripts were reported for humans by Yelin et al. (2,667; Ref. 72) and Chen et al. (2,940; Ref. 16). The two groups applied different criteria to scrutinize the orientation of cDNAs used for their studies. Yelin et al. required the transcripts to be spliced and to carry a polyadenylation signal, whereas Chen et al. accepted mRNAs with an open reading frame or, alternatively, a poly(A) tail and/or a polyadenylation signal. The in silico results were verified by Northern blotting using riboprobes or strand-specific PCR, respectively.
Recent investigations in Drosophila melanogaster revealed large-scale antisense transcription also in this model organism. Annotation of the Drosophila euchromatic genome identified 1,027 sense/antisense pairs representing ∼15% of the 13,379 genes (44). This finding suggests that overlapping genes and the related regulatory mechanism(s) represent a well-conserved, thus beneficial, concept in the animal kingdom.
In plants, a comparable trend seems to prevail, although antisense transcription is less frequent than in animals. Annotation of full-length cDNAs revealed 601 overlapping clusters in rice (Oryza sativa) (50). The A. thaliana genome was found to contain 1,340 bidirectionally transcribed clusters (68). Recently developed large-scale sequencing strategies like serial analysis of gene expression (SAGE) and massively parallel signature sequencing (MPSS) to assess the transcriptome corroborate the scale of overlapping RNAs (12, 52). The widespread antisense transcription emphasizes that gene regulation by natural antisense transcripts seems more prevalent than previously believed. The data on antisense clusters in the various genomes is amalgamated in Table 1.
REGULATORY TOOL OR TRANSCRIPTIONAL NOISE
The fact that so many eukaryotic genes overlap and potentially interfere with each other’s expression implies a physiological benefit of such a genomic arrangement. However, sheer numbers to justify a cause or “raison d’être” do not concur well with independent thinking, and the rather “nihilistic suggestion” was brought forward that most of the natural antisense transcripts represented mere transcriptional noise (14, 18). Such noise would not jeopardize the survival of the organism as long as tools were available to deal with potential RNA hybrids and/or other consequences of bidirectional transcription. Recent studies contradict this theory.
The relatively small number of eukaryotic genes encoded by the comparably large genomes would allow arrangements with every gene being regulated independently of its neighbors. Such individualism would help to reduce transcriptional noise. However, clustered or overlapping genes represent a common gene organization (1, 13, 64), and there is no evidence to suggest that such arrangements trigger negative selection. Whether overlapping genes are preferentially conserved during evolution is controversial, mostly because of a lack of established criteria indicating which features of overlapping genes are likely to be conserved. A study that compared overlapping protein-coding genes in human and mouse did not find increased sequence conservation of these genes (66). In addition, a significant fraction of the 95 overlapping genes identified in both human and mouse showed divergent overlap patterns. The conclusion that overlapping genes do not show preferred conservation is contradicted by a comparative study by Dahary and Sorek and colleagues (18). The authors compared the genome organizations of puffer fish, mouse, and human. Their working hypothesis assumed that genes with overlapping transcripts are close neighbors, and, if antisense transcripts were a beneficial feature, selection would work against a separation of such gene clusters. Orthologs with potentially overlapping tail-to-tail arrangements were compared with consecutive, equally spaced genes in human and puffer fish. Of the 236 tail-to-tail pairs identified in human, 55 (23.3%) conserved their gene order and orientation in fugu. The conservation of consecutive genes on the same strand was only 13.5% (170 of 1,257).
Another line of evidence that strongly supports a regulatory role of natural antisense RNAs comes from large-scale mapping of TFBS. A first approach combined chromatin immunoprecipitation using antibodies against cMyc, Sp1, and p53 with DNA array analysis. The high-density arrays represented the nonrepetitive sequences of human chromosomes 21 and 22 (14). Annotation of the signals localized 22% of all binding sites within 5 kb of the most 5′-exon of a gene. Remarkably, 36% of the sites were found either within the gene or in the 5 kb distal to the 3′-terminal exon. Activation of the latter TFBS results in potentially overlapping transcripts. The significance of the identified promoters was tested by profiling the expression of the cognate genes on DNA arrays. RNA from a pluripotent human cell line, NCCIT, was used before and after retinoic acid-induced differentiation into somatic cells. Fifteen percent of the coding RNAs (6% up, 9% down) showed a twofold or greater response to retinoic acid compared with 23% of the noncoding RNAs (6% up, 17% down). Overlapping RNA pairs showed mostly coordinate regulation, arguing against a mechanism where the antisense transcripts silence the cognate sense RNA (14). A comparable study investigating cAMP response element binding protein TFBS in the entire human genome corroborated the regulated expression of antisense transcripts (24).
Yet, the strongest argument to corroborate a biological role of antisense transcripts comes from gene knockout experiments (33, 34, 62). A limited number of reports describe the deletion of specific antisense transcripts without affecting the structure of the cognate sense RNAs. In these cases, the antisense knockout caused aberrant expression of the sense transcript with a related phenotype. Unfortunately, such experiments have only been performed with a very small number of overlapping genes, including the antisense transcripts Air and Xist in mice and frq (frequency) in Neurospora crassa (33, 34, 62).
Various well-studied epigenetic phenomena that involve noncoding antisense RNA are related to monoallelic expression and include X chromosome inactivation, genomic imprinting, and allelic exclusion in lymphocytes. Genes related to these regulatory mechanisms are clustered and integrated into a regulatory network that includes differential DNA methylation and chromatin modifications, and, likely as a consequence of the latter, asynchronous DNA replication are observed (19).
X chromosome inactivation describes a mechanism to balance the expression of X chromosome-encoded genes in females. The silencing of one of the X chromosomes is mediated by a large noncoding RNA (Xist) that recruits a DNA- and histone-modifying protein complex. An antisense transcript (Tsix) antagonizes the action of Xist (48, 59, 60); consequently, the X chromosome expressing Tsix remains active.
Imprinted genes are expressed either from the maternal or from the paternal allele. About 100 imprinted genes are reported in human and mice so far (http://www.mgu.har.mrc.ac.uk/research/imprinting/). Imprinted genes are grouped in clusters and often display both DNA methylation and noncoding antisense transcripts (25). A mouse gene knockout study focusing on the paternally expressed Air antisense transcript has recently corroborated the importance of noncoding RNA expression in imprinting. Air overlaps the Igf2r gene in antisense direction and suppresses the paternal expression of Igf2r, Slc22a2, and Slc22a3. Premature termination of the Air RNA resulted in both maternal and paternal expression of Slc22a2, Slc22a3, and Igf2r (57, 62). Further studies, however, indicated that parent-specific gene silencing did not require an RNA duplex formed between sense and antisense transcript (61). This finding was recently backed up by studies involving other imprinted gene clusters (63). The role of the noncoding transcript in imprinting may be closely related to Xist function in X chromosome inactivation. Whether such a mechanism represents a common theme of antisense action in imprinting, however, is not clear.
The diversity of immunoglobulins and T cell receptors results from recombination events at the relevant chromosome loci. With respect to the clonal selection theory, only one of the alleles undergoes recombination, whereas the other allele is silenced (9). As recently demonstrated for the V(D)J gene cluster, extensive antisense transcription occurs in genic and intergenic regions before and during recombination. Once the genes are joined, antisense transcription is rapidly turned off. The fact that the noncoding RNA was not restricted to one allele implies that antisense transcription does not represent the epigenetic mark for monoallelic expression but rather induces an open chromatin structure accessible to recombination (11, 17).
To summarize, noncoding antisense transcription related to monoallelic expression has an impact on a gene cluster rather than only on its overlapping sense transcript. Thus the term “noncoding transcript” may be more appropriate than “antisense RNA” in this context. The noncoding transcript may act as a chromatin opener or scaffold for DNA- and/or histone-modifying enzymes. These hallmarks, however, do not apply to a large proportion of identified antisense transcripts. Consequently, one or more additional groups of antisense transcripts exist that may exhibit different mechanistic and physiological properties than noncoding RNAs related to monoallelic gene expression.
NATURAL ANTISENSE TRANSCRIPTS: MECHANISMS
The synthesis of overlapping transcripts potentially interferes with the RNA processing at different levels. DNA methylation, transcriptional interference, impaired splicing, or RNA export as well as mechanisms triggered by double-stranded RNA such as RNA editing and RNA interference/micro-RNA synthesis may represent consequences of antisense transcription.
DNA methylation is closely linked to the transcriptional activity of specific areas of the genome. Repetitive DNA sequences, parasitic DNA, and promoter regions of silenced genes carry methylated C residues in the sequence CG. Recently, it has been shown that double-stranded RNA induces the methylation/silencing of the corresponding genes (27, 41). An interesting example of antisense RNA-induced DNA methylation has recently been reported by Tuffarelli and Higgs and colleagues (65). They found a deletion in the globin gene locus that juxtaposed the constitutively expressed LUC7 gene into close proximity of the Homo sapiens hemoglobin-α2 (HBA2) gene in antisense direction. The LUC7-derived antisense transcript induced methylation of the CpG island of the HBA2 promoter. Transcription of the globin gene was abolished, leading to the α-thalassaemia phenotype. Demethylation of a CpG island and non-CpG methylation have recently been reported for the Sphk1/Khps1 overlapping genes. The two RNAs overlap in the 5′-region in a head-to-head arrangement including the CpG island of the Sphk1 promoter. Overexpression of the antisense transcript in the rat kidney-derived cell line NRK resulted in the demethylation of the CpG island and the de novo methylation of three non-CpG sites. In addition, sense and antisense transcriptions were found to be mutually exclusive at single cell level (23).
The latter finding indicates that transcriptional interference may occur. The competition of two RNA polymerase II complexes has been quantitatively assessed by Prescott and Proudfoot (51); they found that the elongation of overlapping transcripts was prominently affected. Whether such steric interference has indeed regulatory potential needs to be established.
The best-characterized example of overlapping transcripts interfering at the level of splicing represents the α-thyroid receptor TR-α and the antisense-oriented Rev-ErbAα. The mammalian TR-α gene expresses two splice forms, TR-α1 and TR-α2. TR-α2 lacks hormone-binding capacity and acts as a dominant repressor of TR-α1. The splicing is influenced by Rev-ErbAα. The antisense transcript influences the processing of the TR-α hnRNA toward formation of the α1-isoform in a process that requires the overlap of Rev-ErbAα and TR-α2. The special genomic arrangement, however, is only found in mammals and derives from Rev-ErbAα gene duplication (22, 46).
Antisense RNA-mediated nuclear retention was reported in the context of hyperedited RNA duplexes (73). RNA editing describes the conversion of adenosine to inosine in RNA duplexes by an enzyme family called adenosine deaminase acting on RNA (ADAR) (5, 38). RNA editing is performed before or during splicing and causes the retention of the RNA in the nucleus (35, 45). The process eventually induces chromatin changes and concomitant downregulation of the cognate gene (67). RNA editing is one of the nuclear strategies to deal with double-stranded RNA. However, the timing of editing and the fact that natural targets of RNA editing are predominantly intronic sequences argue against a major role of ADAR in natural antisense RNA processing (35).
RNA editing is thought to constitute part of the nuclear defense strategy against RNA duplexes, RNA interference being another component. RNA interference (RNAi) involves a protein complex that recognizes double-stranded RNA (Dicer) and cleaves the targeted RNA duplex into small oligonucleotides of 21–23 base pairs (siRNAs). The strands are separated and become part of the RNA-induced silencing complex (RISC). RISC eventually degrades the cognate target mRNA with exquisite specificity (42, 43). In addition, RNAi was found to feed into chromatin-based gene silencing and DNA methylation (20, 39, 40, 58).
Micro-RNAs are short endogenous RNA molecules that undergo a maturation process similar to siRNAs. They derive from imperfect endogenous hairpin structures that are first processed by a Dicer-related RNase called “Drosha.” Micro-RNAs are important regulators of gene expression during development and are thought to interfere with the translation of the target gene (4). All these processes have been extensively reviewed; only a few aspects that are relevant for the processing of natural sense/antisense hybrids will be highlighted here.
Originally, a relatively strict separation was made between the initial steps of micro-RNA processing and RNA interference. Drosha recognizes the characteristic loop of micro-RNA precursors (47) and is confined to the nucleus, whereas Dicer was attributed to the cytoplasm. This view conflicted with the requirement for Dicer in RNA-induced heterochromatin formation. Drosha was unlikely to be involved in the processing of perfect RNA duplexes as formed by sense/antisense RNA pairs, and Dicer was thought to be cytoplasmic. A recent report now demonstrates the specific degradation of a nuclear RNA (7SK) in siRNA-transfected HeLa cells (54). The results were reproduced in nuclear extracts showing siRNA-mediated cleavage of GFP RNA and transcription elongation factor Cdk9 mRNA as well as the correct processing of let7 micro-RNA precursor. As a consequence, overlapping regions of endogenous sense/antisense pairs may feed into the RNAi or the micro-RNA machinery even if hybridization occurred in the nucleus. The fact that, in Arabidopsis, 11 siRNAs map to complementary regions of overlapping genes supports this hypothesis (68). This prediction was enabled by the fact that, in plants, both si- and micro-RNAs show perfect complementarity with the target sequence (6, 7). In other eukaryotes, however, mismatches occur between micro-RNAs and their target sequence, making a putative link difficult to establish (4).
General conclusions concerning the mechanism of natural antisense transcripts can be drawn from the comparison of overlapping and nonoverlapping antisense transcription on autosomes and the X chromosome. Interestingly, antisense transcripts that can form an RNA hybrid with their cognate sense transcript are significantly underrepresented on the X chromosome, whereas nonoverlapping antisense RNAs are not (16, 31). This implies that a hypothetical antisense-based regulatory mechanism would require transcription from both alleles and an RNA overlap.
To conclude, the different mechanisms discussed above are often limited to small groups of overlapping genes or to special experimental setups. It is speculative which aspects of mechanism or biological function may apply to a wider group of antisense RNAs. A general mechanism is likely to involve biallelic expression and an RNA duplex, in clear discrepancy to the noncoding transcripts mediating monoallelic expression.
NATURAL ANTISENSE TRANSCRIPTS: PHYSIOLOGY
Attempts to categorize overlapping genes resulted in controversial suggestions: DNA repair genes, chaperones, and DNA helicases were found to be overrepresented in bidirectionally transcribed loci (64); on the other hand, natural antisense RNAs are frequently related to transcription factor-encoding genes (69). A recent study by Chen et al. (15) argued that overlapping genes have a short response time. As a consequence, the size of introns would be reduced in these genes. A small but significant decrease in intron size was indeed found in overlapping genes (15). In addition to these general considerations, there are numerous reports characterizing natural antisense transcripts of protein-coding genes from vertebrates, without, however, a recognizable preference toward a particular protein family. The following short list of examples of overlapping genes is by no means complete and is rather meant to demonstrate the intricacy of antisense RNA-mediated gene regulation.
The fibroblast growth factor-2 (Fgf-2) gene and the antisense Gfg-2 transcript were among the first overlapping genes to be identified (28). The genomic arrangement, tail-to-tail with the last exon of both transcripts overlapping, is well conserved between Xenopus and human. The sense and the antisense transcripts show an inverse expression level in a variety of species, tissues, and cell lines (32). Interestingly, the transcript ratio varies in lymphoid cell lines, indicating a role of antisense regulation in hematopoetic malignancies (3). The open reading frame found on the antisense transcript complicates the picture. Translation gives rise to a nucleotide phosphohydrolase of the MutT/NUDIX family as demonstrated in rat and human. Overexpression of the antisense transcript in pituitary-derived GH4 cells had an anti-proliferative effect that was independent of Fgf-2 expression (2). A comparable situation where a biological effect could be related to either the sense/antisense RNA overlap or the antisense-encoded protein represents the epithelial Na-phosphate cotransporter. The overlapping arrangement of two genes is well conserved between fish and mammals but not the intron/exon structure of the antisense transcript. It is noncoding in fish, and in humans it contains less and shorter introns and encodes a putative protein of the profilin family (70, 71).
Antisense regulation governs the expression of the two isoforms of the myosin heavy chain (α- and β-MHC) and thus the contractility of the mammalian heart. The relative expression level of α- and β-MHC is highly regulated during development and changes under pathophysiological conditions. The two genes are arranged head to tail with the α-isoform downstream of β-MHC. The promoter driving the α-isoform is bidirectional, leading to a parallel expression of sense α-MHC mRNA and antisense β-MHC RNA (21, 37). Transgenic mice with a β-MHC promoter reporter gene construct provide indirect evidence for a pivotal role of the antisense transcript in MHC isoform switching (53).
The final example to be mentioned here is the Msx1 gene encoding a homeobox-containing transcription factor. The cognate untranslated antisense transcript overlaps most of the second exon of Msx1 including the homeobox. The expression of both the sense and the antisense transcripts is tightly controlled during mouse development (10). Patterns of complementary expression were observed during early and late craniofacial development in mice (day 10.5 and day 16.5), whereas the transcripts colocalized during early tooth development (days 11.5–16.5). These very carefully performed in situ hybridization experiments were combined with monitoring protein expression to account for antisense inflicted sense mRNA degradation (10). However, protein levels largely followed sense mRNA levels, arguing against a direct inhibitory effect of the antisense transcript. The study into Msx1 is somehow exemplary for the technical difficulties of research into natural antisense transcripts (8, 10, 36). Because the mechanism triggered by natural antisense RNA is unknown, changes in sense, antisense, and protein level may be relevant. A comprehensive compilation of antisense and other noncoding RNA can be found at http://www.bioinfo.org.cn/NONCODE/index.htm.
ANTISENSE TECHNICAL CORNER: THERE IS NO EASY WAY OUT
Natural antisense transcripts are usually detected serendipitously in the course of experiments aimed at characterizing the sense transcript. Antisense RNAs appear as additional bands on Northern blots (hybridized with a double-stranded probe), “unspecific” fragments in RT-PCRs, or “false-positive” signals during in situ hybridization experiments (http://www.narna.ncl.ac.uk).
After identification and (partial) sequencing of an antisense transcript, bioinformatics tools provide an estimate as to whether the signal is real or rather an experimental artifact (for mouse genes, the RIKEN group antisense viewer is another helpful tool; http://genome.gsc.riken.go.jp/m/antisense/). Antisense-oriented expressed sequence tags (ESTs) or additional genes in the proximity of the sequence of interest may point to an additional splice isoform and help to clone the cDNA ends. The physiological characterization of antisense transcripts is tedious, as indicated for the investigations into Msx1. The major obstacles represent the low expression levels of the antisense RNA and the often-limited choice for primers and probes due to the sense/antisense overlaps. The consequences of antisense interaction are still unpredictable, rendering the interpretation of expression patterns difficult. Theoretically, lack of expression could result in similar Northern blots, PCRs, or in situ hybridizations compared with coexpression of sense and antisense RNA followed by degradation of the hybrid.
Key obstacles in investigating the physiological impact of antisense transcription are the missing tools for routine assessment of large-scale sense/antisense expression patterns. Representations of the entire transcriptome by SAGE or MPSS include antisense transcripts, and these data could be mined with a focus on sense/antisense transcription; however, these techniques are not suited for routine analysis. Similar restrictions apply to DNA arrays; there are possibilities to detect antisense transcripts with purpose-made antisense-specific chips or tiling arrays that cover entire chromosomes (26, 30). Both sets of arrays are not commercially available. Rosok and Sioud (55, 56) have recently reported a method to clone antisense RNAs on a large scale, exploiting the hybridization of overlapping transcripts. This method is barely applicable for routine assessments of sense/antisense expression and selects for sense/antisense pairs that are coexpressed. A solution to this technical problem would boost the understanding of antisense RNA-mediated gene regulation and of processes where altered sense/antisense transcription is documented such as embryonic development or cancer biology.
In conclusion, attempts to make sense of antisense resemble solving a jigsaw puzzle of a blurred picture with an overwhelming number of parts. To complicate things further, some of the parts may have identical shapes but different looks. In this review, we suggest distinguishing between noncoding transcripts involved in monoallelic gene expression and natural antisense transcripts. In addition, we accumulate indirect evidence for a mechanism that involves double-stranded RNA during the processing of antisense transcripts. We speculate that the pathway may result in RNA-mediated signaling, much like micro-RNAs or siRNAs. Only experimental scrutiny will tell whether the part of the puzzle is placed correctly or needs relocating.
We thank Gavin McHaffie and Mark Carlile for stimulating discussions and help with the manuscript.
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
Address for reprint requests and other correspondence: A. Werner, Institute for Cellular and Molecular Biosciences, The Medical School, Framlington Place, Univ. of Newcastle, Newcastle upon Tyne NE2 4HH, UK (e-mail:).
- Copyright © 2005 the American Physiological Society