In mammals, cilia are critical for development, sensation, cell signaling, sperm motility, and fluid movement. Defects in cilia are causes of several congenital syndromes, providing additional reasons to identify cilia-related genes. We hypothesized that mRNAs selectively abundant in tissues rich in highly ciliated cells encode cilia proteins. Selective abundance in olfactory epithelium, testes, vomeronasal organ, trachea, and lung proved to be an expression pattern uniquely effective in identifying documented cilia-related genes. Known and suspected cilia-related genes were statistically overrepresented among the 99 genes identified, but the majority encoded proteins of unknown function, thereby predicting new cilia-related proteins. Evidence of expression in a highly ciliated cell, the olfactory sensory neuron, exists for 73 of the genes. In situ hybridization for 17 mRNAs confirmed expression of all 17 in olfactory sensory neurons. Most were also detected in vomeronasal sensory neurons and in neighboring tissues rich in ciliated cells such as respiratory epithelium. Immunoreactivity for one of the proteins identified, Spa17, colocalized with acetylated tubulin in the cilia layer of the olfactory epithelium. In contrast, the ciliary rootlet protein, Crocc, was located in discrete structures whose position was consistent with the dendritic knobs of the olfactory sensory neurons. A compilation of >2,000 mouse genes predicted to encode cilia-related proteins revealed a strong correlation (R = 0.99) between the number of studies predicting a gene's involvement in cilia and documented evidence of such involvement, a fact that simplifies the selection of genes for further study of the physiology of cilia.
- coiled-coil domain
- gene expression
nearly all vertebrate cells have a single primary cilium. While its function in many cell types is a mystery, available evidence suggests the broad conclusion that the primary cilium is important for signaling and cell polarity (27, 30). A few cell types have elaborated and expanded on this single cilium, generating specialized structures that perform critical functions of several broad types: sensation, development, fluid movement, sperm motility, and cell signaling. The functional significance of cilia in tissues is reflected in the severity and diversity of pathologies caused by defects in cilia. These include anosmia, retinitis pigmentosa and retinal degeneration, polycystic kidney disease, diabetes, neural tube defects and neural patterning defects, chronic sinusitus and bronchiectasis, obesity, heterotaxias, polydactyly, and infertility (3). This list includes defects in highly specialized cilia, exemplified by retinitis pigmentosa and anosmia, as well as defects in primary cilia, exemplified by polycystic kidney disease (23, 24). Defects in cilia are therefore underlying causes of several congenital disease syndromes with pleiotropic symptoms. These include Alstrom syndrome, Bardet-Biedl syndrome, Kartagener syndrome, Meckel-Gruber syndrome, Senior-Loken syndrome, orofacial digital syndrome 1, Joubert syndrome, and primary cilia dyskinesia (3). Investigation of mammalian cilia genes will enhance not only our knowledge of ciliary function but also of dysfunctions that contribute to significant human health problems.
The function of cilia in cell signaling is evident in the specialization of cilia for sensation. In the mammalian olfactory epithelium, for example, olfactory cilia are elaborated during the maturation of olfactory sensory neurons (OSNs). This event is a defining feature that distinguishes a mature OSN from its immature antecedent (8, 28). These nonmotile cilia are the sites where odorants interact with odorant receptors, triggering G protein-dependent generation of cAMP and subsequent electrical activity in the OSN. Not surprisingly, the bioinformatics of expression profiling of OSN mRNAs consistently identifies cilia/flagella and ciliogenesis/spermatogenesis as overrepresented biological processes in OSNs (26, 29). This link between OSNs and sperm, and our observation that mRNAs linked to cilia were detected in both OSNs and in cells of the neighboring respiratory epithelium, another tissue enriched in highly ciliated cells, led to the hypothesis that tissue expression pattern could be used as an ab initio approach to identify genes whose products are important to cilia.
The ability of bioinformatics and expression profiling to identify cilia-related genes has been documented several times. Comparison of the genomes of ciliated and nonciliated organisms predicts hundreds of cilia-related genes in Chlamydomonas reinhardtii and Drosophila melanogaster (2, 18). The presence of X box promoter elements, likely targets of the cilia-related transcription factors of the Rfx family, are also predictors of cilia-related genes (10, 34). Proteomics on cilia-enriched samples from C. reinhardtii, Tetrahymena aurelia, and cultured human respiratory epithelial cells predict cilia proteins (21, 22, 31). Expression profiling to compare ciliated vs. nonciliated cells in C. elegans or mRNAs that increased during replacement of flagella in C. reinhardtii predict additional cilia-related genes (5, 32). Although previous efforts have been substantial, we reasoned that additional cilia-related genes remain to be identified. This is especially true in mammals, which were the focus of only one of these earlier studies. We therefore compared our data on gene expression in mouse OSNs with gene expression patterns across >60 mouse tissues to predict mouse cilia-related genes (26, 33). We identified 99 putative cilia-related genes, many of which had not previously been linked to cilia and most of which encode proteins of unknown function.
MATERIALS AND METHODS
Mouse tissue expression patterns in which mRNAs were most abundant in olfactory epithelium, testes, vomeronasal organ, trachea, and lung were identified via SymAtlas, a repository of microarray data on mRNA abundance in many mouse tissues (33). Seven such genes were identified. The correlation function resident in SymAtlas was applied to these seven mRNAs to identify additional mRNAs sharing their expression patterns. Each of the seven lists of correlated mRNAs was ordered by the strength of correlation and then truncated before the first mRNA whose abundance pattern failed to have two of the five target tissues as the tissues with the largest amounts of that mRNA. This threshold corresponded to correlation coefficients that ranged from 0.90 to 0.99 for the seven mRNAs tested. We then combined the seven correlations, tracking how many correlation sets included each mRNA and excluding genes whose mRNAs were found in only one correlation set. As a filter for evidence of expression in highly ciliated cells, we then used expression profiling data generated from separating the olfactory epithelium into purified OSNs, termed the OSN+ sample, and a sample of all other cells (the OSN− sample), expecting to delete any genes whose mRNAs lacked evidence of expression in OSNs (26). The filter criterion for exclusion was an OSN+ sample-to-OSN− sample (OSN+/OSN−) ratio <0.5. Ratios higher than this predict expression in OSNs with accuracies ranging from 83% (ratios of 1.2–0.5) to 99% (ratios ≥1.3). However, none of the mRNAs failed this criterion. All either had evidence of expression in OSNs or were not represented in the filter criterion data. Previous evidence linking candidate genes to cilia was detected by searching publications on each gene via PubMed and by searching in http://www.ciliome.com in April, 2007 (15). Documented cilia-related genes were identified by searching PubMed using the keyword strategy cili* OR flagel* for the period January, 1996, to April, 2007. Thousands of titles and abstracts were inspected manually to identify those that documented genes directly linked to cilia. Protein domains were identified by use of the simple modular architecture research tool (SMART; http://smart.embl-heidelberg.de/) (17). Kaleidagraph version 3.6 was used to plot data and for curve fitting.
In situ hybridization.
Wild-type C57Bl/6J mice of both sexes aged 21 days postnatal through 28 days postnatal (aged P21–P28) were used. All procedures involving animals were approved by the institutional animal care and use committee of the university. In situ hybridizations were performed as described previously (26, 29, 39). Briefly, mice were transcardially perfused with ice-cold 4% paraformaldehyde. The dorsal regions of the snout and anterior cranium were dissected, cryoprotected, stored at −80°C, and cut in 10- to 14-μm coronal sections on a cryostat. One to two digoxygenin-labeled riboprobes, typically ∼500 bp in length, were prepared for each mRNA and hybridized in 50% formamide in 10 mM Tris·HCl (pH 8.0), 10% dextran sulfate, 1× Denhardt's solution, 600 mM NaCl, 0.25% SDS, 1 mM EDTA, and 200 μg/ml yeast tRNA at 65°C (1 ng/μl per riboprobe) on cryosections mounted on slides. Detection was done using an alkaline phosphatase-conjugated antibody to digoxygenin and hydrolysis of nitro blue tetrazolium chloride/5-bromo-4-chloro-3′-indolyphosphate p-toluidine. Sense and antisense probes were always tested simultaneously, and the sense controls were invariably negative. Digital wide-field images were obtained at room temperature using a Spot 2e camera and Spot software version 4.0.6 on a Nikon Diaphot 300 inverted microscope with a 4×/0.13 numerical aperture (NA) Plan objective and a 40×/0.75 NA Plan Fluor objective. Images were processed in Adobe Photoshop version 7.0 by adjusting size, brightness, and contrast and using the dodging tool to eliminate shadows (only in open areas that lacked tissue) to improve the consistency of illumination. Images were then combined and labeled using Deneba Canvas version 8.0.
Cryostat sections at 10-μm thickness on slides (Fisher Superfrost Plus) were prepared from wild-type C57Bl/6J mice of both sexes, aged P21–P28, and fixed in 4% paraformaldehyde as described for in situ hybridization above. Slides were dried with a hair dryer for 20 s and washed in PBS for 5 min, PBS containing 1% Triton X-100 for 30 min, and PBS for 5 min. Blocking was then done with 2% BSA and 0.4% Triton X-100 in PBS for 1 h in a hydrated chamber at room temperature. Primary antibodies were a mouse monoclonal against acetylated tubulin (Sigma no. T6793) at 1/1,000 dilution and rabbit polyclonal antisera against Spa17 (16) at 1/200 and Crocc (Root6) (38) at 1/5,000. The specificity of these antibodies and their utility for immunohistochemistry are well documented (16, 37, 38). Slides were incubated in primary antibodies overnight in the blocking solution at 4°C in a hydrated chamber. Washing was in PBS for 10 min, PBS containing 0.5% Tween 20 for 10 min, and PBS for 10 min. Slides were incubated in cyanine-3 (Cy3)-conjugated anti-rabbit secondary antibody in PBS for 1 h in a hydrated chamber at room temperature in the dark. Washing was in PBS for 10 min, PBS containing 0.5% Tween 20 for 10 min, and PBS for 10 min. Slides were then incubated in FITC-conjugated anti-rabbit secondary antibody in PBS for 1 h in a hydrated chamber at room temperature in the dark. Washing was in PBS for 10 min twice in the dark. Hoechst-33258 at a 1/1,000 dilution was applied in PBS for 5 min in the dark, followed by 5 min in PBS and a rinse with distilled water. Slides were dried at room temperature in the dark for 1 h and then mounted with Vectashield (Vector Laboratories) and a coverslip.
A summary of 2,128 mouse genes linked to cilia by previous high-throughput approaches (15), by published studies of individual gene products, and by the work described herein was compiled as an Excel workbook file (Supplemental File 1; supplemental data are available at the online version of this article). The present Entrez gene IDs and gene symbols were verified, and gene ontology annotation was added to the file. Each gene was also searched against PubMed [gene symbol AND (cili* OR flagel*)] for published evidence linking the mouse gene or putative orthologs to cilia or flagella. This searchable file therefore lists identifying information about each gene, functional information about the encoded protein, the number of high-throughput studies linking the gene to cilia, and PubMed ID numbers for at least one study documenting relationships to cilia.
RESULTS AND DISCUSSION
Identification of mouse cilia-related genes by tissue expression pattern.
We reasoned that the core elements of cilia and cilia support structures (such as the axoneme, basal body, and intraflagellar transport particles) are enriched in highly ciliated cell types, loosely defined as cells that elaborate large or multiple cilia. We identified seven mRNAs that shared expression in mature OSNs and a subset of cells in neighboring respiratory epithelium and found that they shared a tissue expression pattern of high abundance in tissues with numerous highly ciliated cells: olfactory epithelium, testes, vomeronasal organ, trachea, and lung (Fig. 1). We used these mRNAs to identify 99 mouse genes whose mRNAs had highly correlated tissue expression patterns (Table 1). Of these, 39 are the subject of papers archived in PubMed, and of these 39, 18 are directly linked to cilia or flagella by these publications. This proportion is significant even if we make the cautious prediction that 25% of all genes are linked to cilia (binomial test; H0:cilia gene fraction = 0.25; B* = 3.05, P < 0.001). The frequency with which mRNAs were detected in the seven tissue expression pattern correlations was not correlated with the proportion with proven links to cilia as determined by articles archived in PubMed. This finding argues that even those mRNAs whose abundance patterns were highly similar to only two of the seven test mRNAs are not less likely to be important to cilia.
Confirmation of expression in OSN, a highly ciliated cell.
If the mRNAs we identified do indeed encode proteins important to cilia, then these mRNAs must be expressed in the ciliated cells of these tissues. We tested this by in situ hybridization in the olfactory epithelium where the only highly ciliated cells, mature OSNs, can be unambiguously identified in tissue sections because of the position of their cell bodies (Fig. 2A). Published evidence already indicates that 73 of the mRNAs are expressed in OSNs (Table 1; OSN+/OSN− ratios >0.8) (26). We tested 17 mRNAs, including three that have proven links to cilia (4933434I06Rik, Ift74, and 1110017D15Rik). All 17 mRNAs were detected in the mature OSN layer of the epithelium. Examples are shown in Fig. 2, B–O, and the data are summarized in Table 1 [column titled ISH (in situ hybridization)]. Thirteen of the mRNAs were also detected in the neurons of the vomeronasal organ (not shown, but summarized in Table 1). These tissue sections also contain neighboring regions with other highly ciliated cells, such as respiratory epithelium and nasal gland linings. Of the 17 mRNAs, only A330021E33Rik failed to unequivocally label subsets of cells in these epithelia. Examples of labeling of respiratory epithelium and nasal glands can be seen in the low-magnification images of Fig. 2, B′–O′. Even though the ciliated cells in respiratory epithelium and nasal glands cannot be unambiguously identified by in situ hybridization because their cell bodies are intermingled with other cell types, the presence of the mRNAs in these epithelia is additional evidence supporting the conclusion drawn directly from the localization of the mRNAs to OSNs and vomeronasal sensory neurons. We conclude that our tissue expression pattern approach identified mRNAs expressed primarily by highly ciliated cells.
The cell and tissue expression patterns of the 99 mRNAs we identified were consistent with the expectation that many of the encoded proteins reside in cilia or cilia support structures. However, some may instead have functions in other cellular compartments, such as nuclei or vesicular trafficking compartments, yet still be important to cilia. None of the novel proteins we linked to cilia is as yet a target of useful antibodies. However, our analyses predicted that Spa17 is present in olfactory cilia. Spa17 has previously been localized to sperm flagella and cilia of the respiratory tract and reproductive organs and is known to be expressed by OSNs, but it has not been localized to the cilia of OSNs (6, 12, 16). Not surprisingly, we observed that Spa17 immunoreactivity in the olfactory epithelium overlaps completely with immunoreactivity for acetylated tubulin, a marker for cilia (Fig. 2P). In contrast, immunoreactivity for the ciliary rootlet protein, Crocc, did not (Fig. 2Q). Crocc is the major component of ciliary rootlets that extend basally from the basal bodies of cilia (37, 38). We observed Crocc immunoreactivity in round structures just below immunoreactivity for acetylated tubulin, consistent with its expected location in the dendritic knobs from which olfactory cilia arise. Immunoreactivity for Spa17 and Crocc in neighboring respiratory epithelium gave similar patterns. Spa17 overlapped completely with acetylated tubulin labeling of cilia, and Crocc immunoreactivity was found directly below the cilia layer (data not shown). These data provide additional reasons to expect that many of the 99 genes we identified will prove to be important to cilia.
Tissue expression pattern predictions overlap with other predictive approaches.
As described above, several other methods have been used to predict cilia-related genes with some degree of success. We hypothesized that our method would overlap significantly with these previous efforts. Indeed, 39 of our 99 genes are probable orthologs of genes previously predicted to be related to cilia (15). This proportion is greater than a random sampling of genes should yield (binomial test; H0:cilia gene fraction = 0.25; B* = 6.37, P < 0.0001). Of these 39, 27 as yet have no direct evidence of importance to cilia. The mouse genes we identified that are probable orthologs of genes previously linked to cilia were not limited to orthology with mammalian genes. Just six were probable orthologs of genes previously linked to cilia only in humans. Instead, most were orthologs of nonmammalian genes, or of both human and nonmammalian genes.
The basic logic of our approach, an expression pattern largely restricted to tissues with large numbers of highly ciliated cells, appeared to be uniquely consistent with cilia genes. We were unable to find other patterns involving other tissues where cilia have important functions, such as pituitary, retina, kidney, and cartilage/bone (11, 14, 36), that had even a few mRNAs whose abundance was so highly correlated (coefficients >0.9). This tight correlation corresponded with rough estimates of the proportion of highly ciliated cells in the tissues, with testes, olfactory epithelium, vomeronasal organ, trachea, and lung having substantial proportions, whereas these other tissues have relatively few highly ciliated cells, or in the case of the retina and its photoreceptors, have cells whose cilia have diverged into a very specialized structure. The frequency with which mRNAs were detected in the seven tissue expression pattern correlations was not correlated with the proportion of them listed in http://www.ciliome.com (15). This finding is consistent with the expectation that our tissue expression pattern approach does not share biases inherent in previous approaches.
The reliability of our search strategy was better than that of previous methods, judging by the frequency of genes shared with other predictive studies. Genes identified by multiple high-throughput approaches would seem more likely to be true cilia-related genes than those detected only once. The frequency of identification of predicted cilia genes is listed at http://www.ciliome.com, which summarizes previous work in this area (Table 1) (15). Only 22% of the genes listed were identified by multiple studies. For the genes we identified, the proportion was 48%. Overall, the available evidence suggests that the proportion of true cilia-related genes in our list will prove to be substantial.
Many candidates are genes of unknown function.
Like previous bioinformatics approaches to identify cilia-related genes, our tissue expression pattern approach was not biased by gene type. We detected genes involved in signaling, transcriptional regulation, protein chaperone function, cytoskeletal structure, and cytoskeletal dynamics, all of which are functions needed by cilia. However, what was remarkable about the genes as a group was how little we know about them. For 60 of them, we found no evidence of published work on them or their products. Because our approach was unbiased, these genes could encode products involved in virtually any aspect of cilia function or support of cilia, such as the axonome, basal body, protein trafficking, and transcription of other cilia-related genes. Expression patterns limited to tissues enriched in motile cilia (testes, trachea, lung) were unsuccessful in identifying a different group of highly correlated mRNAs that could be associated specifically with motility. Our search pattern included tissues with both motile and nonmotile cilia, so we conclude that the genes we identified are likely to encode components of processes shared by both motile and nonmotile cilia. These processes are not necessarily restricted to cilia or the basal body but could instead be processes necessary to support cilia that happen at other locations such as the nucleus or the cell body.
Several of the most interesting proteins encoded by the 99 genes we identified have never been studied. Proteins with coiled-coil domains appear to be important to cilia (22). This domain mediates protein interactions, especially with other coiled-coil domain proteins, predicting that additional coiled-coil domain proteins might be important to cilia (7). Ccdc39, Ccdc65, Ccdc113, and 1700003M02Rik are all predicted to possess coiled-coil domains, and all have putative orthologs in other species that have been linked to cilia by at least one other high-throughput approach (15). Similarly, tetratricopeptide repeat domain (Ttc) proteins and WD (tryptophan aspartic acid) repeat domain proteins have been linked to cilia (1, 4, 9, 13, 20, 25). We identified five such genes, three of which have putative orthologs that have been tentatively linked to cilia in other species: Ttc18, Ttc21a, and Wdr66. Last, 0610012D17Rik encodes a novel protein that is predicted to contain a Tctex-1 domain. Tctex-1 is a dynein light chain that interacts with rhodopsin, implicating it in the trafficking of rhodopsin to the photoreceptor outer segment (35). 0610012D17Rik may therefore be a component of a more general mechanism of trafficking proteins to cilia.
A mouse ciliome resource.
Our prediction that the identification of mammalian cilia genes is incomplete was supported not only by our prediction of additional mouse cilia genes but also by published work on genes that were not detected by high-throughput approaches. We detected several proteins with published links to cilia, such as Foxj1 (19), whose genes were absent from http://www.ciliome.com (Table 1) (15). A literature search for mouse and human cilia-related genes revealed 87 genes that have documented importance to cilia but were not detected by high-throughput approaches. Surprisingly, 62% of the mouse genes with documented importance to cilia were not predicted by previous high-throughput methods, nor were probable orthologs of these genes. Combining the genes identified herein (Table 1) with those identified in our literature search and those listed in http://www.ciliome.com identifies 2,128 mouse genes that have been linked to cilia in some fashion (Supplemental File 1).
We expect this mouse ciliome resource to be neither complete, as discussed above, nor wholly accurate. For example, a few genes in our results may be selectively abundant in olfactory epithelium, vomeronasal organ, testes, trachea, and lung for reasons unrelated to cilia. More broadly based bioinformatics and expression profiling approaches are even more likely to make this type of false inclusion error. Similarly, the proteomics approach suffers from limited sensitivity, may misidentify some proteins, and is prone to include proteins from contaminating subcellular fractions. In fact, >75% of the genes predicted as important to cilia by high-throughput approaches were identified by only one of the eight previous studies (Fig. 3), a statistic that argues that many cilia gene predictions are erroneous. Users should therefore keep in mind that the mouse ciliome resource suffers from omissions of cilia-related genes and from inclusion of genes not truly important to cilia and may be complicated by the identification of genes important to cilia in some organisms and tissues but not others.
Even with these imperfections, the compilation of documented and predicted cilia-related genes provides a valuable resource. As the pace of prediction of cilia-related genes has far outstripped the ability to test these predictions, significant opportunities to investigate proteins critical for cilia function and dysfunction have arisen. Supplemental File 1 provides a rational basis for identifying which proteins are the best candidates for as yet undocumented roles in cilia or the support of cilia. We hypothesized that the number of studies linking a gene to cilia would correlate with the frequency of genes independently validated as encoding cilia proteins. Indeed, we found a significant correlation within the exponential relationship between these factors (Fig. 3). Genes identified by multiple approaches are therefore the most promising candidates for future studies to test whether the encoded proteins are sufficient and necessary for cilia function.
The tissue expression pattern we discovered proved to be significantly better than chance at identifying known and suspected cilia-related genes. It identified 99 mRNAs, of which 48 have some link to cilia, ranging from documented involvement in cilia to prediction by previous high-throughput studies. Most of the 99 genes encode proteins that have never been studied, suggesting that much remains to be learned about cilia. The correlation of documented importance to cilia with the frequency of prediction of links to cilia indicates that our approach was more reliable than previous approaches, but this came at the expense of limiting the number of genes identified. The mouse ciliome resource we compiled from nine different sources contains 2,128 mouse genes with documented or predicted importance to cilia. Users of this resource should understand that additional mammalian cilia-related genes remain to be discovered, and that some proportion of the predictions will prove to be erroneous. Genes identified in multiple studies are the most promising for future study of cilia function and dysfunction.
This work was supported by National Institute of Deafness and Other Communications Disorders Grants R01-DC-002736 and R01-DC-007194 and Boyarsky Professorship funds to T. S. McClintock.
We thank Drs. Michael O'Rand and Tiansen Li for the gift of antisera against Spa17 and Crocc, respectively.
Present address of D. A. Bergman: Dept. of Biomedical Sciences, Grand Valley State Univ., 130 Padnos Hall, Allendale, MI 49401.
Address for reprint requests and other correspondence: T. S. McClintock, Dept. of Physiology, Univ. of Kentucky, Lexington, KY 40536-0298 (e-mail:).
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
- Copyright © 2008 the American Physiological Society