We have performed a systematic temporal and spatial expression profiling of the developing mouse kidney using Compugen long-oligonucleotide microarrays. The activity of 18,000 genes was monitored at 24-h intervals from 10.5-day-postcoitum (dpc) metanephric mesenchyme (MM) through to neonatal kidney, and a cohort of 3,600 dynamically expressed genes was identified. Early metanephric development was further surveyed by directly comparing RNA from 10.5 vs. 11.5 vs. 13.5dpc kidneys. These data showed high concordance with the previously published dynamic profile of rat kidney development (Stuart RO, Bush KT, and Nigam SK. Proc Natl Acad Sci USA 98: 5649–5654, 2001) and our own temporal data. Cluster analyses were used to identify gene ontological terms, functional annotations, and pathways associated with temporal expression profiles. Genetic network analysis was also used to identify biological networks that have maximal transcriptional activity during early metanephric development, highlighting the involvement of proliferation and differentiation. Differential gene expression was validated using whole mount and section in situ hybridization of staged embryonic kidneys. Two spatial profiling experiments were also undertaken. MM (10.5dpc) was compared with adjacent intermediate mesenchyme to further define metanephric commitment. To define the genes involved in branching and in the induction of nephrogenesis, expression profiling was performed on ureteric bud (GFP+) FACS sorted from HoxB7-GFP transgenic mice at 15.5dpc vs. the GFP− mesenchymal derivatives. Comparisons between temporal and spatial data enhanced the ability to predict function for genes and networks. This study provides the most comprehensive temporal and spatial survey of kidney development to date, and the compilation of these transcriptional surveys provides important insights into metanephric development that can now be functionally tested.
the development of the permanent kidney, the metanephros, in the mouse commences at ∼10.5 days postcoitum (dpc) and involves reciprocal inductive events between the ureteric bud (UB) and metanephric mesenchyme (MM). The UB rapidly gives rise to the collecting duct system via branching morphogenesis, while nephrogenesis is initiated adjacent to the tips of the branching bud where mesenchymal condensation, epithelialization, and morphogenesis give rise to the filtration units of the kidney, the nephrons. Finally, vascularization, enervation, and maturation of the renal corpuscle and uriniferous tubules lead to the formation of the functional excretory organ.
Although kidney organogenesis has long been studied as a model organ system, the underlying mechanisms that control metanephric development have only been partially elucidated. Traditional developmental genetics (transgenics, gene knockouts, organ culture studies) have been used to define many key regulators of the process, including the transcription factors WT1 and Sall1 and the Pax2/Eya/Six network (4, 15, 20). Kidney development is also heavily orchestrated by secreted protein factors with several secreted factor families involved, including the Wnt, FGF, IGF, Tgfβ, and hedgehog signaling pathways (reviewed in Ref. 9). Key molecules in several early morphogenetic events have been described. Glial-derived neurotrophic factor (GDNF) produced by the MM attracts the UB toward the MM by virtue of c-ret and GDNFRα expression in the tips of the UB (5, 19, 21, 22). Various secreted factors within the mesenchyme either maintain mesenchyme survival (BMP7, FGF2) (2, 10) or promote nephron formation (LIF, FGF) (2, 13). Much remains to be clarified. The classical model of metanephric induction suggests that an inducing signal from the UB initiates tubulogenesis within the MM, and yet the identity of this inducer remains unknown, leading to the alternative suggestion that the UB serves simply to keep the mesenchyme alive and to initiate Wnt4 expression within the MM, which in turn results in tubulogenesis (6, 14). Lineage relationships between these early compartments and the >25 distinct cell types in the adult remain only superficially characterized. The elongation and subsequent maturation of distinct segments of the uriniferous tubules, the role of the stroma and its relationship to the renal interstitium, the full complement of signals coordinating vasculogenesis and angiogenesis, and the maturation of the specialized cells of the glomerulus have yet to be well defined at the molecular level. Indeed, the question of whether the MM itself is homogeneous at the time of commitment (10–10.5dpc) remains to be clarified.
The complex nature of any developing organ means that gene expression needs to be assessed in both a temporal and spatial context to be truly informative. Microarray-based approaches have more recently been used to survey global gene expression in the developing mammalian kidney (23, 25) to provide new insights into the process. Such studies provide the opportunity to monitor the temporal or spatial expression patterns of key classes of genes such as transcription factors, ligands, and receptors and signaling pathways. Furthermore, bioinformatic approaches can be used to define distinct temporal phases of gene expression, and biological pathways inferred on the basis of coordinate expression of functionally similar genes have been described (25). Spatial expression profiling has been performed to a limited extent in the developing kidney via several experimental models, including profiling of dissected 11.5dpc ureteric tree and mesenchyme (23), immortalized embryonic kidney mesenchymal cell lines (28), and green fluorescence protein (GFP)-tagged mesenchyme using Sal1 transgenic mice (27). However, the number of developmental stages and the number of genes surveyed in these studies have been limited.
In this study, we present the most thorough gene expression profile of kidney development to date. The activity of some 18,000 genes has been assessed throughout all stages of kidney development and using several experimental strategies. We have sought to define the cohort of genes that are dynamically expressed in the developing kidney and have used data mining techniques to define functional clustering therein. Particular attention has been paid to the period covering metanephric induction and early kidney development (10.5–13.5dpc). We also report on combined temporal and spatial profiling to define expression markers involved in MM commitment, UB branching, and MM differentiation. To identify genes regulating kidney metanephric induction, we profiled intermediate mesoderm (IM) vs. MM at 10.5dpc. To analyze UB branching and MM differentiation, we generated spatial data using fluorescence-activated cell sorting (FACS) of GFP+ve ureteric tree from the remaining mesenchyme (GFP−ve) of 15.5dpc Hoxb7-GFP transgenic mice. These data have been integrated with the temporal profiling data to provide predictions as to which cellular compartments express given dynamically expressed transcripts and pathways. In situ hybridization has been used to demonstrate the validity of such temporal/spatial predictions of function.
Tissue collection and RNA isolation.
Experimental protocols were approved by the University of Queensland Animal Ethics Committee (approval no. IMB/479/02/NIH).
Pools of kidney tissue were collected from outbred CD1 embryos at 10.5, 11.5, 12.5, 13.5, 14.5, 15.5, 16.5, and 17.5dpc and 1-day-postpartum (neonate) mice. All dissections were performed in cold phosphate-buffered saline (PBS) and were stored at −80°C before RNA extraction. Total RNA was prepared from pooled tissues using an acid phenol extraction method (TRIzol, Invitrogen) followed by GITC and column purification (Qiagen RNeasy RNA minipurification). The quantity and integrity of each RNA sample were determined via running each sample on a bioanalyzer RNA microfluidic chip (Agilent). Mouse universal total RNA (Stratagene) was used as the common reference.
Isolation of GFP+ and GFP− cells from metanephroi.
Time-mated Hoxb7/GFP transgenic mice (obtained from Dr. Frank Costantini, Columbia Univ.) were killed at 15.5dpc by cervical dislocation. Metanephroi were collected in DMEM-Ham’s F-12 (TRACE) from several litters and dissociated into a single cell suspension using a mixture of collagenase and dispase (0.5 mg/ml; Sigma) in Hank’s buffered salt solution (Invitrogen) for 10 min at 37°C followed by trituration. Cells were washed in DMEM-Ham’s F-12–10% FCS and passed through a wire mesh to remove any clumps. GFP+ and GFP− cells were isolated on a Mo-Flo Cytometer (Cytomation, Fort Collins, CO). Dissociated wild-type metanephroi were used as a negative control. Cells were collected into DMEM-Ham’s F-12 and washed twice in RNase-free PBS, and cell pellets were stored at −70°C.
Metanephric culture and tissue preparation.
For RNA in situ hybridization, embryos were collected from outbred CD1 mice, as previously stated, at days 10.5 and 12.5 of gestation. Embryonic day 10.5 (E10.5) embryos were cut transversely below the forelimbs and longitudinally down the midline to expose the nephric duct, UB, and MM. At E12.5, complete urogenital (UG) tracts were collected. For explant culture, metanephroi were isolated from E12.5 embryos and grown as explants for two days at 5% CO2, 37°C, on 3.0-μm polycarbonate transwell filters (Costar) in MEM supplemented with 10% FCS and 20 mM glutamine. Tissue for whole mount in situ hybridization was fixed overnight in 4% paraformaldehyde-PBS at 4°C.
RNA amplification and target labeling.
Total RNA was linearly amplified from 1 μg of total RNA per each sample using the messageAMP amplified RNA (aRNA) kit (Ambion), yielding a minimum of 20 μg of amino-allyl-labeled antisense aRNA. The quantity and integrity of these aRNAs were compared via running each sample on a bioanalyzer RNA microfluidic chip (Agilent) before labeling. Five micrograms of each aRNA sample were then labeled by covalent linking of Cy5- or Cy3-labeled UTP (Amersham). Finally, the labeled material was hydrolyzed and used for hybridization.
Array fabrication and generation.
The arrays used in this study were all obtained from the SRC Microarray Facility, University of Queensland [Australian Research Council (ARC) Centre for Functional and Applied Genomics], and were composed of 21,994 mouse gene-specific oligonucleotides (Compugen) spotted onto epoxy-silane-coated slides (Eppendorf). Labeled aRNA target was generated from 1 μg of total RNA and hybridized to each array for 16 h at 45°C using previously described conditions (7).
Image analysis, normalization, and analysis.
Hybridized slides were washed, dried, and scanned in a 600B array scanner (Agilent). The images were analyzed with Imagene 5.5 (BioDiscovery) to determine mean foreground and background for both channels. All primary data, including images, were then imported into an in-house installation of the comprehensive microarray relational database, BioArray Software Environment (BASE; http://kidney.scgap.org/base). The raw data from each hybridization were compiled into an experiment and subjected to print tip intensity-independent Lowess normalization using the R statistical software from the Linear Models for Microarray Data (LIMMA) package (http://bioinf.wehi.edu.au/limma/). This normalization was implemented within BASE, using scripts developed by Ola Spjuth of the Linnaeus Centre for Bioinformatics (http://www.lcb.uu.se/baseplugins.php).
In the case of all direct comparisons, differential expression was defined using a robust statistical method rather than simple fold change. All genes were ranked using the B statistic (17) method, where both fold change and variance of signals in replicates are used to determine the likelihood that genes are truly differentially expressed. A threshold in the B statistic of 0.0 was adopted, as genes with a B score >0 have a >50% probability of being truly expressed (24). This analysis was executed using the Bio-conductor package that has been implemented as a plug-in tool in BASE. The ranked B scores for all genes in each experiment can be obtained from BASE (http://kidney.scgap.org/base).
In the case of the temporal profiling from 10.5dpc to adult, normalized data were exported from BASE into Genespring 6.1.2 (SiliconGenetics) for cluster analysis and visualization. The data were subjected to a further transformation, with the ratio of each being divided by the median of its measurements in all samples. The result of this transformation was clearer visualization of relative changes of expression for each gene across the entire time course. The remaining genes were then subjected to the GeneSpring implementation of a Welch ANOVA test (P < 0.005) plus Benjamini and Hochberg correction to define the cohort with dynamic expression. In addition to primary data sets available from BASE and Gene Expression Omnibus (GEO) (GSE2281, GSE2282, GSE2284, GSE2285, GSE2300, GSE2301), normalized annotated data for each experiment can be reviewed via Signet: http://spring.imb.uq.edu.au (login, reviewer; password, scgap).
Hierarchical agglomerative clustering was performed within the GeneSpring package using the Pearson correlation for the distance metric. K-means clustering was also performed using the GeneSpring package. The dynamic cohort of genes was partitioned into five classes and the default settings on the software package.
These data were generated through the use of Ingenuity Pathways Analysis, a web-delivered application that enables biologists to discover, visualize, and explore therapeutically relevant networks significant to their experimental results, such as gene expression array data sets. For a detailed description of Ingenuity Pathways Analysis, visit http://www.Ingenuity.com.
Accession numbers for all 3,600 dynamic genes were imported into the Ingenuity Pathway Analysis software. Eight hundred thirteen genes that displayed upregulation relative to the median level across time course were selected as focus genes and used as the starting point for generating biological networks. To start building networks, the application queries the Ingenuity Pathways Knowledge Base for interactions between focus genes and all other gene objects stored in the Knowledge Base and generates a set of networks with a network size of 30 genes/proteins. Ingenuity Pathway Analysis then computes a score for each network according to the fit of the user’s set of significant genes. The score is derived from a P value and indicates the likelihood of the focus genes in a network being found together due to random chance. A score of 2 indicates that there is a 1/100 chance that the focus genes are together in a network due to random chance. Therefore, scores of 2 or higher have at least a 99% confidence of not being generated by random chance alone. Biological functions are then calculated and assigned to each network. These networks were then ranked on the probability that a collection of genes equal to or greater than the number in a network could be achieved by chance alone. A score of 3 indicates that there is a 1/1,000 chance that the focus genes are in a network due to random chance. The top six scoring networks were analyzed as part of this analysis.
RNA in situ hybridization.
Expression patterns were analyzed by RNA in situ hybridization using digoxigenin-labeled sense and antisense riboprobes. Probes were synthesized as described previously (12), using cDNA constructs containing the expressed sequence tag (EST) of interest (SRC Microarray Facility). Clonal cultures were sequence verified before being used as the template for riboprobe synthesis. Sense and antisense probes were prepared from linearized cDNAs using T7, T3, and Sp6 RNA polymerase. Probes were purified with Sephadex columns (Roche) after digestion of the vector with DNase I for 15 min at 37°C. Whole mount in situ hybridizations were performed as described by Christiansen et al. (8), with minor modifications. All probes were hybridized at 65°C and postantibody washes reduced to 30 min. Tissue was mounted in Mount-Quick aqueous (Daido Sangyo) and photographed using an Olympus AX70 compound microscope with Kodak Elite Ektachrome 160T color reversal film.
Temporal profiling from 10.5dpc to neonate kidney and comparison with previous studies.
Temporal profiling from 10.5dpc to neonate kidney vs. a common reference was performed to obtain a broad global survey of gene expression throughout development. Quadruplicate hybridizations were performed for each time point, and the combined data set was filtered for genes displaying dynamic temporal gene expression using the Welch ANOVA test (P < 0.005) and Benjamini and Hochberg false discovery rate correction; 3,600 genes passed this threshold and were subjected to further study (Supplemental Table S1; available at the Physiological Genomics web site).1
A series of boxed microarray experiments was also performed between 10.5dpc MM, 11.5dpc metanephroi, and 13.5dpc metanephroi to survey for temporal markers expressed during the inductive and early growth phase of metanephric development. B statistics were used to define differential expression for each time point. The complete rankings are available as Supplemental Material (Supplemental Table S2). Comparison of the boxed experiment to the 10.5dpc-neonate temporal profile showed strong agreement. Of the 102 genes that were found to be significantly differentially expressed by B statistics in the direct comparisons and by ANOVA for the temporal data, 99 displayed a similar pattern of expression. This provided confidence in both the accuracy and robustness of these profiling studies.
To establish the accuracy of this temporal analysis, the data set was compared with the previously published temporal analysis of kidney development in the rat (25), which spanned E13.5 to adult. Homology mapping between the mouse and rat data was performed using a combination of Locuslink, UniGene, and Homologene; 2,399 clear orthologs were defined between the two data sets, of which 508 were dynamic throughout kidney development (Supplemental Table S3). Hierarchical clustering of this core set was performed for both this mouse data as well as the data of Stuart et al. (25), with strong similarities in clustering observed. Figure 1 shows the similarity in profiles by displaying the mouse expression profiles (Fig. 1B) and rat expression profiles of Stuart et al. (25) (Fig. 1C) for the same gene tree. The major biological functions associated with the core set were reviewed by compiling the mouse Compugen functional annotations and are summarized in Table 1. “Transcriptional regulation” was the most common theme, with 28 genes identified, and was closely followed by signal transduction (24 genes), proteolysis/peptidolysis (20 genes), and developmental processes (20 genes).
The strong similarities observed between our 10.5dpc-neonate time course, the profiles from the direct boxed comparisons of early metanephric development, and the rat kidney temporal profile (25) suggested that this is a robust and accurate temporal profile of murine kidney development.
Functional annotation of developmentally dynamic kidney genes.
Functional annotation was undertaken using several methods. First, hierarchical clustering was performed, and overrepresentations of gene ontological terms were sorted for each branch to define functional clusters. This approach yielded little insight other than that the major active early branch was enriched for “cell cycle” and “transcription factor” (P < 10−4), and the active late branch was enriched for metabolism, transporter, and “cell communication”(P < 10−4).
Second, K-means partitioning was used. This clustering tool partitions all dynamic genes into a predefined number of classes. Five robust classes were identified (see Fig. 2) and annotated. The frequency of gene ontology (GO) terms associated with each part of the dendrogram was calculated using the programs Database for Annotation, Visualization and Integrated Discovery (DAVID) and EASE (http://david.niaid.nih.gov/david/upload.asp) (see Supplemental Table S4).
Class 1 contained 1,017 genes and displayed maximal activity from 10.5dpc to 13.5dpc. This class contained a large number of genes associated with DNA metabolism (P < 9.9 × 10−25), mitotic cell cycle (P < 2.0 × 10−14), and cell proliferation (P < 7.2 × 10−14). Class 2 contained 718 genes that displayed lowest expression at 10.5dpc and steadily increased to maximal levels in the neonatal kidney. Eight of the top ten ranked functional annotations for this class were associated with metabolism (P < 2.0 × 10−4 to 8.0 × 10−3). Other significant annotations included cell communication and cell adhesion (P < 2 × 10−2). Class 3 displayed maximal activities at 10.5dpc and decreased rapidly with lowest expression in the neonatal sample. The annotation of this class suggested it was enriched with developmental genes regulating cell motility (P < 6 × 10−4), morphogenesis (P < 3 × 10−4), organogenesis (P < 6 × 10−4), muscle development (P < 2.5 × 10−3), and development (P < 3.8 × 10−3). Class 4 contained 233 genes and displayed relatively little expression at 10.5dpc but rapidly increased to maximal levels by 12.5dpc and then decreased again during late renal development. Functional annotation for this class was limited, with DNA-dependent transcription and protein targeting being the most significant (P < 2 × 10−3). Class 5 contained 873 genes and displayed little dynamic expression between 10.5 and 15.5dpc, followed by a steady increase at the neonatal stage. This class contained a large number of genes associated with metabolic biosynthetic pathways (P < 1.0 × 10−10) and ion transport (2 × 10−8).
In addition to clustering, the expression levels at each time point for the developmentally dynamic cohort were mapped to the Ingenuity Knowledge Base using Ingenuity software (http://www.ingenuity.com). Upregulated genes at each time point were reviewed against a well-defined panel of high-level functions and metabolic or signaling pathways to determine involvement at each time point. The significance of association for each term or pathway within the list of upregulated genes from each time point was plotted, as seen in Supplemental Figs. S1 and S2.
A subset of high-level functions displayed robust temporal expression patterns. In early kidney development, the highly proliferative nature of the metanephros was reflected by the terms cell cycle (P < 10−4 to 10−6), DNA replication (P < 10−10), and RNA posttranscription (P < 10−8) being significantly overrepresented between 10.5 and 13.5dpc. Genes regulating organ morphology (P < 10−2) and cellular assembly are significant only for a short time after metanephric induction (11.5–12.5dpc). Later periods of development are dominated by functions related to metabolic and biosynthetic properties that develop in the older kidney (maximum significance P < 10−5 at 18.5dpc). Immune response genes become significant (10−3) from 16.5dpc and maintain their importance right through to the neonatal stage. Several high-level processes connected with maturation of the kidney, such as cell-to-cell signaling (P < 5 × 10−5) and development of connective tissue (P < 10−2), were also identified and became significant from 16 to 17.5dpc.
Pathway analysis identified several signaling pathways not identified by other methods. Wnt, Tgfβ, IGF, VEGF, and insulin signaling were all displayed with maximal transcriptional activity during early kidney development. Cell cycle control was once again highlighted. ERK MAP signaling was found to be significant in early nephrogenesis. Older kidney development was associated with a series of several signaling pathways including peroxisome proliferator-activated receptor (PPAR) from 15.5dpc and metabolic activity associated with oxidative phosphorylation and fatty acid metabolism (P < 10−5). Two of the pathways identified as being significantly involved in early kidney development were subjected to further investigation. IGF and Wnt signaling displayed maximal expression at 11.5 and 13.5dpc, respectively, and the transcriptional activity of each pathway was assessed at their respective pinnacles of activity. Figure 3 displays schematics for the IGF and Wnt signaling pathways, with members of the pathway displaying maximal activity at 11.5dpc colored red. It is apparent for both pathways that members from receptor through to effector molecules are upregulated. Many of these pathways can engage homolog members at specific points in the pathway. In the case of Wnt signaling, pathway activation can be initiated by Wnt ligands binding to the frizzled receptor. Our studies have shown that Wnt4, Wnt6, and 10a were expressed at their peak levels at 13.5dpc. Importantly, of the two frizzled receptors displaying dynamic expression (frz2 and frz7), only Frz2 is active at 13.5dpc.
The final analysis performed on the developmental dynamic gene set was the detection of genetic networks active during early kidney development. Network analysis defines relationships between a set of genes, in this case upregulated genes at each time point from 10.5dpc to neonate. Relatedness was predefined using the Ingenuity Knowledge Base, a proprietary database containing millions of defined molecular interactions and functional interactions that have been mined from published literature and other sources. Networks are displayed graphically as a collection of nodes (genes/gene products) and edges (the biological relationships between the nodes). The intensity of the node color indicates the degree of up- (red) or downregulation (green). Nodes are displayed using various shapes that represent the functional class of the gene product. Edges are displayed with various labels that describe the nature of the relationship between the nodes. The length of an edge reflects the evidence supporting that node-to-node relationship, in that edges supported by more articles from the literature are shorter.
An example of output from this analysis is shown in Fig. 4. The high proportion of red-colored nodes reflects coordinated upregulation of the genes in this network at 11.5dpc (Fig. 4A). The coordinate nature of gene expression for this network is limited only to 11.5–13.5dpc, after which the gene expression is steadily downregulated, as depicted at 15.5dpc (Fig. 4B) and 18.5dpc (Fig. 4C). Graphic representations of the top six genetic networks are available (Supplemental Fig. S3).
Functional annotations associated with network 1 are diverse, including proliferation, DNA repair, and chromatin modeling. Subcellular prediction of the components of this network revealed that is represents a cytoplasmic signal transduction cascade headed by Arhga1, which in turn interacts with a large set of nuclear proteins. Correlation of functional properties to the network, based on the Ingenuity Knowledge Base, suggests that this network is involved in regulation of cell proliferation, mitosis (P < 2.2 × 10−2), and transcription (P < 5 × 10−7). In addition, there is a series of genes from the network associated with cellular differentiation (P < 2.9 × 10−2: Akt2, Cbfa2t1, Dnmt1, Etv6, Ezh2, Hdac2, Khdrbs1, Rasa1) and cell invasion (P < 1.6 × 10−2 : Akt2, Esr1, Etv6, Mta1).
Network 2 contains 35 genes the products of which are distributed in the membrane, cytoplasm, and nucleus. This network includes Notch4, a well-characterized receptor that mediates proliferative signals in urogenital development. Notch signaling is known to be an important developmental regulator in metanephric development (18). The majority of the genes associated with this network have previously been defined as being involved in cell cycle progression (P < 1 × 10−20) and transition from the G1/S (P < 1.7 × 10−10) and G2/M transitions (3.6 × 10−7). The networks displayed coordinated expression, with maximal activity between 11.5 and 13.5dpc. Network 3 is associated with FGF2 signaling, a key secreted factor that modulates UB and MM survival. Functional annotations show organogenesis (P < 1 × 10−04) and cell death (P < 2.7 × 10−06) as highly significant for this network. Network 4 has somewhat limited functional annotations, other than a moderate association to DNA replication (P < 3.3 × 10−04) and metabolism. Network 5 is associated with cell cycle control associated with ATM, while network 6 is associated with organogenesis, DNA-dependent transcription, and cellular differentiation.
Because genetic network analysis defines a series of functionally related genes that are expressed at the same time, we sought to confirm the validity and utility of such network predictions by examining the coexpression of genes within a network at the spatial level. The results of in situ hybridization on similar aged metanephroi for almost all genes from network 3 are displayed in Fig. 5. The expression patterns of these genes were quite complex; however, almost all of the genes displayed spatially coordinated expression in the interstitial compartment.
Spatial profiling of metanephric induction.
While temporal experiments provide insights into gene activity, the use of in situ hybridization as a validation tool highlighted that they provide no insight into any spatial restriction of gene expression. Two spatial analyses were therefore undertaken. The first related to spatial profiling of metanephric induction. A large number of genes displayed maximal activity at 10.5dpc, and we determined which genes were expressed in the MM vs. the adjacent IM (10.5dpc) using expression profiling. Quadruplicate hybridizations directly comparing dissected 10.5dpc IM vs. MM were performed, and differential expression was defined using B statistics. Genes with a B score >0 are summarized in Supplemental Table S5. Challen et al. (7) reported on the expression profiling of MM vs. IM using NIA15K cDNA arrays. Probes were mapped between this previous work and our current study and identified >80 common expression markers of MM or IM at 10.5dpc (matches are also indicated in Supplemental Table S5).
Whole mount in situ hybridization was performed for a subset of novel markers enriched in 10.5dpc MM. Figure 6 shows either preferential or restricted expression of the MM by Smoc2, CD83, crystallin mu, Itm2c, somatostatin, and α-fetoprotein. The temporal patterns of both 10.5dpc MM and IM expression markers were reviewed in the 10.5dpc-neonatal time course, and a majority of the MM markers at 10.5dpc are downregulated during early-to-mid kidney development.
Spatial expression profiling of ureteric development and mesenchymal differentiation.
To identify the genes that are differentially expressed in the ureteric epithelium as opposed to the surrounding mesenchyme and its derivatives, we isolated these two compartments of the kidney. We utilized 15.5dpc embryonic kidneys from the Hoxb7/GFP transgenic mice, which express GFP throughout the ureteric epithelium, and isolated GFP+ (ureteric epithelium) and GFP− (cells from the rest of the kidney) cells using fluorescence-activated cell sorting. Approximately 18% of the metanephroi were composed of ureteric epithelial cells (GFP+), with 82% being mesenchyme derived, including the developing uriniferous tubules, and the interstitial elements. Gating was used to collect the top 10% of cells with the highest GFP expression (GFP+ fraction) and the first 30% of cells lacking GFP expression (GFP− fraction) for RNA extraction (Fig. 7). In total, 5.43 × 105 GFP+ and 1.88 × 106 GFP− cells were collected from a total of 100 E15.5 metanephroi. To verify the purity of the cells collected, the GFP− fraction was reanalyzed through the flow cytometer to determine whether any GFP+ cells were present. It was determined that the sorted fractions were 99.95% pure (data not shown). In addition, semiquantitative PCR amplification of GFP demonstrated that a very low level of GFP product was detected after 25 cycles in the GFP− fraction compared with the expected high levels in the GFP+ fraction (Fig. 7). These results are comparable with the sort results demonstrating that the GFP+ and GFP− fractions are greatly enriched.
Triplicate microarray profiling of sorted GFP+ve ureteric tree vs. the remaining GFP−ve cell population was performed, and differential expression was defined using B statistics. All genes with a B score >0 were annotated using DAVID, and are listed in Supplemental Table S6; 196 of these genes were upregulated in the ureteric tree.
The quality of this screen for genes expressed in the ureteric tree was assessed by two means: 1) analysis of known tree expression markers in the developing tree and 2) validation of a representative set of new genes by in situ hybridization. A large number of genes associated with the ureteric tree during development were present in the GFP+ set. Examples include Sprouty (11), Calbindin 28D (16), and Gata3 (7,). Whole mount and section in situ hybridization were carried out for Cadherin16 and WAP domain-containing protein, revealing tree-specific expression (Fig. 8). Cadherin1 was only slightly enriched in the tree by microarray profiling, and this was reflected by in situ hybridization.
The GFP−ve set is, predictably, a complex population of cell types. Several well-characterized markers of MM were present in this set such as Wt1 and Mox1. In addition, expression markers associated with immune cells were readily identifiable (e.g., Toll2, macrophage scavenger receptor, and many cytokines). There is also a large number of genes known to associate specifically with erythrocyte cell population (erythroid-associated factor and hemoglobins X and Y).
Kidney development, from the point of commitment of the MM to the cessation of nephrogenesis in the immediate postnatal period, is a complex process involving the derivation of >25 distinct cell types in strict spatial relationships. These all apparently arise from two progenitor populations, the MM and the UB. A global understanding of the molecules involved in kidney development will require precise temporal and spatial information. This report summarizes the most comprehensive survey to date of gene expression during murine kidney development, capturing relative gene expression levels from the time of metanephric induction through to the neonatal kidney. Our temporal analysis defined 3,600 genes displaying dynamic expression across the time course. These data show excellent concordance when compared with previous reports (25), despite the data being generated on a different platform technology (Affymetrix vs. spotted arrays) and being performed on rat instead of mouse. The distant natures and experimental designs of these two experiments provide strong support for the validity of the core set of dynamically expressed genes present in both studies.
While temporal profiling alone provides a useful record of gene behavior during development, it lacks the discriminating power displayed by direct comparison of samples. In this study, both indirect and boxed experimental designs were employed to show the robustness of temporal expression. Direct boxed comparisons have focused particularly on gene expression during the period of 10.5–13.5dpc when commitment of the MM and critical inductive events occur. In addition, this study overlays key spatial data relating to the UB vs. the mesenchymal derivatives at midgestation.
In these experiments, we have shown that clustering approaches, annotation with GO terms and ranking on the basis of statistical overrepresentation of terms, was a relatively poor instrument for defining underlying biological pathways. In contrast, the ability to identify the majority of critical signaling pathways active during early kidney development by the employment of new data mining strategies is a significant advancement. The latter approach correctly verified the involvement of the Wnt, Tgfβ, IGF, and VEGF pathways as significantly active before 14.5dpc without the primary ligand and receptor pairs being identified as outliers. Network analysis was also able to implicate both FGF2 and Notch signaling as active during early development. Network analysis of kidney development between 11.5 and 13.5dpc in these studies also highlighted the involvement of several pathways not previously noted for their specific involvement in kidney development.
Although defining the timing of maximal activity of specific gene members for each pathway is a valuable method for identifying biological process, it is important to remember that spatial coordination of gene expression is also required for a process to be significant. While temporal profiling of the whole developing kidney is largely complete, extensive spatial profiling of specific compartments from the developing kidney is still required. This report has demonstrated that both microarray profiling of specific compartments of the kidney and screening kidney tissues by in situ hybridization are suitable tools to develop a more accurate picture of gene expression in the kidney.
The robustness of gene expression surveyed in this report has been tested by comparison to a previously published analysis of a similar system. Such comparisons are only really possible when primary data and information relating to data transformation are available. These comparisons serve as an independent validation of our temporal and spatial profiling when conserved. These studies will act as an excellent hypothesis engine for functional studies into the specific roles of these pathways in metanephric development and differentiation.
G. Challen holds an Australian Postgraduate Award. S. M. Grimmond is the recipient of a National Health and Medical Research Council (NHMRC) Career Development Award and a Senior Research Affiliate of the ARC Special Research Centre for Functional and Applied Genomics. M. Little is an NHMRC Senior Research Fellow. This work was supported by National Institute of Diabetes and Digestive and Kidney Diseases Grant DK-63400 as part of the Stem Cell Genome Anatomy Project (http://www.scgap.org/).
We thank Dr. Tina Maguire, the ARC Special Research Centre for Functional and Applied Genomics, and the Australian Cancer Research Foundation DNA resource for the production of microarrays used in this project.
This work was performed as part of the Renal Regeneration Consortium.
↵* G. Challen and B. Gardiner contributed equally to this work.
↵1 The Supplemental Material (Supplemental Tables S1–S6 and Supplemental Figs. S1–S3) for this article is available online at http://physiolgenomics.physiology.org/cgi/content/full/00043.2005/DC1.
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
Address for reprint requests and other correspondence: S. M. Grimmond, Institute of Molecular Bioscience, Univ. of Queensland, St. Lucia, QLD, Australia (e-mail:).
- Copyright © 2005 the American Physiological Society