|
|
||||||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
1 Department of Medicine, Flinders University of South Australia, Adelaide, South Australia
2 Preventative Health National Research Flagship, CSIRO Mathematical and Information Sciences, Sydney
3 Preventative Health National Research Flagship, CSIRO Molecular and Health Technologies, Sydney, New South Wales
4 Department of Surgery, Flinders University of South Australia, Adelaide, South Australia, Australia
| ABSTRACT |
|---|
|
|
|---|
colorectal gene expression
| INTRODUCTION |
|---|
|
|
|---|
The large intestine is often divided for clinical convenience into six anatomical regions starting from the terminal region of the ileum: the cecum, the ascending colon, the transverse colon, the descending colon, the sigmoid colon, and the rectum. Alternatively, these segments may be grouped to divide the large intestine into a two-region model comprising the proximal and distal large intestine. The proximal ("right") region is generally taken to include the cecum, ascending colon, and the transverse colon, while the distal ("left") region includes the splenic flexure, the descending colon, the sigmoid colon, and the rectum. This division is supported by the distinct embryonic ontogenesis of these regions whose junction is two-thirds along the transverse colon and also by the distinct arterial supply to each region. While the proximal large intestine develops from the embryonic midgut and is supplied by the superior mesenteric artery, the distal large intestine forms from the embryonic hindgut and is supplied by the inferior mesenteric artery (3). A comprehensive of review of proximal/distal differences are provided in Ref. 29.
The longitudinal nature of the large intestine along the proximal-distal axis provides a relatively unique opportunity for constructing a whole organ map of gene expression. Previous research suggests that there is a clear distinction between the gene expression patterns of proximal colonic tissues and distal colorectal tissues (7, 25, 33). While these findings support a broad model of gene expression difference, there have been no studies to explore the detailed nature of expression gradients of such genes. Given the interesting embryology related to the midgut and hindgut junction near the splenic flexure during embryogenesis, the question is raised: Do differentially expressed genes exhibit an abrupt expression schism between the midgut- and hindgut-derived tissues or does expression follow a gentle gradient along the proximal-distal axis?
To explore this question, this work investigates the gene expression patterns observed along the proximal-distal axis of the large intestine. By exploring these patterns in nonneoplastic tissues we aim to improve understanding of gene expression variation in healthy normal adults without the added complexity of neoplasia-related gene expression changes. We have built expression profile "maps" that identify individual genes whose expression appears to be location dependent, and we have described the nature of multigene expression variance longitudinally along the colon. We apply linear models to these maps to compare the embryology-consistent proximal vs. distal two-region model with a more gradual model based on continuously variable expression between the cecum proximally and rectum distally. Such gene expression maps of the normal adult colon will provide a foundation for improved understanding of gene expression variation in both the normal and diseased state.
| MATERIALS AND METHODS |
|---|
|
|
|---|
"Discovery" data set.
Gene expression and clinical descriptions for 184 colorectal tissue specimens were purchased from GeneLogic (Gaithersburg, MD). Individual tissue microarray data were selected with the following characteristics: nonneoplastic colorectal mucosa free of nonmucosa contaminating tissue (confirmed by histology) from otherwise healthy tissue specimen (i.e., no evidence of inflammation or other disease at specimen site) with an anatomically identifiable site of resection designated as one of: cecum, ascending colon, descending colon, sigmoid colon, or rectum.
For each tissue selected from the GeneLogic database, we received electronic files of raw data containing a total of 44,928 probe sets (Affymetrix HGU133A and HGU133B, combined), experimental and clinical descriptors for each tissue, and digitally archived microscopy images of the histology preparations. Each data record was manually assessed for clinical consistency, and a sample of records was randomly chosen for histopathology audit using digitally archived histology images. A quality control analysis was performed to identify and remove array results not meeting essential quality control measures as defined by the manufacturer (1, 50).
Gene expression levels were calculated by both Microarray Suite (MAS) 5.0 (Affymetrix) and the robust multichip average (RMA) normalization techniques (1, 28, 30). MAS normalized data were used for performing standard quality control routines, and the final data set was normalized with RMA for all subsequent analyses. A list of GeneLogic sample IDs for the commercial microarray data used in this study is included as supplemental material.1
"Validation" data set.
The colorectal specimens in the validation set were collected from a tertiary referral hospital tissue bank in metropolitan Adelaide, Australia (Repatriation General Hospital and Flinders Medical Centre). The tissue bank and this project were approved by the Research and Ethics Committee of the Repatriation General Hospital, and patient consent was received for each tissue studied. Following surgical resection, specimens were placed in a sterile receptacle and collected from theatre. The time from operative resection to collection from theatre was variable but not more than 30 min. Samples,
125 mm3 (5 x 5 x 5 mm) in size, were taken from the macroscopically normal tissue as far from pathology as possible, defined both by colonic region as well as by distance either proximal or distal to the pathology. Tissues were placed in cryovials, then immediately immersed in liquid nitrogen and stored at –150°C until processing.
Frozen samples were processed by the authors using standard protocols and commercially available kits. Briefly, frozen tissues were homogenized using a carbide bead mill (Mixer Mill MM 300; Qiagen, Melbourne, Australia) in the presence of chilled Promega SV RNA Lysis Buffer (Promega, Sydney, Australia) to neutralize RNase activity. Homogenized tissue lysates for each tissue were aliquoted to convenient volumes and stored –80°C. Total RNA was extracted from tissue lysates using the Promega SV Total RNA system according to manufacturer's instructions and integrity was assessed visually by gel electrophoresis.
To measure relative expression of mRNA transcripts, tissue RNA samples were analyzed using Affymetrix HG U133 Plus 2.0 GeneChips (Affymetrix, Santa Clara, CA) according to the manufacturer's protocols (2). Biotin-labeled cRNA was prepared using 5 µg (1.0 µg/µl) total RNA (
1 µg mRNA) with the "One-Cycle cDNA" kit [incorporating a T7-oligo(dT) primer] and the GeneChip IVT labeling kit. In vitro transcribed cRNA was fragmented (20 µg) and analyzed for quality control purposes by spectrophotometry and gel electrophoresis prior to hybridization. Finally, an hybridization cocktail was prepared with 15 µg of cRNA (0.5 µg/µl) and hybridized to HG U133 Plus 2.0 microarrays for 16 h at 45°C in an Affymetrix Hybridization Chamber 640. Each cRNA sample was spiked with standard prokaryotic hybridization controls for quality monitoring.
Hybridized microarrays were stained with streptavidin phycoerythrin and washed with a solution containing biotinylated anti-streptavidin antibodies using the Affymetrix Fluidics Station 450. Finally, the stained and washed microarrays were scanned with the Affymetrix Scanner 3000.
The Affymetrix software package was used to transform raw microarray image files to digitized format. As for the Discovery set above, gene expression levels for the validation data set were calculated using MAS 5.0 (Affymetrix) for quality control purposes and with the RMA normalization algorithm for expression data. Finally, the data for the 19 microarrays used for validation in this publication have been deposited in the National Center for Biotechnology Information's Gene Expression Omnibus (GEO, http://www.ncbi.nlm.nih.gov/geo/) and are accessible through GEO Series accession number GSE9254.
Statistical Analysis
For all statistical analysis, we used open source software available from BioConductor for the R statistics environment (BioConductor, www.bioconductor.org) (23, 24).
Gene expression gradients were analyzed using three analytical techniques. First, we compared the gene expression variation of individual genes along the colon using univariate tests. Next, we further explored those particular genes exhibiting statistically significant expression differences with linear models to compare dichotomous (proximal vs. distal) expression change with a gradual (multisegment) model of change. Finally, we applied multivariate techniques to understand subtle genome-wide expression variance along the proximal-distal axis.
Individual Gene Expression Maps
Univariate differential expression.
Differentially expressed gene transcripts between the proximal and distal colon were identified using a moderated t-test implemented in the "limma" Bioconductor library (47). Significance estimates (P values) were corrected to adjust for multiple hypothesis testing using the conservative Bonferroni correction. The subset of tissues limited to the cecum vs. the rectum was similarly tested.
Gene transcripts identified to be differentially expressed were also evaluated in the Validation specimens on a probe set-by-probe set basis using modified t-tests. To assess the significance of the total number of differential probe sets that were likewise differential in the validation data, the number of validated probe sets were compared with a null distribution estimated using a Monte Carlo simulation.
Multisegment colon vs. two-segment colon model comparison.
To evaluate the nature of intersegment gene expression variation we analyzed differentially expressed probe sets for relative fit to linear models in a multisegment vs. a two-segment framework. The goal of this analysis is to explore whether the intersegment expression of probe sets that are known to be differentially expressed between the terminal ends of the large intestine are better modeled by a five-segment linear model that approximates a continual gradation or by a simpler, dichotomous "proximal" vs. "distal" gradient. As our data are only identified by colorectal segment designation and not by a continuous measurement along the length of the colon, we approximate the continuous model using the tissue segment location. We chose probe sets that are differentially expressed between the most terminal segments (cecum and rectum) to maximize the likelihood of identifying transcripts that vary along the proximal-distal axis of the colon.
We first modeled the expression of these probe sets along the proximal-distal axis of the colon using a five-factor robust linear model according to an indicator matrix defined by the colorectal segment for each tissue. For this model each tissue was assigned by biopsy location to one of: cecum, ascending, descending, sigmoid, or rectum. (For reasons described below, transverse tissues were not included in this analysis.) This five-segment model was then compared with a two-factor robust linear model with a design matrix corresponding to the theoretical proximal and distal regions of the colon. The same data were used for both model comparisons; however, for the two-segment model, the first factor (corresponding to the proximal tissues) included all of the tissues from the cecum and ascending colon, while the second factor (corresponding to the distal colon) included all tissues from the descending, sigmoid, and rectum segments.
When comparing these distinct models for each probe set, we used an F-test to evaluate the alternative hypothesis that the improved fit (reduced regression residual) provided by the more complex five-segment model was significantly better than the simpler two-segment model. A nonsignificant residual reduction indicates a failure to reject the null hypothesis: that there is no inherent value to adopting a more complex five-segment model over the simpler alternative.
Multivariate Gene Expression Pattern Mapping
Supervised principal components analysis.
To visualize and explore the structure of expression variability at an organ level, we applied principal component analysis (PCA) and supervised PCA. Supervised PCA is similar to traditional PCA but uses only a subset of the features/genes (usually selected by some univariate means) to derive the principal components (4). We use the set of genes differentially expressed between the cecum and rectum as described above. All software for implementing supervised PCA was developed by us and is available on request. The algorithms for supervised PCA is coded in R.
| RESULTS |
|---|
|
|
|---|
The larger data set was analyzed to identify gene expression patterns, and the independently derived second expression set was used to validate these patterns. Thus, the first data set was mined for hypothesis generation, while the second set was used for hypothesis testing.
Discovery and validation data sets.
To construct the discovery set, 184 GeneChips hybridized to cRNA from nondiseased tissues meeting inclusion and quality assurance criteria were used for hypothesis generation. The tissues comprised segment subsets as follows: 29 cecum, 45 ascending, 13 descending, 54 sigmoid, and 43 rectum. For each tissue, 44,928 probe sets were background corrected and normalized using RMA preprocessing. The theoretical juncture between the proximal and distal colon is approximately two-thirds the length of the transverse colon measured from the hepatic flexure (3). As sample data were not specific for distance along the transverse colon, these tissues were excluded from the discovery analysis.
To construct the validation data set, 19 HG U133 Plus2.0 GeneChips were hybridized to labeled cRNA prepared from 8 proximal tissue specimens and 11 distal specimens from the hospital tissue bank. Due to stringent quality control parameters for tissue and GeneChip acceptability, this validation data set did not include sufficient tissues to explore multiple segment models. Each microarray measured transcript expression for 54,675 probe sets.
Gene Variation Along the Colon
Individual gene expression changes.
UNIVARIATE DIFFERENTIAL EXPRESSION.
To explore the "natural" dividing point between the anatomical segments of the colon, we measured the absolute number of significant probe set expression differences by modified t-test when the hypothetical "divide" was moved stepwise from cecum to rectum. Figure 1 shows the number of probe sets that were differentially expressed for each intersegment divide. The maximum number of probe set differences, 206, occurs when the proximal and distal regions are divided between the ascending and descending segments. As this dividing point is consistent with both our understanding of embryonic development and the usual separation of the proximal and distal segments, the following comparison of proximal and distal tissues were based on this division.
|
154 known gene targets, were differentially expressed higher in the proximal or distal colorectal samples compared with the complementary region (Bonferroni corrected P < 0.05). Of these 206 probe sets, 31 (16.5%) were also differentially expressed in the validation data with a significant difference (31/206, P < 10–5 by Monte Carlo estimation). To further explore differential expression in the discovery set, we identified those transcripts that were different between the most terminal ends of the large bowel. A total of 115 probe sets were differentially expressed between tissues selected only from the cecum (n = 29) and the rectum (n = 43); 102 (89%) of these probe sets were included in the 206 probe sets differing between proximal and distal colon described above. In this subset, 28 probe sets (24.3%) were likewise differentially expressed in the rectum vs. the cecum in the validation data (28/115, P < 10–5 by Monte Carlo estimation). All 28 of these consistent probe sets were included in the 31 consistent probe sets between the distal and proximal regions.
Differentially expressed probe sets and difference statistics for probe sets with elevated expression in proximal (94) and distal (126) tissues are shown in Tables 1 and 2, respectively.
|
|
To explore the nature of these gene transcript expression changes, we built and compared robust linear models fitted to the expression data based on location for each tissue sample. Two robust linear models of univariate probe set expression were compared for each of the 115 probe sets differentially expressed between the two terminal segments of the large intestine, the cecum, and rectum. In particular, we queried whether the expression of those transcripts that were differentially expressed between these terminal segments were better explained (in terms of residual fit) by a simple two-segment model or by the more descriptive five-segment model.
Of the 115 differentially expressed probe sets, the analysis failed to reject the null hypothesis that a complex model does not significantly improve model fit to the observed gene expression data for 65 (57%) of cases (F-test, P > 0.05). Thus, more than half of these differentially expressed transcripts along the colon are satisfactorily modeled by the two-segment expression model whereby expression is dichotomous and defined by either proximal vs. distal location. The most differentially expressed probe set between the cecum and rectum is the transcript for PRAC. A comparison of the two-segment and multisegment models for this transcript is shown in Fig. 2, which is typical of other genes in this category (data not shown).
|
|
PCA AND SUPERVISED PCA. We analyzed the full 44,928 probe sets of the discovery data set using PCA. The first two dimensions of this analysis are shown in Fig. 4A. Inspection of this low dimension perspective yields no obvious structure within the data that is consistent with tissue segment. This analysis suggests that the major sources of gene expression variation (i.e., the first two principal components) measured across all genes is not dependent on tissue location.
|
|
| DISCUSSION |
|---|
|
|
|---|
154 unique gene targets that are differentially expressed between the normal proximal and normal distal large intestine regions in human adults. A subset of 115 probe sets (89% common to the proximal vs. distal list) is likewise differentially expressed between the terminal colorectal segments of the cecum and rectum. Interestingly, we found no transcripts that were expressed significantly differently between any two adjacent segments. To estimate the validity of these findings, we have also measured the expression change of these gene transcripts in an independent set of microarray data. Thirty-one (31) of the 206 differentially expressed probe sets in our initial discovery data set of 184 colorectal tissue samples were also differentially expressed in the validation data of 19 specimens.
Nearly all (28/31, 90%) of these "validated" transcripts were likewise differentially expressed between the two terminal segments of the cecum and rectum.
Some of the gene transcripts that we describe herein were previously identified to be differentially expressed by microarray analysis using a variety of cDNA and oligonucleotide microarrays (7, 25, 33). Five of the gene targets of differential probe sets we found were previously identified in two or more of these earlier studies, including: HOXB13, NR1H4, S100P, SCNN1B, and SIAT4C. Each of these probe sets were also shown to be statistically different (i.e., HOXB13, SIAT4C: P < 0.065) in our validation data set. An additional 33 probe set target genes of the 206 probe sets we present here were previously identified to be differentially expressed along the colon in at least one of these earlier studies.
We identified an additional 28 probe sets that were differential in both our discovery data and our independent validation data but were not reported in the previous reports. In total, 57 of 154 (37%) gene targets corresponding to the 206 probe sets were confirmed to be differentially expressed between the proximal and distal from the validation set. The agreement of our work with earlier studies and with the independent validation set adds credibility to the results, especially given the potential for concern about microarray reproducibility between and within data collection platforms (39). Our analysis has also identified 28 new probe sets of relevance to mapping.
Differential Transcript Expression for Individual Genes
The most significantly differential probe set we observed in our discovery data was against the gene transcript for PRAC, previously described as specifically expressed in prostate, the distal colon and rectum (37). Our data agree with the earlier findings that the probe set for PRAC is highly expressed in the distal colon relative to the proximal tissues. This observation was confirmed by RT-PCR (Supplementary Figure), where essentially no expression was seen in proximal tissues. Furthermore, PRAC appears to be expressed in a low-high pattern along the colon with a sharp expression change occurring between the ascending and descending colorectal specimens.
We found eight probe sets corresponding to seven HOX genes to be differentially expressed between the proximal and distal colon. The 39 members of the mammalian homeobox gene family consist of highly conserved transcription factors that specify the identity of body segments along the anterior-posterior axis of the developing embryo (27, 35). The four groups of HOX gene paralogs are expressed in an anterior-to-posterior sequence, for e.g., from HOXA1 to HOXB13 (40). The expression patterns in our data for these eight probe sets are consistent with the expected pattern: lower numbered HOX genes are expressed higher in the proximal tissues (HOXD3, HOXD4, HOXB6, HOXC6, and HOXA9), while the higher named genes are more expressed in the distal colon (HOXB13 and HOXD13). Elevated expression of HOXB13 in the distal colon was confirmed by RT-PCR (Supplementary Figure). These results are also consistent with examples of specific HOX expression in the literature, such as studies that demonstrate HOXD13 involvement in the development of the anal sphincter in mice (34).
We also report, however, the conspicuous absence in our findings of some gene transcripts that have been previously shown to be differentially expressed along the proximal-distal axis. Our data do not demonstrate a significant expression gradient for the caudal homeobox genes CDX1 or CDX2, transcription factors that have been shown to be involved in intestine pattern development across a range of vertebrates (13, 31, 45). In particular, CDX2 is considered to play a role in maintaining the colonic phenotype in the adult colon and was shown to be present at relatively high concentrations in the proximal colon but absent in the distal colon (31, 45). Neither statistical analysis nor visual inspection of probe set expression for this gene suggests differential expression along the colon in our data (data not shown). Analysis by RT-PCR of a subset of RNA samples from the validation set supported the array data in that expression of CDX2 in the distal colon was equivalent to or greater than in proximal samples (Supplementary Figure).
We observed significant differential transcript expression for a number of the solute-carrier transport genes that can be rationalized based on our current understanding of colorectal physiology. While probe set expression for SLC2A10, SLC13A2, and SLC28A2 is higher in the distal colon, the solute carrier family members SLC9A3, SLC14A2, SLC16A1, SLC20A1, SCL23A3, and SLC37A2 are higher in the proximal tissues. These data support the findings of Glebov et al. (25), including for the Na-dependent dicarboxylic acid transporter member 2 (SLC13A2), which is expressed higher distally, and for the monocarboxylic acid transporter family member 1 (SLC16A1, alias MCT1), which is higher in the proximal tissues. This expression of SLC16A1/MCT1 is consistent with evidence that the short chain fatty acid butyrate, which is most abundant in the proximal gut (38), may regulate SLC16A1/MCT1 expression by both transcriptional control and by transcript stabilization (16).
Our results show that probe sets against all three of the five members of the chromosome 7q22 cluster of membrane-bound mucins previously believed to be expressed in colon, MUC11, MUC12, and MUC17, are differentially expressed at higher levels in the distal gut (10, 49, 26). We also confirmed this differential expression pattern for MUC12 and MUC17 in the independent validation data. Previous reports have raised the question about whether the genomic sequences for MUC11 and MUC12 are from closely related or perhaps even the same gene (10). Correlation analysis of MUC11 and MUC12 probe sets show a strong, positive correlation at the lower end of the probe set expression range with a weaker correlation as expression increases (data not shown). This correlation profile could be due to increased variability at higher expression levels or, possibly, because the expression levels in the distal colon (where they are higher) reflect a distinct transcriptional control. Differences in mucin glycoprotein characteristics between the proximal and distal gut, including the degree of sulfation, were demonstrated 30 years ago (5, 20).
In addition, while previous research has suggested that the secreted, gel-forming mucin MUC5B is only weakly expressed in the colon (10), our results show that probe sets reactive to this transcript are expressed higher in the distal colon as for the membrane-bound mucins. Our data also support earlier reports that transcripts for the estrogen responsive element known as trefoil factor 1 (TFF1, alias pS2) is differentially expressed higher in the distal colon (46).
Many of the expression patterns we report here for humans have been shown to be similarly patterned in the gastrointestinal tracts of rodent models. However, a number of specific genes previously shown to be differentially expressed along the large intestines of mice and rats were not found to be so expressed by us. Such gene transcript targets include solute carrier family 4 member 1 (alias AE1) (44) and Toll-like receptor 4 (TLR4) (41). For TLR4 no significant difference in expression between proximal and distal human samples was seen by RT-PCR in agreement with the microarray data (Supplementary Figure). Using a commercially available RT-PCR assay we were unable to detect SLC4A1 mRNA in any of our validation set including, carbonic anhydrase IV (21). On the other hand, our data are in agreement with earlier studies of expression of aquaporin-8 (AQP8), a gene whose expression product is suspected to be involved in water absorption in the normal rat colon (11). We observe that AQP8 is significantly expressed to a higher level in the proximal human colon compared with the distal tissues (P < 0.006, data not shown).
The family of claudin tight junction proteins may also play a role in maintaining the water barrier integrity in the colon (32). We found claudin-8 (CLDN8) is more highly expressed in the distal colorectal tissues and this observation was supported by RT-PCR analysis (Supplementary Figure). Conversely, claudin-15 (CLDN15), which is also believed to be localized in the tight junction fibrils was expressed at a higher level in the proximal colorectal tissues (15).
Nature of Gene Expression Change Along the Colon
While one goal of this work was to understand which gene transcripts are differentially expressed along the colon, a second aim was to explore the nature of these expression changes along the proximal-distal axis in region or segment-specific detail.
We observed two broad patterns of statistically significant transcript expression change along the colorectum. The major pattern is described by those 65 probe sets that were well fitted by a two-segment expression model. We suggest that the expression of these transcripts is dichotomous in nature: elevated in the proximal segments and decreased in distal segments, or vice-versa.
Such data are consistent with the conventional anatomical view that the "natural" divide between the proximal and distal colon occurs between the ascending and descending colon. This finding is contrary to a recent report by Komuro et al. (33) that a breakpoint between the descending and sigmoid colon yields the largest differential expression. However, we note that in addition to analyzing this pattern in colorectal cancer specimens (we used nondiseased tissues only), Komuro et al. also chose to include the transverse colon in their analysis. We intentionally exclude tissues from that segment to avoid the possible confounding affect related to the predicted midgut-hindgut junction point approximately two-thirds the length of the transverse colon.
A second set of 50 probe sets does not display a dichotomous change but rather shows a significant improvement in fit when the expression data were applied to a five-segment model supporting a more gradual expression gradient moving along the colon from the cecum to the rectum.
These two characteristic expression patterns hint that gene expression along the proximal-distal axis is perhaps coordinated by two underlying systems of organization.
The majority of differentially expressed transcripts in the adult normal tissues measured here are expressed in a pattern that is consistent with a midgut vs. hindgut pattern of embryonic development. Furthermore, multivariate methods including supervised PCA and canonical variate analysis (data not shown) also suggest that the primary source of variation among these data is explained by the proximal vs. distal divide. In a recent study Glebov et al. (25) found that the number of genes differentially expressed between the ascending and descending colon in the adult is substantially larger than the number of genes likewise identified in 17- to 24-wk-old fetal colons. Glebov et al. hypothesize that the gene expression pattern of the adult colon is possibly set concurrently with expression of the adult colonic phenotype at
30 wk gestation or perhaps even in response to postnatal luminal contents of the gastrointestinal tract. While we did not explore gene expression in the fetal colon, we observe patterns of expression in the adult that support a proximal-distal expression model consistent with the midgut-hindgut embryonic origins.
Most (41 of 50) of those transcripts that exhibit a gradual expression change between the cecum and rectum exhibit a prototypical pattern of increased expression increasing from the cecum to the rectum. This pattern is not observed in the midgut-hindgut differential transcripts where the number of transcripts elevated proximally is approximately equal to the number elevated in the distal region. We propose that the characteristic distally increasing pattern in those transcripts could be a function of extrinsic factors compared with the intrinsically defined midgut-hindgut pattern. Such factors could include the effect of luminal contents that move in a unidirectional manner from the cecum to the rectum and/or the regional changes in microflora along the large intestine. Further work will be required to investigate whether such extrinsic controls are working in a positive manner of inducing transcriptional activity or through a reduced transcriptional silencing.
Gene Expression Changes in Concert Along the Colon
To explore the expression of genes in concert along the colon, we also apply PCA to these expression data. There is strong evidence for a proximal vs. distal gene expression pattern with these multivariate visualization techniques. Though multivariate results do not exclude a subtle proximal-distal gradient, the apparent bimodal nature of the multivariate plots suggests that the major source of expression variation in these tissues is consistent with a midgut- vs. hindgut-derived pattern.
Conclusions
Our work indicates that transcript abundance, and perhaps transcriptional regulation, follows two broad patterns along the proximal-distal axis of the large intestine. The dominant pattern is a dichotomous expression pattern consistent with the midgut-hindgut embryonic origins of the proximal and distal gut. Transcripts that follow this pattern are roughly equally split into those that are elevated distally and those elevated proximally. The second pattern we observe is characterized by a gradual change in transcript levels from the cecum to the rectum, nearly all of which exhibit increasing expression toward the distal tissues. We propose that tissues that exhibit the dichotomous midgut-hindgut patterns are likely to reflect the intrinsic embryonic origins of the large intestine while those that exhibit a gradual change reflect extrinsic factors such as luminal flow and microflora changes. Taken together, these patterns constitute a gene expression map of the large intestine. This is the first such map of an entire human organ.
| GRANTS |
|---|
|
|
|---|
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
1 The online version of this article contains supplemental material. ![]()
| REFERENCES |
|---|
|
|
|---|
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |