Physiol. Genomics AJP: Advances in Physiology Education
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Physiol. Genomics 25: 294-302, 2006. First published January 31, 2006; doi:10.1152/physiolgenomics.00168.2005
1094-8341/06 $8.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary Tables
Right arrow All Versions of this Article:
25/2/294    most recent
00168.2005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (5)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Lee, P. D.
Right arrow Articles by Sladek, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lee, P. D.
Right arrow Articles by Sladek, R.
Received 12 July 2005; accepted in final form 13 January 2006.
Physiological Genomics 25:294-302 (2006)
1094-8341/06 $8.00 © 2006 American Physiological Society

Mapping cis-acting regulatory variation in recombinant congenic strains

Peter D. Lee1,3,4, Bing Ge1,2, Celia M. T. Greenwood5,6, Donna Sinnett1, Yannick Fortin1, Sebastien Brunet1, Anny Fortin7, Marina Takane4, Emil Skamene2,3,7, Tomi Pastinen1, Michael Hallett4, Thomas J. Hudson1,2,3 and Robert Sladek1,2,3

1 McGill University and Genome Quebec Innovation Centre, Montreal, Quebec
2 Research Institute of McGill University Health Centre, Montreal, Quebec
3 Department of Human Genetics, Faculty of Medicine, McGill University, Montreal, Quebec
4 McGill Centre for Bioinformatics, Montreal, Quebec
5 Program in Genetics and Genomic Biology, Hospital for Sick Children, Toronto, Ontario
6 Department of Public Health Sciences, University of Toronto, Toronto, Ontario
7 Emerillon Therapeutics, Montreal, Quebec, Canada


    ABSTRACT
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 GRANTS
 REFERENCES
 
We present an integrated approach for the enriched detection of genes subject to cis-acting variation in the mouse genome. Gene expression profiling was performed with lung tissue from a panel of recombinant congenic strains (RCS) derived from A/J and C57BL/6J inbred mouse strains. A multiple-regression model measuring the association between gene expression level, donor strain of origin (DSO), and predominant strain background identified over 1,500 genes (P < 0.05) whose expression profiles differed according to the DSO. This model also identified over 1,200 genes whose expression showed dependence on background (P < 0.05), indicating the influence of background genetic context on transcription levels. Sequences obtained from 1-kb segments of 3'-untranslated regions identified single nucleotide polymorphisms in 64% of genes whose expression levels correlated with DSO status, compared with 29% of genes that displayed no association (P < 0.01, Fisher exact test). Allelic imbalance was identified in 50% of genes positive for expression-DSO association, compared with 22% of negative genes (P < 0.05, Fisher exact test). Together, these results demonstrate the utility of RCS mice for identifying the roles of proximal genetic determinants and background genetic context in determining gene expression levels. We propose the use of this integrated experimental approach in multiple tissues from this and other RCS panels as a means for genome-wide cataloging of genetic regulatory mechanisms in laboratory strains of mice.

allelic imbalance; gene networks; cis-acting regulatory variants


    INTRODUCTION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 GRANTS
 REFERENCES
 
THE APPLICATION OF GENE EXPRESSION profiling to model systems with controlled genetic variation provides new avenues for exploring the genetic basis of complex phenotypes and gene regulatory mechanisms (5, 11). Because regulatory variants are known to affect phenotypic end points and are believed to account for a large proportion of changes contributing to the evolution of complex traits (6, 26), this integrated approach brings the resolution of genetic mapping studies closer to the level of biochemical mechanisms and provides a system for linking gene expression changes to specific genetic variants. In higher mammals, gene expression profiles have been shown to be highly heritable (9, 34), mirroring the evolutionary relationships between species (16) as well as displaying variation between subpopulations of species (40).

Among the major goals shared by genetic mapping strategies and gene expression profiling is the determination of gene regulation mechanisms believed to underlie complex traits. Recent studies have demonstrated the utility of mapping gene expression traits to identify loci with relevance to physiological phenotypes (21, 49). Studies demonstrating the heritability of gene expression, mapping expression traits (4) as well as demonstrating the impact of genetic regulatory variants in human disease (34, 35, 49), further underscore the importance of characterizing gene regulation on a genome-wide scale.

The integration of gene expression profiling and genetic approaches offers the possibility of identifying regulatory variants affecting transcription on a genome-wide scale. Recent studies mapping expression traits aim to classify genes subject to cis- or trans-acting regulation (7, 8). Trans-acting regulatory variation involving variants located at a distance from the affected locus is believed to account for the majority of significant differences in gene expression and may include the spectrum of protein-DNA and protein-protein interactions known to influence transcription. In contrast, cis-acting regulatory variation, located on the same chromosome as the regulated gene, is estimated to affect at least 30% of genes differentially expressed between individuals (6) and can be detected by measuring allele-specific transcription ratios in heterozygous individuals with intragenic single nucleotide polymorphisms (SNPs) (59). When levels of the two transcripts are compared under identical cellular contexts in vivo, deviation from the expected 50-to-50 ratio of alleles is known as allelic imbalance (AI) and can indicate the presence of cis-acting regulatory variants (42).

Here we present an approach that applies expression profiling and AI assays to a panel of recombinant congenic strains (RCS) to detect genes subject to cis-acting variation (Fig. 1). RCS are derived by backcrossing and inbreeding two mouse strains, generating a panel of strains that are homozygous at every locus and that contain variable congenic segments from the genome of the donor strain (averaging 12%) on that of the background strain (14). A RCS panel of sufficient size (>20 strains) ensures that the majority (>90%) of genes are contained on donor congenic segments. RCS panels were initially designed as an experimental system for separating loci involved in multigenic traits, permitting each locus to be analyzed separately (52). The increase in mapping efficiency offered by RCS has led to identification of candidate genes (17) as well as multilocus interaction effects (55). We sought to determine whether RCS could be used to characterize the influence of genetic variation on gene regulation by studying a RCS panel derived from the reciprocal backcrossing of A/J and C57BL/6J parental inbred strains (18). These two strains have been widely studied as models for complex phenotypes including airway hyperresponsiveness (13), acute lung injury (43, 44), and lung cancer (2). RCS provide a particular advantage in detecting regulatory relationships as they enable one to assess the simultaneous influence of background genetic context on the association between expression phenotypes and cis-acting genetic variants (36). We obtained gene expression profiles for lung tissue obtained from the RCS panel to measure the association between transcript levels and donor strain of origin (DSO), taking into account the contribution of the predominant background strain. To validate these observations, we sequenced genes identified by expression analysis to characterize their SNP content and measured allelic imbalance in F1 offspring of an A/J x C57BL/6J cross.


Figure 1
View larger version (18K):
[in this window]
[in a new window]
 
Fig. 1. Overview of analysis method. Simple sequence-length polymorphisms (SSLPs) and oligonucleotide probe sets were aligned to the University of California-Santa Cruz Feb 2003 mouse genome assembly. Probe sets flanked by SSLPs originating from the same parental strain were assigned the same donor strain of origin (DSO) according to previously acquired genotype data (18). Probe sets falling between adjacent markers of differing DSO were assigned an unknown DSO status because of the lack of precise location of the recombination site. Analysis for association between DSO and gene expression was accomplished by an ANOVA with validation by resequencing and allelic imbalance.

 

    MATERIALS AND METHODS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 GRANTS
 REFERENCES
 
RCS panel.
Tissues used in these studies were obtained from a RCS panel derived from A/J and C57BL/6J parents (18). The mice used in this study were part of a 600-animal experiment that took place over a 4-mo period. The mice used in the F1 validation study were reared in the same animal facility (with the same diet and housing conditions). All animals were killed at 16 wk of age, and tissues were harvested according to a standard dissection protocol. All mice were handled according to guidelines and regulations of the Canadian Council on Animal Care under protocols approved by the Animal Care Committee of the McGill University Health Centre Research Institute. Genotyping was previously performed with 621 simple sequence-length polymorphisms (SSLPs), providing an average distance between markers of 2.6 cM. Of the 26,746 genotypes used to characterize the RCS panel, 25,571 results agreed between replicates, 520 results disagreed between replicates (where one result was definite and the other undefined), 19 results were contradictory between replicates (showing results from opposing parental origins for the same locus), and 33 results were undetermined (18). When the replicate genotypes disagreed or were contradictory, the corresponding loci were classified as "undetermined." Because undetermined loci were dispersed throughout the data set (across all markers and strains), their overall effect on the analysis was minimal. The RCS strain assignment of all mice used in this study was verified with genotypes obtained for a minimal set of the original markers.

Microarray studies.
The mice were housed with free access to food and water under conditions of 12 h of darkness and 12 h of light. After an overnight fast, the animals were killed and the lung tissues were rapidly dissected and frozen in liquid nitrogen. Tissue samples were obtained from two adult male mice for each strain in the RCS panel. The samples were homogenized in TRIzol reagent, and RNA was prepared according to the manufacturer's instructions. cRNA probes prepared from 10 µg of total RNA were hybridized to Affymetrix MGU74Av2 oligonucleotide arrays as described previously (39). Expression data were summarized with robust multiple average and quantile normalized with Bioconductor (http://www.bioconductor.org).

Data analysis.
Probe target sequences from the Affymetrix MGU74Av2 oligonucleotide array were aligned against the February 2003 build of the NCBI mouse genome, using BLAT with default parameters (25). Of the 12,422 probe sets on the chip, 11,293 were localized uniquely. Physical positions of SSLPs from the RCS genotype data were retrieved by aligning marker sequences to the same assembly, using BLAST to identify all primer matches with intervening sequences of <500 bp (wordsize = 12, p = 0.90, e = 0.1) and iteratively relaxing matching parameters in the event of no match. SSLPs that were assigned to more than one genomic location were not used in subsequent analyses. This procedure matched 566 of 621 (90%) markers used to genotype the strains. We inferred DSO for each gene based on the DSO of surrounding SSLPs; genes flanked by markers of identical DSO were assigned the same DSO. If flanking markers differed, which would indicate a recombination event, the intervening genes were assigned an unknown DSO status because of the uncertainty as to the position of the recombination site. Genes at the ends of chromosomes were assigned the DSO of the closest SSLP. As expected, the rate of unknown DSO assignment correlates with the number of recombined segments, indicating that higher frequencies of recombination decreased the amount of data included in our analysis. This still permitted assignment of DSO to >90% of the probe sets on the array over all strains.

ANOVA was conducted per gene assuming independence of loci, using the linear model expression ~DSO + BG + DSO*BG, where background (BG) was calculated for each strain as the ratio of total donor segment lengths over the entire genome with DSO from one parent over the other. Positive association between expression and DSO was assigned if P < 0.05 for DSO and P > 0.05 for BG. All analyses were conducted with R (http://www.r-project.org).

Validation studies.
For SNP discovery, we randomly selected 50 genes that were positive for DSO association and 80 genes that showed no DSO association. The random selection was performed with a subset of transcripts whose mean expression over all strains exceeded 500 MAS 5.0 units. This restriction was imposed to ensure that the selected loci could be properly studied in our validation experiments. In addition, we excluded genes with documented alternative splicing in the Alternative Splicing Database (Ref. 53; http://www.ebi.ac.uk/asd/), as well as complex loci that had overlapping transcripts or reversed overlapping transcripts listed in the University of California-Santa Cruz Genome Browser database (23). For each gene, we sequenced 1 kb of 3' untranslated region (UTR) in A/J and C57BL/6J genomic DNA. Primers for resequencing were designed with Primer3.0 (48) set at default parameters.

AI was measured with tissue obtained from five adult male mice obtained from a F1 A/J x C57BL/6J cross. Lung tissue was harvested at 13 wk, and RNA and genomic DNA (gDNA) were extracted with TRIzol reagent. Sequence reactions were performed with 20 ng of cDNA prepared with reverse transcriptase or 20 ng of gDNA for all SNP-containing genes with methods previously described (42). SNP peak heights in the cDNA and gDNA were compared with PeakPicker (http://www.genome.mcgill.ca/BingGe/PeakPicker/), which normalizes SNP peak heights against those of the surrounding sequence. This method can detect differences in allelic expression >1.2-fold (20). In addition, we independently tested the sensitivity of the sequence-based method to detect AI equally in genes from positive and negative sets, both with and without AI, by analyzing serial dilutions of F1 gDNA with gDNA from either parental strain. Dilutions were prepared corresponding to ratios of 55:45, 65:35 and 85:15 for each allele. Genes selected at random for this analysis included those from all possible combinations of DSO-expression association and AI results. Although this analysis assumes similarity between the behavior of the sequencing assay in cDNA and gDNA, serial dilutions confirmed sensitivity of the technique to allelic ratios of at least 1.2-fold for the majority of genes tested. This threshold represents a lower limit because dilutions of <55:45 (representing an allele ratio of 1.2) were not tested. This analysis therefore remains unable to assess the extent of false-negative results. AI was determined with a paired Student's t-test comparing peak height ratios of the two alleles in gDNA vs. cDNA for five replicate F1 mice. Genes were said to display AI if a two-tailed Student's t-test comparing peak ratios exceeded significance of P < 0.05 for at least one SNP found within 1 kb of 3' UTR. Association of DSO expression analysis results with the SNP frequency or the AI frequency was done in R with a one-tailed Fisher exact test and logistic regression models: AI ~ PVAL, and SNP ~ PVAL.


    RESULTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 GRANTS
 REFERENCES
 
We studied 30 strains from a RCS panel derived from the reciprocal backcrossing of A/J and C57BL/6J inbred strains (18). We used genotype data previously obtained for 621 SSLPs to infer DSO, defined as the parental strain from which a congenic segment derives, for the probe sets on the MGU74Av2 oligonucleotide array. We assigned "undetermined" status to 6.8% of the probe sets, with the total number of probe sets with undetermined DSO status ranging from 220 (BcA82) to 1,503 (BcA83) and with a maximum of 366 probe sets on a single chromosome with undetermined DSO (BcA74, on chromosome 8). The largest stretch of genes affected was on chromosome 10 for BcA70, where 46% of genes were assigned an undetermined DSO status. Over the entire RCS panel, between 2% (BcA82) and 13% (BcA83) of the probe sets had an undetermined DSO. Although the RCS panel strains contained between 26 and 56 congenic segments (average 43.7), there was no significant difference between strains with a predominant A/J (4–38 segments, average 22) or C57BL/6J (7–35 segments, average 22) background. The maximum number of segments per chromosome was 5 (for BcA83 on chromosome 6 and AcB56 on chromosome 11).

Expression microarray studies were performed with lung tissue obtained from pairs of 4-mo-old male mice from each of 12 AcB and 18 BcA RCS as well as the parental strains. We previously observed (Lee PD, Sladek R, Greenwood CM, Ge B, Skamene E, and Hudson TJ, unpublished observation) baseline gene expression differences between adult males of the A/J and C57BL/6J parental strains in multiple tissues and demonstrated widespread tissue-specific expression variability between these strains in four tissues. In light of these results, our present study used an ANOVA model that measured association between gene expression, DSO, and the predominant background strain. Variability between individual replicate mice was factored into the error term for the ANOVA model. By explicitly considering the effect of background genetic context on gene expression levels, this analysis strategy provides between 38 and 62 effective replicates per gene, corresponding to the number of expression measurements obtained for genes with either A/J or C57BL/6J DSO status regardless of predominant background strain. In the course of analysis, we noted genomic regions where most RCS share chromosomal segments derived from the same parental strain. This correlates with our observation of uneven partitioning of the data set by the two terms in the ANOVA model (DSO and BG). As a result, the effects of DSO and BG could not be separated for some of the genes (30%). However, the panel provided a sufficient degree of heterogeneity to test association with DSO, while adjusting for background, for 8,860 genes.

Of the genes tested, we identified 1,591 probe sets (Supplemental Table 1, available at the Physiological Genomics web site)1 with significant association between expression level and DSO (P < 0.05), of which 1,185 probe sets displayed no association with background (P > 0.05 for BG and P > 0.05 for DSO*BG) (Fig. 2). To estimate the false-discovery rate due to this stage of the analysis, we determined the effect of correcting for multiple hypothesis testing. Correction for multiple testing with the Benjamini-Yekutieli procedure to control the false-discovery rate reduces the number of probe sets demonstrating significant association with DSO to 483 (45). This includes 40 probe sets representing transcription factors and coregulatory proteins (Table 1). However, because our approach combined independent experimental validation techniques, we chose relaxed thresholds (P < 0.05) without correction for multiple testing to determine the overall sensitivity of the subsequent validation steps. A comparison of the contribution of each variable to the overall variability revealed that DSO contributes the most, followed by BG, and then by their interaction (Fig. 3). We note that 1,213 genes display significant association (P < 0.05) between expression level and BG, suggesting the presence of elements located distant from the affected transcript. We further note 651 genes displaying significant association (P < 0.05) between expression level and the DSO-BG interaction term. This suggests that expression of the transcript is regulated by a combined effect of genetic variants located both distant to and within proximity to the affected transcript.


Figure 2
View larger version (11K):
[in this window]
[in a new window]
 
Fig. 2. Venn diagram listing the number of genes that have significant (P < 0.05) association between expression and each of the terms in the multiple-regression model (expression ~DSO + BG + DSO*BG, where BG is background). In total, 8,860 genes were tested, of which 1,591 probe sets showed association with DSO, 1,213 with BG, and 651 with the interaction term (DSO*BG).

 

View this table:
[in this window]
[in a new window]
 
Table 1. Transcription factors and coregulatory proteins detected with association between expression and DSO

 

Figure 3
View larger version (19K):
[in this window]
[in a new window]
 
Fig. 3. Quantile-quantile plot of P-values for each term from the multiple-regression model expression~DSO + BG + DSO*BG.

 
To evaluate whether the results of our expression analysis corresponded to genetic differences, we determined the SNP content at selected gene loci by sequencing 1 kb of 3' UTR for genes selected randomly from those negative and positive for DSO association. We resequenced 50 probe sets that showed association with DSO as well as 80 probe sets that showed no association with DSO. These were selected randomly from lists of probe sets that had been filtered to exclude transcripts that were technically unsuitable for evaluation in the sequence-based AI assay (see MATERIALS AND METHODS). In total, 234 randomly selected DSO-negative probe sets and 101 randomly selected DSO-positive probe sets were evaluated to select the validation set. We do not feel that this reflects a significant difference in the structure or expression level of the genomic loci underlying the DSO-positive and DSO-negative probe sets, as none of the selection criteria discriminated between equivalent numbers of selected DSO-positive and DSO-negative probe sets and none displayed a trend with the P-value for DSO-expression association (results not shown). In addition, in contrast to a previous report showing that transcript levels measured with oligonucleotide-based expression microarrays may be reduced for RNA targets that contain SNPs within the oligonucleotide probe sequences (15), analysis of the expression data obtained in this study suggests that hybridized probe intensities displayed more variability across the data set for SNP-containing probes but that the summary expression measures showed no systematic bias in favor of one allele (see Supplemental Analysis).

On the basis of the known ancestry of the parental lines used in this study (19, 56), we hypothesized that the SNP content should be higher at loci showing positive association if their expression levels were regulated by cis-acting genetic variants. To assess this, we obtained sequences from the 3' UTR as we expected that this region would show a higher rate of polymorphism than coding sequences. Our sequencing results showed a significantly increased occurrence of SNPs in genes positive for association between DSO and expression (Tables 2 and 3), with 50% of genes with positive DSO association vs. 28% of negative genes containing SNPs (P < 0.01, Fisher exact test). By logistic regression we observed a trend (P < 0.01) between increasing likelihood for SNP occurrence and P-value for association between gene expression association and DSO in the ANOVA model—the odds ratio associated with a 100 times smaller DSO P-value is 1.89, with a confidence interval of 1.17–3.03. This observed correlation between significant DSO-expression association and SNP occurrence concurs with previous results demonstrating cis-acting expression quantitative trait loci (QTL) within regions that are not identical by descent (15).


View this table:
[in this window]
[in a new window]
 
Table 2. Genes positive for expression-DSO association tested for AI

 

View this table:
[in this window]
[in a new window]
 
Table 3. Genes negative for expression-DSO association tested for AI

 
To determine whether cis-acting genetic variation was associated with differences in gene expression levels, we used a sequence-based assay to measure AI for genes where we had previously identified 3' UTR SNPs. These assays were performed with RNA obtained from the lungs of five replicate F1 mice generated from an A/J x C57BL/6J cross, which places loci with A/J or C57BL/6J DSO status in a common genetic background (Fig. 4). Of the 32 positive and 23 negative genes containing SNPs, 17 (50%) positive genes displayed AI vs. 5 (22%) negative genes (P < 0.05, Fisher exact test). The presence of AI also displayed dependence on the P-value for DSO-expression association by logistic regression (P < 0.05), suggesting that the likelihood of AI increases with higher significance of association between expression and DSO in the ANOVA model—the odds ratio associated with a 100 times smaller DSO P-value is 1.92, with a confidence interval of 1.04–3.53. Results for SNP discovery and AI are summarized in Table 4.


Figure 4
View larger version (19K):
[in this window]
[in a new window]
 
Fig. 4. Demonstration of allelic imbalance, using a sequencing-based assay. Sequence traces obtained from 1 kb of 3' untranslated region with genomic DNA (gDNA) from A/J (top left) and C57BL/6J (bottom left) parental lines show a G/A single nucleotide polymorphism (black arrowheads). Differences in the relative peak heights of the two alleles in sequence traces from AxB F1 gDNA (top right) vs. cDNA (bottom right) were used to calculate the allele-specific differences in transcript levels.

 

View this table:
[in this window]
[in a new window]
 
Table 4. SNP discovery and AI results in 130 genes

 

    DISCUSSION
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 GRANTS
 REFERENCES
 
The genome-wide identification of cis-regulated genes will provide a key resource in the search for genetic determinants of complex traits as well as a significant prioritization tool to identify candidate genes. Recent studies of gene regulation in yeast have identified networks of gene regulatory interactions (4, 50). This progress parallels studies of gene regulatory network dynamics, where the topology of the regulatory interactions is described in terms of motifs and modularity (1, 24, 33, 60), as well as chromatin immunoprecipitation studies (29, 46, 51, 54). Although these results suggest that the effects of a fixed cis-regulatory polymorphism may be significantly modified by the control hierarchy in which it is embedded, they also suggest that the link between cis-regulatory polymorphisms and disease states may only be apparent in specific genetic or physiological contexts. By employing an experimental system (RCS) and applying a relatively simple analytical strategy, this study defines a method for identifying cis-acting variation affecting gene expression with respect to background genetic context. In combination with an independent experimental validation AI method, this approach represents an opportunity to efficiently survey cis-acting regulatory variation across the genome.

The percentage of genes found to have AI among those showing an association between expression and DSO (50%) represents a 10- to 20-fold enrichment for detection compared with random screening of genes (10, 42). We note that the detection of association between expression and DSO was greatly enhanced by our ability to adjust for the effect due to the predominant (background) genome derived from the A/J and C57BL/6J parental lines. In agreement with a previous study (7), the transcription factor Runx1, known to be subject to cis-regulation, displayed the highest degree of association by our expression analysis and was confirmed by AI. We note that although our analysis did not identify all of the same candidates as the previous studies in recombinant inbred strains (21), our results confirmed 25 of 75 cis-regulated transcripts described previously (8). Because both tissue-specific effects as well as the specific inbred strains used in these studies may account for these differences, it would be interesting to validate our findings with other RCS or recombinant inbred panels. Although genes displaying AI are more likely to contain cis-acting regulatory variants (41), identification of the causative polymorphisms and mechanisms leading to AI remains to be confirmed empirically.

The RCS expression data set, combined with the F1 studies of AI, allows us to estimate that 8–11% of genes are affected by cis-acting regulatory variation in lung tissue among these strains. This percentage exceeds previous estimates across three tissues and four mouse inbred strains, where AI was seen in 3–6% of randomly selected genes (10), but agrees with a recent study comparing a cross of C57BL/6J and DBA/2J strains (15). We feel that the estimated 8–11% of genes subject to cis-acting regulation possibly represents a lower limit for a number of reasons. First, this study focused on a single tissue (lung) at one developmental stage (adult), whereas cis-regulation is known to act in a tissue-dependent fashion (10) and transcriptional changes are abundant throughout development (3, 12, 47). Second, a significant proportion of genes showed association between expression and DSO but did not demonstrate detectable AI. Although most of these may be true negatives in regard to cis-acting differences at the gene locus, weaker cis-acting effects may have remained undetected by the sequence-based AI assay; some cis-acting regulatory variants may have been masked because of complex control mechanisms happening in the F1 animals used to detect AI. Third, our design excluded genes falling within segments containing a recombination because the positions of recombination sites were not characterized. Although this affected a small percentage of results dispersed throughout the data set (6%), this exclusion together with varied levels of heterogeneity in the data set could have led to insufficient power to evaluate a number of genes on the microarray. Fourth, our analysis methods assumed independence for each locus tested; this may not be appropriate because dependencies exist among genes that are colocated in the same chromosome region (27). An analytical strategy involving more traditional techniques such as QTL mapping (30) would take into account the dependence on location as well as recombination rate throughout the genome. Finally, our results suggest that a proportion of genes with cis-acting variants do not display expression differences that are detectable across the RCS panel. We observe that five genes negative for association (P > 0.05 for DSO) display AI. This could result if the expression measurements were not sensitive enough to detect subtle changes in gene expression, which would be anticipated for tightly regulated transcripts. Alternatively, the expression results may have been confounded by one or more collinear variables, such as the contribution of ancestral haplotype variability as suggested by the differential SNP discovery rate (57, 58), or epistatic trans-interactions. Given the extent of gene interactions estimated to exist, collinear variables are more likely the case than the exception (4).

We note the substantial contribution of the background genetic composition to the overall expression variability observed across the strains; over 1,200 genes displayed significant associations with BG (P < 0.05). These observations indicate the presence of determinants affecting expression variability that are distant from the affected gene, suggestive of trans-acting regulatory mechanisms. Furthermore, a large number of genes (347) displayed significant associations of transcript abundance simultaneously with both DSO and background terms, suggesting that they are affected simultaneously by independently cis-acting and trans-acting regulatory variation. In addition, 651 genes displayed P < 0.05 for DSO*BG interaction, indicating the existence of dependencies between cis and trans effects. Examples of genes simultaneously affected by cis-acting regulation and trans-acting modifier genes are numerous (22, 31, 32, 37, 38). These results confirm the prevalence of such effects and highlight the importance of assessing the contribution of genetic context when measuring gene expression phenotypes.


    GRANTS
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 GRANTS
 REFERENCES
 
This research was supported by grants from the Canadian Genetics Diseases Network, the Canadian Institutes for Health Research, the Mathematics of Information Technology and Complex Systems (Networks of Centres of Excellence Program), Genome Canada, and Genome Quebec. T. J. Hudson is supported by a Clinician-Scientist Award in Translational Research by the Burroughs Wellcome Fund and an Investigator Award from the Canadian Institutes of Health Research. The recombinant congenic strains were supported by Emerillon Therapeutics Inc., Genome Canada, and Genome Quebec.


    ACKNOWLEDGMENTS
 
We thank Scott Gurd and Andre Ponton in the Microarray Facility at the McGill University and Genome Quebec Innovation Centre for contributions to the project, as well as Jean-Marie Chavannes and the animal care technicians at the McGill University Health Centre.


    FOOTNOTES
 
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).

Address for reprint requests and other correspondence: R. Sladek, 740 Dr. Penfield Ave, Rm. 6214, Montreal, QC, Canada H3A 1A4 (e-mail: rob.sladek{at}mail.mcgill.ca).

1 The Supplemental Material for this article (Supplemental Table 1 and Supplemental Analysis) is available online at http://physiolgenomics.physiology.org/cgi/content/full/00168.2005/DC1. Back


    REFERENCES
 TOP
 ABSTRACT
 INTRODUCTION
 MATERIALS AND METHODS
 RESULTS
 DISCUSSION
 GRANTS
 REFERENCES
 

  1. Alon U. Biological networks: the tinkerer as an engineer. Science 301: 1866–1867, 2003.[Abstract/Free Full Text]
  2. Bauer AK, Malkinson AM, and Kleeberger SR. Susceptibility to neoplastic and non-neoplastic pulmonary diseases in mice: genetic similarities. Am J Physiol Lung Cell Mol Physiol 287: L685–L703, 2004.[Abstract/Free Full Text]
  3. Bolouri H and Davidson EH. Transcriptional regulatory cascades in development: initial rates, not steady state, determine network kinetics. Proc Natl Acad Sci USA 100: 9371–9376, 2003.[Abstract/Free Full Text]
  4. Brem RB and Kruglyak L. The landscape of genetic complexity across 5,700 gene expression traits in yeast. Proc Natl Acad Sci USA 102: 1572–1577, 2005.[Abstract/Free Full Text]
  5. Broman KW. Mapping expression in randomized rodent genomes. Nat Genet 37: 209–210, 2005.[CrossRef][ISI][Medline]
  6. Buckland PR. Allele-specific gene expression differences in humans. Hum Mol Genet 13: R255–R260, 2004.[Abstract/Free Full Text]
  7. Bystrykh L, Weersing E, Dontje B, Sutton S, Pletcher MT, Wiltshire T, Su AI, Vellenga E, Wang J, Manly KF, Lu L, Chesler EJ, Alberts R, Jansen RC, Williams RW, Cooke MP, and de Haan G. Uncovering regulatory pathways that affect hematopoietic stem cell function using "genetical genomics." Nat Genet 37: 225–232, 2005.[CrossRef][ISI][Medline]
  8. Chesler EJ, Lu L, Shou S, Qu Y, Gu J, Wang J, Hsu HC, Mountz JD, Baldwin NE, Langston MA, Threadgill DW, Manly KF, and Williams RW. Complex trait analysis of gene expression uncovers polygenic and pleiotropic networks that modulate nervous system function. Nat Genet 37: 233–242, 2005.[CrossRef][ISI][Medline]
  9. Cheung VG, Conlin LK, Weber TM, Arcaro M, Jen KY, Morley M, and Spielman RS. Natural variation in human gene expression assessed in lymphoblastoid cells. Nat Genet 33: 422–425, 2003.[CrossRef][ISI][Medline]
  10. Cowles CR, Hirschhorn JN, Altshuler D, and Lander ES. Detection of regulatory variation in mouse genes. Nat Genet 32: 432–437, 2002.[CrossRef][ISI][Medline]
  11. Darvasi A. Genomics: gene expression meets genetics. Nature 422: 269–270, 2003.[CrossRef][Medline]
  12. Davidson EH, McClay DR, and Hood L. Regulatory gene networks and the properties of the developmental process. Proc Natl Acad Sci USA 100: 1475–1480, 2003.[Abstract/Free Full Text]
  13. De Sanctis GT, Merchant M, Beier DR, Dredge RD, Grobholz JK, Martin TR, Lander ES, and Drazen JM. Quantitative locus analysis of airway hyperresponsiveness in A/J and C57BL/6J mice. Nat Genet 11: 150–154, 1995.[CrossRef][ISI][Medline]
  14. Demant P and Hart AA. Recombinant congenic strains—a new tool for analyzing genetic traits determined by more than one gene. Immunogenetics 24: 416–422, 1986.[CrossRef][ISI][Medline]
  15. Doss S, Schadt EE, Drake TA, and Lusis AJ. Cis-acting expression quantitative trait loci in mice. Genome Res 15: 681–691, 2005.[Abstract/Free Full Text]
  16. Enard W, Khaitovich P, Klose J, Zollner S, Heissig F, Giavalisco P, Nieselt-Struwe K, Muchmore E, Varki A, Ravid R, Doxiadis GM, Bontrop RE, and Paabo S. Intra- and interspecific variation in primate gene expression patterns. Science 296: 340–343, 2002.[Abstract/Free Full Text]
  17. Fortin A, Cardon LR, Tam M, Skamene E, Stevenson MM, and Gros P. Identification of a new malaria susceptibility locus (Char4) in recombinant congenic strains of mice. Proc Natl Acad Sci USA 98: 10793–10798, 2001.[Abstract/Free Full Text]
  18. Fortin A, Diez E, Rochefort D, Laroche L, Malo D, Rouleau GA, Gros P, and Skamene E. Recombinant congenic strains derived from A/J and C57BL/6J: a tool for genetic dissection of complex traits. Genomics 74: 21–35, 2001.[CrossRef][ISI][Medline]
  19. Frazer KA, Wade CM, Hinds DA, Patil N, Cox DR, and Daly MJ. Segmental phylogenetic relationships of inbred mouse strains revealed by fine-scale analysis of sequence variation across 4.6 mb of mouse genome. Genome Res 14: 1493–1500, 2004.[Abstract/Free Full Text]
  20. Ge B, Gurd S, Gaudin T, Dore C, Lepage P, Harmsen E, Hudson TJ, and Pastinen T. Survey of allelic expression using EST mining. Genome Res 15: 1584–1591, 2005.[Abstract/Free Full Text]
  21. Hubner N, Wallace CA, Zimdahl H, Petretto E, Schulz H, Maciver F, Mueller M, Hummel O, Monti J, Zidek V, Musilova A, Kren V, Causton H, Game L, Born G, Schmidt S, Muller A, Cook SA, Kurtz TW, Whittaker J, Pravenec M, and Aitman TJ. Integrated transcriptional profiling and linkage analysis for identification of genes underlying disease. Nat Genet 37: 243–253, 2005.[CrossRef][ISI][Medline]
  22. Ikeda A, Zheng QY, Rosenstiel P, Maddatu T, Zuberi AR, Roopenian DC, North MA, Naggert JK, Johnson KR, and Nishina PM. Genetic modification of hearing in tubby mice: evidence for the existence of a major gene (moth1) which protects tubby mice from hearing loss. Hum Mol Genet 8: 1761–1767, 1999.[Abstract/Free Full Text]
  23. Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, and Kent WJ. The UCSC Table Browser data retrieval tool. Nucleic Acids Res 32: D493–D496, 2004.[Abstract/Free Full Text]
  24. Kauffman S, Peterson C, Samuelsson B, and Troein C. Genetic networks with canalyzing Boolean rules are always stable. Proc Natl Acad Sci USA 101: 17102–17107, 2004.[Abstract/Free Full Text]
  25. Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, and Haussler D. The human genome browser at UCSC. Genome Res 12: 996–1006, 2002.[Abstract/Free Full Text]
  26. Knight JC. Regulatory polymorphisms underlying complex disease traits. J Mol Med 83: 97–109, 2005.[CrossRef][ISI][Medline]
  27. Lander ES and Botstein D. Mapping mendelian factors underlying quantitative traits using RFLP linkage maps. Genetics 121: 185–199, 1989.[Abstract/Free Full Text]
  28. Lee TI, Rinaldi NJ, Robert F, Odom DT, Bar-Joseph Z, Gerber GK, Hannett NM, Harbison CT, Thompson CM, Simon I, Zeitlinger J, Jennings EG, Murray HL, Gordon DB, Ren B, Wyrick JJ, Tagne JB, Volkert TL, Fraenkel E, Gifford DK, and Young RA. Transcriptional regulatory networks in Saccharomyces cerevisiae. Science 298: 799–804, 2002.[Abstract/Free Full Text]
  29. Lynch M and Walsh B. Genetics and Analysis of Quantitative Traits. Sunderland, MA: Sinauer, 1998.
  30. Manenti G, Acevedo A, Galbiati F, Gianni Barrera R, Noci S, Salido E, and Dragani TA. Cancer modifier alleles inhibiting lung tumorigenesis are common in inbred mouse strains. Int J Cancer 99: 555–559, 2002.[CrossRef][ISI][Medline]
  31. Merlo CA and Boyle MP. Modifier genes in cystic fibrosis lung disease. J Lab Clin Med 141: 237–241, 2003.[CrossRef][Medline]
  32. Milo R, Shen-Orr S, Itzkovitz S, Kashtan N, Chklovskii D, and Alon U. Network motifs: simple building blocks of complex networks. Science 298: 824–827, 2002.[Abstract/Free Full Text]
  33. Monks SA, Leonardson A, Zhu H, Cundiff P, Pietrusiak P, Edwards S, Phillips JW, Sachs A, and Schadt EE. Genetic inheritance of gene expression in human cell lines. Am J Hum Genet 75: 1094–1105, 2004.[CrossRef][ISI][Medline]
  34. Morley M, Molony CM, Weber TM, Devlin JL, Ewens KG, Spielman RS, and Cheung VG. Genetic analysis of genome-wide variation in human gene expression. Nature 430: 743–747, 2004.[CrossRef][Medline]
  35. Nadeau JH. Listening to genetic background noise. N Engl J Med 352: 1598–1599, 2005.[Free Full Text]
  36. Nadeau JH. Modifier genes and protective alleles in humans and mice. Curr Opin Genet Dev 13: 290–295, 2003.[CrossRef][ISI][Medline]
  37. Nadeau JH. Modifier genes in mice and humans. Nat Rev Genet 2: 165–174, 2001.[ISI][Medline]
  38. Novak JP, Sladek R, and Hudson TJ. Characterization of variability in large-scale gene expression data: implications for study design. Genomics 79: 104–113, 2002.[CrossRef][ISI][Medline]
  39. Oleksiak MF, Churchill GA, and Crawford DL. Variation in gene expression within and among natural populations. Nat Genet 32: 261–266, 2002.[CrossRef][ISI][Medline]
  40. Pastinen T and Hudson TJ. Cis-acting regulatory variation in the human genome. Science 306: 647–650, 2004.[Abstract/Free Full Text]
  41. Pastinen T, Sladek R, Gurd S, Sammak A, Ge B, Lepage P, Lavergne K, Villeneuve A, Gaudin T, Brandstrom H, Beck A, Verner A, Kingsley J, Harmsen E, Labuda D, Morgan K, Vohl MC, Naumova AK, Sinnett D, and Hudson TJ. A survey of genetic and epigenetic variation affecting human gene expression. Physiol Genomics 16: 184–193, 2004.[Abstract/Free Full Text]
  42. Prows DR, Daly MJ, Shertzer HG, and Leikauf GD. Ozone-induced acute lung injury: genetic analysis of F2 mice generated from A/J and C57BL/6J strains. Am J Physiol Lung Cell Mol Physiol 277: L372–L380, 1999.[Abstract/Free Full Text]
  43. Prows DR and Leikauf GD. Quantitative trait analysis of nickel-induced acute lung injury in mice. Am J Respir Cell Mol Biol 24: 740–746, 2001.[Abstract/Free Full Text]
  44. Reiner A, Yekutieli D, and Benjamini Y. Identifying differentially expressed genes using false discovery rate controlling procedures. Bioinformatics 19: 368–375, 2003.[Abstract/Free Full Text]
  45. Ren B, Robert F, Wyrick JJ, Aparicio O, Jennings EG, Simon I, Zeitlinger J, Schreiber J, Hannett N, Kanin E, Volkert TL, Wilson CJ, Bell SP, and Young RA. Genome-wide location and function of DNA binding proteins. Science 290: 2306–2309, 2000.[Abstract/Free Full Text]
  46. Rifkin SA, Kim J, and White KP. Evolution of gene expression in the Drosophila melanogaster subgroup. Nat Genet 33: 138–144, 2003.[CrossRef][ISI][Medline]
  47. Rosen S and Skaletsky H. Primer3 on the WWW for general users and for biologist programmers. In: Bioinformatics Methods and Protocols: Methods in Molecular Biology, edited by Krawetz SA and Misener S. Totowa, NJ: Humana, 2000, p. 365–386.
  48. Schadt EE, Monks SA, Drake TA, Lusis AJ, Che N, Colinayo V, Ruff TG, Milligan SB, Lamb JR, Cavet G, Linsley PS, Mao M, Stoughton RB, and Friend SH. Genetics of gene expression surveyed in maize, mouse and man. Nature 422: 297–302, 2003.[CrossRef][Medline]
  49. Segre D, Deluna A, Church GM, and Kishony R. Modular epistasis in yeast metabolism. Nat Genet 37: 77–83, 2005.[CrossRef][ISI][Medline]
  50. Simon I, Barnett J, Hannett N, Harbison CT, Rinaldi NJ, Volkert TL, Wyrick JJ, Zeitlinger J, Gifford DK, Jaakkola TS, and Young RA. Serial regulation of transcriptional regulators in the yeast cell cycle. Cell 106: 697–708, 2001.[CrossRef][ISI][Medline]
  51. Stassen AP, Groot PC, Eppig JT, and Demant P. Genetic composition of the recombinant congenic strains. Mamm Genome 7: 55–58, 1996.[CrossRef][ISI][Medline]
  52. Thanaraj TA, Stamm S, Clark F, Riethoven JJ, Le Texier V, and Muilu J. ASD: the Alternative Splicing Database. Nucleic Acids Res 32: D64–D69, 2004.[Abstract/Free Full Text]
  53. Tong AH, Lesage G, Bader GD, Ding H, Xu H, Xin X, Young J, Berriz GF, Brost RL, Chang M, Chen Y, Cheng X, Chua G, Friesen H, Goldberg DS, Haynes J, Humphries C, He G, Hussein S, Ke L, Krogan N, Li Z, Levinson JN, Lu H, Menard P, Munyana C, Parsons AB, Ryan O, Tonikian R, Roberts T, Sdicu AM, Shapiro J, Sheikh B, Suter B, Wong SL, Zhang LV, Zhu H, Burd CG, Munro S, Sander C, Rine J, Greenblatt J, Peter M, Bretscher A, Bell G, Roth FP, Brown GW, Andrews B, Bussey H, and Boone C. Global mapping of the yeast genetic interaction network. Science 303: 808–813, 2004.[Abstract/Free Full Text]
  54. Tripodis N, Hart AA, Fijneman RJ, and Demant P. Complexity of lung cancer modifiers: mapping of thirty genes and twenty-five interactions in half of the mouse genome. J Natl Cancer Inst 93: 1484–1491, 2001.[Abstract/Free Full Text]
  55. Wade CM, Kulbokas EJ 3rd, Kirby AW, Zody MC, Mullikin JC, Lander ES, Lindblad-Toh K, and Daly MJ. The mosaic structure of variation in the laboratory mouse genome. Nature 420: 574–578, 2002.[CrossRef][Medline]
  56. Wiltshire T, Pletcher MT, Batalov S, Barnes SW, Tarantino LM, Cooke MP, Wu H, Smylie K, Santrosyan A, Copeland NG, Jenkins NA, Kalush F, Mural RJ, Glynne RJ, Kay SA, Adams MD, and Fletcher CF. Genome-wide single-nucleotide polymorphism analysis defines haplotype patterns in mouse. Proc Natl Acad Sci USA 100: 3380–3385, 2003.[Abstract/Free Full Text]
  57. Yalcin B, Fullerton J, Miller S, Keays DA, Brady S, Bhomra A, Jefferson A, Volpi E, Copley RR, Flint J, and Mott R. Unexpected complexity in the haplotypes of commonly used inbred strains of laboratory mice. Proc Natl Acad Sci USA 101: 9734–9739, 2004.[Abstract/Free Full Text]
  58. Yan H, Yuan W, Velculescu VE, Vogelstein B, and Kinzler KW. Allelic variation in human gene expression. Science 297: 1143, 2002.[Free Full Text]
  59. Yeger-Lotem E, Sattath S, Kashtan N, Itzkovitz S, Milo R, Pinter RY, Alon U, and Margalit H. Network motifs in integrated cellular networks of transcription-regulation and protein-protein interaction. Proc Natl Acad Sci USA 101: 5934–5939, 2004.[Abstract/Free Full Text]



This article has been cited by other articles:


Home page
Hum Mol GenetHome page
S. Thifault, S. Ondrej, Y. Sun, A. Fortin, E. Skamene, R. Lalonde, J. Tremblay, and P. Hamet
Genetic determinants of emotionality and stress response in AcB/BcA recombinant congenic mice and in silico evidence of convergence with cardiovascular candidate genes
Hum. Mol. Genet., February 1, 2008; 17(3): 331 - 344.
[Abstract] [Full Text] [PDF]


Home page
Am. J. Physiol. Regul. Integr. Comp. Physiol.Home page
K. G. Kumar, L. O. Byerley, J. Volaufova, D. J. Drucker, G. A. Churchill, R. Li, B. York, A. Zuberi, and B. K. S. Richards
Genetic variation in Glp1r expression influences the rate of gastric emptying in mice
Am J Physiol Regulatory Integrative Comp Physiol, February 1, 2008; 294(2): R362 - R371.
[Abstract] [Full Text] [PDF]


Home page
GeneticsHome page
G. Burgio, M. Szatanik, J.-L. Guenet, M.-R. Arnau, J.-J. Panthier, and X. Montagutelli
Interspecific Recombinant Congenic Strains Between C57BL/6 and Mice of the Mus spretus Species: A Powerful Tool to Dissect Genetic Control of Complex Traits
Genetics, December 1, 2007; 177(4): 2321 - 2333.
[Abstract] [Full Text] [PDF]


This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Supplementary Tables
Right arrow All Versions of this Article:
25/2/294    most recent
00168.2005v1
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (5)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by Lee, P. D.
Right arrow Articles by Sladek, R.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by Lee, P. D.
Right arrow Articles by Sladek, R.


HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
Visit Other APS Journals Online
Copyright © 2006 by the American Physiological Society.