|
|
||||||||
1 Lilly Research Laboratories, Greenfield, Indiana
2 Lilly Research Laboratories, Indianapolis, Indiana
3 Lilly Research Laboratories, Mont-Saint-Guibert, Belgium, A Division of Eli Lilly and Company
| ABSTRACT |
|---|
|
|
|---|
80%), a substantial number were greater than twofold (
20%). Transcript changes unique to the individual analysis were confirmed by quantitative RT-PCR, while all the changes unique to the pooled analysis did not confirm. The individual analysis identified more hits per biological pathway than the pooled approach. Many of the transcripts identified by the individual analysis were novel findings and may contribute to a better understanding of molecular mechanisms of these compounds. Furthermore, having individual animal data provided the opportunity to correlate changes in transcript expression to phenotypes (i.e., histology) observed in toxicology studies. The two approaches were similar when clustering methods were used despite the large difference in the absolute number of transcripts changed. In summary, pooling reduced resource requirements substantially, but the individual approach enabled statistical analysis that identified more gene expression changes to evaluate mechanisms of toxicity. An individual animal approach becomes more valuable when the overall expression response is subtle and/or when associating expression data to variable phenotypic responses. Affymetrix microarray; statistical; hepatotoxicants
| INTRODUCTION |
|---|
|
|
|---|
Despite demonstrated utility, there are many challenges in using microarrays. Among them is the task of extracting useful biological information from the enormous amount of data generated. This challenge has been aided by the development of many new tools over the last few years that facilitate data reduction, visualization, and analysis (13, 49). Another prominent, yet largely unaddressed, challenge is that appropriately powered experimental designs using microarrays can be costly and time consuming (52). This cost has led many scientists to the practice of pooling replicate biological samples, which not only decreases the number of microarrays needed but also reduces sample preparation for a given study. However, with the practice of pooling comes a loss of opportunity to apply statistical analysis methods that enable the measurement of individual animal variation within the experiment. Furthermore, the ability to correlate individual transcript expression changes back to individual animal phenotypes, which often vary within treatment groups, is lost. This correlation is critical for identification of biomarkers and modeling response. Ultimately, the effect of pooling samples as opposed to a statistical analysis of individual samples relates to the balancing of costs vs. a tolerance for lack of statistical confidence in the data.
Literature on the topic of pooling samples for microarray experiments tends to be technical reports of formulas for calculating the number of biological samples (pooled or individual) needed for a properly powered analysis (23, 24, 28, 32, 44). This study examines the impact of pooling samples from both a qualitative or "biological" perspective and a quantitative viewpoint. We used the same single-dose time-course toxicology samples and pooled them or processed them individually. Three well-characterized prototypical toxicants, clofibrate (CLO), diethylhexylphthalate (DEHP), and valproic acid (VPA) were selected based on 1) knowledge of their mechanism(s) in the scientific literature, 2) the robust expression response in liver following treatment, and 3) their relative effect on peroxisome proliferator-activated receptor (PPAR) (3, 16, 33, 38, 48). The analysis methods employed for analyzing data from pooled or individual sample designs were those typical for microarray users when utilizing the respective designs. This analysis allows researchers to make informed decisions as to appropriate experimental design and sample processing based on their needs and resource constraints.
| MATERIALS AND METHODS |
|---|
|
|
|---|
|
|
Sample quality.
All samples were assessed for RNA quality such as microarray scaling factors, background levels, percent present calls, ß-actin and GAPDH 3'/5' ratios, etc. Samples with high 3'/5' ratios would usually be excluded from further analysis; however, the CLO samples were pooled before the results for the individual animal microarrays were completed. For the remaining two compounds, samples were pooled after the quality of the individual samples was assessed, thus ensuring that the pools for VPA and DEHP contained only high-quality samples. This design allowed us to compare the two analysis methods as a worst-case scenario (CLO, with 7 of 26 samples being of suspect quality RNA) and a best-case scenario (VPA and DEHP, where all of the samples are high quality). The data discussed in this publication have been deposited in the National Center for Biotechnology Information's Gene Expression Omnibus (GEO; http://www.ncbi.nlm.nih.gov/geo/) and are accessible through GEO series accession no. GSE-2303.
Data analysis of individual samples.
Signal intensities were generated from Microarray Suite version 5.0 (MAS5) using the default settings, and a global scaling set to 1500. To apply statistical analysis to the experiment, the fold change was calculated as the ratio of the two group means based on the observed signal values from MAS5 for the two treatment groups.
To test whether a gene is differentially expressed, an ANOVA model was fitted on each of the 8,799 probe sets on the chip. The intensity value, Affymetrix MAS5 signal, of a particular gene was modeled as
![]() |
ki is the measurement error (combination of sample variation and chip-to-chip variation) following distribution N(0,
2). Because thousands of hypotheses were tested simultaneously, the issue of multiplicity is a concern. To estimate the false positives, Benjamini and Hochberg's false discovery rate (FDR) was used to adjust the P values derived from the above ANOVA model. Because FDR estimates the proportion of false positives among the tests called "significant," the cutoff value of FDR in a particular study is flexible depending on the goal of the experiment. For instance, an FDR of 0.1 represents a tolerance of 10% of false positives among the selected significant genes.
Data analysis for pooled samples.
Signal intensities were generated using MAS5 as described above, and fold-change values were obtained by running a comparison analysis in MAS5 using the default settings. The MAS5 fold-change output is the signal log ratio (SLR), which is the fold change, presented in log2. To simplify comparison with the individual analysis, the SLR was converted to a standard, nonlogarithmic scale.
Filtering criteria.
All changes for the individual analysis had an FDR
0.1, a minimum signal of 500 for either the vehicle or treated samples, and an absolute fold change of
1.4. For the pooled analysis, changes had an MAS5 call of increase or decrease, an absolute fold change of at least 2, and a minimum signal of 500 for either the vehicle or treated samples and were not called absent in both vehicle and treated samples. Analysis of these data showed that signal values below
500 had a high coefficient of variation (data not shown). The 1.4-fold minimum change for the statistical analysis was arbitrarily chosen to remove small changes not consistently validated using quantitative RT-PCR (QRT-PCR; data not shown). A lower fold-change cutoff for the statistically analyzed data was implemented as a result of the additional confidence gained from a 10% FDR. The twofold minimum for MAS5 changes was based on guidance from Affymetrix. Implementing a twofold cutoff for both analyses resulted in a greater overlap between the two methods, but there were still numerous statistically significant changes missed by pooling, and the false positives remained (see Supplemental Materials).
Principle component analysis.
Principle component analysis (PCA) was performed as described by Yeung and Ruzzo (53) using the R statistical package (http://www.R-project.org), and data were standardized using a z-score calculation before PCA. The PCA scores corresponding to the first three principal components were imported into Spotfire Decision Suite 7.2 for visualization. To include all observable variation in the experiment and to avoid sample grouping caused by transcript selection, all probe sets on the chip were used (2, 36).
Hierarchical clustering.
Heirarchical clustering (HCS) was performed using the Pearson correlation as the distance metric and average linkage clustering in Spotfire Decision Suite. The data were clustered based on the MAS5 signal for all 8,799 probe sets.
Pathway mapping of expression data.
Mapping of expression data was performed using GenMapp 2.0 and MappFinder 2.0 (11, 12). MappFinder identified the number of genes changing for a number of pathways as well as provided a statistical evaluation of the overrepresentation of modulated genes within the various pathways. Because not all of the genes in a given pathway have been annotated to the RG_U34A chip, the data were formatted so that the pathway content of the RG_U34A chip was used for the background in the statistical calculations. MappFinder positive z-scores indicate that the number of modulated genes in the pathway is greater than expected by chance. Larger z-scores represent greater significance. Once modulated pathways were identified using MappFinder, the data were overlaid onto pathway visualization maps and colored based on whether a transcript was detected by pooled, individual, or both approaches.
QRT-PCR.
We used QRT-PCR to validate the array data. Total RNA was treated with DNase (DNA-free, Ambion), and subsequently 1.5 µg were used for first-strand cDNA synthesis (Superscript II, Invitrogen) followed by QRT-PCR using SYBR Green (Applied Biosystems). Transcripts for confirmation were picked arbitrarily, representing a range of fold change and signal change from the different subsets in the data. Because the goal of these experiments was to validate the technical results from the two approaches, the same preparation of RNA as hybridized to the arrays was used. The QRT-PCR results represent the mean values from each dosing group. Aldehyde reductase was used as an internal control. As shown in Table 1, aldehyde reductase expression levels remained consistent across all treatments. Fold changes were calculated by dividing the gene of interest/internal control for the treatment by GOI/ctrl for the vehicle samples (GOI/ctrl = signal for the gene of interest divided by the signal for the control gene). Statistical analysis (t-test) was performed on the QRT-PCR results from individual animals for each transcript using the JMP statistical package (version 4.0.4). Additional details on this method, including primer sequences, can be found in the Supplemental Materials.
|
| RESULTS |
|---|
|
|
|---|
|
30% for all compounds (Fig. 2, AC). Most of the pooled changes were also identified within the individual set of significant changes as demonstrated by the small number of transcripts unique to the pooled analysis (5 and 3% for DEHP and VPA, respectively, and 20% of the total for CLO). In contrast, a large percentage of changes from the individual analysis (4967%) were not found in the pooled data set. The majority of these probe sets had expression changes of less than twofold, but there were numerous transcripts (1420%) with expression changes greater than twofold that were also not identified by the pooled analysis (Fig. 2, AC). Similar results were obtained with data at the 4- and 24-h time points, showing that a significant fraction of data was lost even where there were fewer total transcript changes (see Supplemental Materials).
QRT-PCR analysis of transcript changes falling in the various regions of the Venn diagram, generated from the data set, highlights additional differences between the approaches. Five or six transcripts from each of the three regions of the Venn diagram representing unique and overlapping areas of the VPA data set (Fig. 2C) were arbitrarily chosen for technical confirmation by QRT-PCR of the VPA samples. The results from the QRT-PCR analysis are shown in Table 1. It is striking to note that none of the transcripts unique to the VPA pooled analysis was confirmed as being statistically significant (P value
0.05) when the individual samples were evaluated by QRT-PCR, suggesting a high false-positive rate for this subset of transcripts upon pooling. The changes for both the overlap and the transcripts detected uniquely in the individual samples were confirmed by QRT-PCR as being statistically significant changes. Therefore, all changes identified by the statistical microarray analysis and tested by QRT-PCR were confirmed. Among these transcripts, the individual analysis identified three transcripts (sodium channel-ß1, selenoprotein W, and lanosterol-14-demethylase) called "not changed" by the MAS5 algorithm and therefore missed by the pooled approach. Each of these transcripts was strongly regulated (5.7-, 2.3-, and 5.3-fold, respectively) when detected by the individual analysis and significantly regulated when checked by QRT-PCR.
To determine the impact of a pooled vs. individual analysis on sample classification, all 48-h array data were analyzed using PCA and HCS, as shown in Fig. 3. Treatment with the three toxicants separated from the vehicle along the first and second principal components, while differences between compounds separated along the third principal component. A clear compound-dependent effect relative to vehicle control was observed using PCA. As expected, the pooled samples for each compound consistently clustered tightly with the individual samples from their respective groups in the PCA.
|
Minimal changes in pathology were observed in these studies, and all changes were reversed by 168 h after treatment. Some vacuolation, hypertrophy, and increases in mitotic figures were noted; however, the most dramatic response to the compounds was an increase in absolute liver weight noted at 24 and 48 h. An advantage of the individual approach is the opportunity to correlate the transcript profiles to the phenotype, sometimes called "phenotypic anchoring." For example, a subtle increase in liver weight was observed in the CLO-treated animals at 48 h. The Spearman correlation was used to identify specific transcripts that correlate with increased liver weight (Fig. 4, AC). Hydratase dehydrogenase, a known PPAR
-regulated transcript, was clearly upregulated by CLO treatment (FDR <0.001) but was not highly correlated to increased liver weight. Retinol dehydrogenase, although not associated with PPAR
activation (FDR >0.1), was highly correlated to liver weight increase. Lastly, acyl-CoA-synthetase-4 was significantly regulated by treatment (FDR = 0.03) and correlated to increased liver weight. Thus the individual approach enables both statistical analysis and correlation to phenotype, allowing for deeper investigation of the data.
|
|
| DISCUSSION |
|---|
|
|
|---|
The rationale for pooling samples tends to be nontechnical (limited sample RNA) and generally involves savings in time and reagent costs. In our experience, time and cost can be reduced up to fivefold by pooling samples. The benefits of an individual animal approach are many and include 1) estimation of biological variation, 2) robust statistical testing as opposed to heuristic filtering, 3) association with phenotypic data when available, and 4) more information to apply to mechanistic analysis. Statistics provide a quantitative measure of the probability of false positives in the data based on signal relative to noise and observed sample variation. Therefore, statistical measures of confidence, such as FDR, provide a mechanism for rational data filtering that incorporates knowledge of false-positive rates. In our comparison of pooled and individual designs, changes unique to the pooled analysis did not validate when evaluated with QRT-PCR (n = 5) using individual sample RNA, indicating that they are likely to be false positives (Table 1). This loss in confirmation of transcripts by QRT-PCR was a result of large variation in signal between samples within the treatment group as opposed to a technical problem related to the probe sets representing these transcripts on the microarray. This is supported by the observation that we could confirm the unique-to-pooled changes by QRT-PCR using pooled sample RNA (data not shown). The changes identified from the pooled data set that failed to validate can impact the interpretation of the data, as demonstrated by the fact that, at 168 h postdosing, CLO still regulated a number of changes (see Supplemental Materials) in the pooled samples, but none of these changes was statistically significant.
The expression changes uniquely identified by the individual analysis enhanced biological interpretation of the experiment. Three regulated transcripts identified in the individual analysis, sodium channel-ß1, selenoprotein W, and lanosterol-14-demethylase, were called not changed by MAS5 and therefore were not called changed in the pooled approach. These transcripts were strongly regulated (>5-fold) and were confirmed by QRT-PCR (with primers designed independently of the probe sequences on the arrays). Each of the three transcripts had a high level of hybridization to the Affymetrix mismatch probes, providing an explanation for why MAS5 did not pick up the change. Interestingly, VPA has been shown to modulate sodium channel-ß1, and this is likely associated with its pharmacological mechanism of action (51). The increase in lanosterol-14-demethylase transcript after VPA treatment has not been reported previously. This enzyme is involved in cholesterol biosynthesis and trafficking of sterols. Downregulation of this transcript and the cholesterol esterase (Table 1; a subtle 1.5-fold change also only detected in the individual analysis) by VPA may contribute to the steatotic effect in liver observed after VPA treatment (21, 39). Other transcripts involved in lipid metabolism that were strongly regulated by VPA and determined only in the individual analysis include acyl-CoA-synthetase-4 (11-fold increase) and the low-density lipoprotein receptor transcript (13-fold increase). Additionally, glycerol-3-phosphate dehydrogenase, a mitochondrial protein integral in intermediate shuttling and bioenergetics, was regulated by VPA (8-fold increase). These are just a few examples of biologically relevant changes that analysis of pooled samples did not identify.
Statistical analysis increases assay sensitivity, enabling confident identification of subtle changes and detection of changes that may be characterized by greater variability. Approximately one-half of the transcriptional changes detected by individual design were not detected in the pooled design (Figs. 1 and 2), a result similar to that reported by others (14). While most of the transcript changes detected by the individual design had fold changes of less than two, 20% of these changes were greater than two (Fig. 2). The identification of additional differentially expressed transcripts, even if subtle, increases the sensitivity in identifying modulated pathways that can impact biological interpretation of data. Here, gene expression changes were mapped to the fatty acid ß-oxidation pathway, a pathway that all of these compounds are known to affect via their interactions with the PPAR
receptor (54). It was clear from this exercise that the pooled vs. individual analysis gave the same result from a high-level view; i.e., expected effects on fatty acid oxidation were evident for both approaches. Still, some important differences were evident using this comparative analysis. For example, the individual analysis of VPA detected a change in the long-chain acyl-CoA-dehydrogenase mRNA where the pooled analysis did not. This enzyme is inhibited by both VPA and its active metabolite (25); thus an important piece of mechanistic information about VPA was missed by pooling. Further discrepancies between the methods were apparent upon analysis of additional pathways. Individual analysis identified more pathways and had greater representation within pathways than the pooled analysis. Ribosomal (DEHP), apoptosis (DEHP), proteosomal degradation (DEHP), oxidative stress (VPA), and cell cycle (VPA) transcripts were identified only in the individual analysis. Ribosomal protein transcripts are modulated in stress response and tissue regeneration (27, 31, 50). Although only DEHP modulated more ribosomal protein transcripts than expected by chance, individual analysis of VPA and CLO also detected changes in 12 and 3 of these transcripts, respectively. Activation of the ribosomal transcripts modulated by these three compounds have been associated with hepatocarcinogenesis in the rat (8, 9), an interesting finding given that PPAR
agonists are nongenotoxic carcinogens in rodents. The implications of elevated proteasomal transcripts are not clear; however, regulation by all compounds of a family of transcripts indicates a treatment-related effect. DEHP toxicity has been shown to involve apoptosis(20), while VPA has been shown to induce oxidative stress in rats (41). Clearly, additional insight into these toxicants' mechanisms of action was gained utilizing the individual approach.
Subtle changes in transcript abundance identified by statistical analysis may also be biologically interesting, since a number of laboratories have demonstrated discordance between mRNA and protein levels (5, 10, 17, 19, 45). Subtle changes in mRNA abundance may relate to a dramatic impact on protein activity that may be missed without the sensitivity enabled by statistical methods. Transcription factors such as nuclear hormone receptors represent such a class of genes. Another key benefit of individual analysis is that it enables the correlation of each animal's microarray results to phenotypes measured within an experiment. Variability in phenotype severity across multiple animals treated identically often complicates correlation analysis, and pooling samples does not allow discrimination between responders and nonresponders. In this study, several transcript changes correlated to the PPAR
-induced increase in liver weight (Fig. 4). The high correlation observed for retinol dehydrogenase was more a result of change in expression in vehicle animals rather than in treated animals, indicating that this was not a compound effect. This was not the case for acyl-CoA-synthetase, where the high correlation supports a treatment-related effect. Regardless of the interpretation, this analysis illustrated the utility of correlating changes in a given transcript with a given phenotype. Correlation between transcript abundance and phenotype can be useful to identify cause-and-effect relationships or in the identification of novel biomarkers. For example, within standard toxicology live phase experiments, serum clinical chemistry parameters are measured and histopathology observations are routinely captured. Understanding the correlation between clinical chemistry and gene expression could lead to additional markers that provide mechanistic information regarding the type of injury.
Clustering algorithms are often used to classify expression patterns as similar or different. In both PCA and HCS, the pooled samples fell into clusters with their corresponding individual animal samples. Waring et al. (44) also demonstrated that pooling samples did not alter their ability to use unsupervised clustering methods to compare overall sample similarity. Given that significant numbers of transcripts were missed with the pooled approach, it would appear that clustering methods are insensitive to subtle changes in a large population of transcript changes. However, there were apparently enough significant changes in the pooled data to drive the algorithm to a similar clustering of treatments. Interestingly, VPA and DEHP clustered more tightly together than with CLO regardless of the analysis. There is no evidence that DEHP causes microvesicular steatosis in the literature, but there is data indicating that VPA causes a slight PPAR-type response (16). These data show that clustering does not separate an inhibition of ß-oxidation from upregulation of this pathway. The quality of samples can impact array data, as demonstrated by the larger spread on the PCA chart for some of the CLO samples (treatment and vehicles). Some of these samples had an unacceptable amount of degradation, as assessed from control probes on the microarrays, and could have been excluded from analysis. This difference in sample integrity between CLO and the other compounds may be evident in Figs. 1 and 2, where CLO had fewer transcripts unique to individual analysis relative to the pooled analysis, since a statistical analysis is more sensitive to variability between samples. These observations underscore a potential risk of pooling samples. Just one degraded or contaminated sample can negatively impact the quality of data for a particular pool and potentially confound results.
In summary, while the two approaches to running microarray chips were comparable, pooling saves considerable time and resources at the expense of statistical confidence in the data. The ability to correlate expression data to other data sets and phenotypes can be critical for effective modeling, the building of training sets, or biomarker discovery (7, 30). As has been done with the classification of cancers, the analysis of individual samples is required to provide appropriate power for predictive modeling (4, 40, 46, 47). This is an important issue in toxicology, since microarrays have been heralded for their potential use in building a predictive database of prototypical toxicants (26, 43). As we have seen, the individual analysis revealed subtle changes that affect interpretation of the experiment that were lost in the pooled analysis and important for mechanistic understanding. Although there are dissenting opinions as to the value of pooled or individual experimental designs, an individual approach becomes more valuable where the variation in phenotypic response is large and/or the overall response is subtle. A pooled approach is better employed where the expected response or phenotype is robust and its variation in that response is minimal.
| ACKNOWLEDGMENTS |
|---|
| FOOTNOTES |
|---|
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
10.1152/physiolgenomics.00260.2004
1 The Supplemental Material for this article is available online at http://physiolgenomics.physiology.org/cgi/content/full/00260.2004/DC1. The Supplemental Material contains 5 files that are available for downloading. An experimental design summary table is provided, DesignTable.doc. The raw microarray data for this manuscript is contained in the file, "Fold-change data.zip." This file contains the complete statistical analysis of the individual data for all probe sets with a P value < 0.05 as well as the MAS5 output for the pooled samples. A more detailed protocol for the QRT-PCR analysis is included in the file, "Detailed QRT-PCR protocol.doc;" the primers that were used for these reactions can be found in the file, "QRT-PCR primer sequences.xls." Additional figures for Fig. 2 containing Venn diagrams at all 4 time points can be found in the file, "Figure 2 Venn diagrams.PDF." There are also additional figures for Fig. 5 overlaying the effect of all 3 compounds on mitochondrial fatty acid ß-oxidation, contained in the file, "Figure 5.pdf." ![]()
| REFERENCES |
|---|
|
|
|---|
ligands: role in perturbation of hepatocyte proliferation and apoptosis. Toxicol Sci 68: 304313, 2002.
-secretase inhibitor. J Biol Chem 278: 4610746116, 2003.
This article has been cited by other articles:
![]() |
J. N. McGuire, J. Overgaard, and F. Pociot Mass spectrometry is only one piece of the puzzle in clinical proteomics Brief Funct Genomic Proteomic, February 28, 2008; (2008) eln005v1. [Abstract] [Full Text] [PDF] |
||||
![]() |
Y. Guo, R. A. Jolly, B. W. Halstead, T. K. Baker, J. P. Stutz, M. Huffman, J. N. Calley, A. West, H. Gao, G. H. Searfoss, et al. Underlying Mechanisms of Pharmacology and Toxicity of a Novel PPAR Agonist Revealed Using Rodent and Canine Hepatocytes Toxicol. Sci., April 1, 2007; 96(2): 294 - 309. [Abstract] [Full Text] [PDF] |
||||
![]() |
H. Wintz, L. J. Yoo, A. Loguinov, Y.-Y. Wu, J. A. Steevens, R. D. Holland, R. D. Beger, E. J. Perkins, O. Hughes, and C. D. Vulpe Gene Expression Profiles in Fathead Minnow Exposed to 2,4-DNT: Correlation with Toxicity in Mammals Toxicol. Sci., November 1, 2006; 94(1): 71 - 82. [Abstract] [Full Text] [PDF] |
||||
![]() |
E. van Lunteren, M. Moyer, and P. Leahy Gene expression profiling of diaphragm muscle in {alpha}2-laminin (merosin)-deficient dy/dy dystrophic mice Physiol Genomics, March 13, 2006; 25(1): 85 - 95. [Abstract] [Full Text] [PDF] |
||||
![]() |
S.-D. Zhang and T. W. Gant Effect of pooling samples on the efficiency of comparative studies using microarrays Bioinformatics, December 15, 2005; 21(24): 4378 - 4383. [Abstract] [Full Text] [PDF] |
||||
![]() |
M. Liang and B. Ventura Physiological genomics in PG and beyond: July to September 2005 Physiol Genomics, October 17, 2005; 23(2): 119 - 124. [Full Text] [PDF] |
||||
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
| HOME | HELP | FEEDBACK | SUBSCRIPTIONS | ARCHIVE | SEARCH | TABLE OF CONTENTS |
| Visit Other APS Journals Online |