Physiol. Genomics 25: 166-178, 2006.
First published January 10, 2006; doi:10.1152/physiolgenomics.00243.2005
1094-8341/06 $8.00
Received 4 October 2005;
accepted in final form 30 December 2005.
Physiological Genomics 25:166-178 (2006)
American Physiological Society © 2006 American Physiological Society
Toolbox
Three-color cDNA microarrays with prehybridization quality control yield gene expression data comparable to that of commercial platforms
Martin J. Hessner1,2,*,
Bixia Xiang1,*,
Shuang Jia1,
Rhonda Geoffrey1,2,
Shannon Holmes1,
Lisa Meyer2,
Sanaa Muheisen2 and
Xujing Wang1,2
1 The Max McGee National Research Center for Juvenile Diabetes, Department of Pediatrics, The Medical College of Wisconsin and The Children's Hospital Research Institute of Children's Hospital of Wisconsin, Milwaukee
2 The Human and Molecular Genetics Center, The Medical College of Wisconsin, Milwaukee, Wisconsin
 |
ABSTRACT
|
|---|
Despite their lower cost and high content flexibility, a limitation of in-house-prepared arrays has been their susceptibility to quality control (QC) issues and lack of QC standards across laboratories. Therefore, we developed a novel three-color array system that allows prehybridization QC as well as the Matarray software to facilitate acquisition of accurate gene expression data. In this study, we compared performance of our rat cDNA array to the Affymetrix RG-U34A and Agilent G4130A arrays using 2,824 UniGenes represented on all three arrays. Before data filtering, poor interplatform agreement was observed; however, after data filtering, differentially expressed UniGenes exhibited correlation coefficients of 0.91, 0.88, and 0.92 between the Affymetrix vs. Agilent, Affymetrix vs. cDNA, and Agilent vs. cDNA arrays, respectively. The Affymetrix, Agilent, and cDNA arrays agreed well with quantitative RT-PCR conducted on 42 UniGenes, yielding correlation coefficients of 0.90, 0.90, and 0.96, respectively. Each platform underestimated ratios relative to quantitative RT-PCR, possessing respective slopes of 0.86 (R2 = 0.81), 0.65 (R2 = 0.81), and 0.70 (R2 = 0.92). Overall, these data show that the combination of our novel technical and analytic approaches yield an accurate platform for functional genomics that is concordant with commercial discovery arrays in terms of identifying regulated genes and pathways.
gene expression profiling; cross-platform comparison; microarray; GeneChip; functional profiling
 |
INTRODUCTION
|
|---|
THE NOW EXTENSIVE USE of DNA microarrays to simultaneously monitor expression levels of tens of thousands of genes began nearly a decade ago (46). During this time, the technology has rapidly evolved into a number of widely used high-density platforms that are either commercially fabricated or prepared within the research laboratory. Microarray platforms are based on one of two general fabrication formats: 1) in situ-synthesized oligonucleotide arrays (12, 13) or 2) spotted arrays (46). Current DNA microarray formats vary considerably in terms of immobilized probe type; probe types include
300- to 1,500-bp spotted PCR products amplified from cDNA clones, in situ-synthesized 25-mer oligonucleotides (Affymetrix GeneChip), and spotted or in situ-synthesized 60-mer (Agilent) to 70-mer oligonucleotides. Furthermore, the varying platforms differ in terms of experimental design as well as target labeling and hybridization protocols; for example, the Affymetrix platform analyzes a single in vitro-transcribed, biotinylated cRNA per array which is visualized after hybridization with a streptavidin-phycoerytherin conjugate, whereas other platforms typically utilize cohybridization of two cDNA targets directly labeled during reverse transcription with different fluorescent dyes.
While differing microarray platforms allow researchers to compare gene expression profiles, each format offers unique advantages and disadvantages in terms of cost, study design considerations, array content, array content flexibility, required hardware, and analytic protocols. The cDNA arrays can be cost-effectively fabricated in-house, possess high hybridization stringency, and are not highly susceptible to single nucleotide polymorphisms. However, they require laborious and error-prone clone library management, PCR amplification, and purification, as well as cDNA clone sequence validation (3, 51), since clone misidentification rates within libraries have been estimated as high as 30% (57). Oligonucleotide arrays exhibit a number of advantages, including the fact that they can be designed to exclude homologous sequences between genes, thereby enhancing specificity. In addition, a given gene can be represented by a set of different oligonucleotides targeting different regions or exons, allowing for the detection of splice variants or discrimination of closely related genes. Oligonucleotide probe designs are typically based on deposited sequence information and therefore are dependent on the quality of the submitted information and annotation. Highlighting this reality is a recent report which found that, among Affymetrix mammalian arrays, >19% of the probes on any given array type did not correspond to their appropriate mRNA reference sequence defined by the highly curated, publicly available RefSeq database (37).
Microarray technology has drawn criticism due to its lack of reproducibility, which can stem from technical problems in both target preparation and array fabrication. A substantial limitation in the utilization of spotted arrays fabricated in the research laboratory is their susceptibility to quality control issues, resulting largely from variable DNA probe deposition and retention on the solid support surface, which is difficult to control for each and every array, since the array is typically "invisible" before hybridization (18, 21, 60). This has been a motivation for many investigators to use commercial array systems, despite potentially higher costs, where highly controlled fabrication methodologies have evolved, resulting in systems that offer high intraplatform reproducibility.
Because the array is a critical source of data variability, we have developed a novel three-color array approach where it is possible to directly visualize either cDNA or oligonucleotide arrays before hybridization (1821). For cDNA arrays, probes are tagged during amplification by using fluorescein-labeled oligonucleotide primers. After purification of PCR products, which removes unincorporated oligonucleotide primer, the detected fluorescein fluorescence represents deposited cDNA probe on the array, allowing for assessment of slide fabrication variables independent of hybridization. By labeling the array itself with this third fluorophore, we have observed that arrays fabricated together are not equivalent in terms of a number of measurable physical parameters, including the amount of probe immobilized on the array and the amount of background probe generated through the redistribution of probe during array-blocking procedures. These prehybridization array-based variables play a direct and significant relationship in replicate consistency and the accuracy of gene expression ratio measurements (18, 20, 21). We have found that microarray data quality can be significantly improved through prehybridization slide selection based on these quality parameters (20, 21). Quality control (QC) of the array is the major aspect that separates academic arrays from commercial arrays, since commercial providers have invested heavily in fabrication development and the establishment of QC standards, whereas academic laboratories have not reached a universal QC schema. An objective in the development of our three-color platform has been to fill this need.
To date there have been a number of reports comparing the results of different microarray platforms. Some studies have concluded that reasonable concordance exists across platforms, whereas others have not (31, 43, 50, 59). Several studies have found that data obtained from "homemade" spotted cDNA arrays are not as correlative as those derived from commercial oligonucleotide arrays (36, 58, 59), perhaps due to fabrication variability, cDNA clone annotation errors, or the nonspecific hybridization of labeled targets to the longer cDNA probes (27, 31). In this report, we compared the gene expression profiles of liver from fasted Wistar-Furth (WF) and BioBreeding (BB) diabetes-resistant (DR)+/+ rats (4) generated on our in-house rat cDNA array, the Affymetrix RG-U34A 25-mer array, and the Agilent G4130A 60-mer array. The three platforms possess probes for 2,824 common UniGenes (UniGene build no. 139) which were used to evaluate the overall interplatform concordance, accuracy of fold change relative to quantitative real-time RT-PCR (qRT-PCR), and effectiveness of controlling the whole array as well as individual spots with our novel three-color approach for improving gene expression data accuracy.
 |
MATERIALS AND METHODS
|
|---|
Animals, tissues, and sample preparation.
BBDR+/+ rats were maintained at the Medical College of Wisconsin (MCW), and genotyping to identify DR+/+, DR+/lyp, and DRlyp/lyp animals utilized polymorphic markers flanking Iddm1 as described previously (25, 32). WF rats were obtained from Harlan Teklad (Indianapolis, IN). Animals raised off-site were shipped after weaning (2530 days) and allowed to acclimate on-site for 37 days. All animals were kept under specific pathogen-free conditions with standard light-dark cycles and were fed a regular diet and water ad libitum. Only female animals were selected for analysis to eliminate confounding gender-specific gene expression differences. Before death at day 65 (d65), animals were fasted for 12 h, weighed, and had blood glucose levels measured. Animals were anesthetized using isofluorane. Livers were immediately harvested and snap frozen in liquid nitrogen for RNA extraction. All study protocols were reviewed and approved by the MCW Institutional Animal Care and Use Committee, and all institutional guidelines for the use and care of laboratory animals were followed. Total RNA was extracted from frozen liver samples using TRIzol reagent (Gibco BRL, Carlsbad, CA). A single RNA sample was prepared from each animal (n = 4 DR+/+ and n = 4 WF). RNA integrity was assessed with denaturing agarose gel electrophoresis (45) and 260 nm-to-280 nm absorbance ratio. Single pools of DR+/+ and WF RNA were created from equal amounts of each of the four individual RNA samples for each strain.
Affymetrix RG-U34A GeneChip analysis (8,779 probe sets).
DR+/+ and WF pooled liver samples were analyzed in duplicate by Affymetrix GeneChip analysis; more extensive technical replication was not performed on this platform, since our laboratory and others have found it highly reproducible (1, 52). First- and second-strand cDNAs were synthesized from 15 µg of total RNA, and cRNA was synthesized, labeled, fragmented, and hybridized to the RG-U34A array in accordance with standard Affymetrix protocols (Affymetrix, Santa Clara, CA). The RG-U34A array allows detection of
5,539 known genes and 3,240 expressed sequence tags (ESTs). After hybridization, arrays were washed, stained with phycoerytherin-conjugated streptavidin (Molecular Probes, Eugene, OR), and scanned. Images were analyzed using Microarray Suite version 5.0 (MAS 5.0; Affymetrix). The MAS 5.0 statistical algorithms were used to calculate signal intensities, probe set detection, probe set (gene expression) change, and signal log ratio. Hybridization data were analyzed using the commonly used two-profile comparison method, and the statistical significance of differential gene expression was derived through a Student's t-test (P < 0.05) (7).
Agilent G4130A 60-mer array analysis (20,500 probes).
DR+/+ and WF pooled liver samples were analyzed in triplicate (3 forward-labeled replicates and 3 reverse-labeled replicates) on the Agilent oligonucleotide array (Agilent Technologies, Palo Alto, CA), allowing the profiling of 9,822 known genes and 10,678 ESTs. Briefly, 60 µg per slide of each pooled RNA sample were directly labeled through a single round of reverse transcription primed with poly-dT and cyanine-3 (Cy3)- or Cy5-labeled dUTP (Amersham, Piscataway, NJ). Cy3- and Cy5-labeled templates were purified using a Qiagen MinElute kit and concentrated using Amicon YM 30 columns. The labeled cDNA targets were resuspended in 250 µl of nuclease-free water, heat denatured for 3 min at 98°C, and allowed to cool to room temperature; 250 µl of 2x Agilent hybridization buffer (Agilent Technologies) were then added. A total of 490 µl of labeled cDNA solution were loaded into hybridization chambers (Agilent Technologies) as instructed by the manufacturer's protocol. Arrays were hybridized at 60°C with rotation for 17 h. Posthybridization, arrays were washed with 0.2 µM sterile filtered wash solution 1 (6x SSC, 0.005% Triton X-102) for 10 min at room temperature, then washed in 0.2 µM sterile filtered wash solution 2 (0.1x SSC, 0.005% Triton X-102) for 5 min at 4°C, and then dried by centrifugation. Arrays were scanned using excitation and emission spectra specific for Cy3 and Cy5, using a ScanArray 5000 (GSI Lumonics, Billerica, MA).
MCW rat cDNA array analysis (36,864 probes).
DR+/+ and WF pooled liver samples were analyzed in triplicate (3 forward-labeled replicates and 3 reverse-labeled replicates) on the MCW 36,864-probe rat cDNA, as previously described (18, 20, 21). Briefly, 30 µg per slide of each pooled RNA sample were directly labeled through a single round of reverse transcription primed with poly-dT and Cy3- or Cy5-labeled dUTP (Amersham). Cy3- and Cy5-labeled templates were purified using a Qiagen MinElute kit and concentrated using Amicon YM 30 columns. The concentrated, mixed probes were added to a solution containing 1.5 mg/ml poly-dA, 0.9 mg/ml yeast tRNA, 2.2x Denhardt's solution, 3.4x SSC, and 0.3% SDS. Hybridization of arrays was performed in sealed humid hybridization cassettes for 1620 h at 65°C under a glass coverslip. After hybridization, slides were washed at room temperature for 4 min in 2x SSC, 1.5 min in 1x SSC, 1.5 min in 0.2x SSC, and 10 s with 0.05x SSC and finally dried by centrifugation. After hybridization reactions, arrays were scanned using excitation and emission spectra specific for Cy3 and Cy5, using a ScanArray 5000 (GSI Lumonics).
The MCW 36,864-probe rat cDNA array is based on a sequence-verified rat library (Research Genetics, Huntsville, AL) possessing 14,809 known genes and 20,231 ESTs. Cultures were grown in 150 µl of Terrific Broth (Sigma, St. Louis, MO) supplemented with 0.1 mg/ml ampicillin in 384 deep-well plates (Matrix Technologies, Hudson, NH) sealed with air pore tape sheets (Qiagen, Valencia, CA) and incubated with shaking for 1416 h. Clone inserts were amplified in duplicate in a 384-well format from 0.5 µl of bacterial culture using 0.26 µM each vector primer (forward 5'-fluorescein-TTC CGG CTC GTA TGT TGT GTG-3' and reverse 5'-fluorescein-AAG CTA AAA TTA ACC CTC ACT AAA G-3') (Integrated DNA Technologies, Coralville, IA) as previously described (18, 20, 21). After purification and quantification, all plates were dried down and reconstituted at 150 ng/µl in 3% DMSO-1.5 M betaine. The cDNA array was printed over two poly-L-lysine-coated slides [prepared in-house as previously described (11)] using a GeneMachines Omni Grid printer (San Carlos, CA) with 32 Telechem International SMP3 pins (Sunnyvale, CA) at 40% humidity and 22°C. Arrays were postprocessed using the previously described nonaqueous protocol (9). Prehybridization array fluorescein image files were generated with a ScanArray 5000 (GSI Lumonics) and analyzed with the Matarray software (53) modified for prehybridization quality assessment, and only those passing QC thresholds were used for analysis (18, 20, 21).
Microarray data acquisition, data filtering, and evaluation.
The Affymetrix array data were processed with MAS 5.0 software, using the default settings (
= 0.015,
1 = 0.04, and
2 = 0.06). This resulted in the flagging and filtering of 50% of the original 4,147 probe sets corresponding to the shared UniGenes. The acquired Agilent array data were subjected to quality-dependent filtering and localized LOWESS Z-normalization using the Matarray software, as previously described (53, 54, 56). Matarray employs algorithms to define a composite image quality score (qcom) for each spot on the array according to five posthybridization criteria: size, signal-to-noise ratio, background level and uniformity, and saturation status. Previously, we have demonstrated that variability in intensity ratio measurements correlates closely with qcom, in that high-quality spots generate less variability, and removal of spots with low qcom dramatically improves the reliability of hybridization data as reflected by higher correlation coefficients between hybridized replicate arrays/spots (53, 55). Filtering of the Agilent data was adjusted to stringency equal to that of the Affymetrix default settings, so that 50% of the lowest-quality elements representing the shared UniGene content were also dropped. The cDNA array image data were also analyzed with Matarray and were subjected to quality-dependent filtering and localized LOWESS Z-normalization, as previously described (53, 54, 56) but with modified algorithms to capture QC measures made possible with the prehybridization fluorescein image. Previously, we have shown that this image captures data quality influencing array fabrication variables that may not be reflected in the hybridized image. Therefore, we have defined a fluorescein image quality score, based on our previously identified QC thresholds for array fabrication (18, 20, 21), defined as
 |
where
 |
and qcom(TD) is the composite score of the fluorescein spot image quality defined from size, signal-to-noise ratio, background level and uniformity, analogous to that previously described for the hybridized image. The most important factor affecting data reliability is the prehybridization fluorescein spot intensity, which reflects the amount of immobilized probe (18, 20, 21). We have previously found that array elements under the threshold fluorescein intensity of 5,000 relative fluorescence units (RFU)/pixel are increasingly variable and compressed (18, 21, 56). Although all cDNA arrays were subjected to prehybridization QC (18, 20, 21, 56), even high-quality arrays possess some compromised spots due to mechanical failures during printing and PCR failures. Therefore the cDNA array data filtering was accomplished using a quality score (qfinal = qcom x qTD) that scored spots based on both the prehybridization fluorescein image as well as the cyanine dye-hybridized image, again filtering 50% of the lowest-quality spots.
Real-time quantitative RT-PCR.
Specific oligonucleotide primers for selected UniGenes (n = 42) were designed with Oligo 6.66 (Molecular Biology Insights, Cascade, CO). The amplicon generated by real-time quantitative RT-PCR (qRT-PCR) was selected by identifying, when possible, the overlapping sequence between the GenBank transcript sequences (accession nos.) targeted by each platform. Monoplex real-time qRT-PCR was performed using Rotor-Gene 3000 (Corbett Research, Morelake, Australia), QuantumRNA 18S Internal Standards (Ambion, Austin, Texas), UniGene-specific primers (Sigma Genosys, The Woodlands, TX), and QuantiTect SYBR Green PCR Master Mix (Qiagen) according to the manufacturers' instructions. Synthesis of first-strand cDNA from 1 µg of RNA per animal was accomplished with random hexamers (Invitrogen, Carlsbad, CA) and Superscript II (Invitrogen) according to the manufacturers' instructions. Triplicate locus-specific and 18S PCRs were performed for each gene analyzed in 20-µl reactions that included 2 µl of cDNA and 10 µl of 2x SYBR QuantiTect SYBR Green PCR Master Mix (Qiagen) possessing 1.2 µl of locus-specific (10 µM) or 18S-specific competimers (used as a 3:7 ratio of primer-competimer set; each stock is at 5 µM) and 6.8 µl of deionized water. Reactions were typically cycled as follows: stage 1, 95°C for 90 s; stage 2, 55 cycles at 95°C for 30 s, 5066°C for 30 s (locus specific), 72°C for 30 s, and fluorescence acquisition at 7282°C for 15 s (locus specific); stage 3, melt curve at 6095°C. 18S reactions were cycled as follows: stage 1, 95°C for 90 s; stage 2, 55 cycles at 95°C for 30 s, 55°C for 30 s, 72°C for 30 s, and 82°C for 15 s; stage 3: melt curve at 6095°C. A pooled and concentrated sample of DR+/+ or WF cDNA was used for both the locus-specific and 18S standard curves at undiluted, 1:5, 1:25, 1:125, and 1:625 concentrations, and at least two points from the standard curve were used as positive controls in each assay. Specificity for all qRT-PCR reactions was verified by both melting curve analysis and 1.5% agarose gel detection of single product. Data were analyzed with the Rotor-Gene 3000 software using the cycle threshold for quantification. Relative gene expression data (fold change) between samples was accomplished using the mathematical model described by Pfaffl (39).
Strategy for interplatform comparison.
The first step of the comparison was retrieving an updated annotation for each array from the National Center for Biotechnology Information (NCBI) Rattus norvegicus UniGene Build No. 139 (http://www.ncbi.nlm.nih.gov/UniGene) and Dragon Annotation Tool (http://pevsnerlab.kennedykrieger.org/annotate.htm) (5, 6). Probes from each platform, represented by GenBank accession numbers, that grouped to the same UniGene cluster were considered to be interrogating the same gene, since each UniGene cluster should represent a single gene. With the use of this common annotation source, the shared UniGene content among the Affymetrix, Agilent, and MCW cDNA arrays was identified, as well as the common content between each of the two-way comparisons (please see Supplemental Tables S1S4; available at the Physiological Genomics web site).1
The Affymetrix, Agilent, and cDNA arrays possessed 6,749 (4,864 unique), 17,604 (13,803 unique), and 24,559 (17,946 unique) UniGenes, respectively. Among those, 2,824 UniGenes were represented on all three platforms and served as the basis of the three-way comparison. Between each of the possible two-way comparisons, Affymetrix vs. Agilent, Affymetrix vs. cDNA, and Agilent vs. cDNA array, there were 3,822, 3,452, and 9,513 unique UniGenes represented, respectively.
 |
RESULTS
|
|---|
Rationale.
Two pooled liver RNA samples derived from four fasted d65 WF and four fasted d65 DR+/+ rats were used as a source of target for comparison of the Affymetrix, Agilent, and cDNA arrays. Livers from these two rat strains were selected because we previously observed differential expression of many lipid metabolism genes when profiling the pancreatic lymph nodes (22). Therefore, we expected a large number of genes to exhibit expression differences in the liver, allowing a comprehensive comparison of the three platforms over a wide dynamic range, and ample data points for benchmarking the effectiveness of our technical and analytic approaches utilizing the third dye (1821). The comparison of two pooled RNA samples vs. a series of individual RNA samples was opted for, since it would 1) allow for generation of a sufficiently large source of RNA for testing all three platforms; 2) ensure sufficient material for extensive qRT-PCR follow-up while reducing the technical and analytic complexity of the experiment; and 3) exclude the measurement of biological variability between animal replicates (40), which would complicate comparison of the single-channel format of the Affymetrix platform with the dual-hybridization formats of the Agilent and cDNA platforms. Labeling, hybridization, and image acquisition for each platform, as well as the qRT-PCR assays, were optimized before running the study samples to ensure technical familiarity and the generation of reproducible data. For this study, data were managed as a relative ratio (log2 ratio of DR signal intensity to WF signal intensity), which allowed for a consistent measurement across platforms while representing a more biologically meaningful value than an absolute intensity measurement (33, 38).
Intraplatform replicate consistency.
On the basis of the UniGene content shared by the Affymetrix, Agilent, and cDNA arrays, intraplatform replicate consistency was investigated (Table 1). When evaluating the log2 ratios for all replicate arrays within a platform for the entire set of probes targeting the 2,824 shared UniGenes (no filtering), intraplatform Pearson correlation coefficients of 0.60, 0.56, and 0.24 were observed for the Affymetrix, Agilent, and cDNA arrays, respectively. When restricting the analysis to only those genes within the 2.5% tails of the log2 ratio distribution, respective Pearson correlation coefficients of 0.79, 0.78, and 0.40 were observed.
Intraplatform Pearson correlation coefficients of 0.45, 0.41, and 0.15 were also determined for all genes on Affymetrix (n = 8,799), Agilent (n = 20,158), and cDNA (n = 34,980) arrays, respectively. When the analysis only considered all genes within the 2.5% tails of the log2 ratio distribution, intraplatform Pearson correlation coefficients of 0.56, 0.63, and 0.25 were determined for the Affymetrix (n = 440), Agilent (n = 1,008), and cDNA (1,250) arrays, respectively.
Next, the intraplatform replicate consistency after data filtering was evaluated for the 2,824 shared UniGenes. The filtered log2 ratios for all replicate arrays within a platform revealed improved Pearson correlation coefficients of 0.85, 0.83, and 0.69 for the Affymetrix, Agilent, and cDNA arrays, respectively. When restricting the analysis to only those genes within the 2.5% tails of the log2 ratio distribution, respective intraplatform Pearson correlation coefficients of 0.96, 0.96, and 0.90 were observed (Table 1). The intraplatform replicate concordances (sharing among the 2.5% tails of replicates) were found be 71, 66, and 55%, respectively.
After data filtering, intraplatform Pearson correlation coefficients of 0.85, 0.79, and 0.66 were also determined for all genes on Affymetrix (n = 3,643), Agilent (n = 8,346), and cDNA (n = 12,726) arrays, respectively. When the analysis only considered all genes within the 2.5% tails of the log2 ratio distribution, intraplatform Pearson correlation coefficients of 0.96, 0.94, and 0.90 were determined for the Affymetrix (n = 182), Agilent (n = 418), and cDNA (636) arrays, respectively. Overall, these observations illustrate how the pre- and posthybridization quality filters facilitate removal of data points compromised during array fabrication or hybridization and enable identification of the most reproducible, highest-quality data.
Intraplatform differential gene expression.
Among the three platforms, the Affymetrix array exhibited the highest rate of differential expression detection (5.1%; 445/8,799 probe sets), followed by the MCW cDNA array (3.1%; 1,087/35,040 probes, excluding control clones) and the Agilent array (2.5%; 521/20,500 probes, excluding control clones).
Interplatform comparison of Affymetrix, Agilent, and cDNA arrays.
To begin our comparison of gene expression measurements between the Affymetrix, Agilent, and cDNA arrays, we again focused on the 2,824 common UniGenes represented by 4,147 25-mer probe sets, 3,606 60-mer oligonucleotide probes, and 4,846 cDNA probes on the Affymetrix, Agilent, and MCW platforms, respectively. Before data filtering, no pairwise comparison showed a Pearson correlation coefficient exceeding 0.50 (Table 2). Using the strategy described above, we first filtered the lowest-quality spots from the shared content of each data set. The remaining probes within each platform were matched by UniGene identification and averaged. This resulted in overall log2 ratio profiles consisting of the 865 UniGenes represented on all three platforms, for which we observed Pearson correlation coefficients of 0.73, 0.68, and 0.75 between the Affymetrix vs. Agilent, Affymetrix vs. cDNA, and Agilent vs. cDNA platforms, respectively. Next, we evaluated the 2.5% tails of the DR+/+-to-WF log2 ratio distribution, represented by 44 UniGenes. Within the 2.5% tails were observed log2 ratio ranges of 1.72 to 0.97 and 0.94 to 2.44 for the Affymetrix, 1.33 to 0.60 and 0.75 to 2.14 for the Agilent, and 1.74 to 0.58 and 0.57 to 2.14 for the cDNA arrays. Improved Pearson correlation coefficients of 0.91, 0.88, and 0.92 were observed between the 2.5% tails of the ratio profiles of the Affymetrix vs. Agilent, Affymetrix vs. cDNA, and Agilent vs. cDNA array comparisons, respectively.
This analysis was then expanded to include all of the shared UniGenes between each of the three possible two-way comparisons (Table 2), using the percentage of spots dropped within that subset of probes by the Affymetrix default settings to establish the filtering stringency for the other two platforms. Between the Affymetrix and the Agilent arrays, there are probes representing 3,822 common UniGenes; after filtering 59% of the lowest-quality spots, there were 1,315 common UniGenes available for comparison, yielding Pearson correlation coefficients of 0.65 and 0.92 for the overall log2 ratio profiles and the 2.5% tails, respectively. In this comparison, the 2.5% tails were represented by 66 UniGenes that possessed log2 ratios on the Affymetrix array ranging from 2.36 to 0.98 and 1.06 to 2.56, while those on the Agilent array ranged from 2.67 to 0.65 and 0.84 to 3.20. Between the Affymetrix and the cDNA arrays, there are probes representing 3,452 common UniGenes; after filtering 59% of the lowest-quality spots, there were 1,300 common UniGenes available for comparison, yielding Pearson correlation coefficients of 0.66 and 0.86 for the overall log2 ratio profiles and the 2.5% tails, respectively. In this comparison, the 2.5% tails were represented by 66 UniGenes that possessed log2 ratios on the Affymetrix array ranging from 1.83 to 0.90 and 0.93 to 2.73, while those on the cDNA array ranged from 1.74 to 0.62 and 0.59 to 2.44. Lastly, between the Agilent and cDNA arrays, there are probes representing 9,513 common UniGenes; after filtering 59% of the lowest-quality spots, there were 3,444 common UniGenes available for comparison, yielding Pearson correlation coefficients of 0.70 and 0.86 for the overall log2 ratio profiles and the 2.5% tails, respectively. In this comparison, the 2.5% tails were represented by 172 UniGenes that possessed log2 ratios on the Agilent G4130A array ranging from 3.14 to 0.53 and 0.57 to 3.29, while those on the cDNA array ranged from 2.24 to 0.53 and 0.51 to 2.26. The concordance in predicted differential gene expression between platforms was 76, 41, and 52% for the Affymetrix vs. Agilent, Affymetrix vs. cDNA, and Agilent vs. cDNA comparisons, respectively.
ANOVA was used to quantify the variance arising from microarray platform differences compared with other sources of data variation (Table 3) (62). Using the 2,824 UniGenes represented on all three platforms, probes (or probe sets) were identified that passed QC on all three platforms and exhibited differential expression (2-fold) on at least one hybridization on any platform. This yielded 157 probes. ANOVA was then performed to assess the effect of the four factors that potentially contribute to the variation in ratio measurements for these genes (the gene itself, the array replicate, the differing platforms, and reverse labeling). The analysis found that neither the intraplatform replicate arrays (P = 2.4 x 101) nor the differing platforms (P = 9.3 x 101) contributed significantly to the variance, supporting the findings in Tables 1 and 2, where the intraplatform variation was found to be comparable to the interplatform variation, as evidenced by the similar inter- and intraplatform correlations. Interestingly, we have found that the dye-labeling direction caused significantly more variation (P < 1.0 x 106) than either intraplatform replicates or differing microarray platforms, underscoring the need to control the dye-labeling bias in microarray experiments.
The strategy of filtering each data set followed by combining intraplatform probes targeting the same UniGene offers a logical and direct means of comparing the overall performance of the three platforms. However, we also examined the data by identifying on all three platforms the probes that, on replicate analysis, yielded significant t-test results (P < 0.05) and passed their respective QC criteria described above (again using the 50% cutoff established by the Affymetrix MAS 5.0 thresholds). The primary objective for doing this was to identify a large number of common candidate genes for real-time qRT-PCR follow-up to assess the accuracy of each platform. For genes represented on all three platforms, probes targeting 79 UniGenes met these criteria. At this point, Pearson correlation coefficients of 0.95, 0.92, and 0.91 were determined (Table 4) between the overall log2 ratios for Affymetrix vs. Agilent, Affymetrix vs. cDNA, and Agilent vs. cDNA comparisons, respectively. When analysis was restricted to only UniGenes exhibiting a |log2 ratio| (absolute value) >0.5 (i.e., >1.4-fold in either direction), probes representing 24 UniGenes remained that showed interplatform Pearson correlation coefficients of 0.98, 0.96, and 0.95 for the Affymetrix vs. Agilent, Affymetrix vs. MCW cDNA array, and Agilent vs. cDNA array comparisons, respectively. These data show that differentially expressed genes among the highest-quality data derived from each platform are highly correlative. Also shown in Table 4 are the results of this same analysis applied to the more-extensive content shared between each of the possible two-way comparisons, where again high interplatform correlations are observed for the quality-filtered data exhibiting fold changes >1.4-fold. Because the data in Table 4 represent the commonly represented, highest-quality data from each platform, it is not unexpected that higher correlations are observed than for the less-restricted data in Table 2.
View this table:
[in this window]
[in a new window]
|
Table 4. Comparison of highest quality shared UniGene content (multiple intraplatform representation not averaged)
|
|
Validation of differential expression detected by microarray platforms using qRT-PCR.
qRT-PCR was used to validate relative transcript levels for UniGenes that were significantly (P < 0.05) detected by all three platforms and met each platform's respective QC criteria. As shown in Table 4, 79 unique UniGenes met these criteria. Of these, 42 were randomly selected for qRT-PCR follow-up. In instances where a given UniGene was represented by more than a single significantly detected, QC-passing probe (or probe set), the log2 ratio values were averaged for comparison with qRT-PCR. Among the loci selected for qRT-PCR, the Affymetrix, Agilent, cDNA platforms possessed 9, 4, and 15 UniGenes exhibiting multiple representations. A total of 42 robust qRT-PCR assays were developed for this study, and the oligonucleotide primer designs as well as key assay characteristics are provided in Table 5.
Shown in Fig. 1 are the observed log2 ratios for each of the three microarray platforms relative to qRT-PCR. Among the 42 genes analyzed, directional concordance was observed between qRT-PCR and the array results for all but two genes, Rn.24783 and Rn.10249 (40/42, 95.2%). By qRT-PCR, Rn.24783 was detected as being underexpressed in the DR vs. WF comparison, possessing a log2 ratio of 0.22 (1.17-fold), whereas the Affymetrix, Agilent, and cDNA platforms each detected Rn.24783 as being overexpressed at log2 ratios of 0.52 (1.44-fold), 0.80 (1.74-fold), and 0.09 (1.06-fold), respectively. Likewise, Rn.10249 was also underexpressed by qRT-PCR, possessing a log2 ratio of 0.43 (1.35-fold), whereas the Affymetrix, Agilent, and cDNA platforms each detected Rn.10249 as being overexpressed, possessing log2 ratios of 1.52 (2.87-fold), 1.23 (2.35-fold), and 0.48 (1.40-fold), respectively. It has been estimated that between 1 and 5% of clones in well-maintained libraries are misassigned, and even adherence to rigorous protocols to maintain library integrity and clone tracking may not guarantee clone identity, since error rates in available cDNA clone collections are documented (15, 30, 44, 51). As part of this study, we have conducted clone identity confirmation by standard dye-terminator sequencing protocols and referencing the obtained sequence result against sequence databases. We have identified a 3.1% (5/160) error rate after analyzing 160 randomly selected clones obtained from 30 different source plates. Because all three array platforms were directionally discordant with qRT-PCR for these two loci, it is unlikely that the discordance between the cDNA array result and qRT-PCR is due to clone misidentification. It is possible that the discordance between the array measurements and qRT-PCR may be due to unknown splice variants; however, the qRT-PCR assays were designed to target a sequence common to the sequences used by each array platform to minimize this possibility.

View larger version (23K):
[in this window]
[in a new window]
|
Fig. 1. Comparison of log2 ratios generated by Affymetrix, Agilent, and Medical College of Wisconsin (MCW) cDNA arrays compared with quantitative RT-PCR (qRT-PCR). Pearson correlation coefficients of 0.90, 0.90, and 0.96 were determined between the log2 ratios determined by each platform and qRT-PCR, respectively. A: bar graph representation of log2 ratios generated by the 3 array platforms and qRT-PCR for 42 loci. Inset: hierarchical clustering (49) of log2 ratios generated by the 3 array platforms and qRT-PCR for 42 loci. B: ratios generated by the Affymetrix, Agilent, and MCW cDNA arrays were independently plotted against those generated by qRT-PCR for 42 loci, and linear regressions were determined. All 3 microarray platforms underestimated the log2 ratio relative to qRT-PCR, possessing slopes of 0.86 (y-intercept = 0.17, R2 = 0.81), 0.65 (y-intercept = 0.23, R2 = 0.81), and 0.70 (y-intercept = 0.20, R2 = 0.92), respectively. The lines from each independent plot are combined for comparison at bottom right (black line is RT-PCR plotted against itself, slope of 1).
|
|
The remaining 40 UniGenes showed directional concordance between the array measurements and qRT-PCR. Pearson correlation coefficients were calculated to evaluate the overall agreement between the 42 gene expression measurements by qRT-PCR and those made for the corresponding genes by the Affymetrix, Agilent, and cDNA platforms. All three microarray platforms correlated well with qRT-PCR, showing correlation coefficients of 0.90, 0.90, and 0.96, respectively. The log2 ratios generated from the three microarray platforms and qRT-PCR from the 42 loci were subjected to hierarchical clustering, using Genesis (49) (Fig. 1A, inset). In this analysis, the three microarray platforms were found most related to one another than to qRT-PCR. The Agilent and cDNA data sets clustered first with each other, then with the Affymetrix data set, and finally with qRT-PCR.
Because Pearson correlation coefficients can be influenced by a single or few outlying values, the log2 ratios determined for the 42 genes analyzed by qRT-PCR were plotted against the corresponding log2 ratios determined by each of the microarray platforms, and the slope and linear regression were calculated (Fig. 1B). This allowed for the examination of possible systematic biases in the log2 ratio distributions by each of the three platforms. Overall, this analysis revealed that the Affymetrix, Agilent, and cDNA microarray platforms all underestimated the log2 ratio relative to qRT-PCR, possessing slopes of 0.86 (y-intercept = 0.17, R2 = 0.81), 0.65 (y-intercept = 0.23, R2 = 0.81), and 0.70 (y-intercept = 0.20, R2 = 0.92), respectively. This analysis revealed that, for these 42 loci, the Affymetrix platform on average showed the least overall compression. However, these data also indicate that, for log2 ratios between 1.0 and 1.0, both the Affymetrix and Agilent systems exhibited higher variability and tended to overestimate the log2 ratio, while the cDNA platform showed the least overall deviation over the entire log2 measurement range compared with qRT-PCR
Functional analysis of differentially expressed genes.
All three arrays compared in this study were designed for discovery-driven experimentation and represent a broad spectrum of genes involved in many aspects of cellular structure and function. Therefore, we speculated, given the high correlation in differentially expressed genes identified by each platform, that, in general, the same biological themes should be identified by the Affymetrix, Agilent, and cDNA arrays. Onto-Express software performs an overrepresentation analysis of the functional gene categories detected by the array relative to all genes assayed on the array using the Gene Ontology (GO) databases as references. This package was used to evaluate the biological functions identified by each platform (10, 28, 29).
First, we limited the functional comparison only to the shared content, i.e., those probes targeting the 2,824 UniGenes represented on all three platforms. For this, the subset of significantly differentially expressed genes (|log2 ratio| > 0.5; Student's t-test, P < 0.05) passing QC of the shared content derived from each platform was analyzed with Onto-Express using the 2,824 UniGenes represented on all three platforms as a reference. The analysis detected 179 (58/179, P < 0.01), 104 (51/104, P < 0.01), and 142 (58/142, P < 0.01) regulated GO biological processes from the Affymetrix, Agilent, and cDNA array gene expression profiles, respectively; 42 (12/42, P < 0.01) of these processes were detected by all three platforms. Table 6, top, summarizes the most significant commonly identified GO biological processes, shown in order of their significance in the Affymetrix analysis. Cholesterol biosynthesis was the most significantly upregulated process on each platform when comparing the expression profiles of DR+/+ with WF. Lipid biosynthesis and isoprenoid biosynthesis were also common significantly identified GO biological processes from the 2,824 common UniGenes, ranking within the top 10 upregulated GO biological processes in all three individual platform analyses.
Next, we evaluated the identified biological processes of each platform using all significantly detected differentially expressed genes (|log2 ratio| > 0.5; Student's t-test, P < 0.05) passing QC relative to respective entire content (Table 6, bottom). The analysis detected 219 (70/219, P < 0.01), 149 (66/149, P < 0.01), and 175 (61/175, P < 0.01) regulated GO biological processes from the Affymetrix, Agilent, and cDNA array gene expression profiles, respectively; 56 (13, P < 0.01) of these processes were detected by all three platforms. Table 6, bottom, summarizes the most significant commonly identified GO biological processes, shown in order of their significance in the Affymetrix analysis along with the P value and rank order determined by Onto-Express for analysis of each data set. Again, cholesterol biosynthesis, lipid biosynthesis, and regulation of cholesterol biosynthesis were independently identified as highly significant upregulated processes by all three platforms. Fatty acid metabolism, cellular extrasavation, and response to virus were again identified as highly significant downregulated processes on all three platforms. The data indicate that, despite the differing content on the three different array platforms, similar global biological interpretations were derived from the complex analysis of the complete expression profiles.
 |
DISCUSSION
|
|---|
During the past 10 years, global gene expression profiling with microarrays has become commonplace in both the academic and private research settings, resulting in the publishing of thousands of studies and the public deposition of vast amounts of gene expression data. This repository will continue to grow exponentially, giving laboratory scientists and bioinformaticians the option of mining these resources to address their questions of interest. Therefore, it is vitally important to generate an understanding of the compatibility and reliability of gene expression data generated on different gene expression platforms so that this information can be accurately and efficiently utilized, as well as to establish criteria by which data from different sources can be appropriately integrated.
To date, a number of cross-platform studies have been reported, collectively evaluating different array formats as well as various methodological and statistical approaches. From these reports, a number of pertinent issues have become recognized when attempting to compare data across different microarray platforms. A fundamental factor is annotation, which is dynamic and therefore must be frequently updated for both commercial and in-house-fabricated arrays. The use of the UniGene or RefSeq databases to identify the shared content between array platforms has enabled investigators to identify increased cross-platform content and consistency (36, 37, 50). The reality of the annotation issue was underscored again in this study, as only 45 shared genes were identified among the Affymetrix, Agilent, and cDNA arrays when cross-platform probe matching was attempted with GenBank accession numbers vs. the 2,824 common genes identified through use of the UniGene identifiers. It must also be recognized that there exist a plethora of technical and analytic factors that can impair interplatform comparisons, including biological variability (40), analysis methods, and filtering strategies as well as intra- and interlaboratory variation (2, 24). Studies where these factors may not have been sufficiently weighed have indeed found low correlation between platforms and have likely underestimated the overall reliability of microarrays as a gene expression technology (31, 36, 47). As described above, we have attempted to minimize factors that contribute to interplatform discordance to facilitate a direct comparison of these three array systems with one another and assess their accuracy relative to qRT-PCR.
Among previous reports evaluating the interplatform performance, Pearson correlation coefficients have been commonly calculated to measure both intraplatform reproducibility and interplatform agreement. For differentially expressed genes, many commercial array platforms can be optimized to achieve good reproducibility, with observed Pearson correlation coefficients of >0.80 and often >0.90 being observed within the same laboratory (2, 21, 33, 59). Here we also observe high replicate consistency for differentially expressed genes (Table 1) on all three microarray platforms. Recently, 7 laboratories compared 2 standardized RNA samples on 12 microarray platforms and found acceptable intraplatform reproducibility (Pearson correlation coefficient >0.70); however, interlaboratory reproducibility was found more difficult to achieve (2), especially when differences in laboratory procedures, data acquisition, and data normalization were not eliminated. Their observations emphasize the caution that must be taken when conducting cross-platform comparisons given the inherent protocol and analysis differences that exist between laboratories as well as what is recommended by different commercial vendors.
In general, previous studies comparing either commercially prepared or academically spotted oligonucleotide arrays have found reasonable, although not perfect, correlation among shared content. Barczak et al. (3) compared two spotted
70-mer oligonucleotide sets with the Affymetrix U95Av2 GeneChip, using RNA derived from dissimilar human cell lines, and found high interplatform Pearson correlation coefficients of 0.80.9 between the common content. Likewise, other groups have observed reasonable agreement (in most cases, Pearson correlation coefficients >0.70) between Affymetrix U133A/B GeneChips and Amersham CodeLink UniSet Human 20K microarrays (48) and Affymetrix MOE430A/B GeneChips and spotted 65-mer oligonucleotides (58), as well as between Affymetrix U74v2 GeneChips, Amersham Codelink Uniset I arrays (30-mer oligonucleotides), and Agilent mouse 22K development arrays (59). In this study, a Pearson correlation coefficient of 0.65 was observed between the Affymetrix RG-U34A GeneChip and the Agilent G4130A array after data filtering. This level of agreement is consistent with those reported by Bammler et al. (2) in their study of Affymetrix and Agilent murine arrays as well as with those reported by Yauk et al. (59) in evaluating the concordance between Affymetrix and Agilent human arrays. However, we observed much better correlation coefficients between these two platforms (exceeding 0.90) when restricting the analysis to significantly detected differentially expressed genes (Tables 2 and 4), with relatively high concordances in the number of genes being identified by both platforms.
Comparisons of cDNA arrays, particularly those fabricated in an academic setting, with either commercially prepared or spotted oligonucleotide arrays have generally been less favorable (2, 31, 36, 58, 59). Yauk et al. (59) reported correlation coefficients of 0.44 and 0.52 when comparing the shared content between an academically fabricated murine cDNA array with the Agilent 22K mouse development (60-mer oligonucleotides) or the Affymetrix U74Av2 array. Järvinen et al. (26) recently compared a high-density human cDNA array with the Affymetrix U95Av2 array using a number of analytic approaches with and without data filtering; Pearson correlation coefficients for unfiltered data ranged from 0.54 to 0.63, whereas, after filtering low-quality data, Pearson correlation coefficients ranged from 0.66 to 0.75. When starting with unfiltered data, here we observed Pearson correlation coefficients ranging from 0.34 to 0.50 between the cDNA array and the Affymetrix or Agilent array, respectively (Table 2). When analyzing the 865 UniGenes shared by all platforms after data filtering, improved respective correlation coefficients of 0.68 and 0.75 were observed. Finally, if we only considered the high-quality differentially expressed genes, high Pearson correlation coefficients (0.860.96; Tables 2 and 4) were observed. These observations highlight the benefit of our three color-based QC approach, in that it allows identification of compromised data, thereby enabling efficient data filtering.
Because of its reproducibility and large measurement range (14, 23, 61), qRT-PCR has become the method of choice for quantifying transcript levels as part of microarray follow-up studies (16, 35). It is known that array results are often "compressed" relative to qRT-PCR (16, 17, 41, 42). In this study, all three array platforms, on average, underestimated fold changes relative to qRT-PCR. Here, directional discordances (positive vs. negative log2 ratios) were not observed among the highest-quality array data. However, when conducting qRT-PCR, two UniGenes were found directionally discordant to the log2 ratios generated by the three arrays. Despite these two points, all three array platforms were found highly correlative with qRT-PCR (Fig. 1), all possessing Pearson correlation coefficients of
0.90 for the 42 loci tested. Certainly, oligonucleotides can be designed to target sequences more specifically than cDNA arrays, allowing the differentiation of splice variants or closely related genes. However, Chou et al. (8) recently reported that, in the absence of experimental validation, 150-mer oligonucleotides offered the best compromise in terms of accurate gene expression measurement and hybridization specificity. Consistent with this, we find that when using either the regression analysis or the Pearson correlation coefficient as the metric, the longer cDNA probes were more correlative with qRT-PCR than the oligonucleotide probes.
Quality array fabrication can be a difficult proposition in the academic setting, especially when it is not the primary focus of the laboratory. Until recently, it has been impossible to directly and quantitatively evaluate microarray data problems arising from slide fabrication. In this study, we employed cDNA arrays that were fabricated using our novel three-color approach, which allows prehybridization QC and selection of only quality arrays for hybridization. Previously, we have determined that features possessing <5,000 RFU/pixel of support-bound probe yield increasingly compressed and variable gene expression measurements (18, 21). Low-quality spots may not only be due to PCR or mechanical failures but also to the locally high background from spotted probe that has been redistributed during the blocking step of array fabrication (20). Therefore, data filtering using the prehybridization fluorescein image can remove these spots from data sets and improve data reliability (56). Furthermore, the cDNA and Agilent arrays were processed with the Matarray image processing software, which specializes in quantitative QC of data acquisition (53). We have shown that several major sources of data variability, independent of those potentially introduced during array fabrication, are readily identifiable from the posthybridization image, including high or nonuniform noise profiles and low or saturated signal intensities. Through the definition of the quality score, qcom, we have characterized their effect on data reliability (53) and have found the ratio-qcom plot a useful tool for data filtering and normalization (55, 56). Here we have shown that our combined technical and analytic strategies provide a means to identify the highest-quality and most reliable microarray data. When such data are used as the basis of an interplatform microarray comparison, unlike many other studies, differing array platforms can be found highly correlative to one another in terms of the genes and pathways they identify as being regulated, as well as to qRT-PCR. More importantly, these results validate our previously reported approaches and offer laboratories a means by which in-house-fabricated arrays can be used with confidence to generate accurate gene expression data.
 |
GRANTS
|
|---|
This work was supported by National Institute of Biomedical Imaging and Bioengineering Grant R01-EB-001421 and National Institute of Allergy and Infectious Diseases Grant P01-AI-42380.
 |
ACKNOWLEDGMENTS
|
|---|
We are grateful to Dr. Josef Lazar for review of the manuscript.
 |
FOOTNOTES
|
|---|
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
Address for reprint requests and other correspondence: M. J. Hessner, Dept. of Pediatrics, The Medical College of Wisconsin, 8701 Watertown Plank Rd., Milwaukee, WI 53226 (e-mail: mhessner{at}mcw.edu).
* M. J. Hessner and B. Xiang contributed equally to this work. 
1 The Supplemental Material for this article is available online at http://physiolgenomics.physiology.org/cgi/content/full/00243.2005/DC1. 
 |
REFERENCES
|
|---|
- Bakay M, Chen YW, Borup R, Zhao P, Nagaraju K, and Hoffman EP. Sources of variability and effect of experimental approach on expression profiling data interpretation. BMC Bioinformatics 3: 4, 2002.[CrossRef][Medline]
- Bammler T, Beyer RP, Bhattacharya S, Boorman GA, Boyles A, Bradford BU, Bumgarner RE, Bushel PR, Chaturvedi K, Choi D, Cunningham ML, Deng S, Dressman HK, Fannin RD, Farin FM, Freedman JH, Fry RC, Harper A, Humble MC, Hurban P, Kavanagh TJ, Kaufmann WK, Kerr KF, Jing L, Lapidus JA, Lasarev MR, Li J, Li YJ, Lobenhofer EK, Lu X, Malek RL, Milton S, Nagalla SR, O'Malley JP, Palmer VS, Pattee P, Paules RS, Perou CM, Phillips K, Qin LX, Qiu Y, Quigley SD, Rodland M, Rusyn I, Samson LD, Schwartz DA, Shi Y, Shin JL, Sieber SO, Slifer S, Speer MC, Spencer PS, Sproles DI, Swenberg JA, Suk WA, Sullivan RC, Tian R, Tennant RW, Todd SA, Tucker CJ, Van Houten B, Weis BK, Xuan S, and Zarbl H; Members of the Toxicogenomics Research Consortium. Standardizing global gene expression analysis between laboratories and across platforms. Nat Methods 2: 351356, 2005.[CrossRef][ISI][Medline]
- Barczak A, Rodriguez MW, Hanspers K, Koth LL, Tai YC, Bolstad BM, Speed TP, and Erle DJ. Spotted long oligonucleotide arrays for human gene expression analysis. Genome Res 13: 17751785, 2003.[Abstract/Free Full Text]
- Bieg S, Koike G, Jiang J, Klaff L, Pettersson A, MacMurray AJ, Jacob HJ, Lander ES, and Lernmark A. Genetic isolation of iddm 1 on chromosome 4 in the biobreeding (BB) rat. Mamm Genome 9: 324326, 1998.[CrossRef][Medline]
- Bouton CM and Pevsner J. DRAGON View: information visualization for annotated microarray data. Bioinformatics 18: 323324, 2002.[Abstract/Free Full Text]
- Bouton CM and Pevsner J. DRAGON: Database Referencing of Array Genes Online. Bioinformatics 16: 10381039, 2000.[Abstract/Free Full Text]
- Chen YW, Zhao P, Borup R, and Hoffman EP. Expression profiling in the muscular dystrophies: identification of novel aspects of molecular pathophysiology. J Cell Biol 151: 13211336, 2000.[Abstract/Free Full Text]
- Chou CC, Chen CH, Lee TT, and Peck K. Optimization of probe length and the number of probes per gene for optimal microarray analysis of gene expression. Nucleic Acids Res 32: e99, 2004.[Abstract/Free Full Text]
- Diehl F, Grahlmann S, Beier M, and Hoheisel J. Manufacturing DNA microarrays of high spot homogeneity and reduced background signal. Nucleic Acids Res 29: e38, 2001.[Abstract/Free Full Text]
- Draghici S, Khatri P, Bhavsar P, Shah A, Krawetz SA, and Tainsky MA. Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-Translate. Nucleic Acids Res 31: 37753781, 2003.[Abstract/Free Full Text]
- Eisen M and Brown P. DNA arrays for analysis of gene expression. Methods Enzymol 303: 179205, 1999.[ISI][Medline]
- Fodor SP, Rava RP, Huang XC, Pease AC, Holmes CP, and Adams CL. Multiplexed biochemical assays with biological chips. Nature 364: 555556, 1993.[CrossRef][Medline]
- Fodor SP, Read JL, Pirrung MC, Stryer L, Lu AT, and Solas D. Light-directed, spatially addressable parallel chemical synthesis. Science 251: 767773, 1991.[Abstract/Free Full Text]
- Freeman WM, Walker SJ, and Vrana KE. Quantitative RT-PCR: pitfalls and potential. Biotechniques 26: 112122, 124115, 1999.[ISI][Medline]
- Halgren RG, Fielden MR, Fong CJ, and Zacharewski TR. Assessment of clone identity and sequence fidelity for 1189 IMAGE cDNA clones. Nucleic Acids Res 29: 582588, 2001.[Abstract/Free Full Text]
- Heid CA, Stevens J, Livak KJ, and Williams PM. Real time quantitative PCR. Genome Res 6: 986994, 1996.[Abstract/Free Full Text]
- Heller R, Schena M, Chai A, Shalon D, Bedilion T, Gilmore J, Woolley D, and Davis R. Discovery and analysis of inflammatory disease-related genes using cDNA microarrays. Proc Natl Acad Sci USA 94: 21502155, 1997.[Abstract/Free Full Text]
- Hessner MJ, Meyer L, Tackes J, Muheisen S, and Wang X. Immobilized probe and glass surface chemistry as variables in microarray fabrication. BMC Genomics 5: 53, 2004.[CrossRef][Medline]
- Hessner MJ, Singh VK, Wang X, Khan S, Tschannen MR, and Zahrt TC. Utilization of a labeled tracking oligonucleotide for visualization and quality control of spotted 70-mer arrays. BMC Genomics 5: 12, 2004.[CrossRef][Medline]
- Hessner MJ, Wang X, Hulse K, Meyer L, Wu Y, Nye S, Guo SW, and Ghosh S. Three color cDNA microarrays: quantitative assessment through the use of fluorescein-labeled probes. Nucleic Acids Res 31: e14, 2003.[Abstract/Free Full Text]
- Hessner MJ, Wang X, Khan S, Meyer L, Schlicht M, Tackes J, Datta MW, Jacob HJ, and Ghosh S. Use of a three-color cDNA microarray platform to measure and control support-bound probe for improved data quality and reproducibility. Nucleic Acids Res 31: e60, 2003.[Abstract/Free Full Text]
- Hessner MJ, Wang X, Meyer L, Geoffrey R, Jia S, Fuller J, Lernmark A, and Ghosh S. Involvement of eotaxin, eosinophils, and pancreatic predisposition in development of type 1 diabetes mellitus in the BioBreeding rat. J Immunol 173: 69937002, 2004.[Abstract/Free Full Text]
- Hughes TR, Mao M, Jones AR, Burchard J, Marton MJ, Shannon KW, Lefkowitz SM, Ziman M, Schelter JM, Meyer MR, Kobayashi S, Davis C, Dai H, He YD, Stephaniants SB, Cavet G, Walker WL, West A, Coffey E, Shoemaker DD, Stoughton R, Blanchard AP, Friend SH, and Linsley PS. Expression profiling using microarrays fabricated by an ink-jet oligonucleotide synthesizer. Nat Biotechnol 19: 342347, 2001.[CrossRef][ISI][Medline]
- Irizarry RA, Warren D, Spencer F, Kim IF, Biswal S, Frank BC, Gabrielson E, Garcia JG, Geoghegan J, Germino G, Griffin C, Hilmer SC, Hoffman E, Jedlicka AE, Kawasaki E, Martinez-Murillo F, Morsberger L, Lee H, Petersen D, Quackenbush J, Scott A, Wilson M, Yang Y, Ye SQ, and Yu W. Multiple-laboratory comparison of microarray platforms. Nat Methods 2: 345350, 2005.[CrossRef][ISI][Medline]
- Jacob HJ, Pettersson A, Wilson D, Mao Y, Lernmark A, and Lander ES. Genetic dissection of autoimmune type I diabetes in the BB rat. Nat Genet 2: 5660, 1992.[CrossRef][ISI][Medline]
- Järvinen AK, Hautaniemi S, Edgren H, Auvinen P, Saarela J, Kallioniemi OP, and Monni O. Are data from different gene expression microarray platforms comparable? Genomics 83: 11641168, 2004.[CrossRef][ISI][Medline]
- Kane MD, Jatkoe TA, Stumpf CR, Lu J, Thomas JD, and Madore SJ. Assessment of the sensitivity and specificity of oligonucleotide (50mer) microarrays. Nucleic Acids Res 28: 45524557, 2000.[Abstract/Free Full Text]
- Khatri P, Bhavsar P, Bawa G, and Draghici S. Onto-Tools: an ensemble of web-accessible, ontology-based tools for the functional design and interpretation of high-throughput gene expression experiments. Nucleic Acids Res 32: W449W456, 2004.[Abstract/Free Full Text]
- Khatri P, Draghici S, Ostermeier GC, and Krawetz SA. Profiling gene expression using onto-express. Genomics 79: 266270, 2002.[CrossRef][ISI][Medline]
- Kothapalli R, Yoder SJ, Mane S, and Loughran TP Jr. Microarray results: how accurate are they? BMC Bioinformatics 3: 22, 2002.[CrossRef][Medline]
- Kuo WP, Jenssen TK, Butte AJ, Ohno-Machado L, and Kohane IS. Analysis of matched mRNA measurements from two different microarray technologies. Bioinformatics 18: 405412, 2002.[Abstract/Free Full Text]
- Kwitek AE, Tonellato PJ, Chen D, Gullings-Handley J, Cheng YS, Twigger S, Scheetz TE, Casavant TL, Stoll M, Nobrega MA, Shiozawa M, Soares MB, Sheffield VC, and Jacob HJ. Automated construction of high-density comparative maps between rat, human, and mouse. Genome Res 11: 19351943, 2001.[Abstract/Free Full Text]
- Larkin JE, Frank BC, Gavras H, Sultana R, and Quackenbush J. Independence and reproducibility across microarray platforms. Nat Methods 2: 337344, 2005.[CrossRef][ISI][Medline]
- Lipshutz RJ, Fodor SP, Gingeras TR, and Lockhart DJ. High density synthetic oligonucleotide arrays. Nat Genet 21: 2024, 1999.[CrossRef][ISI][Medline]
- Livak KJ, Flood SJ, Marmaro J, Giusti W, and Deetz K. Oligonucleotides with fluorescent dyes at opposite ends provide a quenched probe system useful for detecting PCR product and nucleic acid hybridization. PCR Methods Appl 4: 357362, 1995.[ISI][Medline]
- Mah N, Thelin A, Lu T, Nikolaus S, Kuhbacher T, Gurbuz Y, Eickhoff H, Kloppel G, Lehrach H, Mellgard B, Costello CM, and Schreiber S. A comparison of oligonucleotide and cDNA-based microarray systems. Physiol Genomics 16: 361370, 2004.[Abstract/Free Full Text]
- Mecham BH, Wetmore DZ, Szallasi Z, Sadovsky Y, Kohane I, and Mariani TJ. Increased measurement accuracy for sequence-verified microarray probes. Physiol Genomics 18: 308315, 2004.[Abstract/Free Full Text]
- Park PJ, Cao YA, Lee SY, Kim JW, Chang MS, Hart R, and Choi S. Current issues for DNA microarrays: platform comparison, double linear amplification, and universal RNA reference. J Biotechnol 112: 225245, 2004.[CrossRef][ISI][Medline]
- Pfaffl MW. A new mathematical model for relative quantification in real-time RT-PCR. Nucleic Acids Res 29: e45, 2001.