Screening anti-inflammatory compounds in injured spinal cord with microarrays: a comparison of bioinformatics analysis approaches

Jonathan Z. Pan, Rebecka Jörnsten, Ronald P. Hart


Inflammatory responses contribute to secondary tissue damage following spinal cord injury (SCI). A potent anti-inflammatory glucocorticoid, methylprednisolone (MP), is the only currently accepted therapy for acute SCI but its efficacy has been questioned. To search for additional anti-inflammatory compounds, we combined microarray analysis with an explanted spinal cord slice culture injury model. We compared gene expression profiles after treatment with MP, acetaminophen, indomethacin, NS398, and combined cytokine inhibitors (IL-1ra and soluble TNFR). Multiple gene filtering methods and statistical clustering analyses were applied to the multi-dimensional data set and results were compared. Our analysis showed a consistent and unique gene expression profile associated with NS398, the selective cyclooxygenase-2 (COX-2) inhibitor, in which the overall effect of these upregulated genes could be interpreted as neuroprotective. In vivo testing demonstrated that NS398 reduced lesion volumes, unlike MP or acetaminophen, consistent with a predicted physiological effect in spinal cord. Combining explanted spinal cultures, microarrays, and flexible clustering algorithms allows us to accelerate selection of compounds for in vivo testing.

  • DNA microarrays
  • clustering analysis
  • cyclooxygenase-2
  • gene expression

inflammatory responses are normally thought to exacerbate secondary tissue damage following spinal cord injury (SCI) (46). Anti-inflammatory treatments have been shown to be effective in reducing inflammation and in sparing spinal tissue for functional improvement (1, 3, 21, 50). These effects are thought to be due to the inhibition of lipid peroxidation (23, 54), prostanoid production (26, 39), and inflammatory cytokine activity (27, 71). We and others have detected a localized inflammatory cytokine response after acute spinal cord contusion (41, 44, 57, 69). Using explanted cultures of adult spinal cord slices (44), we found that the pattern of inflammatory cytokine mRNA induction does not require infiltrating peripheral cells. This injury model allows us to concentrate on response mechanisms within spinal tissue itself early after injury. Therefore, the pattern of gene responses in our explanted cultures would help to predict subsequent physiological events in secondary injury.

The traditional target of anti-inflammatory therapy is the conversion of arachidonate to prostanoids by cyclooxygenase (COX), also known as prostaglandin H2 synthetase (PGHS). There are two gene isoforms of COX in mammals, COX-1 and COX-2. Typically, COX-1 is constitutively expressed in many tissues, including gastrointestinal tract and kidney. An inducible gene, COX-2 is expressed primarily in brain and macrophages (14, 35, 43). COX-3 mRNA, an alternative product of the COX-1 gene, is most abundantly expressed in the mature brain and spinal cord tissue, compared with other tissues (7, 33, 58). COX-1 and COX-2 mRNA (20, 26) and protein (12, 50) are found in spinal cord. COX is the target of the family of nonsteroidal anti-inflammatory (NSAID) compounds, including the broadly antagonistic indomethacin. COX-3 is thought to be the main therapeutic target of acetaminophen (7, 58). It would appear that an NSAID might be effective in reducing inflammatory mechanisms following SCI.

The glucocorticoid methylprednisolone (MP) has been an accepted anti-inflammatory therapy for acute spinal contusion for some time (2); however, there have been doubts expressed about its efficacy (28, 29, 47). Other NSAIDs have been tried with little success (W. Young, personal communication). We wished to systematically compare MP, NSAIDs, and direct inhibition of inflammatory cytokines to determine whether patterns of gene expression could be used to distinguish candidate compounds for in vivo testing.

In the present study, we utilized DNA microarrays to compare gene expression following five anti-inflammatory treatments. We expected to see a major similarity between these treatments due to their anti-inflammatory mechanisms. However, it was hypothesized that there may also be some novel molecular mechanisms to distinguish one compound over others. Four hours following explantation, expression of 5,000 genes was measured by our custom rat microarrays. A number of statistical methods for gene filtering [SAM (67), ANOVA, the Welch F-test, and permuted versions of these] and clustering methods [principal components analysis (PCA), gene shaving (24); K-means, PAM/K-medoids, and data depth-based methods (ReD/DDclust; 30)] were applied to the data. A unique pattern of gene expression was revealed, in which a prominent cluster of genes was selectively enhanced by the COX-2 inhibitor NS398, and this pattern was identified by each method tested. Presumed functions of the selected genes could be associated with protective mechanisms. Finally, the COX-2 inhibitor SC58125 or NS398, which has been recently shown to reduce hyperalgesia and allodynia and to enhance locomotor function (21, 50), was also found to reduce tissue lesion volumes. Selection of candidate anti-inflammatory compounds using explanted spinal cord cultures and microarray assays proves to be an efficient and effective method for accelerating therapeutic screening.


Spinal Cord Slice Culture

Cultured spinal cord slices were used as a model of SCI (44). Adult rat spinal cords were quickly isolated, and then 1-mm slices were placed in wells of a 24-well tissue culture plate. Four slices from a single animal were typically placed in a single well with 0.5 ml of Opti-MEM medium with 1% N2 supplement (Invitrogen). Groups contained multiple wells derived from separate animals. The cultures were incubated at 37°C and 5% CO2 for up to 4 h and then frozen for RNA analysis. In the experiments, groups of cultures were treated with each of five anti-inflammatory compounds: 1) acetaminophen (pretreatment 100 mg/kg ip; 2 mM in culture); 2) indomethacin (pretreatment 10 mg/kg ip; 10 μM culture); 3) MP (0.5 μg/ml culture); 4) NS398 (pretreatment 10 mg/kg ip; 100 nM culture); and 5) a mixture of 500 ng/ml rat recombinant IL-1 receptor antagonist and 3.2 mg/ml soluble TNF receptor:Fc chimeric protein. As indicated, some groups were pretreated with drug 1 h prior to explantation. Each group consisted of three separate cultures analyzed independently. Control groups included both freshly dissected spinal tissue (uninjured) and vehicle-treated cultures (injured without drug treatment).

Spinal Cord Injury

Long-Evans rats were treated with drugs or vehicle 1 h prior to (NS398, 5 mg/kg ip) or immediately after (MP, 30 mg/kg iv; acetaminophen, 128 mg/kg po) SCI. After deep anesthesia, a thoracic (T9-T10) laminectomy was performed. A 10-g, 2.5-mm diameter rod was released from a 25-mm height onto exposed spinal cord, using the MASCIS injury model (19, 36), as we previously described (5). Immediately after contusion, rats pretreated with NS398 were given a second dose (5 mg/kg ip). Six or 12 h following injury, the rats were euthanized, and spinal tissue was removed for atomic absorption spectroscopy.

Lesion Volumes

After contusion, spinal cord segments were removed and weighed, and a small aliquot of the TRIzol homogenate (below) was diluted in water, sonicated, and assayed for total tissue potassium by atomic absorption spectroscopy as previously described (8, 36).

RNA Preparation

RNA was prepared from drug- or vehicle-treated spinal cord tissues. Tissues were homogenized with a tissue grinder in TRIzol (Invitrogen). Chloroform was added to the TRIzol homogenate to separate phases, then the aqueous phase was removed, mixed with an equal volume of 70% ethanol, and loaded onto an RNeasy column (Qiagen, Valencia, CA). The column was washed and RNA eluted following the manufacturer’s recommendations. RNA was subjected to spectroscopic analysis of quantity and purity, with A260/A280 ratios at pH 8.0 between 1.9 and 2.1 for all samples. All RNA samples were subjected to capillary electrophoresis on an Agilent 2100 Bioanalyzer (Palo Alto, CA), with all samples demonstrating sharp 18S and 28S ribosomal RNA bands.



The 4,967 probes on our custom microarrays contain a collection of 4,854 oligonucleotides specific for 4,803 rat cDNA clusters purchased from Compugen (Jamesburg, NJ) and a custom set of 113 oligos designed and synthesized by MWG-Biotech (Ebersberg, Germany). The probes, 65–70 nt in length, are standardized for melting temperature and homology is minimized. All bioinformatics for the oligonucleotides are provided on our searchable web site (

Printing and processing.

Microarrays were printed on poly-l-lysine-coated glass slides using an OmniGrid microarrayer (GeneMachines, San Carlos, CA) and quill-type printing pins. Poly-l-lysine slides were prepared and scanned at 532 nm and 635 nm in an Axon GenePix 4000B Scanner (Axon Instruments, Union City, CA) to evaluate surface quality. Slides were stored in a bench-top desiccator for at least 3 wk prior to use. Oligonucleotides were resuspended to 40 μM in 3× SSC and rehydrated with shaking at room temperature. Printing was performed at 24°C with a relative humidity of ∼50%. After printing, arrays were stored overnight in a Parafilm-sealed plastic slide box in a desiccator at room temperature and postprocessed by standard procedures. Slides were stored at room temperature in a sealed plastic slide box in a desiccator and used for up to 3 mo after printing.


Fluorescent probes were prepared using the Genisphere 3DNA dendrimer system (Genisphere, Hatfield, PA). Two micrograms of total cellular RNA was reverse-transcribed from a “capture-sequence”-containing oligo-d(T)18 primer using SuperScript II (Invitrogen). Following alkaline hydrolysis and neutralization, the cDNA target was hybridized with probes on the array at 58°C for 12 h using a Ventana Discovery workstation (Ventana Medical Systems, Tuscon, AZ) and washed to high stringency (6). Dye- and capture sequence-specific fluorescent dendrimers (Genisphere) were then hybridized at 58°C for 2 h. The arrays were washed after removal from the Discovery workstation and scanned on an Axon GenePix 4000B. Raw data are available from the National Center for Biotechnology Information Gene Expression Omnibus (GEO) repository (

Data transformation.

Image files were processed using Axon GenePix 4.0 software, resulting in text files containing median fluorescence feature intensities and median local backgrounds. Features overlaid with contaminants or near background were flagged. The data files were loaded into an open-source, MIAME-compliant, web-accessible database (BASE) (52). Data were then imported into GeneSpring (Silicon Genetics, Redwood City, CA), and normalized by the “per chip, per spot Lowess” method in GeneSpring (72).

Statistical Analysis

Normalized data were filtered by several methods (see results). For comparing clustering techniques, we used the GeneSpring (Silicon Genetics; version 6.0) filter based on the Welch F-test. We adjusted for multiple comparisons using the Benjamini and Hochberg false discovery rate with cutoff P = 0.05. Thus 382 genes were retained for further analysis. To prevent that our analysis was much affected by spurious variations, we performed clustering analyses on a reduced-dimension data set consisting of the median log ratios across replicates.

Principal components analysis.

For PCA, we started with a “centered” matrix of log expression data, X. Each row corresponds to a different gene, and each column corresponds to one of the seven different conditions to which the spinal cord slices were treated. The xab element of the matrix contains the a-th gene’s expression ratio of condition b to a vehicle control. To compute the principal components, the eigenvalues and their corresponding eigenvectors were calculated from the covariance matrix of conditions. Each eigenvector defines a principal component. Each component represents a weighted sum of the conditions, with the coefficients of the eigenvectors as the weights. Each eigenvalue accounts for some degree of total variance over all genes. Therefore, the eigenvectors with large eigenvalues contain most of the information; eigenvectors with small eigenvalues are uninformative. The way to determine the real dimension of the data, disregarding noisy components, is often heuristic. Eliminating low-variance components, while reducing noise, also discards information. We chose to use one criterion that eliminates all components that accounted for similar, but insignificant (<10%) overall variability (49).

Gene shaving (GS).

This algorithm was designed to identify the most variable genes that coherently cluster together across samples (24). In the gene expression matrix, the rows correspond to genes, and the columns correspond to samples. The algorithm uses an iterative computation based on the principal components from PCA. The calculation generates nested clusters in a top-down manner, starting with all the genes, and decreasing down to only one gene. For each cluster of genes, the largest principal component is computed. The correlation of each gene to this principal component is then computed, and genes having the lowest correlation are removed (shaved off). The shaving process requires repeated computation on the reduced cluster of genes. The gap statistic is used to determine appropriate cluster size (24). We tried different shaving factors, i.e., the fraction of genes removed at each iteration. We present the results with shaving factor 5% and 100 permutation data sets used in the gap algorithm to estimate cluster size. This choice of parameters gave stable results, such that the leading clusters and components did not change when the analysis was repeated.


K-means is a commonly used clustering method. It works on the principle of partitioning the data around cluster representatives, here means. The K-means algorithms iterates between two steps: 1) the nearest-mean allocation, where distances from observations (genes) to the means are calculated and genes grouped to the closest mean; and 2) the updating step, where the group means are recomputed given the new partition. The algorithm iterates until the average distance from observations (genes) to the means is minimized. Because K-means is based on averages, it is a notoriously nonrobust clustering method.

Partition around medoids/K-medoids.

Partition around medoids (PAM) robustifies K-means by using “medoids” as representatives instead of means (32). The medoids are restricted to belong to the set of observations. The PAM algorithm is also iterative. Step 2 is now replaced by a medoid-swap, where candidate observations (genes) replace the current medoids. We applied PAM to our data set for various numbers of clusters. We present results obtained with the number of clusters selected by validation methods such as the gap statistic (65).


PAM is an approximation of the exact K-median. If we relax the condition that the cluster representatives have to be medoids (belong to the set of observations), and instead use the multivariate medians, then we can robustify the clustering further (31). K-medians, like PAM, is an iterative algorithm. It iterates between: 1) nearest-multivariate-median allocation and 2) updating the multivariate medians, which can be computed by a fast iterative algorithm (68).


Jornsten (30) recently proposed a new clustering method based on data depth. DDclust is a scale-independent clustering method and allows for the presence of clusters of different scale (within-cluster variance). The methods described above are all highly scale dependent, with clustering criteria similar to the average within-cluster variance. These methods are therefore easily dominated by a single high-variance cluster. The building block of DDclust is the L1 data depth of Vardi and Zhang (68) defined as follows. For a point and an observation, take the unit vector, pointing in the direction from the point to the observation. Compute the average of all the unit vectors from the point to observations in a data cluster. Define the data depth D = 1 − ||Average of unit vectors||. D is near 0 if the point is close to edge of the data cluster, and D near 1 if the point is close to the center of mass of the data cluster. The interpretation is that 1 − D is the additional probability mass needed at the point to make it the multivariate median of the data cluster. Jornsten et al. (31) extended the L1 data depth to multiple data clusters for the purpose of cluster validation. To validate the outcome of a clustering method, let each observation in a cluster take on the role of the point in turn. With each observation we can associate a depth DW (W, “within”) with respect to the cluster it belongs to and a depth DB (B, “between”) with respect to the nearest competing cluster. The relative data depth (ReD) of an observation is the difference ReD = DWDB. Now, a well-clustered observation has ReD near 1, and an observation on the boundary between clusters has negative or near-zero ReD. An appropriate number of clusters for the data set is selected by maximizing the average ReD.

DDclust uses ReD to do clustering. Instead of separating the two tasks of clustering and validation, we combine the two into one step. DDclust finds the partition of the data that maximizes ReD directly. The algorithm is iterative. At each iteration, candidate observations are relocated and a new partition is accepted if ReD increases. To avoid convergence to local maxima, a stochastic search strategy is employed.

Cluster validation criteria.

As described above, the relative data depth, ReD, can be used to select the number of clusters and also to identify outliers (small ReD) and central observations (ReD near 1). A similar tool, the silhouette width, sil, was introduced by Kaufman and Rousseeuw (32). Consider a point in the data set. Let a be the average distance from the point to all the other observations in the cluster, and let b be the average distance from the point to all the observations in the nearest competing cluster. The silhouette width of the point is sil = (ba)/max(a,b). Now let each observation take on the role of the point in turn. A negative of near-zero sil value indicates that the observation does not cluster well with the data; sil near 1 indicates the opposite. The number of clusters in a data set is selected by maximizing the average sil value. Jornsten et al. (31) found that sil is very sensitive to the scales of the individual clusters and tends to underestimate the number of clusters if high-variance clusters are present.

The gap statistic is another method for selecting the number of clusters (65). The method is based on permutations of the data. For each number of clusters we cluster the original data and many perturbed versions of the data for which the natural number of clusters should be 1. We compute the clustering criterion, i.e., average within-cluster variance. This quantity decreases with increasing number of clusters, on both the original and perturbed data. What we are looking for is a significant gap between the criterion computed on the original data and the criterion computed on the perturbed data. We choose the number of clusters as the smallest number k for which a significant increase in gap is observed by going from k − 1 to k clusters.

Quantitative Real-Time PCR

We confirmed selected microarray results by comparison with mRNA levels obtained by quantitative reverse transcription PCR (Q-RT-PCR) using selected gene-specific primer pairs (Table 2). RNA was reverse transcribed with SuperScript II and random primers as suggested by the manufacturer (Invitrogen). The PCR reactions were carried out using 10 ng of cDNA, 50 nM of each primer, and SYBR Green master mix (Applied Biosystems, Foster City, CA) in 10-μl reactions. Levels of Q-RT-PCR product were measured using SYBR Green fluorescence (51, 70) collected during real-time PCR on an Applied Biosystems 7900HT system. A control cDNA dilution series was created for each gene to establish a standard curve. Each reaction was subjected to melting point analysis to confirm single amplified products.


Adult rat spinal cords were explanted, sliced into 1-mm segments (which served as the “injury”), and placed in culture for 4 h. To identify anti-inflammatory reagents producing novel gene responses in injured spinal cord tissue, groups of explanted spinal cord cultures were treated with each of five anti-inflammatory compounds: acetaminophen, indomethacin, MP, NS398, and a mixture of rat recombinant IL-1 receptor antagonist and soluble TNF receptor:Fc (CKIN). The group was selected to include generalized NSAIDs (indomethacin, possibly acetaminophen), COX-2-selective NSAIDs (NS398), COX-3-selective NSAIDs [acetaminophen (7)], and a generalized blockade of primary, inflammatory cytokines (CKIN) (44). In most cases, rats were pretreated with drugs before tissue dissection to achieve maximal effect at the time of injury. Control groups included vehicle-treated control cultures (“injured 4 h”) and freshly dissected, uncultured slices of untreated spinal cords (“uninjured”). A reference pool was created by combining aliquots of the vehicle-treated control culture group: each microarray contained one treatment group sample and one reference pool sample, and results are expressed as a ratio of net feature fluorescences for these RNAs (treatment group/reference pool). Since all compounds were anti-inflammatory, it was expected that most response patterns would be similar. However, we wished to identify novel mechanisms that distinguish a compound or subset of compounds in the context of spinal injury.

Data were normalized using “Lowess” (72) (also known as “Loess”). The data set was filtered by various methods that we briefly describe here. The GeneSpring filtering method proceeds in three steps. First, a subset of genes that satisfy a “minimum data” requirement is identified. With our version of GeneSpring (6.0) this requirement consisted of a gene having at least two measurements for at least two conditions (here, compounds). The second step is to apply the Welch F-test to this subset of genes. The third step is to apply the Benjamini-Hochberg P value correction for multiple comparisons and retain only those genes with adjusted P values below a threshold (here, P = 0.05). GeneSpring identified a subset of 382 genes that are different in at least one of the conditions. GeneSpring is a popular software tool, and we chose to focus on the GeneSpring output in our clustering analyses. However, the minimum data requirement seems somewhat arbitrary, and the Welch F-test is based on normality assumption for the data. We therefore compared other filtering methods to the GeneSpring filter to better ascertain how reproducible our analyses results were.

We applied standard ANOVA and Welch F-test to the data set where the minimum data requirement was more restrictive. GeneSpring allows for the selection of genes where as few as two compounds are tested. It is an easy simulation exercise to show that high significance levels can be obtained for genes with many missing values, where if the full data had been available, the genes would be far from significant. We apply the filters only to those genes that have at least two observations for all compounds. This subset contains 3,500 genes, compared with the 4962 features flagged as “present” by GenePix and GeneSpring. The Welch F-test allows for different variances for the treatment groups, whereas ANOVA assumes equal variance. With few replicates it is difficult to judge whether variances are equal or not, but since violation of this assumption renders the ANOVA test invalid, it is safer to use the Welch F-test, although this can lead to loss of power. We included both tests for illustrative comparison here. We used the model-based P values and adjusted using the Benjamini-Hochberg method. The ANOVA filter identified a subset of 479 genes, and the Welch F-test identified a subset of 478 genes. The overlap was 274. We also apply ANOVA and the Welch F-test and compute permutation-based P values rather than relying on P values obtained under the normality assumption (ANOVA permuted, Welch F-test permuted). By permuting the data set to generate a nonparametric null distribution, it is possible to calculate P values without requiring normality. However, permutation-based tests suffer from granularity, and the permuted Welch F-test has limited power when there are few replicates. ANOVA and the Welch F-test have been found to be relatively robust with nonnormal data. We include the permutation-based tests/filters for completeness. To perform permutation-based tests, we have to impute missing values. We thus only work with genes that have no more than two missing values (2,532 genes satisfy this criterion) and impute the missing values using k-nearest neighbors, where k = 5. permuted ANOVA identified 559 genes, and permuted Welch F-test identified 222 genes. Finally, we applied the SAM (“significance analysis of microarrays”; 67) filtering method to our data. SAM identified a subset of 120 genes. A comparison of the filtering methods found a general overlap and some exceptions (Fig. 1). Among nonpermuted tests (Fig. 1A), the GeneSpring filter set is most similar to the Welch F-test. As expected, the reduced power of the permuted Welch F-test leads to a reduced overlap with the GeneSpring set (Fig. 1B). Since the GeneSpring set largely overlaps with the other filters and since GeneSpring is commonly used by microarray researchers, we chose to evaluate multiple clustering methods with this filter, which produced a subgroup of 382 genes that were significantly altered in any condition. We concluded that this group of genes could serve as a proxy for the effects of each compound on spinal tissue in the presence of injury.

Fig. 1.

Comparison of genes selected by different filtering methods. Venn diagrams depict overlaps between model-based filtering methods (A) and permutation-based methods (B). The application of each filtering method is described in detail in the text. Vector plots for the selected genes are shown adjacent to each filtering label. SAM, significance analysis of microarrays.

This subset was then log2-transformed and centered, and the eigenvectors constituting the principal components of the set were calculated (Fig. 2). The goal of using PCA was to identify major regulatory patterns in the data set and to ascribe the proportion of total variability explained by each eigenvector. This allows us to focus only on those eigenvectors corresponding to 10% or more of the variability. The first eigenvector (red line in Fig. 2), which explained 55.2% of the total variability, included similar values for each of the five anti-inflammatory treatments. This was expected, but uninteresting. The second eigenvector (green line in Fig. 2), explaining ∼20.7% of the variability, described a surprising effect within our experiment. This vector described a pattern of response that was unique to NS398, the COX-2 inhibitor. We interpreted this result as demonstrating a unique pattern of cellular mechanisms affected by COX-2 inhibition during SCI. This unique pattern of responses represented the goal of our analysis. However, since PCA does not partition the data, we were unable to interpret the response pattern biologically.

Fig. 2.

Principal components analysis (PCA) reveals patterns of gene responses to anti-inflammatory compounds. Principal component eigenvectors (top) were sorted by their accounted variances (eigenvalues, bottom). Eigenvector 1 (red) accounts for 55.2% of variance, which is large, but not interesting, since the pattern is similar across all anti-inflammatory treatments. Eigenvector 2 (green), which accounts for 20.7% of the total variance, revealed an unexpected expression profile, in which a pattern was selectively increased by the cyclooxygenase-2 (COX-2) inhibitor NS398. Remaining vectors were judged to be insignificant, since their eigenvalues accounted for 10% or less of the total variance. MP, methylprednisolone; Acet, acetaminophen; Indo, indomethacin.

Genes were clustered by several methods, including “gene shaving” (24) (Fig. 3, A and B). This method finds successive groups of genes with near orthogonal group means. The size of the clusters was selected via the gap algorithm (65) (Fig. 4). The group corresponding to the first principal component (47.5% explained variance, black line) included 98 genes (a subset is shown in Table 1), some of which were upregulated in the NS398 treatment group. Gene shaving was found to be highly sensitive to the algorithmic parameters used and tended to generate very large clusters. We found the most stable solution to be obtained with shaving factor 5% and 100 permutations used in the gap estimation step (Fig. 3A). Application of the gene shaving algorithm to the raw data set identified a much more selective cluster, containing only 20 genes (Fig. 3B; Table 1), corresponding to the second principal component (red). These 20 genes included chemokines (Scya41 and Scya2, formerly known as MIP-1β and MCP-1, respectively), stress responses (Hspa1a, Hmox1), and those with somewhat unknown function (Vim, Dcn). The identities of genes within this cluster were interpreted as potentially protective to injured tissue (see discussion).

Fig. 3.

Several methods identify an NS398-selective cluster of genes. A: gene shaving clustering of log-transformed data. Gene shaving cluster centers are plotted. After repeating the analysis, the stable GS puts the NS398 genes into the first component of 98 genes; see the hump at NS398 in the black curve. B: gene shaving cluster centers of raw data. The gene shaving algorithm was repeated with raw (non-log) data and now revealed a sharply defined cluster containing only 20 genes (red). The selected cluster members are shown in Table 1. C and D: partition around medoids (PAM)/K-medians. Distance from each observation (gene) to the multivariate median was calculated, and genes were grouped to the closest representative. In C, five clusters were identified but lacked a distinguishable pattern. In D, using eight clusters, the eighth cluster (dark yellow) stood out uniquely with significant upregulation by COX-2 inhibitor, matching the second principal component (Fig. 2). E: K-means. Clustering with the commonly used K-means algorithm exhibited results similar to K-medians, with an NS398-selective group in the sixth cluster (magenta). F: the relative data depth (ReD). ReD calculates the relative data depth for each observation (gene) to assigned and competing data clusters. Depth-based validation methods are independent of scale and focus on cluster separation. We used a novel clustering algorithm, DDclust, to partition the data. The NS398-selective genes were clearly isolated in the fifth cluster (cyan).

Fig. 4.

The gap statistic. A plot of the gap statistic shows the slight difference between five and eight clusters that suggested the selection of eight clusters. “Elbows” in the curve indicate potential natural clusterings.

View this table:
Table 1.

Probes clustered by various methods

In a separate approach, the 382-gene GeneSpring ANOVA-selected data set was clustered by K-means (Fig. 3E), PAM, the K-median algorithm (31) (Fig. 3, C and D), and a new clustering algorithm, DDclust (30) (Fig. 3F). Our motivation was that the latter two algorithms have been found to be highly robust. In addition, DDclust allows for the presence of clusters of very different scales or within-cluster variances. The methods all require that the number of clusters be fixed a priori. To select an appropriate number of clusters, we needed to validate the clustering. Many cluster selection and validation criteria exist. We focused on the following three; the gap statistic (Fig. 4; 65), the average silhouette width (32), and the relative data depth, ReD (31). We applied each selection criteria in conjunction with each method. When PAM or the K-median was used for clustering, gap and ReD chose five clusters for the data set, with gap suggesting an additional alternative clustering with eight clusters (see “elbows” in the curve in Fig. 4). The average silhouette was maximized with only two clusters, but a local maximum was obtained with five clusters. On the raw data set all methods selected five clusters. When we clustered log-transformed data using DDclust, all validation methods selected five clusters. Here, the gap method was used both with the within-cluster sum of squares and with ReD (two separate approaches) as the validation criterion for which the gap between the original and permuted data is computed.

On the log-transformed data the K-median five-group clustering finds patterns corresponding to genes that are similarly upregulated in all treatments, similarly downregulated in all treatments to two different levels, and two group patterns showing the effect of injury (Fig. 3C). If we increase the number of clusters to eight, then the up- and downregulated pattern clusters split and an additional cluster showing the distinct NS398 pattern emerges (Fig. 3D; Table 1). We obtain similar results if we use PAM instead of the K-median (not shown). When we use the K-median with a stochastic search strategy on the log-transformed data with five clusters, we identify a local solution where the NS398 pattern is present. This clustering is not the global solution for which the average distance to the cluster representatives is minimized. However, the ReD and sil validation criteria are much higher for this local solution. Finally, the five-group DDclust clustering picks up the NS398 pattern (Fig. 3F; Table 1). Furthermore, a randomized data set does not produce recognizable clusters, and the ReD and sil values fall off by more than 50% (not shown).

Resulting clusters were examined for fit within their assigned clusters using the ReD method and the silhouette width (Fig. 5). The cluster consisting of genes uniquely upregulated by NS398 exhibits excellent ReD and sil values, indicating it is a stable cluster, clearly separated from the rest (Fig. 3F, cyan). This cluster contains only eight genes and is a subset of the gene shaving cluster (Table 1; Fig. 3B). Seven of the eight genes are conserved in the PAM/K-medoid (K = 8) cluster (Table 1; Fig. 3D). Thus there is a consensus between all these exploratory clustering methods regarding this small set of genes. The differences we do observe are easy to explain. Most clustering methods are scale driven, i.e., the data set is partitioned to minimize a global criterion similar to within-cluster variability or scale. The scale-dependent methods (PAM/K-median) split the large-scale downregulated clusters repeatedly before finding the small NS398 group. However, clusterings that identify this small group early on correspond to higher cluster quality and validation values (sil and ReD) than do those that split the downregulated patterns according to the global scale-dependent clustering criterion (e.g., K-median). DDclust is robust and scale independent and allows for the presence of small and large, tight and large-scale clusters, together. This scale independence of DDclust is also demonstrated by applying the algorithm to log-transformed and raw data. The method identifies near equal clusterings on both data sets (not shown). The K-median algorithm and other scale-dependent methods are more sensitive to the transform (as discussed above). The log transform is commonly applied to microarray data to symmetrize and compress the scale of the data. However, it is important to realize that the transform in most cases does affect the clustering outcome. For downregulated genes the effect of the log transform is to aggressively expand the scale. Other transforms such as the generalized log have thus been proposed to remedy the numerical instability of the log in these ranges (11). A clustering algorithm that is less sensitive to the effects of the various suggested transforms is thus preferable.

Fig. 5.

The ReD validates clusters selected by DDclust. Above the x-axis in are shown the within-cluster data depths, below the axis are depths with respect to the nearest competing cluster. The protrusions at the bottom of the graph represent the potential outliers inside each cluster. The five clusters shown in Fig. 3F are presented along the x-axis from left to right. Apparently, the fifth cluster at the far right of the graph (corresponding to the cyan cluster in Fig. 3F) has very few outliers and presented as a well-isolated cluster of genes. Members of this cluster are listed in Table 1.

To ensure that our selection of a single filtering method did not bias our results, we applied the K-median and DDclust clustering methods to all filtered subsets, and the cluster centers are plotted in Fig. 7. The number of clusters was selected in each case by the ReD validation criterion. In each case the most prominent, distinguishing cluster showed a peak with NS398. Furthermore, of the eight genes clustered by DDclust in Table 1, six of the eight were found in all NS398-peak clusters for all filtering methods. Several additional genes could be added to the NS398 cluster if other filtering methods are used, but the combination of GeneSpring filtering and DDclust appears to provide the most selective and reproducible list.

The means of eight individual gene profiles clustered by DDclust are shown in Fig. 6. Although the ReD analysis suggests this is an isolated cluster, it is apparent that two subsets of patterns emerge. In one case (Scyb2, Hmox1, and GRO1), the mRNAs are relatively more abundant upon injury itself, since the ratio of uninjured to injured diminishes. The other genes are relatively unaffected by injury alone. Although a biological interpretation may be proposed, the inclusion of all eight genes in the cluster is easily explained by the consideration of all seven dimensions when comparing cluster membership.

Fig. 6.

Microarray results from the NS398-selective DDclust cluster. The eight genes selected in the fifth DDclust cluster (Fig. 3F and Table 1) are plotted as the means ± SE (n = 3) ratio of treated culture to pooled, control, untreated culture. Visual examination suggests the cluster may be further split into two groups: one unaffected by injury, and one (Gro, Hmox1, and Scyb2) increased following injury (fresh/uninjured < 1).

Selected genes from microarray results were confirmed by Q-RT-PCR in their changes between drug-treated and vehicle controls (Table 3). In most cases, the fold change of NS398-treated cultures over vehicle-treated cultures was in good agreement with microarray results, but with a larger dynamic range. The only major difference was found for Csh1, which was present at very low concentrations and required different PCR conditions to detect (5-fold increased primer and 5-fold increased cDNA). Under these conditions we began to see additional PCR product in some samples by Tm analysis (not shown), so we consider this assay to be unreliable. All other primer pairs were generally consistent with microarray results.

We interpreted potential expression of the clustered genes in the context of spinal contusion, and we hypothesized that the NS398-enhanced genes would promote preservation of spinal tissue and subsequent function. To test our hypothesis, we performed an in vivo contusion injury, and we assessed the lesion volume after NS398, MP, or acetaminophen treatment (Fig. 8). Other groups had previously shown that COX-2 inhibitor (SC58125 or NS398) reduces the exacerbation of pain syndromes by SCI and enhances the recovered locomotor scores (21, 50). We chose to examine the cellular lesion volume, as calculated from tissue potassium levels, assuming that dead cells quickly release their potassium content to extracellular space (8, 36). Rats were treated 1 h prior to or immediately following injury with either 5 mg/kg ip NS398, 30 mg/kg iv MP, 128 mg/kg po acetaminophen, or vehicle and then contused using the MASCIS device at 25-mm weight drop (19, 36). Following 6 or 12 h of recovery, rats were killed, spinal cords were removed, and five 5-mm segments centered at the T9/10 site of contusion were assayed for total tissue potassium. Only the NS398 treatment resulted in a significant reduction in lesion area compared with vehicle-treated contusions (P < 0.05 Student’s t-test). The significant NS398 reduction in lesion volume was observed at either 6 or 12 h after contusion. In separate experiments, neither acetaminophen nor MP produced an effect on lesion volumes at 6 h. This observation agrees with our hypothesis that NS398 preserves tissue following spinal contusion better than the other anti-inflammatory compounds tested.


The currently accepted therapy for acute SCI is a high-dose MP, administered within 8 h of injury (3). However, the quantity of function preserved by MP is small, and there have been doubts expressed about its efficacy (28, 29). MP is known to have at least two anti-inflammatory mechanisms: action at the glucocorticoid receptor and direct inhibition of lipid peroxidation (22). However, inflammatory cells or mechanisms have been proposed to be neuroprotective or even regenerative under specific circumstances (25, 48, 59), even though other researchers have shown that selective inhibition of inflammatory pathways rescue spinal tissues (18, 41, 45, 46, 50). We wondered whether a more selective attenuation or modification of inflammatory mechanisms might better rescue cells from secondary damage in SCI.

The most obvious targets for selective inhibition of inflammation are the COX family of enzymes, which mediate the production of prostaglandins and other arachidonic acid metabolites. COX-1 is present in spinal tissues, and COX-2 has been found following SCI (50). The COX-selective NSAIDs are valuable in treating a broad range of inflammatory models. Furthermore, we have previously studied cytokine cascades in SCI (44), and we wondered whether a direct inhibition of IL-1 and TNFα might be efficacious. The traditional approach would have been to group large numbers of animals, dose with the desired compounds, perform surgical contusions, and then screen animals for rescued tissue or rescued function. This method is expensive and time-consuming. We chose instead to screen injured, cultured spinal tissue and to assess physiological outcomes by microarray analysis. Our goal was to group compounds with similar physiological effects to distinguish those that differ from the effects of MP. This concept has been used by others (75) but requires sufficiently sensitive bioinformatic clustering methods to extract pertinent physiological patterns.

We compared several filtering methods. The GeneSpring filter is based on the Welch F-test. However, the minimum data requirement seemed arbitrary to us, and we wanted to examine whether our conclusions would hold up if other filtering methods and minimum data requirements were used. We compared SAM, ANOVA, and Welch F-test with a more restrictive minimum data requirement and permutation-based versions of the latter two. There was a considerable overlap between the generated gene lists of all filters but also a significant proportion of genes that were identified by only one or two of the filtering methods. However, regarding the small subset of genes that we had identified as being related to the NS398 compound, there was a general consensus between the filtering methods.

We also compared several clustering algorithms. On raw data the algorithms agreed closely. After applying a log transform we still saw consensus regarding a subset of genes relating to NS398. PCA identifies two components that explain most of the variability in the GeneSpring-selected subset of the data (Fig. 2). The first component appears to demonstrate a common set of physiological effects for every tested anti-inflammatory compound. The second component identified a pattern selective for NS398, the COX-2-specific inhibitor. To partition the data using these PCA vectors, we turned to gene shaving (24). As expected, the GS partitioning reflected patterns observed in PCA, but we found that GS was unable to clearly separate the NS398 group in log-transformed data (Fig. 3A) but did distinguish the NS398 group in nontransformed data (Fig. 3B). The log transform had little effect on PCA. We then turned to K-medians (Fig. 3, C and D). Since this method requires the number of clusters be selected initially, we judged the number of clusters in the data using the gap statistic (Fig. 4), which initially suggested k = 5, but on closer inspection suggested an alternative clustering with k = 8. The clusterings reflected this transition, grouping data into indistinct subsets with k = 5 (Fig. 3C), but providing isolation of the NS398-enhanced genes with k = 8 (Fig. 3D). K-means exhibited results similar to K-median. Finally, the ReD-based DDclust technique readily identified the NS398-increased genes among five clusters (Fig. 3F). Using ReD, we found a tight fit for members of the fifth cluster (Fig. 5). Among the algorithms tested, DDclust produced the most selective and stable clusters and was relatively unaffected by data transformations. Using ReD, we were best able to judge the fit of individual genes to the selected cluster.

With the selection of DDclust as the most robust method, we reexamined the various filtering methods (Fig. 7). Although there were minor variations in genes retained by the various filters (Table 2), DDclust identified essentially identical patterns of NS398-specific clusterings. We conclude that the selection of filtering and clustering methods may be optimized, but the presence of a robust pattern in the data should be detectable under any of the methods tested.

Fig. 7.

Reexamination of all filtering methods with DDclust clustering. The filtering methods depicted in Fig. 1 were clustered using DDclust, where the number of clusters was selected via the ReD criterion. A: GeneSpring 6.0 ANOVA; this result is identical to that shown in Fig. 3F and is shown here for comparison. B: standard ANOVA. C: permuted ANOVA. D: SAM. E: Welch F-test. F: Welch permuted. See text for descriptions. Clusters selected a similar NS398-selective cluster in each of the filtering methods tested. Overlapping results are summarized in Table 2.

View this table:
Table 2.

Consistency of clustering with different filters

The genes selected by the various methods are listed in Table 1, and those selected by DDclust are graphed in Fig. 6. Results were confirmed for NS398 treatment of the DDclust cluster 5 by Q-RT-PCR (Table 3). Of the six genes in common to all filtering methods (Csh1, Dcn, Mgp, Scya2, Scyb2, and Vim; Table 2), we interpret the predicted physiological effects in the context of SCI to be beneficial, i.e., promoting tissue resolution and preserving cells from damage or death. Individual genes and their interpretation are summarized below.

View this table:
Table 3.

Comparison of microarray results with QRT-PCR results for the members of DDclust cluster 5

Several chemokines have been shown to increase following spinal contusion (16, 38, 41), and inhibition of specific chemokines has proven beneficial (17, 18). However, we interpret the NS398-increased chemokines Scya4 and Scyb2 as primary responses to removal of inhibition by prostaglandins (37, 63) and potentially valuable in stimulating the proper activation of infiltrating macrophages. Activated macrophages have been found to promote repair of injured spinal tissue and to enhance functional recovery (48, 59). If correctly activated macrophages are beneficial, then enhanced levels of Dcn are important because Dcn may reverse TGFβ inhibition of macrophage activation (13, 60). Of course, reduced numbers of peripheral macrophages have also been found to preserve spinal tissues (45), and macrophages or microglia are found associated with apoptotic oligodendrocytes (61). These conflicting observations suggest to us that macrophages lacking the appropriate activation signals are likely to cause tissue damage through nonspecific toxic pathways, but that activation may promote mechanisms leading to tissue resolution. Our choice among these hypotheses is admittedly arbitrary.

Gro, also known as CINC-1, macrophage inflammatory protein 2α, growth-related protein β, and KC, is a neutrophil chemoattractant, part of C-X-C family. It has been found to be induced to a higher extent in spinal cord than brain by IL-1β, which is consistent with a more rapid accumulation of neutrophils after injury (4) or ischemia (40). Expression has been noted previously in vascular cells 4 h following spinal contusion (66), and it may induce IL-6 and other signaling molecules such as prolactin (55, 56). We predict that NS398-induced expression of Gro may lead to enhanced neutrophil accumulation and localized damage following SCI but also may be involved in regulating immune response to injury.

Stress response products such as Hmox1 and Hspa1a would be expected to be neuroprotective (64, 73). Induction of homologous Hmox2 by phorbol esters protects brain cultures from oxidative stress (10). Hspa1a (Hsp70) protects against apoptosis (53) and is associated with ischemic preconditioning (42, 74). Csh1 (formerly known as placental lactogen) and prolactin inhibit NO-induced apoptosis of lymphoma cells, potentially by enhancing anti-apoptotic bcl-2 family members (15, 34). The protective nature of these genes is much clearer.

Vim increase is likely a product of glial remodeling, most likely in the ependymal cells (9). Although there is no evidence that ependymal remodeling provides protective function, one might imagine that efforts to repair of the central canal could restore cerebrospinal flow. Finally, Mgp induces expression of BMP-2, a powerful regulator of neuronal development and phenotype (62). In the context of spinal contusion, we interpret these potential responses as being beneficial for injury response or protection from cell damage. We hypothesize that COX-2 inhibition following acute spinal contusion, then, would rescue tissues from damage and thereby preserve function.

In vivo testing of NS398 demonstrates a significant reduction in lesion volume compared with vehicle-treated controls or treatment with either MP or acetaminophen (Fig. 8). We used the determination of tissue potassium by atomic absorption to calculate lesion volumes, as described previously (8, 36). Other studies have found enhanced locomotor function, reduced tissue loss, reduced allodynia, and reduced hyperalgesia following treatment of COX-2 inhibition (21, 50). Our comparisons, including broad NSAIDs such as indomethacin and acetaminophen as well as MP, suggest that the COX-2-selective NS398 produces a distinct physiological response following spinal tissue injury. The finding that NS398’s response is associated with reduced tissue loss and enhanced function preservation matches our interpretation of clustered genes. This study demonstrates that the technique of using microarrays to assess physiological responses can be interpreted to speed compound screening in specific tissue paradigms. Furthermore, the identification of selected gene responses provides hypothesized mechanisms for more traditional investigations.

Fig. 8.

Selective COX-2 inhibitor significantly reduced lesion volumes after spinal contusion. Each panel represents experiments performed at different times. Rats were treated with NS398, MP, acetaminophen, or vehicle. Rats were killed at 6 or 12 h and processed for atomic absorption measurement of tissue potassium as an indicator of the fraction of cells surviving injury, as well as the calculated lesion volume (8, 36). Results from either MP (A) or acetaminophen (B) treatment did not significantly affect lesion volumes at 6 h (Student’s t-test, P > 0.05). However, at 6 or 12 h (C and D, respectively), NS398 treatment significantly reduced lesion volumes compared with vehicle injury controls (Student’s t-test, *P < 0.05). In the NS398 assays, calculated lesion volumes in uninjured tissue (representing tissue space unoccupied by cells) are shown for comparison.


This work was supported by grants from the New Jersey Commission on Spinal Cord Research (to R. P. Hart) and the National Science Foundation (to R. Jörnsten).


We thank Jessica Lam, Donna Wilson, Bor Tom Ng, and Hock Ng for outstanding technical assistance. Dr. Jason Carmel performed the MP experiments in Fig. 8, and Dr. Kaan Erkan performed the acetaminophen experiment in Fig. 8.


  • 1 Gene names are abbreviated according to their LocusLink record, shown in Table 1.

  • Article published online before print. See web site for date of publication (

    Address for reprint requests and other correspondence: R. P. Hart, Rutgers Univ., W. M. Keck Center for Collaborative Neuroscience, 604 Allison Rd, Rm D251, Piscataway, NJ 08854 (Email: rhart{at}



View Abstract