Investigating the molecular mechanisms underlying sarcopenia in humans with the use of microarrays has been complicated by low sample size and the variability inherent in human gene expression profiles. We have conducted a study using Affymetrix GeneChips to identify a molecular signature of aged skeletal muscle. The molecular signature was defined as the set of expressed genes that best distinguished the vastus lateralis muscle of young (n = 10) and older (n = 12) male subjects, when a k-nearest neighbor supervised classification method was used in conjunction with a signal-to-noise ratio gene selection method and a holdout cross-validation procedure. The age-specific expression signature was comprised of 45 genes; 27 were upregulated and 18 were downregulated. This signature also correctly classified 75% of the muscle samples from young and older subjects published by an independent laboratory, based on their expression profiles. The signature revealed increased expression of several genes involved in mediating cellular responses to inflammation and apoptosis, including complement component C1QA, Galectin-1, C/EBP-β, and FOXO3A, among others. The increased expressions of genes that regulate pre-mRNA splicing, localization, and modification of RNA comprise markers of the aging signature. Downregulated genes in the signature were the glutamine transporter SLC38A1, a TRAF-6 inhibitory zinc finger protein, and membrane-bound transcription factor protease S2P, among others. The sarcopenia signature developed here will be useful as a molecular model to judge the effectiveness of exercise and other therapeutic treatments aimed at ameliorating the effects of muscle loss associated with aging.
- k-nearest neighbor classification
- skeletal muscle
sarcopenia is an “age-related” loss of muscle mass leading to muscle weakness, limited mobility, and increased susceptibility to injury (36, 55). Overall changes with age that contribute to sarcopenia include declines in androgenic and growth hormone concentrations (26), declines in spontaneous physical activity, and changes in dietary intake of protein and/or energy (51). Specifically, in skeletal muscle there is a selective loss of type II (fast twitch) muscle fibers (27), declines in total muscle area (31, 40), reduced muscle capillarization (5), shortening velocity (28), and maximal force (28).
To begin to identify the molecular basis for the loss of muscle mass with age, investigators have measured changes in expression on a global scale during aging in skeletal muscle using differential display (16), serial analysis of gene expression (57), cDNA arrays (23, 43), and oligonucleotide-based microarrays (24, 29, 38, 59). These studies have reported changes in gene expression consistent with decreased protein synthesis, impaired oxidative defense, and decreased activity of mitochondrial proteins. They have also reported differential expression of genes involved in energy metabolism, DNA damage repair, stress response, immune/inflammatory response, RNA binding and splicing, and proteasome degradation. Although these studies have provided insight into the age-related changes in gene expression and thereby the aging process, the human studies in particular have limitations with regard to sample size, number of genes surveyed, overall smaller differences in gene expression, and pooling of samples. Importantly, investigating the molecular mechanisms underlying sarcopenia in humans with the use of microarrays is also complicated by the inherent variability in human gene expression profiles. This variability is likely due to differences in genetics, diet, environment, and habitual patterns of activity, making it more difficult to identify true age-specific alterations. In fact, investigators using the human Affymetrix microarrays to study young vs. older males (59) found that the intragroup (n = 8) variability was so high that a special ratio method needed to be developed to reduce the within-group variance so that mean differences between young and old subjects reached statistical significance (60).
In the present study, Affymetrix HG-U133A GeneChips were used to interrogate the expression of 18,400 unique transcripts and variants, including 14,500 well-characterized human genes (∼22,000 probe sets in total) in the vastus lateralis muscle of 10 young (19–25 yr old) and 12 older (70–80 yr old) sedentary but ambulatory men. We have expanded on previous work in human studies by increasing sample size, controlling recent diet, and, most importantly, focusing our efforts on defining a molecular signature of sarcopenia rather than a general survey of gene changes with age. An aging signature is the minimum set of genes whose differential expression, when taken together, is best at defining and classifying aged skeletal muscle. The computation of molecular signatures using microarray data has proven useful in the classification of cancer subtypes and the classification of normal vs. cancerous tissue in humans. Examples include signatures distinguishing two types of acute leukemia (15), normal vs. malignant oral epithelial tissue (20), and primary vs. metastatic adenocarcinomas (41). Identification of specific signatures in various cancers allows for more specific targeting of treatment regimens. An aging signature in muscle can be used to classify the age of unknown samples, and it can be used to determine the effectiveness of training or pharmaceutical intervention.
The molecular signature of sarcopenia was defined as the set of genes expressed in muscle that best distinguished young (n = 10) and older (n = 12) male subjects when a k-nearest neighbor supervised classification method was used in conjunction with a signal-to-noise ratio gene selection method and a holdout cross-validation procedure. We identified an aging-specific expression signature comprised of 45 genes: 27 were upregulated and 18 were downregulated. This signature was also used to classify the expression profiles of muscle samples from a separate dataset published by an independent laboratory (59). Upregulated genes that represented part of the signature of sarcopenia included several genes involved in mediating cellular responses to inflammation and apoptosis in addition to genes that regulate pre-mRNA splicing, localization, and modification of RNA. The signature also revealed the differential expression of genes involved in glutamine transport and metabolism, circadian rhythm control, WNT signaling, proliferation, and steroid hormone receptor activity. This work will serve as a basis for the future analysis of sarcopenia and as a molecular model to judge the effectiveness of exercise and other therapeutic treatments aimed at ameliorating the deleterious effects of reduced muscle mass in aging humans.
MATERIALS AND METHODS
All subjects were screened in a two-part process. First, a brief medical history screening was used to assess general health and overall eligibility. Second, a full medical screening was performed, including a physical examination, complete medical history, 12-lead electrocardiogram, blood analysis, and a graded exercise test to assess maximal aerobic capacity. Eligible subjects were then invited back for the muscle biopsy procedure and underwent assessment of leg strength and power at least 7 days after the biopsy. This study was approved by the Boston University Institutional Review Board.
Fourteen young (19–25 yr old) and 14 older (70–80 yr old) healthy men were recruited for this study. Subjects were in good health, as evidenced by physical examination and normal clinical laboratory tests. Subjects were not taking prescription medication and were not participating in a regular program of resistance or endurance exercise for the previous 6 mo.
Maximal Aerobic Capacity
Maximal aerobic capacity was determined on a stationary cycle ergometer by a symptom-limited graded test to exhaustion. Oxygen consumption and carbon dioxide production were measured with the use of a computer-interfaced metabolic cart (Ametek, Pittsburgh, PA).
Percutaneous needle biopsies of the vastus lateralis muscle were obtained from all subjects. Subjects were instructed not to engage in any physical activity for 3 days before the biopsy. To further standardize metabolic conditions, subject were provided a standardized meal (3,000 kJ: 68% carbohydrate, 13% fat, and 19% protein) the evening before and a standardized light breakfast (1,865 kJ: 67% carbohydrate, 11% fat, and 22% protein) 3–4 h before each biopsy. All of the biopsies were obtained in the mid-to-late morning hours to control for diurnal variation. Biopsy specimens were taken under local anesthesia (1% lidocaine), using a 5-mm muscle biopsy needle and applied suction. A portion of each sample (∼75 mg) was homogenized in TRIzol reagent for RNA isolation, and the remaining sample was transversely oriented, mounted in mounting media, and quick-frozen in isopentane cooled to the temperature of liquid nitrogen for histochemical and biochemical analysis.
Leg Strength and Power
Muscle strength of the lower extremities was quantitatively assessed by the one-repetition maximum (1RM) measure of bilateral leg press (LP), using pneumatic resistance training equipment (Kaiser Sports Health Equipment, Fresno, CA). The 1RM is defined as the maximum load that can be moved one time only throughout the full range of motion while maintaining proper form, and this method has been described previously (11). Assessment of bilateral LP peak muscle power was performed using the same pneumatic resistance machines used for 1RM testing. Power was assessed at 40 and 70% of the 1RM. Beginning at 40%, subjects performed the LP at each established percentage of 1RM as fast as possible through the full range of motion. Five maximal attempts were made at each resistance level, with 30–45 s of rest given between each repetition, and the highest power output was used in the analyses.
Transverse sections (10 μm) were cut from mounted samples using a cryostat microtome and mounted on glass slides and fixed in acetone at −20°C for 10 min. Sections were incubated in phosphate-buffered saline (PBS), pH 7.4, for 10 min. Sections were then incubated overnight in a humidified chamber with one of the following primary antibodies in 1% BSA at 4°C: anti-type I myosin heavy chain (MHC) (American Type Culture Collection, Manassas, VA), anti-type IIa MHC (DSMZ, Braunsweig, Germany), or anti-type IIx MHC (212F; generously donated by Dr. Peter Merrifield). Sections were rinsed in PBS (5 × 5 min) and Tris-buffered saline (TBS; 4 × 5 min) and then incubated (1 h) in secondary antibody (biotinylated goat anti-mouse IgG) at room temperature in a humidified chamber. After a washing, sections were incubated with streptavidin-alkaline phosphatase (Zymed, San Francisco, CA), rinsed in TBS, and then incubated in Alkaline Phosphatase Substrate Kit IV solution (Vector, Burlingame, CA). Sections were then mounted on coverslips and sealed.
Fiber type distribution and fiber areas were determined with a computer-operated image analysis system (Bioquant Nova, R&M Biometrics,). Briefly, this system captures the light microscope image (magnification 10×), thresholds the image, traces the fiber boundaries, counts the light and dark fibers, and measures the cross-sectional areas of all the fibers. The threshold is established based on similar pixel quality to group fiber types. Citrate synthase activity was determined spectrophotometrically using the method described by Srere (50).
Fresh muscle biopsies samples (50–75 mg) were immediately weighed and homogenized in TRIzol reagent (1 ml TRIzol/50 mg tissue). The muscle homogenates (in TRIzol reagent) were stored at −80°C for up to 4 mo. Once all of the biopsy samples were collected, total RNA was isolated from the frozen homogenates by proceeding with the TRIzol protocol. Next, Qiagen columns were used to remove DNA contamination and improve the quality of total RNA. RNA quality was assessed using the ratio of absorbance at 260 and 280 nm; on average, the ratio was 1.8 (no samples had a ratio below 1.7). Furthermore, all RNA samples were size fractionated, using agarose gel electrophoresis, and stained with ethidium bromide to check for integrity of 18S and 28S RNA. The average yield was 0.5 μg of total RNA per 1 mg of muscle tissue. High-quality RNA with sufficient yield was isolated from 10 young and 12 older subjects.
Sample Hybridization and Quantification
We used the Affymetrix human HG-U133A GeneChip consisting of 22,000 probe sets (including 14,500 well-characterized genes). Details about the design of the GeneChip and sequence selection can be found at the manufacturer’s website (http://www.affymetrix.com). RNA processing and hybridization, including reverse transcription, second-strand synthesis, labeled cRNA preparation, and hybridization to the human U133A GeneChip, were all performed at the Partners Healthcare Gene Array Technology Center (Brigham and Women’s Hospital, Boston, MA) using the recommended protocol by Affymetrix. Hybridization conditions have been detailed elsewhere (34). Fluorescent intensity of hybridized labeled cRNA was scanned with the GeneArray scanner, producing intensity values for each individual probe [perfect match (PM) or mismatch (MM)]. Signal intensities for probe sets, or genes, were calculated using Affymetrix’s Microarray Suite 5.0 (MAS 5.0), which is a reflection of the abundance of sample hybridization. MAS 5.0 also provides a detection call of presence (P) or absence (A) for each probe set, which indicates whether a transcript is expressed above the background intensity on the chip. Presence or absence calls are determined by a Wilcoxon signed-rank test comparing the PM and MM intensities, where in this case presence is assigned at a P value < 0.1.
Microarray data from this project have been submitted to the National Center for Biotechnology Information (NCBI)’s Gene Expression Omnibus, according to the Microarray Information About Microarray Experiments (MIAME) standards. The data for the sample series GSE1428 can be accessed at http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE1428.
Data Analysis and Statistical Filtering
STEP 1: NORMALIZATION.
Signal intensities on each chip were normalized to reduce the effects of chip-to-chip variations in target preparation, hybridization, and scanning. We used a linear scaling method in which signal intensities for each probe set are multiplied by a normalization term (2). In this case, the normalization term for each chip was determined by comparing the intensity value of the 75th percentile of all the present probe sets for a given chip to the median 75th percentile value of all chips.
STEP 2: INTERSUBJECT CORRELATION.
Intersubject correlations were estimated using the Pearson correlation coefficient; results are shown in Supplemental Table S1 (available at the Physiological Genomics web site).1
STEP 3: NOISE REDUCTION.
Absence calls are associated with low signal intensity values and/or high levels of intraprobe set variability. If a probe set did not receive any presence calls in either group, it was removed to reduce errors resulting from noise in subsequent analyses. This reduced the dataset from ∼22,000 to ∼12,000 probe sets (or mRNAs).
Identification of a Molecular Signature of Sarcopenia
The molecular signature set of sarcopenia was defined to be the fewest number of Affymetrix probe sets that could be used to best distinguish the two age groups (young and older), when a supervised classification method was applied to the gene expression data. The k-nearest neighbor (KNN) supervised classification method (54) was used in conjunction with the feature (i.e., probe set or gene) selection method of Golub et al. (15) and a holdout cross-validation (HOCV) procedure similar to that of Hwang et al. (21). The molecular signature set was obtained by using the gene selection parameters that minimize the total KNN misclassification error, as computed by the HOCV procedure. The 45 marker genes identified are those that best classify age. The validation of these genes as a sarcopenia signature is represented in Fig. 1, where the internal misclassification error rate was computed for the number (n) of top performing genes (average error rate = 0.15). This represents a powerful validation because it is based on the HOCV procedure, which was implemented in a fully self-consistent manner by iteratively removing one subject from each group to create a two-element test set (1 young, 1 older) and then rebuilding the KNN classifier, including gene selection, on the basis of only the remaining subjects (9 young, 11 older). Thus there were 120 iterations for each of the top 200 performing genes and thus the possibility for 240 misclassifications for each of the 198 groupings of genes in the analysis (i.e., the top 3 to the top 200 performing genes = 198 gene groupings). No single gene represents an aging signature per se, but rather it is the signature taken as a whole that contains the class prediction capacity, or the sarcopenia signature. The details of the computational process are described in results.
To independently assess the validity of the sarcopenia molecular signature, we used a previously published dataset (59) that also used the Affymetrix HG-U133A GeneChip to interrogate the vastus lateralis muscle of eight young (21–27 yr old) and eight older (67–75 yr old) males. These data were normalized to our dataset, using the procedures outlined above, and then reduced to our molecular signature set of genes found through the gene selection and HOCV procedures. Subjects were then assigned to one of the groups (young or older) using the KNN classifier.
Annotation of the Affymetrix probes was performed by use of the National Institutes of Health (NIH) database DAVID (http://apps1.niaid.nih.gov/david/) and probe set information from Affymetrix’s NetAffx. The DAVID database integrates information from multiple sources, including Locuslink, Genbank, OMIM, RefSeq, UNIGENE, and GeneOntology as well as NetAffx.
Standard Statistical Testing
Although not used to compute the aging signature, a t-test (P < 0.01) was applied to the young vs. old samples to test for overall age-related differences in mRNA expression. Two hundred forty-five probe sets passed this statistic. Nonredundant mRNAs with known function are listed in Supplemental Table S2.
A summary of the descriptive characteristics is presented in Table 1. Older subjects had lower aerobic capacity (47%, P < 0.001), leg strength (30%, P < 0.001), and power (49%, P < 0.001), but body mass index was not different between groups. Whereas the age-associated decrease in mean fiber cross-sectional area of the type I fibers did not reach statistical significance, the cross-sectional areas of all type II “fast” myosin fibers (type IIa + IIx) were significantly reduced in the older subjects. In addition, citrate synthase activity was 33% lower in the older subjects (P < 0.005), indicating a decreased mitochondrial oxidative capacity.
Identification of a Molecular Signature of Sarcopenia: Overview
The identification of a molecular signature of sarcopenia is based on computational analysis of the gene expression profiles in young vs. older subjects. The molecular signature of sarcopenia was defined to be the fewest number of Affymetrix probe sets that could be used to best distinguish the two age groups (young and older). This was determined by a gene (“feature”) selection method, HOCV procedure, and KNN supervised classification method, as detailed below.
Selection of Discriminatory Genes Using Signal-to-Noise Ratio
To identify the genes that best classify or discriminate age, a “feature selection” (i.e., gene selection) method was used that assigns a discriminatory strength to each probe set based on its signal-to-noise (S2N) ratio (15). The S2N ratio is a modified and more stringent t-statistic that most favors probe sets with large differences in group median values and low within-group variability. Therefore, it is more resistant to outliers and eliminates over- or underestimates of significance due to high variability. All 12,000 probe sets were ranked in descending order of S2N ratio.
In this work, the KNN method was used to classify “test subjects” generated by the HOCV procedure (see below) for internally validating the aging signature (54). An unclassified subject was classified by polling the classes (young or older) of the KNN (where a “neighbor” is a subject) in expression space, with parameter k = 3 subjects, and assigning membership of that subject (young or older) based on the most prevalent class of its three neighbors. Here, distance between expression profiles is defined as 1 minus the Pearson correlation coefficient of the two profiles.
A HOCV-based error rate calculation was incorporated into the S2N selection method described above to estimate the classification error introduced by the occurrence of false positives (i.e., non-sarcopenia-related genes) in the gene signature. In this way, it is also used for the extensive internal cross validation of the subjects used to compute the signature. Thus it is used to compute the misclassification error of the subjects. In general, HOCV is a process that estimates error rates that are likely to occur when the gene signature is used to classify completely new samples that were not used in the original selection procedure. It thus provides a measure of “robustness” and general biological validity of the signature. The HOCV procedure was carried out as follows.
For each round of gene selection, the top n genes with the highest S2N score were selected from the S2N-ranked list (also referred to as the “gene selection” method). The number n (genes selected) is a parameter of the gene signature selection process, which we keep fixed for any one round of the HOCV procedure. The HOCV itself consists of a series of iterations. For each of the HOCV iterations, the following steps were taken.
One subject from each group (1 young, 1 older) is withheld (“held out”) as a “test set,” and the remaining subjects (9 young, 11 older) are placed into a “training set.”
The S2N gene selection method is then used to score the probe sets (∼12,000) based on the expression data from subjects in the training set. All probe sets are ranked in descending order of S2N ratio.
The subjects in the test set (1 young, 1 older) are then classified as young or older based on the expression of the best n discriminatory genes identified in the training set (11 young, 9 older), using KNN classification as described above.
The class assigned to the test samples is then compared with the actual class, and the classification error rate is recorded.
Steps 1–4 were repeated using different test and training sets until all samples were withheld once; that is, holdouts were repeated until every subject in the young group (n = 10) had been paired in a test set with every subject in the older group (n = 12) for 120 iterations in total. Finally, the total misclassification error, based on a tally of the misclassification errors of the 240 separate held-out samples, was computed.
The HOCV misclassification error rate, as a function of the number n of probe sets selected in the internal gene selection procedure, for the range 3 ≤ n ≤ 200 is shown in Fig. 1. Note that the error rate first quickly decreases as more genes are added to the classifier, because each new gene provides more discriminatory information. The error rate then levels off at n ∼ 50, undergoing small fluctuations for larger n, because the information content provided by additional genes is now either redundant with that of the genes already present in the classifier or consists essentially of noise. For n > 200 (not shown), the error rate continues to fluctuate around the mean value 0.15, with absolute minimum 0.14.
Under a binomial error model for the classification of the n = 240 held-out samples, the standard deviation (SD) for the estimated mean error rate of 0.15 was 0.023. We used the criterion error threshold = (minimum error rate + 1 SD) or (0.14 + 0.023 = 0.163) to set the actual gene selection cutoff at n = 45, as indicated in Fig. 1. This threshold criterion provides the smallest KNN classifier within a statistically insignificant distance of the absolutely best classifier, and the resulting selection of 45 genes is the most “economical” representation of the molecular signature of sarcopenia.
Composition of the Molecular Signature of Sarcopenia
The heat map in Fig. 2 represents the 45 probe sets that make up the molecular signature of sarcopenia. Each column of the heat map is a subject, and each row is a probe set (i.e., gene). The color and intensity of each square represent the relative expression of a gene for each individual. The relative expression of a probe set is determined by transforming the raw expression values to z-scores, which is the number of SDs a gene’s expression value is from the mean expression value of the gene across all subjects. Negative values (blue) indicate expression that is lower than the mean, and positive values (red) indicate expression values that are greater than the mean. The first 27 probe sets represent genes that were upregulated in muscle of older subjects, and the next 18 probe sets were downregulated in older subjects. Genes that best defined age by their differential expression included genes involved in energy metabolism, stress response, inflammation, proliferation/apoptosis, cytokine signaling, transporters, and neuronal remodeling and are summarized by functional category in Fig. 2. The details of these functional relationships are described in discussion.
Validation of Signature Using an Independent Dataset
To independently assess the validity of the molecular signature of sarcopenia, we used a previously published dataset (59) that also used the Affymetrix HG-U133A GeneChip to interrogate the vastus lateralis muscle of eight young (21–27 yr old) and eight older (67–75 yr old) males. These data were processed and normalized using the procedures outlined above. A KNN classifier using the top 45 discriminatory genes identified in the signature was then used to classify each of the subjects without the benefit of any information about the samples. The results of this analysis are shown in Fig. 3 as a three-dimensional scatter plot, where each subject is represented by a marker. Position in expression space is determined by principal component analysis (42), which reduces multidimensional data (i.e., a subject’s expression profile across n probe sets) to a few dimensions (d = 3). The position of each marker is determined by the expression values for the 45 probe sets in the molecular signature of sarcopenia; therefore, subjects with similar expression profiles are closer in three-dimensional space. The accuracy of the prediction for the subjects in the present study was 100% (percentage of subjects correctly classified), while the accuracy of class prediction for the independent dataset (59) was 75% (12/16). The concomitant error rate of 25%, although larger than the ∼15% cross-validation error rate (Fig. 1), is significantly smaller (P = 0.047, 1-sided test) than the error rate of 50 ± 15% that would be obtained under a random (“coin flip”) classification of the 16 samples. This measure of statistical significance, coupled to the visually striking segregation of the eight young and eight older subject samples in the principal component analysis shown in Fig. 3, strongly supports the general validity of the sarcopenia signature, derived on the basis of cross validation alone.
Sarcopenia is associated with well-characterized functional limitations and physical disabilities that occur with advancing age (36, 55). Understanding the underlying biology is important to better identify interventions that can ameliorate the deleterious effects of muscle loss with age. The use of high-density oligonucleotide microarrays has become a valuable investigative tool used to monitor changes in gene expression on a genome-wide level. For instance, these microarrays have been used to study age-based changes in global gene expression in rodents (29, 39, 49), monkeys (24), and humans (58, 59). These studies have demonstrated that between 1 and 5% of mRNAs are differentially expressed in skeletal muscle with aging (16, 29, 39, 59). In general, these studies have reported changes in gene expression consistent with decreased protein synthesis, impaired oxidative defense, and decreased activity of mitochondrial proteins, differential expression of genes involved in energy metabolism, DNA damage repair, stress response, immune/inflammatory response, RNA binding and splicing, and proteasome degradation.
Although these studies have provided insight into the age-related changes in gene expression and thereby the aging process, in humans the changes tend to be of low magnitude (59) and the analysis is complicated by the heterogeneity of human muscle samples (60). These observations indicate the need to employ tests with advanced statistical rigor to extract the most reliable information. We have therefore augmented previous work by applying a technique that has allowed the identification of the best and smallest number of gene expressions that can be used to identify aged muscle. This is known as a molecular signature, and it not only effectively deals with the heterogeneity of human samples and the overall small changes in gene expression, but it provides an important reference point for investigators to test interventions that improve the function of aged muscle. This signature was derived using a k-nearest neighbor supervised classification method in conjunction with a feature (gene) selection method and holdout cross validation, which eliminates genes that exhibit high within-group variability (15, 21) that would normally be deemed significant using more standard parametric tests (e.g., t-test). This method identifies a set of genes that are consistently good determinants of age-based gene expression in all subjects studied. In the present work, we have identified 45 genes that define a molecular signature of sarcopenia among 10 young and 12 older male subjects. This number was identified as being the number of genes that minimize the error rate in classifying young and older subjects. This work represents the first attempt to identify a molecular signature of sarcopenia. To further validate our signature, we tested its ability to classify the gene expression profiles identified in a separate dataset from an independent laboratory (59). The signature was able to correctly classify the age of 12 of 16 subjects (accuracy = 75%, significance P = 0.047 against a random classifier). The statistical significance of this result, combined with the striking age-related segregation of the new samples, as observed under a principal component analysis (Fig. 3), strongly supports the general biological validity of the signature set.
Describing the Biology of the Signature: Genes Upregulated in Aged Skeletal Muscle
Table 2 shows the functional categories of genes from Fig. 2 that are upregulated in skeletal muscle with age. Aging is associated with increased levels of interleukin-6 and TNF-α, suggesting chronic low-level inflammation during the aging process (6, 44, 56). It is therefore no surprise that we have shown increased expression of several cytoprotective genes, indicating a response to significant cellular damage or chronic inflammation. The gene that showed the largest increase with age and was the best predictor of age was the complement component 1q subcomponent-α peptide (C1Q-α). C1Q-α is the initiating peptide in the classical complement pathway involved in the clearance of pathogens or debris from damaged cells. C1Q-α has also been shown to be necessary for the clearance of apoptotic nuclei. Another upregulated gene, LGALS1, is anti-inflammatory in its ability to activate apoptosis in infiltrating T cells, thereby limiting T cell-induced cellular damage and destruction (18). The upregulation of CCAAT/enhancer binding-β (C/EBP-β) transcription factor provides further evidence of cellular stress or inflammation (53). C/EBP-β is important in the regulation of genes involved in immune and inflammatory responses in conjunction with other transcription factors such as NF-κB (52).
It is thought that inflammatory triggers during aging may induce the loss of muscle cells and myonuclei during the process of human aging through an apoptotic mechanism (9, 30). Indeed, several genes known to play a role in the regulation of apoptosis are components of the upregulated genes in this signature. The forkhead box O3A (FOXO3A) is one such gene upregulated in the aged signature. FOXO3A activation has been shown to induce apoptosis by activating the expression of genes necessary for cell death (14, 48). Recent studies have shown the influence of FOXO transcription factors in the transcriptional activation of the ubiquitin protein ligase atrogin-1 during fasting- and glucocorticoid-induced atrophy (45). Welle et al. (59) also found increased FOXO1 mRNA in aged muscle using standard microarray analysis. Another recent study has shown that nuclei of aged muscle contain more FOXO1 than those of young muscle (35), and another shows increased atrogin mRNA in aged rats (39). Thus the FOXO proteins may very well play a role in the loss of muscle mass or muscle nuclei with aging.
Staufen (STAU) is a double-stranded RNA-binding protein involved in the transport and localization of mRNAs via microtubules to specific cellular compartments (61). In muscle, for example, STAU has been shown to accumulate in the postsynaptic region of the neuromuscular junction (1). Belanger et al. (1) have also shown that STAU is expressed at a higher level in slow-twitch muscles than in fast-twitch muscles, and it is induced during myogenesis and denervation. For this reason, it is thought that STAU is involved in the maturation and plasticity of the neuromuscular junction. Increased expression in aged muscle may represent a compensatory response to the loss of functional innervation that occurs during aging (8).
RNA-binding protein-9 (RBM9) is a coregulator of nuclear steroid receptor signaling. The expression of RBM9 has been shown to inhibit estrogen receptor-α-mediated transcription (37). RBM9 contains an RNA-binding domain that is necessary for its repressor function. However, the role of this domain is not well understood. The upregulation of this gene serves as an indicator that steroid nuclear receptor-mediated signaling may be perturbed in aged skeletal muscle. Along these lines, the retinoid X receptor-β (RXRB) is also upregulated in the aging signature. This gene is a member of the retinoid X receptor family of nuclear receptors involved in mediating the effects of retinoic acid. RXRB can dimerize with retinoic acid, thyroid hormone, and vitamin D receptors as well as peroxisome proliferator-activated receptors to mediate hormone mediated transcriptional responses (67). The effect of elevated levels of RXRB in skeletal muscle has not been investigated; however, it has been shown in other cell types to mediate growth and cell death in retinoid-sensitive cells (4).
Slit homolog (Drosophila)-2 (SLIT2), known for its role of repulsion in axon guidance and neuronal migration (65), is upregulated in the aged signature. This could represent another compensatory response to the individual fiber denervation known to occur during aging. Membrane-spanning 4-domains, subfamily A, member-4 (MS4A4A), is also a strongly upregulated marker in the aging signature. This gene encodes a member of a protein family thought to be involved in membrane-based signaling (32). The role of these proteins has not been characterized, but because the expression of this gene is so highly induced in aging muscle, further work is warranted.
Another interesting aspect of the aged signature is the upregulation of the circadian period-2 (PER2) gene. This gene is involved in the entrainment of circadian rhythms that help to optimize daily cycles of sleep, metabolism, and activity (33). Circadian rhythms are regulated by clock mechanisms that are located in central and peripheral tissues. PER2 is part of the negative arm of this loop that serves to dampen the expression of genes induced by the positive arm of the loop (25). It has become evident that, during aging, there is a dampening of the frequency and amplitude of circadian rhythms in the central and peripheral tissues (66). It is not known, however, whether this occurs in skeletal muscle, which has been shown to have its own (peripheral) clock (68). The fact that PER2 upregulation is a part of the sarcopenia signature suggests that circadian rhythms may indeed be dampened in skeletal muscle as well. It has also been shown that PER2 plays a role in tumor suppression and DNA damage response (12).
Several RNA-binding proteins are also upregulated in the aged muscle signature. The first three genes, SF3B1, SFRS2IP, and LUC7A, are components of the spliceosome or are spliceosome-interacting proteins involved in constitutive and regulated alternative splicing. Welle et al. (59) have also shown several genes that fall into these categories and are upregulated in senescent skeletal muscle. Our results confirm that the upregulation of some of these genes represents a robust marker of aged muscle. It has been suggested that an increase in these types of proteins may indicate either a misregulation of the splicing apparatus or a shift in regulated alternative splicing in aged muscle (59).
Other genes upregulated in aging skeletal muscle fall into a variety of categories such as transcriptional regulation, chromatin organization, nuclear structure and cell division and signaling pathways.
Defining the Biology of the Signature: Genes Downregulated in Aged Skeletal Muscle
Table 3 shows the functional categories of genes from the signature in Fig. 2 that are downregulated with age. One of the most highly downregulated transcripts in muscle, the solute carrier family-38, member-1 (SLC38A1), is a system A sodium-coupled amino acid transporter. It should be noted that SLC38A1 was the best predictor of the young vs. older class distinction and is thus likely to be of importance in defining the aging phenotype. Transporters of system A are responsible for glutamine uptake and can be induced after prolonged periods of amino acid starvation (7). It has been shown that insulin can stimulate system A activity in skeletal muscle in a mechanism thought to resemble GLUT4 glucose transport recruitment (22). The activity of SLC38A1 is not well understood in muscle, but the results indicate that aging muscle may be impaired in its ability to take up neutral amino acids, which may limit growth responses. Further work is necessary to investigate this possibility.
Another downregulated mRNA in the aged signature is the tumor necrosis factor receptor-associated factor (TRAF)-6-inhibitory zinc finger protein (TIZ). TIZ is a tumor TRAF-interacting protein that is capable of inhibiting TRAF-induced activation of NF-κB, receptor activator of NF-κB (RANK), and the c-Jun NH2-terminal kinase (46). This is the first report of a gene capable of inhibiting TNF-induced inflammatory or cell death responses being downregulated in aged skeletal muscle. This, in combination with increasing levels of circulating TNF and other inflammatory mediators in aged subjects, may be in part responsible for skeletal muscle and muscle protein loss during sarcopenia.
The myeloid leukemia factor-1 (MLF1) is downregulated in the aging signature. MLF1 is an oncoprotein involved in translocations associated with erythroid-based acute myeloid leukemia (AML) (63). It has been shown that with AML, MLF1 can interfere with erythropoietin-responsive erythroid-terminal differentiation by blocking cell cycle exit and p27(Kip1) accumulation. Although it has been shown that MLF1 is highly expressed in skeletal muscle (19), its function in muscle cells is not known. The core-binding factor, runt domain, α-subunit 2 translocated to 3 (CBFA2T3), is a transcription factor and putative tumor suppressor that is also downregulated in the aging signature. A fusion of this gene with the runt-related transcription factor-1 (RUNX1) is one of the most common translocations in acute myeloid leukemia (3, 13). Although the role of these two antiproliferative genes has not been explored in muscle, their presence in the aging signature warrants further investigation.
Sex-determining region Y (SRY)-box 17 (SOX17) is an HMG-box-containing transcription factor that is decreased in the aging signature. This gene encodes a member of the SOX (SRY-related HMG-box) family of transcription factors involved in the regulation of embryonic development and in the determination of the cell fate (62). SOX17 is thought to act as a transcriptional regulator after forming protein complexes with other proteins. Although its role in muscle has not been demonstrated, it has been shown to cooperate with β-catenin to regulate the transcription of genes that specify developmental cues during vertebrate gastrulation (47). β-Catenin is an adhesion molecule that participates in Wnt/Frizzled (Fz) signaling that has also been shown to play a role in myogenesis and muscle growth (64). Disheveled associated activator of morphogenesis-2 (DAAM2) is another gene implicated in the Wnt/Fz pathway (17) that had decreased expression in the aging signature. It has also been extensively studied with regard to development, but its role in adult muscle and sarcopenia has not been characterized. Decreases in the expression of these genes may be involved in the reduced plasticity of aged muscle with activity.
Coagulation factor II (thrombin) receptor-like-1 (F2RL1), decreased with age, is a member of the large family of seven-transmembrane region receptors that couple to guanine nucleotide-binding proteins. F2RL1 plays a pivotal role in mediating chronic inflammation. Joint swelling in an adjuvant monoarthritis model of chronic inflammation was shown to be decreased by more than fourfold in F2RL1-deficient mice, indicating that it may play a role as a proinflammatory signal (10). Further studies are therefore necessary to understand its role in the chronic inflammation know to occur in aged skeletal muscle.
In conclusion, this work represents the first attempt to identify a molecular signature of sarcopenia. We identified and internally validated this signature and have shown its predictive power by analyzing its ability to classify subjects from a separate study. The 45 marker genes identified here and taken as a whole correctly classified all of the subjects in this study (Fig. 3). It represents an aging-specific signature for skeletal muscle. This signature reveals changes in gene expression consistent with an inflammation response. Several genes involved in the clearance of damaged cells, protection from inflammation, and apoptosis define part of the signature. It was also shown that upregulation of genes involved in pre-mRNA splicing, localization, and modification of RNA was an important aspect of the aging signature. Several genes involved in Wnt signaling, glutamine transport, proliferation, and steroid hormone receptor activity and a FOXO transcription factor were contained in this signature. This work will serve as a basis for the future analysis of sarcopenia in aging subjects as well as a model by which to judge the effectiveness of therapies to ameliorate the deleterious effects of reduced muscle mass with age. It will be interesting to determine whether the genes in the signature will exhibit plasticity during these interventions or whether they will remain unchanged, since they showed low heterogeneity with respect to age-specific expression and therefore may be independent of variations in habitual patterns of activity and genetics.
This research was supported by NIH Grant R21-AG-019754.
We thank Katsuhiko Funai for biochemical analysis of muscle samples. We are grateful to R. Bridge Hunter for assistance with figures.
↵1 The Supplemental Material for this article (Supplemental Tables S1 and S2) is available online at http://physiolgenomics.physiology.org/cgi/content/full/00249.2004/DC1.
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
Address for reprint requests and other correspondence: S. C. Kandarian, 635 Commonwealth Ave., 4th Floor, Boston, MA 02215 (E-mail:).
- Copyright © 2005 the American Physiological Society