The single nucleotide polymorphism (SNP) within the TCF7L2 gene, rs7903146, is, to date, the most significant genetic marker associated with Type 2 diabetes mellitus (T2DM) risk. Nonetheless, its functional role in disease pathology is poorly understood. The aim of the present study was to investigate, in vascular smooth muscle cells from 92 patients undergoing aortocoronary bypass surgery, the contribution of this SNP in T2DM using expression levels and expression correlation comparison approaches, which were visually represented as gene interaction networks. Initially, the expression levels of 41 genes (seven TCF7L2 splice forms and 40 other T2DM relevant genes) were compared between rs7903146 wild-type (CC) and T2DM-risk (CT + TT) genotype groups. Next, we compared the expression correlation patterns of these 41 genes between groups to observe if the relationships between genes were different. Five TCF7L2 splice forms and nine genes showed significant expression differences between groups. RXRα gene was pinpointed as showing the most different expression correlation pattern with other genes. Therefore, T2DM risk alleles appear to be influencing TCF7L2 splice form's expression in vascular smooth muscle cells, and RXRα gene is pointed out as a treatment target candidate for risk reduction in individuals with high risk of developing T2DM, especially individuals harboring TCF7L2 risk genotypes.
- correlation networks
- differential expression
- differential coexpression
- Type 2 diabetes mellitus
type 2 diabetes mellitus (T2DM) is a disorder associated with an increased risk of cardiovascular disease and affects more than 300 million people worldwide according to the World Health Organization (58a). T2DM results from the interaction between genetic and environmental risk factors, being characterized by high blood glucose levels, insulin resistance, and, at a late stage, reduced insulin secretion by pancreatic β-cells.
Despite the multiple loci associated with T2DM in genome-wide association studies (GWAS)(58), the single nucleotide polymorphism (SNP) within the TCF7L2 gene, rs7903146, is, to date, the most significant genetic marker associated with disease risk, being consistently associated with T2DM in different populations, including the Brazilian (8, 21, 34, 52, 54). Nonetheless, little knowledge of its contribution to disease pathology is understood.
One way to understand this contribution is studying how genotype can affect cellular phenotypes. Expression quantitative trait locus studies (eQTLs) are among the most popular strategies for studying cellular transcription phenotypes. In this analysis, genetic effects on transcript levels are mapped, or associated, to genetic polymorphisms (10).
In a rather different approach, some studies assume that genes and proteins interact in networks to modulate cellular processes. Analysis of these networks can provide insights into gene interactions and functions and also an understanding on how gene interactions work in disease states.
Thereby, recent investigations have gone beyond testing for one-dimensional differential gene expression and, comparing different conditions or different biological types, have started searching for high-order differences in expression and coexpression patterns (2, 7, 25). The identification of these differences between groups provides further understanding of the differences between gene networks and also allows one an inference about the potential central causes of deregulation in certain pathway or system (11).
The aim of this work was to develop gene expression and coexpression networks with TCF7L2, PPARγ, CDKAL1, JAZF1, IGF2BP2, WFS1, CDKN2A, CDKN2B, CAMK1D, and HHEX genes, all have, or are close to, genetic variants associated with T2DM, and 31 other genes related to T2DM pathways or to cellular proliferation, to explore if gene expression and coexpression relationships between genotypic types (wild-type and risk genotypes for rs7903146) are different.
MATERIALS AND METHODS
Study sample and ethics statement.
Smooth muscle cells from human mammary artery (H-MA) segments were obtained from 104 patients undergoing aortocoronary bypass surgery at the Heart Institute (InCor), University of Sao Paulo Medical School. All individuals gave informed consent to participate in the study, which was reviewed and approved by the local Ethics Committee (SDC 2454/04/074 - CAPPesq 638/04). The investigation is in accordance with the principles outlined in the Declaration of Helsinki (see Ref. 58b).
Genotyping rs7903146 for 92 subjects and separating them into two groups: wild-type and T2DM risk genotypes.
Genomic DNA was extracted from peripheral blood leukocytes by means of a standard salting-out procedure. In the present analysis we used genotype data on 92 samples (Table 1).
DNA samples were diluted with PCR grade water to a concentration of 10 ng/μl. Samples were preselected for bidirectional sequencing by high-resolution melting (HRM) (47) manually analyzed curves or were directly bidirectionally sequenced. For HRM, the reaction consisted of 10 ng of genomic DNA, 1× assay buffer, 2 mM MgCl2, 200 nM of each primer, 200 μM of dNTPs, 1.5 μM of SYTO9 (Invitrogen, Carlsbad, CA), 0.5 U of GoTaq DNA Polymerase (Promega), and PCR grade water in a volume of 10 μl. PCR cycling and HRM analysis were performed on the Rotor-Gene 6000 equipment (Qiagen, Courtaboeuf, France). Bidirectional sequencing was performed by 3500xL Genetic Analyzer (Applied Biosystems) using BigDye Terminator v3.1 Cycle Sequencing Kit (Applied Biosystems) and manufacturer protocols. PCR primers, which flank the rs7903146 polymorphism region, were designed using the software Primer3 (46). PCR primers used were: 5′ CCG TCA GAT GGT AAT GCA GA 3′ and 5′ GTA GCA GTG AAG TGC CCA AG 3′ for HRM, and 5′ CTG AAC AAT TAG AGA GCT AAG CA 3′ and 5′ ACT AAG GGT GCC TCA TAC GG 3′ for sequencing.
The 92 subjects' samples were separated according to rs7903146 polymorphism into two genotype groups: CC (wild-type group/group without the risk allele associated with T2DM) and CT + TT (risk group/group with the risk allele).
Primary culture of human smooth muscle cells.
H-MA smooth muscle cells (SMCs) were obtained by explant protocol. In summary, the endothelial layer was removed by mechanical friction, and small fragments (∼0.09 cm2) of vessels were placed on six-well culture plates containing a 3% gelatin. The fragments were cultured with Dulbecco's modified Eagle's medium containing 20% fetal bovine serum, 100 U/ml penicillin, and 100 μg/ml streptomycin. After ∼2 wk, the SMCs derived from vessel fragments were isolated and expanded. SMCs were characterized by hill-and-valley growth pattern and by immune-fluorescence staining for α-smooth muscle actin.
Selecting TCF7L2 splice forms and genes for expression and coexpression analysis.
To decide which genes to quantify we selected 10 of the genes that have, or are close to, SNPs with the lowest P values in GWAS for association with T2DM as a starting point (10, 57). For the purpose of analyzing the possible rs7903146 cis-acting regulation on TCF7L2 gene, we quantified seven TCF7L2 splice forms (16) [some, like B and F, that have been studied by other works (41, 56) and other poorly studied splice forms] (Table 2 and Fig. 1). We also have selected genes that, according to the KEGG database, are related to T2DM pathways (for example, insulin signaling pathway or the T2DM pathway itself) and genes whose expression could also be affected by a TCF7L2 gene expression deregulation, as genes related to cell proliferation. TCF7L2 expression is known to influence the expression of genes that have a regulatory role in the cell cycle process and, consequently, in the cellular proliferation process, which is highly affected during T2DM pathology (20, 26, 27, 36). Using data from I2D (6), an online tool that assemble several databases of protein interactions (MINT, BIND, IntAct, HPRD, etc.), we also selected genes with close relationships (direct protein interactions) with the 10 genes that are close or have T2DM-associated SNPs. The list of all quantified genes is in Table 3.
RNA expressions of H-MA SMCs were quantified by quantitative real time RT-PCR.
Total RNA was isolated with TRIzol reagent (Invitrogen) according to the manufacturer's instructions and cDNA synthesis was performed with SuperScript III Reverse Transcriptase (Invitrogen). To avoid genomic DNA's being amplified, we designed primers into different exons, which have a large intron between them that would not be amplified by our protocol. In all designed assays we observed specific bands for all amplified products.
We used 5 ng of cDNA for real-time RT-PCR reaction (SYBRw Green PCR Master Mix-PE Applied Biosystems) in an ABI Prism 7700 Sequence Detection System (Applied Biosystems). For each subject, samples were assayed in duplicate or triplicate and each gene average expression was used as the standard measure. Target transcript expression was normalized to 28S (reference gene) ribosomal mRNA expression by the ΔCt method, and the linearized ΔCt (i.e., 2−ΔCt) was used for comparative purposes (32). CT indicates the fractional cycle number at which the amount of an amplified target gene reaches a fixed threshold, and ΔCT is the difference in threshold cycle for a target and a reference gene.
The two genotype groups were analyzed in terms of the prevalence of T2DM to evaluate an association between disease and groups. Effects in expression mean values by the number of cellular replication cycles in primary culture, age, T2DM, and sex were also evaluated, with a P value of 0.05 or lower considered as significant for these analyses. For the analysis of environmental effects we used the total SMC expression database, which consists of 104 patients.
Assembling protein interaction networks.
Using data from I2D, we assembled a protein interaction network, for each analyzed group, with the 41 chosen genes (one of the TCF7L2 splice forms representing the TCF7L2 gene, and the other 40 genes) that had measurements, other genes that have SNPs associated with T2DM, and genes that have protein interactions connecting the two mentioned gene groups. For each group, the gene set behavior was graphically visualized with color scales representing average gene expressions. For this graphical representation, to keep average gene expressions in the same value range, the values (2−ΔCt) were centered and adjusted by the standard deviation of the gene expression distribution. Cytoscape platform v.2.8.0 was used for visualization (53).
Comparing gene coexpression patterns between groups and representing them in networks.
The gene expression correlations of the 41 genes (one of the TCF7L2 splice forms representing TCF7L2 gene, and the other 40 genes) were analyzed pair-wise, and all the correlation coefficients were compared between the two genotype groups, to ascertain whether they were statistically similar or different. The main objective of these analyses was to observe the genes that have the largest number of significantly different correlations between genotype groups. For each genotype group, we also evaluated coexpression networks between the TCF7L2 splice forms to observe different relationships between them. Correlation networks were built and visualized with the Cytoscape platform v.2.8.0. Networks of the two groups were topologically compared with the Cytoscape plug-in NetworkAnalyzer v.2.7 (5).
Statistical analysis: comparing expression and coexpression patterns between groups.
The authors had full access to the data and take the responsibility for its integrity.
Hardy-Weinberg equilibrium for the genotypes was estimated by the χ2-test. Testing the association between T2DM and the genotype groups was also conducted through the χ2-test.
Since the quantified genes did not present a normal distribution, average gene expression values (2−ΔCt) by real-time RT-PCR are presented as median for each group. Comparisons, between groups, of gene median expression values were analyzed via a Mann-Whitney U-test, and SPSS v.13 program for Windows was used for this analysis. We considered a P value of 0.05 or lower significant and P values of 0.075–0.05 as suggestive of differences for comparisons.
Gene expression correlations were analyzed pair-wise through Spearman correlation. The correlation coefficients of each gene pair were compared, between the groups, through a test to evaluate significance of the difference between two correlation coefficients (37). We considered a P value of 0.05 or lower significant for difference between correlation coefficients.
From the analysis of the 92 studied subjects (donor of the cell lines), rs7903146 genotype frequencies were: CC = 52.2% (n = 48). CT = 38.0% (n = 35). TT = 9.8% (n = 9). No departure from HWE was observed. Thirty-four subjects had T2DM or altered fasting glucose (AFG), 14 of the affected (41.18%) were of the wild-type group, and 20 (58.82%) were of the risk group (Table 1).
Testing for the effect of sex, T2DM, and age in the measurements, we observed that MAP4K4 expression was affected by sex, and D and F TCF7L2 splice form's expressions were affected by age (Table 4). We also observed that many genes' expression levels were affected by the number of cellular replication cycles in primary culture (Table 4). Therefore, for genes whose expressions were influenced by the replication cycle number, we adjusted the expression values (2−ΔCt) for this variable and worked with the residuals of this adjustment.
Of note, after adjustment, separating individuals into diabetic and nondiabetic groups, and further comparing average gene expression levels of the seven TCF7L2 splice forms, and of the 40 genes between these groups, we observed that only JUN and ERBB4 had significant differences in median gene expression levels (Tables 5 and 6, Fig. 2).
Comparing expression medians and protein interaction networks between genotype groups.
We first sought to identify if TCF7L2 risk genotypes affect TCF7L2 splice forms' expression (cis acting). Considering the seven analyzed splice forms' expressions, we observed that five (splice forms A, B, C, D, and E) were significantly affected by genotype (Table 7, Fig. 3). We also compared expression median levels of the other 40 genes between the two genotype groups and observed that there were significant differences in median expression levels for nine of them (CDKAL1, IGF2BP2, JAZF1, CDKN2B, CAMK1D, JUN, CDK4, ATP2A2, and FKBP1A) and a tendency for two (CALM1 and WFS1) (Table 8).
Through a combined analysis using gene expression and protein-protein interaction network data, we compared the two group's networks and noticed that several genes that were directly interconnected in the networks, such as ATP2A2, PLN, CALR, CALM1, JUN, IL1R1, EP300, TNFα, AKT1, HHEX, CAMK1D, and TCF7L2 (represented by splice form B), had higher expression measurements in the risk group compared with the wild-type group (Fig. 4). Observing both protein networks, we found that TCF7L2 and other 28 genes (10 being statistically significant and two with a trend toward statistical significance) had higher median expression levels in the risk group (Fig. 4).
Comparing gene coexpression patterns between genotype groups.
First, we did a between-groups comparison of the gene expression correlation coefficients related to TCF7L2 splice forms with the other 40 genes. We observed that splice form B had the largest number of significantly different expression correlation coefficients (nine) (Table 9).
Comparing between the two genotype groups, the gene expression correlation coefficients referred to the 41 genes (820 pairwise comparisons), using splice form B to represent TCF7L2 expression, we identified RXRα as the gene with the largest amount of significantly different expression correlation coefficients (eleven) (Table 10). When we looked for the second in number of significantly different expression correlation coefficients, we observed IGF2BP2, CALR, and CALM1 with 10 different coefficients between the two groups (Table 10). Together with the B TCF7L2 splice form, these four genes listed account for >50% (43 of 83) of the significant differences in correlation detected. The other correlation comparisons are shown in Table 11. Analyzing the significant correlation differences using a correlation threshold of ± 0.4 and ± 0.5 for at least one of the compared correlations, we observed that RXRα, CALM1, and CALR remain the genes with the largest number of significantly different correlations.
Comparing coexpression networks.
Setting up correlation networks for the TCF7L2 splice forms, we observed that the differences in correlations between them are mainly concentrated in two splice forms (B and G) (Fig. 3). Correlation networks were also built for RXRα gene to illustrate the significant differences in correlation patterns between groups (Fig. 5).
Despite the well-known and significant association between rs7903146 genotype and diabetes incidence risk, little is known regarding its molecular mode of action.
In the present study we used human vascular SMCs, a cell type poorly studied relative to TCF7L2 cis-acting effects and very relevant to cardiovascular disease (4). The main objective of using this cell type was to address the recent findings from our group of an association between the risk genotype and an increased incidence of coronary artery disease and cardiovascular events in nondiabetic humans (54). For this present study we obtained 92 samples, which is a considerable number, considering the difficulty in obtaining such tissue. Moreover, through cell culture we were able to reduce epigenetic effects such as age, sex, and medications in gene expression measurements. We assume that the reduction of epigenetic effects can be better observed when we look at the network set up for diabetics and nondiabetics.
In an attempt to understand the rs7903146 polymorphism effect in T2DM pathology we used two strategies (comparing expression medians and protein interaction networks and comparing coexpression patterns between T2DM-related genes) to phenotypically compare cells of subjects with and without the risk allele (risk and wild-type groups, respectively). The first strategy suggested that there were phenotypic differences between the groups, which were well represented by the five splice forms of TCF7L2 and other nine genes with significantly higher median expression levels in the genotype-risk group network. The second strategy also suggested that there were phenotypic differences between the groups, characterized by several significantly different gene correlation coefficients, pinpointing RXRα, CALM1, CALR, and IGF2BP2 as genes that may be interacting differently between the groups and suggesting that these players may be targets for genotype-specific pharmacological approaches.
The first strategy, to phenotypically compare cells between the groups, aimed to find the rs7903146 polymorphism effect in TCF7L2 gene expression, through an eQTL approach (9, 19), since associated disease SNPs may be regulating the expression of candidate genes in which they are inserted (cis regulation).
Several studies have attempted to observe this polymorphism effect. Lyssenko et al. (33) stated that genetic variants in TCF7L2 may be influencing the TCF7L2 expression in pancreatic islets. This experiment has not been replicated by Cotsapas et al. (10), who did not obtain the same results when analyzing liver, colon, and pancreas tissues. Recent studies (18, 55) reported the studied risk allele as accounting for open chromatin and enhancer activity, which, in turn, would lead to TCF7L2 overexpression in risk allele carriers. Based on these results, Savic et al. (48) demonstrated, through a selective deletion of the genome interval in which the rs7903146 is inserted, that this region is critical for modulating TCF7L2 expression. In the same study it was observed that TCF7L2 overexpression can be associated with T2DM increased risk.
In our study we observed a significantly higher expression of five TCF7L2 splice forms in the risk group. While this has been shown by other researchers in pancreatic islets, adipose, and blood cells (41), to our knowledge, this is the first demonstration of cis-acting TCF7L2 genotype effects in vascular SMCs. Due to the large number of TCF7L2 transcripts described (16) and the great similarity of them, we, as others (56), have found difficulty in conducting transcript-specific real-time PCR assays. One of the TCF7L2 transcripts that possibly has been detected in our assays is HM352847, which has the exon 15 and was described as producing striking features of malignant transformation characterized by high cell proliferation rate, migration, and colony formation (56). Future efforts are necessary to develop isoform-specific assays that are able to determine specific genotype effects.
We next compared, between groups, the expression of the other genes that harbor, or are close to, SNPs associated with T2DM, genes related to T2DM signaling pathways, and genes related to cell proliferation, to investigate if their expression could be affected by that change in TCF7L2 splice form's expression. Nine genes were found to have significantly different median expression levels between groups, a number significantly higher than what would be expected by chance. Among the nine genes were IGF2BP2, JAZF1, CDKN2B, CAMK1D, and CDKAL1, which are genes that are close to or harbor T2DM-associated SNPs (42, 49, 50, 62, 63). This finding may evidence a linked regulation mechanism between these genes.
When we analyzed the expression data in conjunction, 29 of the 41 measured genes have a greater nominal median expression level in the risk group compared with the wild-type group, thus, changes in gene expression could be discrete when observed at a single-gene level but could become more obvious when analyzed together with other genes in a particular system (39).
The second applied strategy aimed to understand if the studied genes interact differently depending on the genotype group. With this purpose, we compared between-groups gene expression correlations. We looked for genes with highly altered pattern between the groups' correlations and networks, behavior that might provide evidence of genes that have a crucial role in disease pathology (11, 30, 45) and, in this specific case, in the TCF7L2 genetic modulation of T2DM risk.
By the coexpression comparison approach we found RXRα, CALM1, CALR, and IGF2BP2 genes as having significantly different correlation structures between wild-type and risk networks.
During recent years RXRα has been extensively studied into two main fields regarding the pathogenesis of cardiovascular disease: cell proliferation and glucose homeostasis (17, 23, 38, 44). The role of retinoids in the development of cardiovascular disease attracted researchers' attention by the ability of all-trans retinoic acid to suppress growth of vascular SMCs (35, 38). Another study revealed the importance of treatment with retinoic acid, which can induce signs of growth inhibition in pulmonary SMCs (43). It has also been postulated that the mechanism of altered phosphorylation/degradation/transactivation of RXRα characterizes abnormal, although benign, proliferation of uterine SMCs (29).
Concerning glucose homeostasis, RXRα/PPARγ heterodimer is a known regulator of lipid and glucose homeostasis (17), which are important deregulated mechanisms in T2DM pathogenesis. PPARγ gene produces a nuclear protein that is involved in transcriptional regulation of a variety of enzymes involved in glucose and lipid metabolism (51) and has been described as also having SNPs associated with T2DM risk (3, 50), besides being an important target for intervention in metabolic disorders (14, 28). PPARγ is a well-known target for insulin-sensitizing drugs, such as thiazolidinediones, which are used to treat T2DM. Coactivator or corepressor proteins may regulate PPARγ activity by modulating PPARγ-mediated gene expression (12, 15, 24). Li et al. (31) demonstrated, through deleting the PPARγ corepressor NCoR, that activation of the PPARγ-mediated transcription program in adipocytes is a principal contributor to insulin sensitivity.
CALM1 and CALR are known genes that participate in calcium signaling pathways. Enhanced vasoconstrictor-mediated intracellular calcium concentration signals in vascular SMCs have been widely reported as a factor contributing to hyperreactivity to vasoconstrictors (13, 22, 61), which has been reported in T2DM (13, 40). The augmented calcium ions influx seems also to be involved in the mechanism of enhanced proliferation of cultured vascular SMCs. Interestingly, the other gene that was observed as having different coexpression and expression patterns between genotype groups, IGF2BP2, also was described as having a role in cell proliferation and cancer (60).
Thus, our analysis opens the possibility that the observed results in humans could be derived not only from systemic glucose homeostasis mechanisms (59), but also from a calcium homeostasis derangement and an altered cellular proliferation on a system where TCF7L2 is central.
We believe this work significantly contributes to the understanding that T2DM risk alleles may be influencing TCF7L2 splice form's expression in vascular SMCs; and TCF7L2 differences in expression can cause a significant change in gene interactions and in high-dimensional systems like gene networks responsible for glucose homeostasis, calcium signaling pathways, and cell proliferation. Together, our strategies can help us understand phenotypic differences between groups and also allow a better knowledge of how genes may be relating to each other in each genotype-defined group. Moreover, RXRα, and consequently PPARγ, genes are pointed out as important candidates to be functionally studied, through pharmacological modulation and treatment target candidates for risk reduction in individuals with high risk of developing T2DM, especially individuals harboring TCF7L2 risk genotypes.
We thank The National Council for Scientific and Technological Development (CNPq) and The São Paulo Research Foundation (FAPESP) for financial support.
No conflicts of interest, financial or otherwise, are declared by the author(s).
Author contributions: A.R.V., J.E.K., and A.C.P. conception and design of research; A.R.V., N.E.F., S.V.O., M.V.R., and S.K.T. performed experiments; A.R.V. and A.C.P. analyzed data; A.R.V. and A.C.P. interpreted results of experiments; A.R.V. prepared figures; A.R.V. and A.C.P. drafted manuscript; A.R.V. and A.C.P. edited and revised manuscript; A.R.V., N.E.F., S.V.O., M.V.R., S.K.T., J.E.K., and A.C.P. approved final version of manuscript.
- Copyright © 2012 the American Physiological Society