The spontaneously hypertensive rat (SHR) is a widely used rodent model of hypertension and metabolic syndrome. Previously we identified thousands of cis-regulated expression quantitative trait loci (eQTLs) across multiple tissues using a panel of rat recombinant inbred (RI) strains derived from Brown Norway and SHR progenitors. These cis-eQTLs represent potential susceptibility loci underlying physiological and pathophysiological traits manifested in SHR. We have prioritized 60 cis-eQTLs and confirmed differential expression between the parental strains by quantitative PCR in 43 (72%) of the eQTL transcripts. Quantitative trait transcript (QTT) analysis in the RI strains showed highly significant correlation between cis-eQTL transcript abundance and clinically relevant traits such as systolic blood pressure and blood glucose, with the physical location of a subset of the cis-eQTLs colocalizing with “physiological” QTLs (pQTLs) for these same traits. These colocalizing correlated cis-eQTLs (c3-eQTLs) are highly attractive as primary susceptibility loci for the colocalizing pQTLs. Furthermore, sequence analysis of the c3-eQTL genes identified single nucleotide polymorphisms (SNPs) that are predicted to affect transcription factor binding affinity, splicing and protein function. These SNPs, which potentially alter transcript abundance and stability, represent strong candidate factors underlying not just eQTL expression phenotypes, but also the correlated metabolic and physiological traits. In conclusion, by integration of genomic sequence, eQTL and QTT datasets we have identified several genes that are strong positional candidates for pathophysiological traits observed in the SHR strain. These findings provide a basis for the functional testing and ultimate elucidation of the molecular basis of these metabolic and cardiovascular phenotypes.
- expression quantitative trait locus
- spontaneously hypertensive rat
- quantitative trait transcript
- sequence variation
the spontaneously hypertensive rat (SHR) strain is a widely used rodent model for the study of hypertension and metabolic syndrome, displaying a range of physiological and metabolic traits that can be related to human disorders including high blood pressure, dyslipidemia and insulin resistance (49, 52). Elucidation of the molecular basis of these traits has proven difficult as they are under the control of multiple genes and genetic loci. The standard approach to gene identification involves mapping by linkage analysis in experimental crosses, and this has led to the localization in the rat genome of hundreds of quantitative trait loci (QTLs) underlying trait variation (68). We refer to these loci as physiological quantitative trait loci (pQTLs). The confidence intervals for such QTLs are often large and contain tens to hundreds of candidate genes meaning that the identification of the underlying genes and genetic variants therefore remains a challenge.
Recombinant inbred (RI) strains represent a type of experimental cross in which traits can be mapped in a panel of inbred strains derived from two progenitor strains (9). Because each animal within an RI strain is genetically identical, repeated measurements can be made in multiple animals from the same strain. In addition, because the inbred lines can be propagated almost unchanged over many years, phenotypic and genetic data generated in the RI panel is cumulative and can be used to infer correlation and linkage between traits generated many years apart in a given RI panel.
The BXH/HXB RI strains were derived from the SHR/OlaIpcv and BN.Lx progenitor strains in the 1980's and have been used to map many SHR phenotypes to the rat genome (1, 3, 11, 46, 47). In addition to the mapping of pQTLs, the BXH/HXB panel have been used to map genetic determinants of gene expression. By measuring genome-wide gene expression using microarrays and treating each expression profile as a quantitative trait, thousands of genetic control points for gene expression, or expression QTLs (eQTLs), have been mapped to the genome in seven tissues of BXH/HXB RI animals (24, 26, 43, 44). A particular advantage of the eQTL study design is that it distinguishes cis- and trans-regulated control of gene expression. Because cis-regulated control of gene expression represents control of gene expression that is regulated by genomic sequence variation within or close to the gene whose expression is being measured, cis-regulated eQTL genes are attractive candidates for pQTLs that map to the same location (26).
Previously, we combined the use of genetic linkage and gene expression profiling to identify the Cd36 gene as a cause of defective insulin action and fatty acid metabolism in the SHR (2, 3). More recently, we have systematically combined genome-wide eQTL data with extensive physiological phenotyping data generated using the BXH/HXB panel in quantitative trait transcript (QTT) studies (42) which identify cis-eQTL transcripts whose expression correlates with a physiological trait, enabling us to identify Ogn as a susceptibility gene for cardiac hypertrophy (44) and renal Cd36 expression as a cause of hypertension (45). Integration of this analysis with other datasets such as pQTL data can successfully identify cis-regulated transcript expression, which correlates with physiological traits and whose gene colocalizes with pQTLs for these same traits. These genes can be considered strong candidates for these pQTLs. We designate such colocalizing, correlated, cis-eQTLs as “c3-eQTLs,” and these are the subject of the present study.
Here, we have used eQTL data from four tissues of BXH/HXB RI strain animals to carry out QTT analysis with 207 (patho)physiological and metabolic traits previously characterized in this RI panel. By combining the results of these studies with data from the recently generated SHR genome sequence (7) we have identified positional candidates for a range of well-characterized SHR phenotypes that provide a basis for prioritized functional testing and ultimate elucidation of the molecular basis of these traits.
MATERIALS AND METHODS
Prioritization of cis-eQTLs.
The generation of eQTL data from adrenal gland, kidney, retroperitoneal fat pad, and soleus muscle from 29 strains (four animals per strain) of the BXH/HXB RI panel have been described previously (24, 26, 43). Microarray expression data was deposited in Array Express (http://www.ebi.ac.uk/arrayexpress) under accession numbers E-AFMX-7 and E-TABM-458. All data for eQTLs with a genome-wide significance (PGW) of P < 0.05, calculated as defined in Refs. 26, 44, were accessed via the eQTL Explorer database and software (39) (http://web.bioinformatics.ic.ac.uk/eqtlexplorer/). Expression data for the adrenal gland, retroperitoneal fat pad, and kidney datasets can also be accessed via the WebQTL database (69) (http://www.webqtl.org/). Transcripts were identified from Affymetrix probe-set IDs using Ensembl (http://www.ensembl.org/Rattus_norvegicus/Info/Index) and the Rat Genome Database (RGD) (http://rgd.mcw.edu/).
Cis-eQTLs were defined as previously, as eQTLs whose genetic control point maps to within 10 Mb of the associated gene (26), and were selected from four tissues that are relevant to the pathogenesis of the metabolic syndrome: adrenal, kidney, retroperitoneal fat, and soleus muscle. These cis-eQTLs were prioritized for further study using a filtering protocol that selected the most robust cis-eQTLs based on criteria relating to expression levels, differential transcript expression between parental strains (44), genome-wide P values and probe-set mapping to unique loci (See Fig. 1). From cis-QTLs that had not been excluded by the above criteria, we selected the 10 most significant cis-eQTLs by allelic P value for each tissue (40 in total). This “allelic P value” represents the PGW for expression differences between strains across the RI strain panel separated by genotype at the peak marker of linkage of the cis-eQTL. Furthermore, we selected 20 additional highly significant cis-eQTLs (allelic P value P < 0.001) with a transcript expression fold change >2.5 or <0.4 (2.5-fold downregulated in SHR) between the parental strains. Affymetrix microarray probe-set locations identified using Ensembl were checked for the presence of sequence variants using the SHR (7) and reference Brown Norway (BN) genome sequences (22). Sequence variants were sequenced with a minimum read depth of 3 and a minimum consensus quality value of 30 (7).
Animals and tissues.
BN.Lx/Cub and SHR/Ola rats were housed in an air conditioned animal facility with unrestricted access to standard laboratory chow and water at the Czech Academy of Sciences, Prague, Czech Republic. All procedures were approved by the Ethics Committee of the Institute of Physiology, Czech Academy of Sciences, in accordance with the Animal Protection Law of the Czech Republic (311/1997). All physiological trait measurements were carried out in male rats, unless females were specifically required (e.g., number of offspring etc.). Tissues were harvested from BN.Lx/Cub and SHR/Ola male rats at 6 wk of age as described previously (26).
RNA was extracted as described previously (29) except that 50 mg of crushed whole kidney tissue was homogenized, while adrenal gland, fat pad, and skeletal muscle tissues were homogenized whole. Additionally, total RNA was isolated from fat and skeletal muscle tissues using the RNeasy Lipid tissue and RNeasy fibrous tissue kits (Qiagen) respectively in accordance with the manufacturers' instructions. Individual tissues from four animals per strain were used without pooling.
Validation of microarray data by quantitative PCR.
Primers were designed using the Primer3 software package (Supplementary Table S1).1 Reverse transcription and subsequent qPCR were carried out as described previously (29), except reactions were run on an ABI 7900 Fast Real Time PCR system (Applied Biosystems). Appropriate housekeeping genes for normalization controls for each tissue were selected based on equal expression between parental strains in that tissue [β-actin, adrenal and left kidney; HPRT, retroperitoneal fat pad; and GAPDH, soleus muscle (skeletal muscle)] as assayed by quantitative PCR (qPCR). Expression data could not be obtained for seven transcripts due to a combination of low transcript expression and the presence of highly repetitive sequence making the design of suitable primers impossible. These transcripts were omitted from further analysis. Statistical significance was defined in the first instance as P < 0.05 using an uncorrected one-tailed Mann-Whitney test, before Bonferroni correction for multiple testing (53 transcripts tested).
Identification of correlating cis-eQTLs by QTT analysis.
This analysis used the microarray expression array values of cis-eQTLs from 29 RI strains and phenotype measurements for 207 physiological traits (Supplementary Table S2). Physiological and metabolic trait measurements have been reported previously (14, 17, 20, 28, 32, 44, 46, 48, 50, 51, 53, 72).
We assessed association between gene expression levels and phenotypic variation across the BXH/HXB RI strain panel by correlating transcript abundance with the values of physiological traits as described (42). Pearson correlation coefficients (r) and Westfall-Young corrected P values based on a minimum of 1,000 permutations were calculated using MatLab version 6.
Identification of c3-eQTLs.
Physiological QTLs (pQTL) that map to the same loci as cis-eQTLs correlating with physiological traits were identified using the RGD (68). This resource provides information regarding the physiological trait studied, strain combination used, associated linkage statistics, and the genomic coordinates of the pQTL region. For pQTL regions identified from RGD, the original data (Supplementary Table S3) were examined, and the 99% confidence interval [within the 2 logarithm of the odds (LOD) drop from the peak of linkage] was estimated. Cis-eQTLs were classed as colocalizing if the entire cis-eQTL gene falls within this interval. We limited our search to pQTLs defined using either the BXH/HXB RI strain panel or a BN or SHR-derived strain, with a minimum LOD score of 2.4.
Identification of sequence variants in the SHR strain.
Sequence variants arising between the SHR/Olalpcv and BN/SsNHsd/Mcwi rat strains were identified using the recently available SHR (7) and reference BN genome sequences (22). Classification of genetic variants by location relative to the relevant transcript was assigned using gene annotations in the Ensembl database (Rat RGSC 3.4, release 56, Sept. 2009). All sequence variants reported were sequenced with a minimum read depth of 3 and a minimum consensus quality value of 30 (7). We selected 22 sequence variants affecting the coding region and upstream regions for validation by standard capillary sequencing. Primers were designed using the Primer3 software package (Supplementary Table S8), while sequences were analyzed using the Sequencher DNA analysis software (Gene Codes).
Prediction of differential transcription factor binding.
We employed the sTRAP method (36) to identify putative transcription factor (TF) binding sites whose binding affinity may be altered by potential regulatory single nucleotide polymorphisms (SNPs). The binding affinity was computed for both alleles (including 100 bp flanking sequence) using a biophysical model (54) for each SNP and vertebrate TF with a position specific frequency matrix in the TRANSFAC database (38) release 12.1. To compare different TFs with each other we transformed the raw affinities to P values (37). Finally we quantified the change of affinity by the absolute log ratio of the P values of the two alleles. We used a threshold of 1.2 for the absolute log ratio, which was shown previously to yield a high specificity (36). Additionally we required that at least one allele has a high quality motif (min. P < 0.01).
Identification of evolutionarily conserved residues.
Species-specific protein sequences were retrieved from the GenBank sequence repository (http://www.ncbi.nlm.nih.gov/genbank/). Briefly, protein sequence derived from the BN strain for each of four c3-eQTL genes in which nonsynonymous SNPs were found [acyl coenzyme A oxidase 2, branched chain (Acox2); aldo-ketoreductase family 1, member C1 (Akr1c1); chymotrypsin-like elastase family, member 1 (Cela1); and peroxisomal biogenesis factor 11 beta (Pex11b)] were aligned to corresponding human and mouse protein sequences using the bl2seq specialized BLAST sequence alignment tool using default settings (http://blast.ncbi.nlm.nih.gov) (5). GenBank accession numbers for protein sequences used are: Acox2; NP_665713.1 (rat), AAH21339 (mouse) and NP_003491 (human), Akr1c1; BC088227.1 (rat), AC091817.6 (human), Cela1; NP_036684 (rat), NP_291090 (mouse) and NP_001962 (human), Pex11b; NP_001020855 (rat), AAC78661 (mouse) and CAG46844 (human). Amino acids (aa) that are retained between human, mouse, and rat were considered conserved.
Prioritization of cis-eQTLs.
Using the eQTL Explorer database (39), we identified 3217 cis-eQTL transcripts across four rat tissues [adrenal gland, kidney, retroperitoneal fat pad, and soleus (skeletal) muscle] (Table 1). Filtering based on criteria including transcript expression fold change and statistical significance in the parental and RI strains identified 480 transcripts representing the strongest cis-eQTLs (Supplementary Table S4). Of this group, we selected 60 cis-eQTLs for qPCR analysis based on statistical significance and parental fold change (Fig. 1 and Supplementary Table S4).
qPCR validation of differential expression of cis-eQTL transcripts.
Differences in cis-eQTL transcript expression observed in microarray analyses between parental strains were validated by qPCR analysis. Differential transcript expression between the parental strains (BN.Lx/Cub and SHR/Ola) was confirmed with raw P values <0.05 in 43 of the 60 cis-eQTL transcripts tested (72%) (Fig. 2), of which 36 were in the same direction as observed by microarray (Table 2). All of these transcripts remained statistically significant after correction for multiple testing. Validated transcripts exhibited expression dysregulation of between 1.3- (Trak2) and 11.1-fold (Acox2) between parental strains. Sequence analysis found that 13/60 transcripts contained an SNP that could potentially affect microarray probe binding; however, differential expression was subsequently confirmed by qPCR in eight of these cases (data not shown). No common factor (tissue, expression level, fold change) could be found linking cis-eQTLs whose differential expression could not be validated.
Of the 36 qPCR validated cis-eQTLs, we identified 22 transcripts that significantly correlated (P < 0.01, Westfall-Young P value corrected for multiple testing) with at least one of 207 physiological and metabolic traits measured across the BXH/HXB RI strain panel (Table 3 and Supplementary Table S5). Transcript abundance of these cis-eQTLs was found to correlate with a variety of traits including those relating to blood pressure, cardiac mass, stress response, and response to oxidative stress (Supplementary Table S5). Examples of three of the most significant correlations with associated scatter-plots are illustrated in Fig. 3 and Supplementary Fig. S1. Five cis-eQTLs correlated strongly with several functionally related traits, such as haptoglobin (Hp) and malic enzyme 3 (Me3), which correlate with traits relating to both lipid peroxidation in the kidney and blood pressure (Supplementary Table S5), and thiosulfate sulfurtransferase (Tst), which correlates with both isoproterenol-induced lipolysis and body weight. Transcript abundance of both lymphocyte antigen 6 complex, locus B (Ly6b) and peroxisomal biogenesis factor 11 beta (Pex11b) was found to correlate with multiple insulin and blood glucose traits, with Ly6b expression also correlating with lipolysis in adipocytes (Supplementary Table S5).
Identification of c3-eQTLs.
Of the 22 correlating cis-eQTLs, we identified 12 that also colocalized within a pQTL region (99% confidence interval) (Supplementary Table S6). We found seven cis-eQTLs whose expression was found to correlate with blood pressure or related traits [aldo-ketoreductase family 1, member C1 (Akr1c1); chymotrypsin-like elastase family, member 1 (Cela1); cytochrome P450, family 1, subfamily a, polypeptide 1 (Cyp1a1); Hp; Ly6b; Me3; and Tst], which also colocalized with pQTLs for blood pressure or renal function. Furthermore, our analysis demonstrated that Ly6b and Tst as well as an additional five cis-eQTLs [acyl coenzyme A oxidase 2, branched chain (Acox2); cytochrome P450, family 2, subfamily d, polypeptide 2 (Cyp2d2); hemopexin (Hpx); Pex11b; and trafficking protein, kinesin binding 2 (Trak2)] correlated with fat metabolism or blood sugar-related traits, while colocalizing with pQTLs underlying differences in lipid levels, blood sugar levels, and/or body weight. Such c3-eQTL transcripts are strong candidate genetic determinants of these (patho)physiological traits.
Identification of sequence variation in c3-eQTL genes.
We used the SHRbase sequence database (7) to identify sequence variants with the potential to affect the expression and/or function of cis-eQTL transcripts. We detected sequence variation in the upstream and downstream noncoding regions in all 12 c3-eQTL genes (Table 4 and Supplementary Table S7, a and b). We successfully confirmed 90.9% (20/22) of the SNPs selected for validation by capillary sequencing (See Supplementary Table S9). Of the two SNPs that did not validate, one (Cela1 SNP2) was found not to vary between the parental strains, while the second (Trak2 SNP1) appeared to constitute part of a small deletion. On analysis of upstream region SNPs we identified 33 potential SNP/TF interactions in nine c3-eQTL genes, where predicted TF binding strength was changed significantly by an SNP between the SHR and BN strains (|log ratio| > 1.2) (Table 5). This included one c3-eQTL gene (Hp) that was found to carry a polymorphism affecting a putative binding site for a TF family known to play a role in its transcriptional regulation (CEBP) (40, 41). Three c3-eQTLs (Acox2, Trak2, and Pex11b) also show sequence variation affecting splice sites, potentially leading to aberrant splicing or altered transcript stability and protein function. Significantly, nonsynonymous coding region SNPs were identified in four of the 12 c3-eQTLs (Acox2, Akr1c1, Cela1, and Pex11b) (Table 4). Analysis of the corresponding protein sequences by the BLAST sequence alignment program (bl2seq) showed that several of these SNPs (Supplementary Table S7a) affect aa residues that are conserved between humans and rodents [ACOX2-aa 546, AKR1C1-aa 250, CELA1-aa 21, and PEX11b-aa 113 and 231]. Furthermore, small insertions and deletions (up to 7 bp) were found in 10 of the c3-eQTLs; however, none localized to either coding regions or to putative TF binding sites, and their functional significance remains unknown (Supplementary Table S7b).
Advancing our understanding of the genetic susceptibility to common health problems such as high blood pressure, obesity, and insulin resistance could lead to significant improvements in their detection and treatment. We have studied traits related to a variety of common disease phenotypes in the rat model. We combined the analysis of transcript expression, physiological phenotyping, and sequence datasets to identify candidate genes for pathophysiological traits in the SHR, and sequence variants with the potential to underlie differential transcript expression and altered gene function of these candidates. This could provide insight into the molecular basis underlying not just cis-regulation of transcript abundance, but that of downstream physiological or metabolic traits.
After filtering microarray expression data in four tissues (Fig. 1), we selected 60 of the most robust cis-eQTLs and validated differential expression in the SHR and BN parental strains for 36 of these (Fig. 2). No common factor could be identified between prioritized transcripts that did not validate. Of this subset, we identified 22 cis-eQTL transcripts whose expression across the RI strains correlated significantly with measurements of physiological traits relating to disorders manifested in the SHR model (Fig. 3, Table 3, and Supplementary Table S5). Three highly correlating transcripts, Cela1, Hp, and Pex11b, exhibited Pearson correlation coefficients (R) between expression and physiological trait measurement of between 0.56 and 0.67, strongly suggesting a correlation between transcript expression and the individual physiological trait measurements, identifying these correlating cis-eQTLs as attractive candidates for the correlated trait (Supplementary Fig. S1). Incidences where QTT analysis identifies several closely localizing correlating cis-eQTLs are often the consequence of a correlated strain distribution pattern. Previous studies have already demonstrated a significant relationship between the absolute correlation coefficient of expression of eQTL pairs residing on the same chromosome and the distance between them (34, 56). Likewise, pair-wise correlation analysis of our cis-eQTL dataset found that >80% of the correlation between expression of individual cis-eQTL transcripts is explained by the shared genotypes of cis-eQTL genes at tightly linked loci (23). We found 12 of the correlating cis-eQTLs to colocalize with pQTLs previously mapped using either the RI strain panel or a BN or SHR strain. These 12 c3-eQTLs represent strong candidate genes for the associated traits, showing high genome-wide statistical significance of linkage, high differential expression between strains, and strong correlations to the traits in question. Seven of these c3-eQTLs, Akr1c1; Cyp1a1; Ly6b; and Tst (correlations found in kidney); Cela1 (fat and kidney); and Hp and Me3 (adrenal) correlate with blood pressure-related traits. Those identified in kidney are of particular interest given the role played by the kidney in the regulation of blood pressure. In addition, transcript levels of Hp correlate with lipid peroxidation traits in the kidney. Although the Hp expression phenotype occurs in a different tissue (adrenal) to the correlating physiological trait (kidney), this finding could potentially be a reflection of a more global role for this secreted plasma protein in the prevention of oxidative stress and tissue lipid peroxidation (Supplementary Table S5).
Likewise, we have shown that Tst (in adrenal) and Trak2 (skeletal muscle) transcript levels correlate with traits associated with fat metabolism and colocalize with pQTLs for lipid level and body weight, while Hpx (adrenal) and Pex11b (fat) represent c3-eQTLs relating to blood sugar regulation and traits recognized as downstream consequences of abnormal blood sugar regulation (4, 10, 55, 64) (Supplementary Table S6). These c3-eQTLs are particularly striking given that they also occur in tissues that are physiologically relevant for the control of these traits, suggesting that they represent strong candidate genes underlying these phenotypes in the SHR. The correlations of certain transcripts with traits associated with fat metabolism and body weight (Acox2 and Cyp2d2, both in kidney) and blood pressure/kidney mass (Me3, adrenal) did not occur in tissues normally associated with the correlating trait. These associations could indicate additional as yet unknown mechanisms of control, possibly as a consequence of the transcript forming cis-eQTLs in a number of tissues (Supplementary Table S4), not all of which were prioritized for expression validation and QTT analysis. A possible example of this is Cela1 whose expression correlates with glucose uptake in adipocytes despite being prioritized as a cis-eQTL in kidney. Further analysis of the entire cis-eQTL dataset found that Cela1 is also a cis-eQTL in fat tissue, albeit below our prioritization threshold.
The availability of the SHR genome sequence has enabled the analysis of sequence variation between the SHR and BN strains on a genome-wide scale. The high sequence concordance between BN substrains (60) meant that we were able to successfully validate 90.9% of selected SNPs. We investigated sequence changes that disrupt TF binding sites (TFBSs) that could represent candidate quantitative trait nucleotides (QTNs) underlying c3-eQTL expression traits. We found several SNPs in the upstream regions of nine c3-eQTLs that are predicted to disrupt TFBSs (Table 5). Eight c3-eQTL genes have polymorphisms affecting the coding region, four of which have nonsynonymous changes potentially affecting protein activity and stability. Five of these nonsynonymous SNPs affecting four c3-eQTLs (Acox2, Akr1c1, Cela1, and Pex11b) lead to substitution of highly conserved aa residues (ACOX2, aa 546 Y>H; AKR1C1, aa 250 W>R; CELA1, aa 21 P>A; and PEX11b, aa 113 R>H and aa 231 P>R) (Table 4 and Supplementary Table S7a), suggesting likely effects on protein activity. Furthermore, none of the 12 c3-eQTLs localized to regions identified as being subject to copy number variation in the SHR (18), strengthening the view that sequence variation rather than copy number variation close to or within dysregulated transcripts is the major cause of differential expression observed in cis-eQTL genes.
Although the c3-eQTLs represent a diverse group of genes with a range of different functions, we noted that several c3-eQTLs [Akr1c1 (16), Cyp1a1 (19, 61), Hp (6, 15), Hpx (66), Me3 (59, 70), Pex11b (58), and Tst (31)] have previously been demonstrated to play a role in oxidative stress pathways, suggesting they may be important factors controlling the extent of tissue and DNA damage in response to oxidative stress. This link is of interest given that oxidative stress has been linked to hypertension, cardiovascular disease, and Type 2 diabetes, among other pathological states (25). Indeed, the SHR parental strain has been shown to exhibit reduced endogenous antioxidant capabilities compared with other rat strains, with studies showing increased DNA and protein oxidation in SHR compared with WKY (12, 30, 62, 71). This suggests that an increased knowledge of the molecular mechanisms underlying reduced antioxidant capacity is of functional relevance.
For example, the IL-6-responsive acute-phase gene Hp (41) has been shown along with another c3-eQTL Hpx to prevent hemoglobin-mediated oxidative damage by promoting clearance of hemoglobin released as a consequence of red blood cell turnover or hemolysis (6, 15) via scavenger receptors (8, 27, 57). The Hp locus is polymorphic in humans, with variants found to show evidence of association with increased risk of cardiovascular disease in diabetic patients (6). Hp is considered a susceptibility factor for atherosclerosis in diabetic patients and has also been shown to reduce hemoglobin-mediated hypertension in other species (13). Human Hp promoter polymorphisms disrupting CEBPβ sites have been shown to significantly reduce transcriptional activation in response to IL-6 in human cell lines (41, 63). Here we show that Hp transcript levels in adrenal correlate with kidney oxidation traits and blood pressure, while the gene itself colocalizes with pQTLs for blood pressure (Table 3, Supplementary Tables S5 and S6). Expression of both Hp and Hpx is increased in SHR, suggesting that dysregulation of this mechanism may serve to act as a protective factor attenuating the oxidative stress response in this strain. We identified several disrupted putative TFBSs upstream of the Hp gene, including an SNP affecting a putative CEBP family binding site in SHR (Table 5). This finding is of note due to an earlier report identifying the CEBP family protein, CEBPβ, as a key TF in the control of Hp transcript levels in humans (40, 41), suggesting that SNPs within the Hp upstream region represent excellent candidate QTNs that regulate Hp expression and potentially the development of downstream physiological traits, though verification of the effects of these SNPs will entail functional analysis.
Analysis of sequence variants in c3-eQTL upstream regions showed an SNP that disrupts a putative PAX binding motif in the Cela1 gene whose expression in fat and kidney correlates with kidney mass traits and colocalizes with QTLs for blood pressure-related traits (Table 5 and Supplementary Tables S5 and S6). Although the correlation of Cela1 expression with kidney mass was detected primarily in the fat cis-eQTL dataset, this transcript was also found to correlate with this trait in the kidney cis-eQTL dataset, albeit at a lower level of statistical significance (P = 0.01), suggesting a role for this gene in the control of kidney mass traits. Members of the PAX TF family (Pax2 and Pax5) have been shown to be involved in the regulation of kidney development (21, 67) and repair (33). This raises the possibility that Cela1 may be a downstream target by which PAX TFs mediate these processes, with disruption of its expression leading to aberrant kidney mass and development. Indeed, alterations in the renal elastin-elastase system, leading to increased expression of elastins, have been linked to Type I diabetic nephropathy in mice and humans (65). Furthermore, Cela1 also contains two missense coding region SNPs, affecting codons 21 and 160 (See Supplementary Table S7a). The P21A change occurs in a residue that is conserved between humans and rodents, close to the cleavage site of the immature Cela1 proprotein product (35), potentially affecting transcript stability or protein function. This provides an additional means by which sequence variation at the Cela1 locus may contribute to the development of the observed expression and physiological traits. These observations, while suggestive, require confirmation in future functional studies.
In this study we have combined the use of transcript expression data, physiological trait measurements and the recently assembled SHR genome sequence along with curated data resources such as TRANSFAC to prioritize novel candidate genes underlying physiological traits manifested in the SHR strain. By integrating these resources, we have identified genes that not only are highly correlated with relevant physiological traits across the BXH/HXB RI strains and located within previously mapped QTL regions in this RI strain panel but also contain sequence variants that represent a plausible genetic basis for the differential expression observed and could affect downstream physiological and metabolic traits. While further studies would be required to confirm the physiological significance of the transcripts and sequence variants identified, the increasing availability of knockout and transgenic animal models makes studies a plausible next step. Indeed, knockout mice are available for a number of c3-eQTLs identified; however, these have yet to be tested for a relevant phenotype. These data therefore provide a starting point for systematic analysis of functional variants that potentially underlie these traits in the SHR model.
This work was primarily supported by intramural funding from the MRC Clinical Sciences Centre, by the Imperial College BHF Centre of Research Excellence, by a Wellcome Trust studentship (069962/Z/02/Z) to I. C. Grieve, and by the Grant Agency of the Czech Republic (grant 301/08/0166) and the Ministry of Education of the Czech Republic (grant ME10019) (M. Pravenec).
No conflicts of interest, financial or otherwise, are declared by the author(s).
↵1 The online version of this article contains supplemental material.
- Copyright © 2011 the American Physiological Society