Wilms’ tumor gene (WT1) is important for nephrogenesis and gonadal growth. WT1 mutations cause Denys-Drash and Frasier syndromes, which are characterized by glomerular scarring. To test whether genetic variations in WT1 and WIT1 (gene immediately 5′ to WT1) associate with focal segmental glomerulosclerosis (FSGS), patients with biopsy-proven idiopathic and HIV-1-associated FSGS were enrolled in a multicenter study. We genotyped SNP rs6508 located in WIT1 exon 1, three SNPs (rs2301250, rs2301252, rs2301254) in the promoter shared by WT1 and WIT1, rs2234590 in exon 6, rs2234591 in intron 6, rs16754 in exon 7, and rs1799937 in intron 9 of WT1. Cases (n = 218) and controls (n = 281) were compared in the African American population. Stratification by HIV-1 infection status showed that SNPs rs6508, rs2301254, and rs1799937 were significantly associated with FSGS [rs6508 odds ratio (OR) 1.82, P = 0.006; rs2301254 OR 1.65, P = 0.049; rs1799937 OR 1.91, P = 0.005] in the non-HIV-1 group and rs2234591 (OR 0.234, P = 0.011) in the HIV-1 group. Haplotype analyses in the population revealed that seven SNPs were associated with FSGS; five SNPs had the highest contingency score [−log10(P value) = 13.57] in the HIV-1 group. This association could not be explained by population substructure. We conclude that SNPs in WT1 and WIT1 genes are significantly associated with FSGS, suggesting that variants in these genes may mediate pathogenesis by altering WT1 function. Furthermore, HIV-1 infection status interacts with genetic variations in both genes to influence this phenotype. We speculate that nephropathy liability alleles in WT1 pathway genes cause podocyte dysfunction and glomerular scarring.
- renal failure
- single nucleotide polymorphism haplotype
- African Americans
glomerular disease is the third leading cause of end-stage renal disease, and focal segmental glomerulosclerosis (FSGS) constitutes a significant proportion of this subgroup. FSGS is diagnosed by renal biopsy, with histopathological findings characterized by partial scarring of the glomerular capillary tuft in a significant proportion of glomeruli. Patients with idiopathic FSGS present with proteinuria and often develop renal dysfunction that progresses to end-stage renal disease. Immunosuppressive regimens, which are commonly used to treat other glomerular diseases, are generally ineffective in FSGS. The incidence of idiopathic FSGS is increased in African Americans (3), and secondary forms of FSGS can occur in association with vesicoureteral reflux, human immunodeficiency virus (HIV)-1 infection (9), and sickle cell disease (51). Familial forms of FSGS with linkage to chromosomes 19q13 (40), 11q21–q22 (55), and 1q25–31 (53) have been reported in large families. Mutations in two genes, actinin-4 (30) and CD2-associated protein (CD2AP) (31, 50), are associated with susceptibility to FSGS. Thus FSGS is a disease with diverse etiologies and significant genetic heterogeneity.
The Wilms’ tumor gene (WT1) is located at chromosome 11p13 (7, 21) and was originally discovered as a tumor suppressor gene inactivated in a subset of Wilms’ tumors. WT1 is composed of 10 exons and encodes an NH2-terminal glutamine-rich/proline-rich transactivation domain and a COOH-terminal DNA binding domain with four zinc fingers (Fig. 1). Mechanisms that regulate WT1 expression are still unclear, with several different genes playing a significant role (22, 45, 57), one of which may be WIT1, which is located upstream of the WT1 gene (Fig. 1). WIT1 is transcribed in the opposite direction from a shared intergenic region (∼2 kb in size) that contains the promoter region. WT1 and WIT1 are expressed in the same tissue-restrictive pattern (28), suggesting that WIT1 may be an antisense regulator of WT1 expression or function.
Regulated expression of WT1 is absolutely required for normal renal and genital development (44, 46). There are two major alternative splice sites in WT1 that generate four major WT1 isoforms (6, 23). The first alternative splice site either includes or excludes 17 amino acids encoded by exon 5 and is expressed only in mammals. The function of this domain is not clear, since mice engineered to express WT1 isoforms without exon 5 develop normally and are fertile (10). The second alternative splice site occurs in intron 9 and excludes or includes three amino acids, lysine-threonine-serine (KTS), between zinc fingers 3 and 4 (35). In vitro, WT1 isoforms without KTS [WT1(−KTS)] bind DNA and regulate transcription, whereas those with the KTS insertion [WT1(+KTS)] primarily regulate RNA processing but not gene activity (10). Posttranscriptional modifications, RNA editing, and the use of alternative translation start sites increase the number of known WT1 protein isoforms to at least 24 (20, 23, 45, 54). Whereas mutations in WT1 are rare in sporadic Wilms’ tumors, they occur frequently in association with Frasier (FS; Ref. 4) and Denys-Drash syndromes (DDS; Refs. 42, 43), two childhood syndromes characterized by gonadal dysfunction and progressive nephropathy from FSGS and diffuse mesangial sclerosis, respectively. In addition, 4 of 10 patients with isolated diffuse mesangial sclerosis had WT1 mutations (29). Comparing these results with a large database for WT1 mutations, Jeanpierre et al. (29) observed consistent associations between mutations in exons 8 and 9, involving the zinc finger domain, and mesangial sclerosis (with or without gonadal dysfunction). Mice in which the WT1 gene is homozygously deleted fail to develop kidneys and gonads (34), but heterozygote animals that have a targeted deletion of the third zinc finger at codon 396 develop mesangial sclerosis (41). Similarly, mice that expressed either the +KTS or −KTS splice variant from one or both WT1 alleles developed abnormalities of the gonad and nephron with phenotypes similar to Frasier syndrome (24).
Evidence from DDS and FS, combined with the data from animal models, demonstrates that glomerulosclerosis phenotypes correlate with mutations in WT1. Therefore, we hypothesized that WT1 variants may regulate idiopathic FSGS. To determine whether variants in WT1 are associated with idiopathic FSGS, we genotyped single nucleotide polymorphisms (SNPs) in WT1 in an African American cohort with biopsy-proven FSGS. In addition, we genotyped a variant in the single exon of the WIT1 gene located upstream of WT1 to determine whether this variant is associated with FSGS.
Description of the Study Population
Patient selection and evaluation were performed at 22 academic medical centers in the United States, using protocols approved at each institution by an Institutional Review Board. All patients gave informed consent. Inclusion criteria were a renal biopsy that showed idiopathic FSGS or HIV-1-associated FSGS. Patients with a clinical history suggestive of hyperfiltration-associated FSGS due to reduced renal mass, reflux nephropathy, sickle cell nephropathy, or morbid obesity were excluded. Most patients had sporadic FSGS, but some patients had a family history. Control subjects included African American (AA) HIV-1 sero-positive patients who did not have FSGS after at least 8 yr of exposure. The absence of FSGS was defined by normal serum creatinine (≤1.4 mg/dl) and lack of proteinuria (urine protein-to-creatinine ratio <0.5). As a second control group, we randomly recruited AA blood donors from the same site. Peripheral blood lymphocytes from each patient were transformed with Epstein-Barr virus to generate permanent cell lines that provide a renewable source of DNA.
Screening for DNA Variations
Identification of SNPs in the WIT1 and WT1 gene.
Previously identified DNA variants of WT1 reported in public data bases, the National Center for Biotechnology Institute (NCBI) (http://www.ncbi.nlm.nih.gov/LocusLink/), the SNP Consortium Limited (http://snp.cshl.org/), the Institute of Medical Science-Japanese Science and Technology Japanese SNPs database (http://snp.ims.u-tokyo.ac.jp/), the Ensembl Genome Browser (http://www.ensembl.org/), and the GeneSNPs Public Internet (http://www.genome.utah.edu/genesnps/), were selected for genotyping (Fig. 1). Eight SNPs were identified from these databases, and the dbSNP reference map was used to locate the chromosomal positions of each of these SNPs (see Table 2). For genotyping, the SNPs were prioritized as follows: exon with nonsynonymous amino acid change, transcription factor binding sites in the promoter, splice site, and exon with synonymous amino acid change and intronic region. SNPs were chosen on the basis of their heterozygosities (heterozygosity >0.25) and minor allele frequencies of at least 0.1. SNPs that had neither heterozygosity nor allele frequency values available in the databases were then validated by NCBI blast analysis to confirm the presence of selected SNPs in WT1 or WIT1 genes and by bidirectional DNA sequencing of 20 FSGS individuals.
Oligonucleotide primers flanking each polymorphism of interest within WIT1 and WT1 genes were designed with PrimerSelect 4.05 and used to amplify the region. The amplicon was then purified at 37°C for 15 min using ExoSAP-IT, followed by denaturation at 80°C for 15 min and sequencing. All sequence data in a FASTA format were subjected to multiple alignments performed using the Clustal V method, described by Higgins and Sharp in 1989 (26), incorporated in the MegAlign 4.05 DNASTAR software. The resulting multiple sequences were aligned against a consensus sequence obtained from GenBank or NCBI for the gene of interest. Mismatches were identified among the sequence alignments. After resequencing twice in opposite directions to rule out sequencing error, we recorded likely polymorphic sites as SNP candidates.
With the use of TFSEARCH (http://www.cbrc.jp/research/db/TFSEARCH.html), the SNP candidates within the promoter region were analyzed to determine whether their location coincided with transcription factor (TF) binding sites. The Splice Site Prediction by Neural Network (http://www.fruitfly.org/seq_tools/splice.html) was used to determine whether the SNP candidates at the intronic regions localized to splice donor or acceptor regions or within the splice regulatory sites. The silent SNP candidates (i.e., at the 3rd wobble codon) in the exonic regions were analyzed to determine whether they were within the regulatory binding sites and splice regulatory sites, using TFSEARCH.
Genotyping of Polymorphisms
SNP analysis in WT1 and WIT1.
Our samples were typed for the SNPs of interest, using TaqMan assay or Luminex xMap technology. With the use of TaqMan assay, genomic DNA (10 ng) was subjected to PCR amplification (hold, 10 min, 95°C; denature, 15 s, 92°C; and anneal/extend, 1 min, 60°C; for 40 cycles) in a volume of 25 μl including 1× TaqMan Universal PCR Master Mix [PE Biosystems; 8% glycerol and 1× TaqMan buffer (100 mM Tris, pH 8.0, 500 mM KCl) with 7.5 mM MgCl2, dNTPs (200 μM dATP, 200 μM dCTP, 200 μM dGTP, 400 μM dUTP), and 0.5 U of AmpliTaq Gold DNA polymerase]. Assays-on-Demand (1×) SNP Genotyping Assay Mix containing the two specific TaqMan MGB probes, forward and reverse primers, was also added. The 96-well or 384-well plate containing the reaction mixture was then run on the ABI 7000 or the ABI 7900 Sequence Detection System Instrument. A number of steps were taken to minimize error in genotyping WT1 and WIT1 SNPs. A set of 71 individuals (cases n = 38, controls n = 33) in duplicate, 10 individuals (cases n = 7, controls n = 3) in triplicate, 3 individuals in quadruplicate, and 1 individual in quintuplicate were genotyped under the same conditions as the whole sample. Concordance values were estimated for these individuals to ascertain replicability in genotyping. Individuals whose genotypes could not be discriminated accurately were regenotyped. In addition, blind replicates, Centre d’Etude du Polymorphisme Humain controls, and blanks (water) were also genotyped as internal and negative controls. SNPs that demonstrated <98% concordance in the duplicates, triplicates, quadruplicates, and quintuplicates were regenotyped. SNPs that gave concordance values of at least 97.5% were considered in our analyses. In addition to all the measures taken to minimize genotyping error, the measure of nonconformity to Hardy-Weinberg proportion in the samples for each SNP was calculated. Deviation from Hardy-Weinberg proportions was indicative of problematic assays.
Unlike the TaqMan platform, which assays one SNP at a time, Luminex xMap technology incorporates fluorescently encoded microspheres that are able to perform multiplex SNP assays in conjunction with advanced digital signal processing. Therefore, measurement of several SNPs within one sample and small sample volume requirements were attractive because they increased the efficiency for testing and sample consumption. With the use of Luminex xMap technology, primers were designed for the regions flanking the SNPs of interest, followed by amplification of 10 ng of genomic DNA from both cases and controls for each SNP in our population. The PCR product was then multiplexed, resulting in the simultaneous analyses of two SNPs (rs2234590 and rs2234591). The multiplexed PCR products were then subjected to three reactions: a ligation detection reaction (LDR) step that ligates a tagged allele-specific (complement of the SNP nucleotide) oligonucleotide to a common primer attached to biotin, a step involving hybridization of the LDR product to the anti-tagged microspheres, and lastly a fluorescence detection step. The ligation was done at 95°C for 1 min, 95°C for 15 s, and 58°C for 2 min. The LDR product was hybridized to anti-tagged microspheres and then incubated in streptavidin-phycoerythrin at 95°C for 1.5 min and 37°C for 40 min. The anti-tag sequence on the microsphere hybridizes on the tag sequence generated from the LDR reaction. The fluorescence output from the second step was detected in the Luminex 100 platform, followed by interpretation of the genotypes.
Data on patients with FSGS and controls were compared using a χ2-test. Allele frequencies for all the genotyped polymorphisms were estimated by the gene counting method, and a χ2-test was used to identify departures from Hardy-Weinberg (HW) proportions.
Estimating odds ratios and adjusting for population stratification.
Analyses using 2 × 2 allelic table and 2 × 3 contingency table analyses under the dominant, recessive, and additive models (n = 499; Table 1) were performed. The P value, odds ratio (OR), and 95% confidence interval associated with each test were calculated for each SNP. SNPs with OR ≥1.50 were considered associated with FSGS, whereas those with OR <1.00 were considered protective.
For our case-control sample, population substructure among subjects can lead to overdispersion of the χ2-test statistic for association and, as a result, can cause spurious rejections of the null hypothesis. With the use of genotypes generated from our seven SNPs, the genomic-control method (14), which assumes that the genotypes of individuals are independent in the absence of population substructure, was used as an approach to eliminate spurious associations due to population heterogeneity by adjusting the inflated dispersion. We also genotyped an additional 27 microsatellite markers (Supplemental Table S1; available at the Physiological Genomics web site)1 dispersed over the genome that were not known to be linked to nephropathy loci, and tested for population stratification using the genomic-control approach (14). This enabled us to obtain independent confirmation of no population stratification.
Standardized linkage disequilibrium.
The standardized measure of pairwise linkage disequilibrium (LD), termed D′ (36), was calculated using maximum likelihood. Two locus haplotype frequencies were estimated using the EM algorithm as implemented in Arlequin computer program (22) (http:/lgb.unige.ch/arlequin/) under the assumption of the HW equilibrium. This implementation of the EM algorithm is based on a Markov chain approach and calculates the most likely pairwise haplotypes and their frequencies. One hundred thousand iterations of the Markov chain were carried out within Arlequin. Ideally, in a large population, a D′ value of 1 indicates complete LD, and a value of 0 indicates no LD. LD plots were then generated utilizing GOLD software (1). A triangular matrix of D′ values was used to demonstrate LD patterns in cases and controls within AA.
Haplotype frequency estimation.
Because the sample consisted of unrelated subjects, family data could not be used to determine phase in multiple heterozygous individuals. We used the EM algorithm (13, 17, 25, 27, 38) to estimate maximum-likelihood estimates of haplotype frequencies based on genotypes at the multiple (n = 7) biallelic loci. The approach assumed HW proportions and utilized known genotypic information at each relevant SNP locus to estimate missing information (i.e., the phase of each individual, given their particular genotypes). Expected heterozygosities for individual sites and for the haplotypes were estimated as 1 − ∑pi2, where p is the allele or haplotype frequencies. After elimination of individuals with missing genotypic data, 499 individuals were subjected to haplotype estimation, of which 218 were cases and 281 controls. The maximum-likelihood estimates for each haplotype were calculated using 15 initial starting values and a maximum number of iterations set to 1 × 105.
Haplotype analysis using the omnibus likelihood ratio test.
A likelihood ratio test (18) was used to determine whether multiple related haplotypes bearing common alleles were associated with disease. The hypothesis was that there was no association between any of the haplotypes and the disease status. The test statistic follows χ2H− 1, where H is the total number of haplotypes, and H − 1 is the degrees of freedom.
Haplotype analysis using a moving window test.
A moving window method (12, 16) scans for all possible widths of haplotypes that contain a specific susceptibility variant associated with FSGS. A maximum scanning window width was set to m = 5 (where m is the number of loci). The null hypothesis was that there was no association with the disease, under the assumption that the haplotypes were sampled independently. For all the windows, genotypic information was used to generate haplotypic frequencies from all possible haplotype widths of the seven linked SNPs, which harbored a potential susceptibility marker or locus (see Fig. 3). The haplotypic frequencies were utilized to obtain the actual number of individuals among the cases and controls having a particular haplotype with one or more susceptibility loci. A contingency table testing these haplotypes in controls vs. cases was then computed to obtain the contingency statistic for each locus pair S(a, b) that contained loci in between a and b and therefore contributed to the widths of the haplotypes (see Fig. 3). The analysis scans for maximum contingency statistics, max S*, for each SNP that is contained in different haplotype widths. This was repeated for each SNP within the window of size m = 5. The SNP or SNPs that yielded max S* within the window are ultimately reported. A standard χ2-test was performed under the assumption of independence of the k-by-2 contingency table with k − 1 degrees of freedom, where k is the number of haplotypes. The number of haplotypes varied depending on the marker positions within a window. For an accurate type I error, an empirical P value that corrects for testing multiple marker locations was obtained by permuting the affection status across the individual genotypes and recomputing the max S* that contains the susceptibility region. The P value was expressed as a −log10 function for ease of visualization. From these 15 contingency statistics, the max S* was reported for each window by −log10 (P value) examination. After the first window, the analysis slides to the second window containing the next five SNPs and then to the next five SNPs (see Fig. 3).
Description of the Sample
The sample was predominantly AA (n = 499), with 218 cases and 281 controls (Table 1). Consistent with published data (32), a twofold excess of males was observed in our sample (Table 1). Eight SNPs (Table 2) were genotyped, using TaqMan assay and Luminex xMap technology. Genotypic and allelic frequencies for each polymorphism were determined by gene counting, and the allelic frequencies are shown in Table 2. A χ2-test was performed to test for HW proportions. The genotypic frequencies in rs2234591 among controls and rs16754 among cases and controls revealed nonconformity to HW proportions (Table 3). The absence of HW equilibrium in rs16754 (in exon 7) was due to an excess of homozygotes, and therefore this SNP was discarded from further analysis.
Comparison of D′ and Haplotype Blocks
The haplotype frequencies were estimated from a sample of unrelated cases and controls by use of maximum-likelihood estimates [as implemented by Excoffier and Slatkin in 1995 (16) in Arlequin (http://lgb.unige.ch/arlequin/)]. All 32 possible pairwise D′ values among cases and controls were calculated, and the significance level for each pair was evaluated by an asymptotic χ2-test statistic. In general, marker pairs that were not in complete LD (D′ = 1.00) showed D′ values between 0.40 and 0.95. These D′ values were not strictly correlated with distance between marker pairs, but some trend was evident. The pattern and extent of LD at WIT1 and WT1 among the cases and controls generally differed. The D′ statistic was plotted with the GOLD program to illustrate the intensity of LD along the length of the region spanned by the seven remaining SNPs (Fig. 2). As shown in Fig. 2, A and B, a much stronger LD pattern was observed between SNP pairs rs6508-rs2301252, rs2301254-rs2234590, rs2301254-rs2234591, rs6508-rs1799937, and rs2301252-rs1799937 among cases and SNP pairs rs6508-rs2301252 and rs2301252-rs1799937 among controls. A weaker LD pattern was observed between marker pairs rs6508-rs2301254 and rs2301252-rs2301254 (shown in green in Fig. 2A). Among the cases, negative LD values were observed between other marker pairs, indicating that the two-locus allele pairs were in negative association and did not appear together on a particular haplotype (Fig. 2A, blue). We did not observe negative LD values in the control group.
Each SNP was analyzed separately, and we observed that rs6508 (in the WIT1 gene) and rs2234591 (WT1 intron 6) were significantly associated with FSGS (P values 0.040 and 0.009, respectively; Table 2). The allele with OR >1.5 (susceptibility allele) was “A” for rs6508, and the allele having OR <1.0 (protective allele) was A for rs2234591.
We stratified the AA population by HIV-1 status. Because sex was not a significant predictor, we did not control for sex in the contingency table test. Stratification by the presence of HIV-1 demonstrated that rs2301254 and rs2234591 were significantly associated with FSGS, with P values of 0.057 and 0.029, respectively. Stratification by the absence of HIV-1 showed that SNPs rs6508, rs2301254, and rs1799937 became significantly associated with FSGS (P value = 0.006, 0.023, and 0.003, respectively; Table 3).
Adjusting for Population Stratification
Because the AA sample is known to be admixed and it is feasible that population stratification may cause spurious results, we used the Armitage Trend Test, as described in Devlin and Roeder (14), to adjust for population stratification. After adjustment for population stratification in the HIV-1-negative group, the SNPs that were significant remained significant with more conservative P values [rs6508 (0.008), rs2301254 (0.049), rs1799937 (0.005); Table 3]. Similarly, in the HIV-1-positive group, rs2301254 and rs2234591 were associated with FSGS with P values 0.055 and 0.011, respectively (Table 3).
Omnibus Likelihood Ratio Test
Haplotype frequencies were estimated in case samples (n = 218) (Supplementary Table S2), and an H-by-2 contingency table was constructed to determine the most probable haplotype containing a variant allele that is associated with FSGS. To parse the association from the seven SNPs into smaller regions, three windows were constructed, each comprising haplotypes of five SNPs. The five alleles for each haplotype (h) are listed in order from left to right, based on the SNP location in the genes and the total number of haplotypes (H) (Supplementary Table S2). There were 32 haplotypes generated in each window. The omnibus likelihood ratio test was significant (Table 4; χ2 = 77.29, P = 0.001) at window 1 (harboring the first 5 SNPs), which indicated that the overall haplotype frequency profiles within this window were different between the FSGS subjects and controls (Supplementary Table S2).
Moving Window Test for Detecting SNP(s) That Are Significantly Associated with FSGS
Using the method described by Cheng and colleagues (11, 12), we analyzed our data set. The advantage of the moving window test is that it provides a measurement of the contingency score for each SNP, thus estimating the magnitude of association of each SNP with FSGS (Table 5). The most significant association was observed between the first five (in order from 5′ to 3′) selected SNPs in window 1 and FSGS in the AA population (empirical P value 0.003 after 1,000 permutations; Table 5). Therefore, the most significant associations were observed with rs6508 (WIT1 exon region), rs2301250 [WT1 promoter region and within a transcription coactivator binding site, cAMP response element binding protein (CREB) binding protein (CBP)], rs2301252 (WT1 promoter region), rs2301254 (WT1 promoter region), and rs2234590 (WT1 exon 6 region), with maximum −log10 (P value) = 9.03407, all in the first window (Table 5). In summary, the moving window test was consistent with the findings from the two-by-two contingency and stratified analysis, where rs6508 and rs2301254 remained significantly associated with FSGS (Tables 3 and 5) in the AA population. The omnibus likelihood ratio test results were also consistent with the moving window test, where the first five SNPs were significantly associated with FSGS in the first window analysis (Tables 4 and 5) in the AA population. The moving window analysis yielded susceptibility and protective haplotypes that are listed in Fig. 3.
The AA sample was stratified and restudied using the moving window analysis. Among subjects with HIV-1, the most significant association with FSGS was observed between the first five SNPs in window 1 and FSGS. Although results were similar to those shown above, a much higher contingency score, (S* = 13.57) was observed. In addition, SNPs rs2234591 and rs1799937 were also significant, with a maximum contingency score of S* = 9.30. In this group, all the windows were significant, and more significant empirical P values were obtained. In the non-HIV-1 group, there was no significant association between the SNPs and FSGS (Table 5). In general, the sample sizes were smaller once we stratified the population, which may have led to nonsignificant associations in the non-HIV-1 group.
FSGS is a complex genetic disease the pathogenesis of which remains incompletely understood. Mutations in exons 3, 6, 7, 8, and 9 and the second alternate splice region of WT1 have been associated with renal phenotypes in DDS and FS (4, 5, 29, 33, 37, 42, 43), rare Mendelian syndromes of progressive nephropathy. Because WT1 plays a role in DDS and FS and targeted mouse mutants involving the zinc finger domain show mesangial sclerosis (4, 42, 43), we hypothesized that WT1 mutations may be associated with sporadic FSGS. Like Ruf et al. (48), who scanned for mutations in exons 7, 8, and 9 for steroid-resistant and steroid-sensitive nephrotic syndrome, we scanned the FSGS cases for mutations using denaturing high-performance liquid chromatography followed by DNA resequencing. We were unable to identify any coding sequence variants within exons 7, 8, and 9 that were consistent with functional mutations. To investigate whether common variants in the WT1 were associated with disease, we determined whether six SNPs in the WT1 gene and one in the WIT1 gene were associated with FSGS. This is the first large-scale investigation of the role of WT1 and the WIT1 genes among FSGS cases and controls. Our results demonstrated that SNP variants in these genes are associated with kidney disease.
Several methods were used to determine whether SNPs in WT1 and WIT1 predicted FSGS. In this study, we initially demonstrated that, after adjustment for population substructure, rs6508, rs2301254, and rs1799937 were associated with FSGS. Using haplotype analyses, rs6508 in WIT1, rs2301250 in CBP transcription coactivator binding site, SNPs rs2301252 and rs2301254 in the promoter region, and a conserved rs2234590 that causes synonymous amino acid change in exon 6 were highly significant in the AA population. Terwilliger and Ott (52) stipulated that analysis based on haplotypes increases informativity at closely linked markers by increasing heterozygosity in that region. Hence this unique allelic background (haplotype) encompassing each of the SNPs in the WT1-WIT1 region provides more information compared with single marker analysis to detect association. This may explain why we observed no significance to borderline significance for some SNPs and a significant value for rs2234591 in single SNP analysis after adjusting for substructure. Compared with single SNP analysis, rs2234591 did not produce a significant signal in the moving window scan of the unstratified AA population. It is possible that the deviation from Hardy-Weinberg proportions among the controls, compounded with the smaller sample size of the stratified population, could have contributed to the association observed between rs2234591 and the trait. We also obtained highly significant values in the haplotype analyses of the first five SNPs in the HIV-1 group. Because the first five SNPs were significant in the two haplotype analyses, and SNPs rs6508, rs2301254, and rs1799937 (in the HIV-1-negative group) and SNPs rs2301254 and rs2234591 (in the HIV-1-positive group) were significant after adjustment for population stratification, it is likely that the putative susceptibility variants that act singly or together could be located in this region. rs2234590 in exon 6 was significant in the haplotype analyses, which was consistent with other studies that showed mutations (nonsynonymous changes) at multiple amino acids within exon 6 in patients with DDS (2, 5) and patients with steroid-sensitive and steroid-resistant nephrotic syndromes (48). Because the SNP in exon 6 of WT1 results in a synonymous amino acid change, it is likely that this SNP is in LD with point mutations elsewhere in the gene. There is also data showing that a nuclear export signal lies approximately between residues 180 and 340 (40a) that encompasses exons 2–7 of the WT1.
The moving window test showed that the first five SNPs were associated with FSGS, and all seven SNPs were significant in the HIV-1 FSGS group before adjustment for population stratification. The most significant of these haplotypes encompassed the first five SNPs (Table 5) that had a maximum contingency statistic S* equal to 9.034 and 13.57 in the entire group and the HIV-1-positive group, respectively. While the association between FSGS and the two genes was detected in the entire population, the stratified analysis indicates a greater vulnerability among those exposed to HIV-1 before adjusting for population stratification. The unique allelic background present in the haplotypes may also explain why we observed highly significant association in the HIV-1 group in the moving window test and two SNPs, rs2301254 and rs2234591, with P values of 0.055 and 0.011, respectively, in the single SNP analysis in the HIV-1 group.
Interestingly, comparison between HIV-1 and non-HIV-1 groups suggests that HIV-1 exposure may influence the strength and location of the association. We adjusted for population stratification using the additive genetic model, and SNPs rs6508, rs2301254, and rs1799937 remained significant among the non-HIV-1 group. The haplotype analysis of the HIV-1 group suggests that the first five SNPs were the most significantly associated with FSGS, followed by two SNPs (rs2234591 and rs1799937) to a lesser extent; none of the haplotypes was significantly associated with FSGS in the non-HIV-1 group. In the case of the non-HIV-1 group, the association is spread over the maximal length of the SNP coverage (at SNPs rs6508, rs2301254, and rs1799937), and it is feasible that multiple SNPs on rare haplotypes account for this variation. Therefore, the moving window test is not significant. Additional studies are needed to confirm this association.
Rs6508, located in the sole exon of WIT1 (Fig. 1), demonstrates significant association across multiple analyses. Studies have shown that WIT1 is coexpressed with WT1 in kidney and spleen cells (28). Because WIT1 may directly regulate WT1 (8), the association observed between rs6508 and FSGS is intriguing. To fully characterize this association, it will be necessary to design in vitro experiments expressing the SNPs in both genes simultaneously. Rs6508 is a coding SNP, and either G [protective allele (OR <1.0)] or A [susceptibility allele (OR >1.0)] is present in the first nucleotide of the triplet codons GCG and ACG that code for alanine and threonine, respectively. Alanine is a hydrophobic amino acid with a small aliphatic methyl (CH3) group and is normally embedded in the interior of the WIT1 protein molecule. In contrast, threonine is a hydrophilic and larger amino acid that is often phosphorylated, thereby modifying protein structure and function. The G-to-A substitution is therefore nonconservative and may alter the protein binding properties of WIT1, thus influencing WT1 expression or function.
Rs2301250 is located in the 300-kDa CBP transcription coactivator binding site [A(C/T)GAAGGGAGTCAGA], which has DNA binding properties of a nuclear phosphoprotein that regulates cell cycle phase-specific modifications (15, 39, 47, 56). We discovered that other variations in the promoter (i.e., SNPs rs2301252 and rs2301254) were within close proximity of other known transcription factor binding sites in WT1. Thus these variants in the promoter region may be in LD with other consensus binding sites or may alter DNA binding properties in the promoter region. The variant in the CBP transcription coactivator binding site may influence the level of WT1 expression. Previous studies have shown that mutations located in zinc fingers I, II, and III and the second splice site of WT1 are associated with DDS and FS. In this study, we demonstrate that variants in the promoter region (SNPs rs2301252 and rs2301254) and transcription factor binding site (rs2301250) are associated with FSGS.
The haplotype analysis yielded both susceptibility and protective haplotypes, as shown in Fig. 3. Both the protective and the susceptible single nucleotide polymorphisms may play a role in the modification of the FSGS pathophysiology. There were two possibilities for susceptibility haplotypes: those from the WIT1 gene that may play a role in controlling WT1 expression and those bearing an unknown mutation in WT1 itself. We also propose that these SNPs are either in LD with the functional variants that are in close proximity or that the SNP(s) itself mediates pathogenesis. One caveat in this analysis is the dependence on the accuracy of haplotype frequency estimation by the EM-based maximum-likelihood method, although this method has been validated in many studies (12, 19, 49). The accuracy of this estimation method depends on several conditions, such as sample size (≥100 chromosomes), the number of loci (7 SNPs), and dispersion of haplotype frequency values (with some very common haplotypes and many rare haplotypes) (12, 19). These data were derived from a large multicenter study, and many of the conditions stated above were met. However, because this is the first report of association, other confirmatory studies are warranted.
This work was supported by National Institute of Diabetes and Digestive and Kidney Diseases Grants DK-54644, DK-54178, DK-38558, DK-51472, DK-02281, and DK-57329; grants from the Northeast Ohio chapter of the American Heart Association, the Central Ohio Diabetes Association, the Kidney Foundation of Ohio, the Leonard Rosenberg Foundation, and the Juvenile Diabetes Foundation; and federal funds from the National Cancer Institute, National Institutes of Health, under contract nos. NO1-CO-12400 and NO1-CO-56000.
We thank other clinical collaborators who provided patients for the study, including Dr. Roslyn Mannon (Duke Univ.), Dr. Patrick Nachman (Univ. of North Carolina), Dr. Thomas Welch (Univ. of Cincinnati), Dr. Florence Hutchison (Medical Univ. of South Carolina), Dr. T. K. S. Rao (State Univ. of New York), Dr. Diego Aviles (Louisiana State Univ.), Dr. Elizabeth Ripley (Medical College of Virginia), and Dr. Dollie Green (Univ. of Miami). We gratefully acknowledge the help of all study subjects. We also thank Dr. Peter Zimmerman at Case Western Reserve University for use of the Luminex genotyping platform.
↵1 The Supplemental Material for this article (Supplemental Tables S1 and S2) is available online at http://physiolgenomics.physiology.org/cgi/content/full/00201.2004/DC1.
Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).
Address for reprint requests and other correspondence: S. K. Iyengar, Dept. of Epidemiology and Biostatistics, Case Western Reserve Univ., Wolstein Research Bldg. 1315, 10900 Euclid Ave., Cleveland, OH 44106-7281 (E-mail:).
- Copyright © 2005 the American Physiological Society