Stearoyl-CoA desaturases (SCDs) are key enzymes of fatty acid biosynthesis whose regulation underpins responses to dietary, thermal, and hormonal treatment. Although two isoforms are known to exist in the common carp and human and four in mouse, there is no coherent view on how this gene family evolved to generate functionally diverse members. Here we identify numerous new SCD homologs in teleost fishes, using sequence data from expressed sequence tag (EST) and cDNA collections and genomic model species. Phylogenetic analyses of the deduced coding sequences produced only partially resolved molecular trees. The multiple SCD isoforms were, however, consistent with having arisen by an ancient gene duplication event in teleost fishes together with a more recent duplication in the tetraploid carp and possibly also salmonid lineages. Critical support for this interpretation comes from comparison across all vertebrate groups of the gene order in the genomic environments of the SCD isoforms. Using syntenically aligned chromosomal fragments from large-insert clones of common carp and grass carp together with those from genomically sequenced model species, we show that the ancient and modern SCD duplication events in the carp lineage were each associated with large chromosomal segment duplications, both possibly linked to whole genome duplications. By contrast, the four mouse isoforms likely arose by tandem duplications. Each duplication in the carp lineage gave rise to differentially expressed SCD isoforms, either induced by cold or diet as previously shown for the recent duplicated carp isoforms or tissue specific as demonstrated here for the ancient duplicate zebrafish isoforms.
- molecular evolution
- genome duplication
- large-insert library
fluctuations in temperature present major problems, particularly for poikilothermic organisms (5). However, species that experience regular seasonal changes in environmental temperature, as in the temperate regions of the planet, have evolved a plastic physiology that matches the performance of their physiological systems, and their resistance to debilitation and death at extremes, to the prevailing conditions be it winter or summer. A well-understood molecular response mediating thermal adaptation is the induction by cold of the stearoyl-CoA desaturase (SCD). This enzyme catalyzes the desaturation of the 9–10 (Δ9) position of a C16 or C18 saturated hydrocarbon chain (16), and its activity controls the balance between saturated and monounsaturated fatty acids, thus influencing the physical properties of complex lipid systems. The SCD is also involved in the synthesis of signaling molecules, such as prostaglandins and leukotrienes (20), and functions in response to changes in lipid diet (16, 21, 30), this being the subject of much recent attention in nutrition research (9, 15, 32, 37).
Cold-induced increases in membrane lipid unsaturation occur in all domains of life and have been linked to compensatory increases in the “fluidity” of cellular membranes (13) and the conservation of important membrane properties. This so-called “homeoviscous adaptation” has been recorded in many different species, but despite progress in understanding the role of SCD in thermal responses (4, 23, 38, 47) the evidence linking SCD induction to altered thermal phenotype is entirely circumstantial and indirect. The common carp, Cyprinus carpio, expresses two hepatic SCD genes, which share 89% and 65% identity at the amino acid and nucleotide levels, respectively (30). Both genes are functional Δ9-desaturases (30, 43), but one is regulated by dietary fatty acid composition (30) whereas the other is regulated by temperature (43), consistent with divergence of the promoter regions controlling expression (11).
Adopting a comparative, evolutionary approach to relating SCD function to thermal adaptation, or exploring responses in genetic models such as zebrafish, requires a clear understanding of the pattern of evolution and homologous relationships of the different SCD isoforms within the lower vertebrates. Thus the two carp SCD isoforms may result from a tandem duplication, as in mice (18, 26, 29, 49), or may have arisen during a recent whole genome duplication event in the common carp lineage 12–16 million years ago (Mya) (8, 22). Alternatively, they may stem from an ancient whole genome duplication event in teleost fishes ∼320 Mya (7, 44). In the latter case, duplicated and functionally divergent SCD isoforms may be much more common among teleost fishes than currently realized. Seeking to explore SCD function in genetically manipulable models such as zebrafish requires a clear understanding of the pattern of evolution within the lower vertebrates.
In exploring molecular SCD diversity in vertebrates, we have uncovered a new clade of previously undescribed SCD genes in teleost fishes, a group comprising half of all living vertebrate species. This clade was initially explored by examining expressed sequence tags (ESTs) within the major sequence databases, but its proper resolution required the comparative syntenic analysis of genomic sequences of the chromosomal segments surrounding the two carp SCDs with that of other published genomic data and exploration of tissue-specific differences in SCD expression. We show that many teleost fish species, though not the common carp, possess SCD isoforms from both clades, consistent with their origin during an ancient teleost whole genome duplication event. We further propose that additional, more recent whole genome duplication events in the Cyprinidae and Salmonidae have given rise to additional isoforms of up to four in a single species.
Identification of published SCD coding sequences.
Published SCD coding sequences from vertebrate species and from the purple urchin and the sea squirt (SpuSCD and CinSCD, respectively) with similarity to fish SCDs were identified through a GenBank BLASTx search (2) using the SCD coding sequence of grass carp (CidSCD).
Characterization of SCD coding sequence from fugu and zebrafish.
Genomic scaffolds with similarity to grass carp SCD identified by tBLASTx searching of the fugu and zebrafish genome database on ENSEMBL (14) were used to design primer sequences for the amplification of corresponding cDNAs (see Table 1). Total RNA from frozen fugu liver samples kindly donated by Ian Johnston (University of St Andrews, St Andrews, UK) and from whole zebrafish that had been killed by British Home Office-approved Schedule 1 methods was prepared with TRIzol reagent (Invitrogen, Carlsbad, CA) according to the manufacturer's instructions. The protocols adopted were approved by the ethical review process of the University of Liverpool Animal Welfare Committee and also examined and fully licensed by the Home Office Inspectorate (UK). One microgram of total RNA was then treated with RQ1 RNase-free DNase (Promega, Madison, WI) and reverse transcribed with 10 ng of random hexamers, 2 μM dNTPs (Bioline, Taunton, MA), and 200 U of Superscript II reverse transcriptase, following the manufacturer's procedures. cDNA was then diluted 50× and used for PCR amplification (95°C for 2 min; 30 cycles of 95°C for 20 s, X°C for 30 s, and 72°C for 1 min, where X denotes SCD-specific annealing temperatures listed in Table 1). Primers were also designed to amplify the 5′ and 3′ ends by rapid amplification of cDNA ends (RACE)-PCR (Table 1). The first strand was synthesized with primer sequences from the Creator SMART cDNA Library Construction Kit (Clontech, Mountain View, CA). For the second strand synthesis, a modified version of the SMART PCR primer (Clontech) was used with specially designed gene-specific primers to amplify the 3′ and 5′ ends. Amplification was achieved with 25 cycles of 95°C for 5 s, 58 or 62°C for 6 min, and amplicons were cloned into the pGEM-T Easy vector (Promega) and sequenced (Lark Technologies, Takeley, UK). Plasmid preparations, transformations, and other standard molecular biology techniques were carried out as described previously (36).
Identification of putative SCD genes in green pufferfish, medaka, and stickleback.
Genomic scaffolds containing DNA sequences similar to SCD were identified from tBLASTx searches of the green pufferfish, medaka, and stickleback genome databases on ENSEMBL (14). MapDraw (DNAstar, Madison, WI) was used to translate the open reading frames contained within each scaffold and the characterized SCD genes used to predict the coding sequences.
Assembly of full and partial SCD coding sequences from TIGR and NCBI.
The remaining sequences were obtained through BLASTn searches of the EST databases provided by Gene Indices database (33, 34) (Dana-Farber Cancer Institute, Harvard University; http://compbio.dfci.harvard.edu/tgi/) and the EST resources available at the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/). Where more than one entry was found for a given species, multiple alignments were produced to identify overlapping sequences and the number of different transcripts. Where the full coding sequence was not available, partial sequences were used for phylogenetic reconstruction.
Alignment of protein and nucleotide sequences.
The ClustalX program (42) was used for multiple alignment of protein and nucleotide sequences. For nucleotide sequences, the codon alignment was further optimized manually with MacClade version 4 (24).
Two data sets, consisting of alignments from which all gapped positions were excluded, were analyzed by maximum parsimony (MP) and Bayesian inference (BI). The larger data set (L in Table 2) consisted of a 871-bp alignment of mostly complete coding nucleotide sequences of 42 vertebrate SCDs and an echinoderm and 2 urochordate sequences as outgroup. For SCD1 of the killifish and SCD1 and 2 of the roach, only partial sequences of 824, 634, and 652 bp, respectively, were available for alignment, and the remaining positions were coded as missing information. The smaller data set (S in Table 2) consisted of a 295-bp alignment of 29 teleostean SCD coding sequences from the predicted COOH-terminal end, with the corresponding sequence of Xenopus tropicalis as outgroup.
MP trees were inferred with PAUP* Macintosh version 4.0 software (40). After separate analyses of substitution types and rates at each codon position, transitions at first and second codon positions were given twice the weight of transversions and at third codon positions transversions were given twice the weight of transitions. Trees were constructed with the heuristic method and stepwise addition. Taxa were added in random sequence with 10 replicates. To assess the internal support of tree branches, heuristic bootstrap analyses with 1,000 replicates were performed.
Bayesian trees were inferred with MrBayes (35) after the most suitable model of DNA evolution was identified with the Modeltest computer program (31). The Tamura-Nei (41) model of evolution (TrN + I + G) was deemed most suitable, giving base frequencies of A = 0.2288, C = 0.2994, G = 0.2438, and T = 0.2280, a substitution rate matrix consisting of A-C = 1.0000, A-G = 2.5184, A-T = 1.0000, C-G = 1.0000, C-T = 3.3011, and G-T = 1.1805, proportion of invariable sites (I) = 0.1587, and, finally, a gamma distribution shape parameter of 1.0050. These parameters were implemented into the MrBayes program, and two independent analyses were performed with three heated chains and one cold chain. Chains were sampled every 100th generation. The two independent analyses were compared every 1,000th generation until convergence was reached (2,000,000 generations; SD < 0.01), and a final tree with posterior probabilities (given as percentage) and estimated branch lengths was produced from the data.
Bayesian and MP inferred trees of the large and small data sets were rooted with PAUP* and the respective outgroups indicated above.
Characterization of SCD subchromosomal regions.
To determine the order of genes surrounding the SCD genes, subchromosomal regions were analyzed with publicly available online resources of ENSEMBL and NCBI (see above). BLASTn searches of human, mouse, chicken, X. tropicalis, and all available fish genomes were performed to identify genomic contigs containing the previously identified SCD isoforms. The gene order upstream and downstream of the SCD genes was then established with the following genome builds: fugu, FUGU 4.0; green pufferfish, ASSEMBLY 7; medaka, ASSEMBLY HdrR; stickleback, ASSEMBLY BROAD S1; zebrafish, ASSEMBLY Zv7; X. tropicalis, JGI 4.1; chicken, WASHUC2; mouse, ASSEMBLY NCBI m37; rat, ASSEMBLY RGSC 3.4; human, NCBI 36.
Because no genomic resources were available for common carp and grass carp, fosmid libraries containing the complete genome of both species were produced with the EpiFOS library construction kit (Epicentre Biotechnologies), following the manufacturer's instructions (Evans H, De Tomaso T, Berenbrink M, Rogers J, Quail M, Matthews L, Cossins AR, Gracey AY, unpublished observation). Briefly, the libraries were divided into sublibraries, each containing ∼4,000 fosmid clones. Sublibraries containing CcaSCD1a and -1b and CidSCD were identified by PCR amplification of isoform-specific products, followed by identification of positive fosmid clones by plating out and screening on Genescreen Plus nylon membranes (PerkinElmer) with 32P-labeled isoform-specific PCR products. Positive clones were subcloned, sequenced, and assembled by published methods [http://www.sanger.ac.uk, following the links to Pathogen subcloning (Team 53) and Finishing (Team 51)]. After sequence assembly, the position of the SCD genes within each fosmid clone was confirmed.
Tissue-specific expression of zebrafish SCD isoforms.
The brain, heart, liver, and skeletal muscle were dissected from individual zebrafish acclimated to 25°C and snap frozen at −80°C until required. Total RNA was extracted from each tissue sample and reverse transcribed according to the procedures described for fugu and whole zebrafish. The coding sequences for DreSCD1 and DreSCD2 were used to design primers that were specific for each isoform, and PCR amplifications were performed on the individual tissue samples with the reagents and conditions described above (for primer sequences and melting temperature values, see Table 1). Primer specificity was confirmed by DNA sequencing of the PCR products from each tissue (Lark Technologies).
We previously showed (30) that common carp, whose lineage underwent a recent whole genome duplication 12–16 Mya, express two SCD isoforms; by contrast, only one isoform has been described in the grass carp Ctenopharyngodon idella (4), which matches its genomically unduplicated status. To further explore a possible link between whole genome duplication events and SCD diversity, we searched the genome of the zebrafish, another member of the carp family whose genome did not undergo a recent duplication. The previously annotated SCD cDNA sequence from zebrafish, denoted DreSCD1 (AY217090), was used to perform a tBLASTx search of the zebrafish genome database. Apart from the original search query, only one other scaffold (zKp115D2) produced a significant hit, indicating a separate gene, designated DreSCD2 (Danio rerio SCD, paralog 2). The DreSCD2 cDNA sequence was partially confirmed by PCR amplification, although neither 5′- nor 3′-RACE was able to establish the complete coding sequence. However, over the course of the present study, the full coding sequence was independently submitted to GenBank by computational methods as a putative SCD gene (accession no. XM_693620). This showed 73% and 71% identity with the DreSCD1 cDNA and amino acid sequences, respectively. PCR amplification using primers specific for DreSCD1 and DreSCD2 gave products of 200 and 760 bp, respectively. The PCR amplifications and their distribution in brain, heart, liver, and skeletal muscle are shown in Fig. 1. While the brain tissue amplified products for both DreSCD1 and DreSCD2, the heart and liver amplified only DreSCD1 and the skeletal muscle amplified only a DreSCD2, thus establishing a tissue-specific expression of the two zebrafish isoforms.
To test whether the two zebrafish SCD isoforms were part of a more ancient gen(om)e duplication event, we searched the genome databases of fugu and the green pufferfish. These two euteleosts [sensu (17)] diverged from the ostariophysan carp and zebrafish lineage more than ∼150 Mya (3). The tBLASTx search of the fugu genome database identified two scaffolds (M000037 and M003466) containing segments showing high similarity to the grass carp SCD coding sequence CidSCD. The putative genes were denoted TruSCD1 and TruSCD2, respectively. The physical characterization of each coding sequence by RT-PCR amplification and 5′- and 3′-RACE generated sequences of 1,002 bp for both TruSCD1 and TruSCD2. The sequences were aligned with the scaffold data, the exon-intron boundaries were delineated, and the coding and predicted protein sequences were established from the genomic sequence. The completed cDNA sequences for TruSCD1 and TruSCD2 and their predicted protein products were submitted to GenBank (AY741383 and AY741384, respectively). Alignment of TruSCD1 and TruSCD2 showed 71.9% and 56.4% identity at the protein and nucleotide levels, respectively.
The two fugu SCD genes were then used to search the green pufferfish genome database for the corresponding homologs. Alignment of two scaffolds to each of the fugu isoforms enabled prediction of the two putative coding sequences, TniSCD1 and TniSCD2. Pairwise alignments of TniSCD1 with TruSCD1 and TniSCD2 with TruSCD2 gave 92% and 87.6% identities at the protein level and 88.9% and 89.1% identities at the cDNA level, respectively.
The multiple amino acid alignment of the confirmed SCD isoforms from common carp, grass carp, zebrafish, and fugu and the putative two isoforms from the green pufferfish shows that 150 of 323 amino acid residues are conserved in all 9 sequences (Fig. 2). The most relevant conserved residues are three histidine-containing regions, which are essential for catalytic activity (39). Also conserved are two regions containing large proportions of hydrophobic residues (positions 29–80 and 150–242 in CcaSCD1), which form putative membrane spanning domains (39). This sequence conservation among putative SCD (Δ9-desaturase) homologs contrasts with their pairwise alignment against the zebrafish Δ6-desaturase protein (12), which did not reveal any statistically significant similarity and in which the histidine-containing regions possess a markedly different sequence (HDFGHL, HFQHHA, and HLNFQIEHH) compared with the sequences shown in Fig. 2.
Phylogenetic analysis of SCD coding sequences.
To further explore the molecular divergence of SCD isoforms, additional homologs from teleost fishes and other vertebrates were identified by BLASTn searches of GenBank and the NCBI and Gene Indices EST collections (Table 2). The weighted MP analysis produced a single most parsimonious tree, which is compared with the tree obtained by Bayesian inference (BI) in Fig. 3. Both methods strongly support the pairing of the two common carp SCD isoforms as each other's closest relatives (100% bootstrap and posterior probability support). The clustering of zebrafish isoform DreSCD2 with one each of the two isoforms from roach and fathead minnow was also highly supported by both methods. The latter two isoforms have accordingly been denoted RruSCD2 and PprSCD2, respectively. The groupings of the two fugu SCD isoforms 1 and 2 with their respective counterparts in the green pufferfish were also highly supported by posterior probabilities of 100% and bootstrap values ≥99%. This also applied to the grouping of the two amphibian SCDs from X. laevis and X. tropicalis and, among the outgroup, of the two seasquirt SCDs from Ciona intestinalis and C. savignyi.
Both MP and BI grouped all mammalian SCD sequences in the alignment together with strong support values. The four mouse and two rat SCD isoforms grouped with the hamster sequence and nested together within the mammals, forming a rodent SCD clade (Fig. 3). The intrarelationships of this clade were poorly resolved, but its multiplied isoforms were clearly separated from any of the SCD isoforms of teleost fishes. Further within mammals, the SCD isoforms of bovids (cow, goat, and sheep) formed a clade strongly supported by both MP and BI. The bovid sequences were most closely related to the other cetartiodactyl isoform from pig as expected from phylogeny (3), although support for this arrangement was somewhat lower in the MP tree.
Despite the unanimous support for some of the relationships expected from the species phylogeny, the expected phylogenetic grouping of rodents and human together as sister group to cetartiodactyls plus dog (28) was not recovered with either analysis method. BI even gave high branch support values for grouping human and dog SCDs together as closest relatives to cetartiodactyl SCDs. Similarly, accepted deeper benchmark relationships among vertebrates, such as chicken as sister group to mammals, followed consecutively by Xenopus and then teleosts (3), were only partially recovered (Fig. 3).
Nevertheless, both tree construction methods clustered all fish SCDs into the same four distinctive clades (Fig. 3). All ostariophysan SCDs grouped into Ostariophysi 1 or 2, named after the respective zebrafish SCD isoforms that they contain. Similarly, all euteleostean SCDs group into Euteleostei 1 or 2, containing SCD1 or SCD2 of fugu and green pufferfish, respectively. These four clades were highly supported in the BI tree (99–100% posterior probabilities). In the MP tree the Euteleostei 1 and 2 clades had bootstrap support values of 80% and 100%, respectively, whereas the two ostariophysan clades had only values of 56% or lower (Fig. 3). MP and BI branch support values tended to decrease at deeper nodes, and the interrelationship between the four teleost SCD clades was poorly resolved.
To better cover the diversity of SCD sequences in the Euteleostei 1 clade and break up some long branches in the Euteleostei 1 and 2 clades, and thereby improve resolution, additional salmonid EST sequences with homology to SCD1 and SCD2 of other teleosts were included in the analysis. However, this came at the expense of sequence information because only a 295-bp alignment of the COOH-terminal codon portion of teleost SCDs was available. Bayesian analysis of this teleost data set with X. tropicalis as outgroup only recovered the Euteleostei 1 and Ostariophysi 1 and 2 clades with 75%, 100%, and 100% posterior probability support, respectively (Fig. 4). MP analysis, using the same weightings as in Fig. 3, only recovered the Ostariophysi 2 clade (75% bootstrap support). However, BI and MP both identified two distinct salmonid SCD clades, which were each supported by 100% bootstrap and posterior probability values. All members of the clade containing OmySCD1a of rainbow trout further shared a unique 6-nucleotide insertion at the COOH-terminal coding end, which was not found in any other SCD. This clade was accordingly named the SCD1 group and distinguished from the SCD2 salmonid group. Each of the two salmonid clades contained two rainbow trout SCD nucleotide sequences, OmySCD1a and b and OmySCD2a and b, which were 97% and 91% identical in within-group comparisons, respectively.
Syntenic mapping of genes surrounding SCD isoforms.
A potentially important diagnostic feature that can discriminate between different models of gene evolution is the order of gene placement on chromosomes. We have determined the order of putative genes surrounding the SCD by interrogation of existing genomic sequence databases. For the nonsequenced species, namely the common and grass carp, we have generated fosmid libraries from which 30- to 38-kb insert clones containing the SCD isoforms were isolated and sequenced. Figure 5 shows details of the genomic environment of common carp SCD1a and SCD1b and grass carp SCD, as obtained by homology searches of the respective fosmid clones (Supplemental Table S1).1 All three SCDs are flanked immediately downstream by several exons homologous to the human SEC31B gene on the reverse strand. Upstream and on the reverse strand CcaSCD1a and CidSCD are flanked by several exons homologous to the human DNAJB12 gene. In addition, there is some evidence for remnants of a past transposition event between DNAJB12 and CidSCD of the grass carp (Fig. 5). Of the three fosmids isolated, only carp f-c114 failed to provide sufficient sequence data to identify a DNAJB12 homolog immediately upstream of CcaSCD1b. Thus rescreening was performed to isolate a second fosmid clone covering this section of the common carp genome. Once isolated, the presence of a DNAJB12 homolog was confirmed without needing to sequence the whole fosmid. This was done by PCR using primers specific for a conserved region of the DNAJB12 gene. This indicates that DNAJB12 is located within a distance of 30–38 kb (the typical fosmid clone length) upstream of CcaSCD1b, too.
Figure 6 illustrates schematically the order of known and putative genes surrounding the different SCD isoforms in teleost and tetrapod genomic model species. As can be seen, the duplicated SCD isoforms of all teleosts for which genomic information was available are located on different chromosomes, scaffolds, or large-insert clones. However, after allowing for specific gaps in gene order and transversions events, the different SCD-containing chromosomal fragments of teleosts aligned to single syntenic blocks in representative tetrapod genomes. Thus SCD1- and SCD2-containing chromosomes 12 and 13 of zebrafish, chromosomes 2 and 17 of green pufferfish, and chromosomes 19 and 15 of medaka all aligned to a small number of syntenic blocks on human chromosome 10 and chicken chromosome 6. In mouse, corresponding syntenic blocks were found on chromosome 19 (containing the 4 SCD paralogs) and on chromosome 10. In Xenopus, in which genomic scaffolds are not yet linked to chromosomal regions, blocks of genes from scaffolds 204 and 179 aligned to the teleost SCD1- and SCD2-containing chromosomal segments. Importantly, the teleost SCD1- and SCD2-containing segments differed characteristically from each other in the genes that were missing in the syntenic alignment with tetrapods. Thus chromosomal fragments of the teleost SCD1 group were distinct from those in tetrapods and the teleost SCD2 group by all lacking homologs to the human genes CUEDC2, HIF1AN, WNT8B, TYSND1, and EIF4EBP2 in the syntenic alignment. Similarly, all members of the teleost SCD2 group lacked homologs of the human genes FBXL15, NDUFB8, SEC31B, C10orf104, ASCC1, LRRC20, and NODAL at the respective position in tetrapods and the SCD1 group (Fig. 6).
The 30- to 38-kb chromosomal fragments that contained the SCD isoforms of the nongenomic model species, the common carp and the grass carp, allowed us to clearly assign these SCDs to the SCD1 group of teleostean genomic model species. This was based on the presence of the gene SEC31B and the absences of the genes WNT8B and CXorf34 in the syntenic alignment compared with the SCD2 group. Thus the two common carp SCD isoforms were located on different chromosomal segments but showed the same gene order as that found for the SCD1 clade of genes. This contrasts with the four mouse SCD isoforms, which were present in four tandem genes on a single chromosome (Fig. 6). In contrast to HsaSCD, we were unable to find homologs of the recently described second functional human SCD isoform in BLASTn searches of the genomes of fish model species (SCD5 of Refs. 45, 48). This second human SCD is located on chromosome 4 next to SEC31A (48) but otherwise showed no synteny to the other vertebrate SCDs analyzed in Fig. 6 (not shown).
Our discovery of a second, distinctive SCD gene within the zebrafish genome (DreSCD2) was independently predicted by the GenBank gene prediction algorithm. This led us to seek potential SCD paralogs in the genomes of two closely related members of the highly derived pufferfish order Tetraodontiformes, resulting in the cloning of TruSCD2 and the designation of TniSCD2 to a previously unannotated transcript for fugu and green pufferfish, respectively. These genes were very similar to each other, but substantially different in coding sequence from the other SCD isoforms already identified in the same two pufferfish species (TruSCD1 and TniSCD1). This is consistent with them arising from a duplication event that precedes the divergence of the two species and therefore represents a new and very distinctive class of SCD genes. Phylogenetic analysis firmly clusters these genes with EST sequences and predicted SCD transcripts from stickleback, killifish, and medaka in an Euteleostei 2 clade. In contrast, TruSCD1 and TniSCD1 cluster with a number of different euteleostean SCD cDNAs, ESTs, tentative EST consensus sequences, predicted transcripts, and unannotated transcripts.
In contrast, the previously known zebrafish DreSCD1 clustered with a group of different ostariophysan SCDs in an Ostariophysi 1 clade. However, the inclusion of DreSCD1 in this group was poorly supported in the MP tree, and the high posterior probability value for this clade in the Bayesian tree needs to be viewed with caution, because this method also supported some highly unusual tree topologies among tetrapod SCDs, which were used as a benchmark for comparison with the topology expected from species phylogeny. Thus the relationship of DreSCD1 to the other teleost fish SCD coding sequences and the interrelationships between the two pairs of paralogous SCD clades, Ostariophysi 1 and 2 and Euteleostei 1 and 2, could not be resolved with confidence by phylogenetic reconstruction alone.
Notwithstanding incompletely resolved phylogenetic trees, the analysis of gene order surrounding the vertebrate SCD isoforms strongly supports the inclusion of DreSCD1 as an ortholog in an Ostariophysi 1 + Euteleostei 1 clade, as suggested in the Bayesian reconstruction. This clade is paralogous to an Ostariophysi 2 + Euteleostei 2 clade, which contains DreSCD2 and the novel fugu and green pufferfish SCD2 isoforms, among others. These groupings are based on the shared absence of a number of genes in the alignment of subchromosomal fragments relative to tetrapods and the other clade of teleost SCDs. Because the missing genes are not contiguous in the genomic sequence, the pattern of presence and absence of all these single genes cannot be explained by a common large insertion/deletion event. The mosaic of complementary deletions rather supports a scenario of a large chromosomal duplication event early in the teleost lineage and subsequent loss of some duplicated genes in a seemingly random fashion from one or the other of the duplicated chromosomal regions.
The duplication of SCD-containing subchromosomal fragments in teleosts may well have been part of the proposed additional whole genome duplication event in teleost fishes compared with tetrapods, which is claimed to have enabled the large radiation of teleost fishes, a group comprising ∼50% of all living vertebrates (see, e.g., Refs. 7, 19, 44, but see Ref. 10).
Our interpretation of the origin of the ancient duplicated SCD genes in teleosts is independently corroborated by a reconstruction of the evolution of the human, zebrafish, medaka, and pufferfish genomes (19). This study suggests that large parts of the SCD containing chromosomes 12 and 13 in zebrafish, 15 and 19 in medaka, and 2 and 17 in pufferfish arose from a single protochromosome in the last common ancestor of humans and teleosts, which also gave rise to the SCD-bearing chromosome 10 in humans (19). The second human SCD gene SCD2/5 appears to be also present in other primates but absent in rodents and teleost fishes (Ref. 45; present study). Poor synteny with the chromosomal fragments containing the other vertebrate SCDs may be due to an origin by an even more ancient chromosomal duplication event in a common ancestor of teleosts and tetrapods with subsequent losses in teleosts or rodents. Alternatively, lack of synteny may be due to a small-scale tandem duplication event with subsequent translocation. Information from the ongoing mammalian and lower vertebrate genome sequencing projects may help to solve this question.
We previously identified a second SCD isoform in the common carp (30, 43). This isoform is very similar to the original carp SCD, more so than either is to the other SCD genes from close cyprinid relatives in the Ostariophysi 1 clade. This is consistent with a recent gene duplication event occurring subsequent to the divergence of the common carp from the other Ostariophysi, which corresponds with the proposed whole genome duplication event in the ancestor of the common carp dated 12–16 Mya (8, 22). The localization of the duplicated common carp SCDs to different, nonoverlapping fosmid clones, albeit with conserved gene order compared with grass carp and other SCD1-containing subchromosomal fragments, supports this interpretation and increases the reliability of an Ostariophysi 1 + Euteleostei 1 clade in the phylogenetic reconstructions.
Thus we distinguish at least two separate SCD duplication events that have resulted in the multiple isoforms now found in the two pufferfishes, the common carp, and the zebrafish. The first was an ancient duplication, which gave rise to the SCD1 and SCD2 paralogs of teleosts, as reflected in the two highly divergent SCD1 and SCD2 genes found in genomic model species such as fugu, green pufferfish, stickleback, medaka, and zebrafish, but also in large EST collections such as those from killifish, salmon, trout, roach, and fathead minnow. The second is a recent duplication event in a small group (Ostariophysi 1) containing the common carp SCD1a and SCD1b genes and possibly other cyprinid SCDs from species possessing double the usual number of chromosomes (crucian carp and goldfish).
We provide evidence for another recent duplication of the ancient set of SCD genes in the trout lineage, which may accordingly contain up to four different SCD genes (OmySCD1a and b and OmySCD2a and b). Interestingly, a whole genome duplication event is thought to have occurred 25–100 Mya in the salmonid lineage, including salmon, trout, and whitefish (1). The clustering of all known salmonid SCDs into two highly divergent and therefore presumably ancient groups is strongly supported in the phylogenetic reconstructions and further strengthened by a unique 6-nucleotide insertion at the COOH-terminal end of all salmonid SCD1 sequences, which is found in no other vertebrate SCD. In contrast, differences between OmySCD1a and b and OmySCD2a and b, respectively, are just based on a few nucleotide substitutions in a relatively short alignment of EST sequences that span the relatively well-conserved COOH-terminal SCD coding sequence. Thus high-quality, full-length coding sequences are necessary to confirm the existence of four different SCD genes in trout. In addition, genomic localization or synteny information would also be necessary before any secondary SCD gene duplication in salmonids could be attributed to a large chromosomal or whole genome duplication event.
In contrast to teleosts, the multiple SCD genes in the rodent lineage, which show tissue-specific expression patterns (25), likely arose by tandem duplications of a single SCD gene. This is based on the arrangement of the four mouse SCD genes next to each other on a single chromosomal fragment that otherwise shows the same gene order as the respective fragments of human, chicken, and X. tropicalis.
Despite the postulated ancient whole genome duplication in teleosts and the presence of ancient SCD duplicates in all teleost genomic model species, we found no evidence for the expression of a SCD2 homolog in our large common carp EST collection (46). Similarly, so far only a single SCD has been found in grass carp, rare minnow, milkfish, catfish, icefish, and tilapia. Because none of these species has been genomically sequenced, SCD diversity is based solely on the available EST sequence data, which for most of these species is not substantial. As additional sequence data become available (6) evidence for ancient duplicates of SCD may come to light, although the generation of 600,000 reads from a GS FLX sequencer failed to generate any evidence of another SCD isoform (Hughes MA, Hall N, and Cossins AR, unpublished observations).
But at present it appears that in some cases one or the other of the SCD1/2 duplicates generated from the ancient duplication has been subsequently lost, and in the carp lineage the SCD2 paralog appears to have been lost sometime after the divergence with the zebrafish lineage. Such gene losses could have occurred by a number of different events, ranging from the deletion of large chromosomal segments to the accumulation of mutations to form new functions or produce noncoding pseudogenes. Nevertheless, the more recent duplication in the carp lineage gave rise to two differentially expressed SCD isoforms, induced by cold or diet (30).
In summary, we have identified two different paralogous clades of SCDs in the teleost fish radiation, which are likely to have arisen by an ancient duplication event. This event preceded and is entirely separate from the more recent duplication that gave rise to the two coexpressed hepatic isoforms in the common carp. In each case the conservation of the two duplicates is associated with the separation of their functional roles (11). Thus we demonstrate a tissue-specific expression for each of the ancient duplicated zebrafish isoforms, while we previously linked the carp duplication to different promoter sensitivities and inducibilities (30). Interestingly, the salmonid groups appear to have retained a set of four paralogs generated from both ancient and more recent duplications, and again we expect that all four would have distinct properties. However, the syntenic data do not exist to test their relationship with the SCD1/2 dichotomy. Given the lack of sequence data, we confidently predict that more SCD isoforms are still to be discovered in other fish species. A precise understanding of SCD diversity and distribution is required 1) to guide the mechanistic investigation of isoform function and tissue distribution and how these properties evolved in relation to cold-induced SCD induction, 2) to guide ablation or transfection strategies for assessing the effects of manipulated expression on the environmental phenotype, and phenotypic plasticity, and 3) to avoid compensatory responses of undiscovered isoforms after gene knockout or knockdown. This latter issue proved important in a recent analysis of the influence of cold-inducible SCD on a cold-tolerance phenotype in Caenorhabditis elegans (27).
This work was funded by a grant from the Natural Environment Research Council (UK).
We thank Hugues Crollius (Ecole Normale Supérieure, Paris, France) for providing sequence data before publication, Ian Johnston (University of St Andrews) for providing tissue from fugu, and Spencer Polley (London School of Health and Tropical Medicine, London, UK) for helpful discussions.
Present address of A. Y. Gracey: University of Southern California, Los Angeles, CA.
↵1 The online version of this article contains supplemental material.
Address for reprint requests and other correspondence: M. Berenbrink, School of Biological Sciences, Biosciences Bldg., Univ. of Liverpool, Crown St., Liverpool L69 7ZB, UK (e-mail:).
The costs of publication of this article were defrayed in part by the payment of page charges. The article must therefore be hereby marked “advertisement” in accordance with 18 U.S.C. Section 1734 solely to indicate this fact.
- Copyright © 2008 the American Physiological Society