The mouse is a premiere experimental organism that has contributed significantly to our understanding of vertebrate biology. Manipulation of the mouse genome via embryonic stem (ES) cell technology makes it possible to engineer an almost limitless repertoire of mutations to model human disease and assess gene function. In this review we outline recent advances in mouse experimental genetics and provide a “how-to” guide for those people wishing to access this technology. We also discuss new technologies, such as transposon-mediated mutagenesis, and resources of targeting vectors and ES cells, which are likely to dramatically accelerate the pace with which we can assess gene function in vivo, and the progress of forward and reverse genetic screens in mice.
- gene targeting
- embryonic stem cells
technologies for modifying the mouse genome can be broadly split into two classes. These include technology for gene-driven analyzes such as those used for knocking out genes or precisely modifying them and technology used to generate random mutations across the genome, or in selected regions of the genome. In this review we will initially discuss gene-driven approaches of modifying the mouse genome and then discuss approaches that may be exploited for random mutagenesis.
The most commonly used gene-driven approach in mouse is gene targeting. The derivation of embryonic stem (ES) cells (21, 45) and the illustration that homology-directed recombination could be accomplished in these cells (19, 75) heralded a new era in biology. We will discuss how gene targeting can be used to generate gain- and loss-of-function alleles in mice (both in specific tissues or the whole animal) and also how to engineer large-scale rearrangements of entire mouse chromosomes. While loss-of-function alleles are usually the first step toward understanding gene function, only a subset of human disease causing alleles are loss-of-function alleles. In many cases disease causing mutations generate hypomorphic or neomorphic alleles. Recently oligo-based gene targeting approaches have emerged that potentially represent a significant advance in our ability to rapidly generate precise base pair changes or allelic series in a target gene. We will discuss these approaches and their application to modeling human disease. Finally we will discuss progress in the use of small hairpin RNA (shRNA) as a tool for the rapid assessment of “knockdown” phenotypes in the mouse and advances in the use of lentiviral and bacterial artificial chromosome (BAC) for transgenesis.
While many investigators have the ability to generate knockout mice by themselves or within core facilities, access to mouse mutants can still be a limiting and expensive. For this reason several large gene-trap and gene targeting programs have been established. These include existing gene-trap resources provided by several groups under the apices of the International Gene Trap Consortium (ITGC) (51) as well as the Knockout Mouse Project (KOMP) funded by the National Institutes of Health (NIH), the European Conditional Mouse Mutagenesis Program (EUCOMM) funded by the European Union, the North American Conditional Mouse Mutagenesis Project (NorCOMM) fund by Genome Canada, and the Sanger Institute's Mouse Genetics Programme (MGP), which is a Wellcome Trust Initiative. We will outline what reagents these resources have to offer and how to access them.
The abovementioned approaches represent reverse genetic (gene-driven) strategies for assessing gene function. However, all mouse geneticists are familiar with the disappointment of spending months or years painstakingly engineering a mouse mutant only to discover that the mouse lacks the desired phenotype or any obvious discernable phenotype at all. For this reason many researchers have taken to using forward genetic (phenotype-driven) strategies where the genome is mutated at random and mutant offspring are screened for phenotypes of interest, usually through a battery of tests. Classically these approaches have involved the use of chemical mutagens such as N-ethyl-N-nitrosourea (ENU), ethylmethanesulfonate (EMS), chlorambucil, or ionizing radiation. The limitation of these mutagenic strategies, however, is the ease with which mutations can be found and then proven to be causal. Powerful gene-trapping and transposon-based approaches have also been developed (17, 25, 80). The advantage of such approaches is that the transposon or gene trap represents a DNA “tag” so the phenotype of the mutant can be rapidly correlated with the presence of an insertion event within a mutated gene. Each of these mutagens has distinct advantages and disadvantages, which we will discuss in this review. We will also direct the reader to collections of mutants and provide links to facilitate access to repositories of phenotyping data.
REVERSE GENETIC (GENE-DRIVEN) APPROACHES OF MODIFYING THE MOUSE GENOME
Overview of ES Cell Technology and Gene Targeting
Genetic background of ES cells.
The development of ES cell technology whereby pluripotent ES cells can be removed from a preimplantation embryo, modified in culture, and then reintroduced into a host preimplantation embryo to colonize the germline has been a profound advance in mouse genetics. An overview of the procedure is provided in Fig. 1.
Historically the ES cell lines used for gene targeting experiments have been derived from substrains of 129 because ES cell lines from these genetic backgrounds have been shown to efficiently colonize the mouse germline (69). Commonly used 129-derived ES cell lines include E14 (129OlaHsd) (34), AB2.2 (129S7) (57), R1 (129X1/SvJ × 129S1) (50), and J1 (129SvJae) (43). However, when the mouse genome was sequenced C57BL/6J was selected as the reference strain because of its central role in mouse genetics, and whereas 129 substrains and C57BL/6J are both classed as “common laboratory strains” and are derived from a common pool of founders (Mus musculus, Mus domesticus, and Mus castaneous), there are distinct differences in their genomes both at the nucleotide level and when copy number variants are compared between them (30, 81). These genomic differences manifest as phenotypic differences for a range of parameters (73).
As shown in Fig. 1, during the process of making knockout mice it is usual to inject genetically modified 129-derived ES cells into host C57BL/6J blastocysts and to transmit the modified allele onto a C57BL/6J background. Although this approach is advantageous, because coat color markers can be used to score for germline transmission, the resultant offspring are 129-C57BL/6J hybrids. For this reason, and because the reference genome sequence is from C57BL/6J, many investigators have attempted to derive C57BL/6J ES cell lines. One commonly used C57BL/6J-derived line is Bruce4, although strictly speaking they are a congenic ES cell line because they were made from the mouse strain C57BL/6J-Thy1.1 (39). More recently, ES cells derived from a substrain of C57, termed C57BL/6N, were found to make reliable ES cells (63). The C57BL/6N strain was derived from C57BL/6J stock originating from the NIH and has been maintained in isolated breeding colonies for at least 160 generations. Jackson Labs, Charles River, and Taconic have maintained separate colonies of this strain for several decades. It is likely that each of these colonies have diverged significantly from C57BL/6J and each other, which should be taken into consideration given that important phenotypic differences between C57BL/6J and C57BL/6N have been reported (18, 71). When C57BL/6N ES cells are used for gene targeting they may be injected into albino C57BL/6J blastocysts and the male chimeras crossed with albino C57BL/6J females so the resultant F1 offspring, which will be black if they are ES cell derived, should be considered hybrids. It is also possible to breed male C57BL/6N chimeras with C57BL/6N females and transmit onto a “pure” C57BL/6N background, although if this strategy is employed all of the offspring will need to be genotyped because it will no longer be possible to score germline transmission by coat color. Several C57BL/6N ES cells lines have been made including B6/Blu-1 (unpublished; generated by Tim Ley from Washington University, St. Louis, MO). B6/Blu-1 ES cells have been elegantly designed to carry a LacZ marker, which allows chimerism to be scored by LacZ staining of peripheral blood smears. Modified B6/Blu-1 ES cells may be injected into C57BL/6N or C57BL/6J blastocysts and male chimeras mated with C57BL/6N females resulting in pure C57BL/6N offspring. Additional C57BL/6N ES cell lines have been made by investigators participating in the EUCOMM and KOMP programs.
The vast majority of the >5,000 knockouts that have been made over the last 20 yr have been generated by microinjection of modified ES cells into host blastocysts and then breeding the chimeric offspring for germline transmission. Over the last 10 yr an alternative approach has emerged, termed tetraploid complementation (50). As shown in Fig. 1, this approach involves passing an electric current through two-cell preimplantation mouse embryos to fuse them together. This renders these embryos 4N and, as such, able to contribute to trophoblast and primitive endoderm lineages but unable to participate in forming the developing embryo. After electrofusion, the embryos are cultured through to blastocysts, microinjected with ES cells, and then transplanted into pseudopregnant females. Alternatively a technique called “aggregation” can be used whereby ES cells are cocultured with electrofused two-cell embryos that have had their zona pellucida removed and they become part of the blastocyst as it forms. The ES cell lines most widely used for tetraploid complementation are either hybrids between 129 substrains such as R1 (47) or hybrids between a 129 substrain and C57BL/6J, such as G4 (26, 68). Analysis of mice generated by using tetraploid complementation is usually performed on a hybrid background. It is possible to “double target” ES cells to disrupt both alleles of a gene, or to target a transgene, and then generate totally ES cell-derived embryos (64, 68). With this approach it is possible to phenotype mice without the need for germline transmission. The drawback of this approach is that the rate of generation of tetraploid offspring can be low, and therefore it may be problematic to generate large cohorts of mice for phenotyping.
Types of targeting vectors.
There are four classic types of targeting vectors, replacement, knock-in, conditional, and insertional, as shown in Fig. 2. To generate most alleles replacement vectors are used, and insertion vectors are usually employed only as part of chromosome engineering-based approaches. Whereas replacement vectors undergo double reciprocal recombination events when they are targeted into the genome, insertional targeting vectors undergo a single reciprocal recombination event and, as such, target the genome at a high frequency (32). However, the caveat of insertional targeting vectors is that they 1) generate complex rearrangements of the genome by duplicating their homology and 2) rely on gene rearrangement by interfering with splicing to disrupt a gene, as opposed to gene ablation which can be achieved with replacement vectors (Fig. 2, A and E). There are also elegant targeting vector designs that allow multiple aspects of a gene's function to be assessed with one targeting event, and these strategies are made possible by using the site-specific recombinase systems, Cre and Flp (58, 60) (Fig. 2, C and D).
The first consideration when designing a targeting vector is the selection marker to be used. Most commonly the cDNA of neomycin (neo) is driven by the strong phosphoglycerate kinase (PGK) promoter. ES cells transfected with vectors carrying a neomycin cassette can be selected in the drug G418. Other commonly used selection markers include puromycin (puro), blastocidin (blasto/Bsd), and hygromycin (hygro), and cells transfected with vectors carrying these markers can be selected in the drugs puromycin, blastocidin, and hygromycin, respectively. Each selection agent has different properties, and ES cells behave differently under different drug selection conditions. For example G418 selection is slow, and it takes up to a week for resistant colonies to emerge. In contrast puromycin and blastocidin selection kills cells very quickly, and significant cell death is evident within 72 h of these drugs being added to the culture medium. In our hands a construct carrying a neomycin marker driven by PGK will give more colonies after selection than the same construct carrying a puromycin or blastocidin marker driven by PGK. Fusions between herpes simplex thymidine kinase (tk) and both neomycin and puromycin have been reported (8, 9, 78), and these allow the removal of selection cassettes by negative selection in either 1-(-2-deoxy-2-fluoro-1-b-d-arabino-furanosyl)-5-iodouracil (FIAU) or gancyclovir. Truncations of tk have been developed (termed Δtk) and fused to puromycin to circumvent the male sterility associated with expression of full-length tk (9).
One aspect worth considering in the design of a targeting vector is that selectable markers, particularly neomycin, have been reported to act as silencers of genes nearby the targeted locus (55). Thus it is critical that selection markers are removed from the genome before analyzing the phenotype of a mutant. While no ill effects have been reported from alleles carrying puromycin, blastocidin, or hygromycin there has been speculation that in some instances the PGK promoter used to drive most selection markers may function bidirectionally, and in at least one case this has been shown to result in the formation of an antisense transcript (59). Thus it is prudent to remove selection markers from a targeted locus before analysis of the mutant mice.
Source of DNA for the gene-targeting vector.
One of the critical determinants of gene targeting efficiency is the length of homology between the genome and the targeting vector. Generally speaking, the longer the homology the higher the targeting frequency (31), although larger targeting vectors can become unwieldy and may potentially become rearranged in Escherichia coli. In recent years significant resources have become available for generating targeting vectors. For example genome-wide end-sequenced bacterial artificial clone libraries now exist for C57BL/6J (53) and 129S7 strains (4), and both of these resources are available via the Ensembl genome browser (24) (www.ensembl.org). Other large insert libraries are also available for the 129/OlaHsd strain (52) from which the commonly used E14 ES cell line is derived, although 129S7 DNA has been successfully used to target E14 ES cells, at least at some loci. Similarly, we routinely use DNA from C57BL/6J BAC libraries to target genes in C57BL/6N ES cells. Other factors that are believed to influence targeting frequency include the level of expression of the target gene (with more highly expressed genes thought to be easier to target) and the chromatin environment surrounding the target locus, although neither of these theories has been formally tested.
Construction of targeting vectors.
In the past, targeting vectors were usually constructed by restriction digest of DNA fragments, followed by standard subcloning into a plasmid-based targeting vector. Some investigators prefer to use PCR-based approaches to generate homology arms for gene targeting, although there is always the concern that PCR may introduce errors into the amplified fragment. If PCR is used to generate a targeting vector it is wise to use a proofreading “high-fidelity” polymerase and to at least sequence the exons and splice junctions on the amplified fragment.
Recently, targeting vector construction has been revolutionized by the development of “recombineering” (11). This approach employs λ phage-derived recombinases to direct homologous recombination of selection markers onto a DNA fragment and gap repair for capturing a genomic segment that represents the targeting vector homology (Fig. 3). In our hands the λRed system developed by Neal Copeland and Don Court is extremely efficient and flexible and can be used to develop targeting vectors within 2 wk (42). Similarly the system developed by Francis Stewart and colleagues (79) works very efficiently and has the advantage that any bacterial cell can be converted to a recombineering proficient strain simply by introducing a plasmid carrying the Red operon (which includes the three components: Red∝, a 5′-to-3′ exonuclease; Redβ, an annealing protein; and Redγ, an inhibitor of the major E. coli exonuclease and recombination complex, RecBCD).
If you are unable to generate a targeting vector by yourself it is also possible to have them made on a contract basis by several companies including GeneBridges and VectorBioLabs, which use recombineering and PCR-based approaches respectively.
Tissue-specific gene targeting.
In many cases it may be desirable to assess gene function in a particular organ or cell type. Furthermore, if a gene is essential for embryonic development, then germline loss-of-function in the whole animal can preclude such analysis. To circumvent this problem site-specific recombinase systems including Flp/FRT (47), Cre/LoxP (5) and phiC31 integrase (76) have been developed. These elements can be used to flank DNA segments for excision from the genome, and in the case of the Flp/FRT and Cre/LoxP systems inverted FRT and LoxP sites, respectively, can be used to “flip” (invert) segments of DNA, which can be useful in conditional gene targeting approaches (62). An outline of the various strategies that can be used to conditionally inactivate and activate genes is shown in Figs. 2, C and D, and 4. There are numerous tissue- and cell-specific Cre lines that are now available in “Cre-Zoos,” and Andras Nagy curates a database of these lines: http://www.mshri.on.ca/nagy/ (49).
In some cases it may be desirable to study a gene in the adult animal or to somatically alter gene function in vivo. This is possible by using inducible expression systems. There are several inducible expression systems including the tetracycline-inducible system (29), the LacZ-inducible system (13), and the tamoxifen-inducible system (22). The tamoxifen-inducible system has been extensively used and shown to be both tightly regulated and sensitive. This system employs a fusion between a recombinase and the estrogen receptor and has been developed for both the Flp/FRT and Cre/LoxP systems (6, 22). Expression of the chimeric fusion protein can be directed ubiquitously or tissue-specifically, depending on which promoter the fusion is expressed from. When mice are administered tamoxifen, or its derivative 4-OHT, the chimeric fusion protein translocates to the nucleus allowing the recombinase to act on FRT or LoxP sites integrated into the genome. Thus by using this approach it is possible to alter gene function in a temporal- and tissue-specific manner.
Resources of targeting vectors, mutant ES cells, and mice.
Even investigators expert in gene targeting would rather not have to develop a targeting vector and target ES cells if there were an alternative source for their mutant allele. For this reason, and to facilitate the dissemination of mutant lines to the research community, several large gene-trap and gene targeting resources have been developed. One such collection is the IGTC resource (http://www.genetrap.org/), which as of February 2008 contained 120,043 ES cell lines in 10,982 genes (17.2% coverage of the genome). These cell lines are largely 129-derived ES cell lines. The accumulated resource is composed of alleles generated by vector-based gene traps and also gene traps generated using viruses. Another large resource has been developed by the Texas Institute for Genomic Medicine and is composed of just over 270,000 ES cell lines. This resource contains both 129 and C57BL/6N ES cell lines. For many genes there are multiple gene-trap insertions in different regions of the gene in these resources meaning that it is possible to use these cell lines to generate allelic series. As shown in Fig. 2F the generic structure of gene traps includes a splice acceptor and βgeo selection marker, which hijacks a genes transcription machinery to express βgeo, a LacZ/neomycin fusion. In some alleles within the resource, FRT and LoxP sites have been included, which makes it possible to revert the allele or to use it as a substrate for recombinase-mediated cassette exchange (RMCE) (23). All of the cell lines generated by the IGTC are made freely available on a cost recovery basis. The Sanger Institute Gene Trap Resource represents ∼12,000 of the cell lines in the IGTC database. These gene-trap ES cell lines are E14Tg2a-derived (34). We have obtained 80% germline transmission of gene traps from this resource, including a gene trap in the gene Geminin (28).
Although gene traps have been shown to be very effective mutagens there have been concerns that some intragenic insertions may not “trap” all of a gene's transcripts and that there could be splicing over the gene-trap cassette. In the case of gene traps in the Tfeb gene this has been shown to be true (46), although the frequency with which splicing over the gene-trap cassette is a problem will vary significantly between gene-trap vector designs and each genomic locus and insertion. While this may represent a potentially undesirable aspect of gene traps, in some case, splicing around the gene trap may result in the formation of hypomorphic alleles that may allow the analysis of genes in mice that would be lethal if a null allele was generated.
Clearly the most desirable resource of mutant ES cell lines is one containing targeted conditional mutations. For this reason the KOMP, the EUCOMM, the NorCOMM, and the Sanger Institute's MGP were established. Each of these programs aims to produce conditional alleles containing a LacZ expression marker known as “knockout first” alleles (Fig. 2D) (74). Although still in their infancy, these programs have already generated an impressive list of targeting vectors and mutant cell lines that represent many lifetimes of work from a single laboratory. These programs are made possible by the use of an elegant recombineering/gateway based approach, which facilitates the rapid generation of targeting vectors. A summary of resources of targeting vectors, ES cell lines, BAC libraries, and gene traps is provided in Table 1.
RMCE/“Plug and Socket” Targeting
RMCE (7, 67) and plug-and-socket targeting (84) are both methods that allow cassettes, which may contain cDNAs or expression markers, to be introduced into the same genomic locus at a “docking site.” This docking site contains either LoxP or FRT sites such that transient expression of either Cre or Flp recombinases will direct integration of a LoxP or FRT flanked cassette into the docking site. RMCE docking sites usually contain two LoxP or FRT sites, while plug and socket target employs just one LoxP or FRT site. Given that the site of integration of a transgene affects both the level and the spatial and temporal expression of genes expressed from the transgene, RMCE and plug-and-socket targeting allows the reproducible functional characterization of genetic elements in the same genetic and chromatin environment.
Lentiviral vectors represent a potentially elegant tool for studying gene function in mice (54). The advantage of lentiviral vectors is that unlike standard “DNA-based” transgenesis lentiviral insertions are usually single copy, transgenesis is efficient with the majority of offspring carrying a lentiviral insertion, and viral insertions are generally introduced into active transcriptional units. Lentiviral vectors have been used for delivering cDNAs and also for delivering shRNAs. The issues that still need to be resolved with these vectors include transgene silencing, genetic mosaicism, and vector-related toxicity. Recent improvements in vector safety and the ability to produce ecotropic lentiviruses (61) now make lentiviral vectors a safe tool for use in most laboratory settings.
Large genomic fragments such as BACs may be used for transgenesis in mice. Given that the average size of a gene in the mouse genome is 34 kb, most genes and their regulatory elements can fit on a BAC. BAC transgenesis allows genes to be expressed from a BAC under the control of their endogenous promoter elements as the cis-splicing machinery and elements with the 5′- and 3′-untranslated regions of genes are present, ensuring that splicing occurs, that all transcripts of a gene are expressed, and that gene expression is appropriately posttranscriptional regulated. For these reasons BAC transgenesis is a particularly useful tool for genetic rescue experiments (12) and also for dissecting the functional elements of genes without having to go through the laborious process of targeting the endogenous locus (48). By combining BAC-transgenesis with RMCE into the Hprt locus, which is subject to random X-inactivation, it is also possible to generate genetic mosaics (56).
Generating Targeted Point Mutations
A significant proportion of genetic variation in humans is at the nucleotide level (1), and site-specific modification of the mouse genome is a powerful approach for functionally characterizing mutations and distinguishing between silent polymorphisms and pathogenic mutations. Although point mutations can be introduced by replacement vector targeting (or knock-in targeting) this is an extremely laborious approach and precludes the generation of large numbers of mutations. Oligonucleotide-mediated gene targeting is emerging as a powerful tool for the introduction of subtle gene modifications (such as point mutations) in mouse ES cells with numerous reports showing that oligonucleotides differing from the target locus by one or a few nucleotides can be used to introduce specific mutations into both episomally and chromosomally located genes (2, 14, 15). In most cases, chemically modified RNA-DNA chimeric oligonucleotides or single-stranded DNA oligonucleotides were used in which the chemical modifications served to protect the oligonucleotides from nucleolytic degradation. The mechanism of transfer of genetic information from the oligonucleotide to the target remains largely elusive, with numerous different cellular processes being involved [including transcription, DNA replication, homologous recombination, and DNA mismatch repair (MMR)].
Until recently, oligonucleotide-mediated gene targeting frequencies in mouse ES cells appeared to be relatively low. However, Hein te Riele and colleagues (2, 14, 15) developed a technique whereby suppression of the DNA MMR machinery by knockdown of Msh2 or Msh3 made it possible to introduce several specific point mutations into the Rb gene and Fancf, respectively. Transient suppression of Msh2 is likely to result in mismatch base changes elsewhere in the genome, and therefore knocking down Msh2 is not ideal. Msh3 null cells on the other hand do not show an overt mutator phenotype (14), suggesting that it will be possible to combine suppression of Msh3 and oligo-based gene targeting for the rapid generation of point mutants. To make oligo-based gene targeting approaches feasible for high-throughput applications it will necessary to increase the oligo targeting frequency or to scale up the identification of point mutant ES cell lines within the pool of transfected ES cell clones. The latter may be possible with new sequencing technologies.
shRNA Knockdown in the Mouse
RNA interference through the expression of shRNA molecules (where one strand is complementary to the coding region of the target gene) potentially holds great promise as a reverse genetics tool as it may allow inexpensive and rapid functional analysis in vivo. If combined with techniques such as tetraploid complementation (Fig. 1), it is possible to generate entirely ES cell-derived mice where a particular gene of interest is knocked down. Single copy shRNA transgenes under the control of either the U6 or the H1 promoter have been shown to mediate efficient and ubiquitous gene knockdown in mice when integrated at the Rosa26 locus (66), and shRNA knockdown has been shown to phenocopy a targeted mutation in the Fgf gene (40). However, shRNA knockdown approaches have gained only limited acceptance due largely to concerns over off-target effects. It has also emerged that it can be problematic to identify shRNAs that effectively knockdown a gene and that in some cases many hairpins may need to be screened to find one that is both specific and effective. For these reasons direct manipulation of the genome by gene targeting has remained the predominant technology. One very appealing aspect of shRNA knockdown in vivo is the ability to conditionally and reversibly silence a gene by combining shRNA expression with tet-regulatable promoters (65). This approach has made it possible to study the effect of Plk1 inhibition on tumor growth in vivo (36) and will be useful in applications where conditional inactivation and then reactivation of a gene are desirable. It has recently been determined that shRNA sequences within miRNA hosts are both more efficient at knocking down gene expression and also more specific (16) than hairpin designs used previously. With further evolution of this technology gene knockdown may become an increasingly feasible approach.
Chromosome Engineering for Generating Large-scale Chromosomal Rearrangements
Chromosome engineering refers to an approach whereby gene targeting in ES cells and the Cre/loxP site-specific recombination strategy is used to introduce defined chromosomal rearrangements into the genome such as a deletion, duplication, inversion, or translocation (83). In this technique, gene targeting is used to sequentially insert loxP sites into two different loci, and the doubly targeted ES cells are then exposed to Cre to induce recombination between the loxP sites and generate the rearranged chromosome (dependent upon the orientation of the targeted loxP sites and the relative position of selection cassettes used to recover the ES cells that have undergone recombination) (3, 57, 70, 85).
The most widely used approach for chromosome engineering employs two vectors called the 5′HPRT and 3′HPRT vectors: each of these vectors contain a loxP site, a portion of the HPRT mini-gene (5′HPRT or 3′HPRT), selection marker (neomycin or puromycin), and a coat-color tag (Agouti or Tyrosinase), shown in Fig. 5A (85). A single length of homologous genomic sequence in these vectors allows them to be used as insertion vectors (Fig. 2). Recently libraries of 5′HPRT and 3′HPRT targeting vectors were developed that contain randomly “shotgun cloned” genomic inserts (85). These libraries have been end-sequenced to generate the Mutagenic Insertion and Chromosome Engineering Resource, which allows a researcher to browse the mouse genome using Ensembl for 5′HPRT and 3′HPRT vectors that they can use to generate the rearrangement they desire (Fig. 5B). 5′HPRT and 3′HPRT vectors are displayed orientated on the genome and are color-coded to facilitate the easy selection of targeting vectors suitable for the generation of the desired chromosomal rearrangement, as shown in Fig. 5B (3).
Very large rearrangements can be generated by chromosome engineering (86); there is probably no limit to the size of inversions that can be generated, but the size of deletions that can be generated is limited by haploinsufficiency of genes within the deleted interval killing the ES cells. If a very large deletion is to be generated it is possible to circumvent potential haploinsuffiency by targeting 5′HPRT and 3′HPRT vectors in trans-, and after the expression of Cre a balanced deletion/duplication may be generated on homologous chromosomes (83). In most cases rearrangements of chromosomal regions are generated in vitro and then transmitted. It is, however, possible to generate rearrangements somatically by conditionally expressing Cre recombinase in a cell- or tissue-specific manner (86). This approach may be useful for studying tumor suppressor regions of the genome.
FORWARD GENETICS APPROACHES OF MODIFYING THE MOUSE GENOME
Large-scale ENU Mutagenesis
ENU is an alkylating agent that is a powerful mutagen in mouse spermatogonial stem cells and can be used to produce single locus mutations at a frequency equivalent in the range of one out of every 175–655 gametes screened. Analysis of sequenced germline mutations reveals that ENU predominantly modifies A.T base pairs, and when translated into a protein product, these changes predominantly result in missense mutations. Point mutations induced by ENU provide a unique mutant resource because they: 1) reflect the consequences of single base change independent of position effects, 2) dissect protein function at amino acid resolution, 3) generate many different types of alleles (loss-of-function mutations, viable hypomorphs of lethal complementation groups, antimorphs, and gain-of-function mutations), and 4) discover gene functions in an unbiased manner. Phenotype-driven ENU screens in the mouse are being performed to identify genes involved in a range of human disease areas including cardiology, physiology, neurology, immunity, hematopoiesis, and mammalian development (38). ENU-based approaches are extremely powerful in understanding complex human diseases and traits: the base-pair changes can accurately model base changes found in human diseases, and subtle mutant alleles in a standard genetic background provide the ability to analyze the consequences of compound genotypes. Ongoing mouse ENU mutagenesis experiments are generating a rich resource of new mutations allowing an in-depth study of a single gene, a chromosomal region, or a biological system. A summary of ENU mutagenesis and other mouse resources is available in Table 2.
One of the major limitations of approaches that use random mutagenesis (such as ENU) is that finding mutations and proving they are causal can be extremely laborious. Recently, new technology that allows the specific capture of DNA by array hybridization coupled with high-throughput parallel sequencing has been developed (33). These technologies have the potential to facilitate a resurgence in ENU screens because they will dramatically accelerate the recovery of mutations. Although it is also possible to generate random genome-wide mutations using radiation and other mutagens such as EMS or chlorambucil, these mutagens, each of which produces its own signature of mutations, are not frequently utilized.
Emerging Transposon-based Approaches
As mentioned above chemical mutagen-based approaches using agents such as ENU are very powerful because they can mutate a gene in many different ways, potentially generating a polychromatic view of a gene's function, albeit a view that may be difficult to interpret. Transposons represent an alternative random mutagenic approach. Transposons are DNA-based elements that can function as part of a bipartite system with a transposase to mutate genes both in vitro (in ES cells) and in vivo (in the germline or somatically) (10, 17, 20, 27, 35), as shown in Fig. 6.
The most widely studied transposon/transposase system for mammalian mutagenesis in vitro and in vivo is the Sleeping Beauty transposon of the Tc1-like mariner family (35). The Sleeping Beauty transposase was originally discovered in salmon, where it had been silenced through evolution, and was point mutated to “resurrect” its ability to transpose the genome (35). Subsequent additional mutations have been added to the Sleeping Beauty transposase to improve its ability to stimulate “jumping” of the Sleeping Beauty transposon (82). Although the Sleeping Beauty transposon system has been shown to function as an efficient germline mutagen analysis of mutant mice generated using this system has been complicated by gene mutations and genomic rearrangements due to transposon mobilization from chromosomal concatemers (27). This is due largely to the fact that the Sleeping Beauty transposon generates a double strand break when it is excised from the genome as part of a cut-and-paste procedure. Despite this the system has been used to generate several interesting mutants, and because the transposon acts as a tag the mutated genes were easily identified. The Sleeping Beauty transposon system has also been used to perform regional saturating screens of the mouse genome (37). These screens have been facilitated by the fact that the Sleeping Beauty transposon preferentially integrates close to the site in the genome from where it was initially mobilized (37).
More recently an alternative transposon called Piggybac, derived from the cabbage looper moth (Trichoplusia ni), has been employed for germline mutagenesis (17). This transposon is thought to more randomly generate mutations across the genome and has the advantage that it can carry large payloads between its transposon repeats, unlike Sleeping Beauty whose transposition efficiently drops dramatically when the size of the payload between its IR/DRs increases above 2 kb (44). A database of Piggybac mutant mouse lines and some phenotyping data of these lines exist on the web (http://www.scbit.org/PBmice/) (72). At present this database is limited but is set to expand over the next few years. Piggybac has also been used together with Cre/LoxP technology to generate large-scale rearrangements of the mouse genome including duplications, deletions, and translocations (80). Over the next few years it is likely that there will be a considerable amount of technology development pushing forward the use of Piggybac as a genome-wide insertional mutagen. There are many other transposons that are yet to be tested in mammalian cells, and it's likely that some of these will have features that make them a useful part of the toolkit for modifying the mouse genome.
The mouse has always been the experimental organism of choice for modeling most human diseases and for the discovery of gene function in vivo. The generation of large resources of gene traps, targeting vectors, targeted ES cells, and transposon mice, together with new technology such as the ability to rapidly engineer point mutations into the genome, will further accelerate the pace of discovery. In a previous review in Physiological Genomics (77), we wrote “In the postgenomic era the mouse will be central to the challenge of ascribing a function to the 40,000 or so genes that constitute our genome.” This statement is now almost certainly the case.
D. J. Adams is funded by Cancer Research-UK and the Wellcome Trust. L. van der Weyden is funded by a Fellowship from the Kay Kendell Leukaemia Foundation.
Address for reprint requests and other correspondence: D. J. Adams & L. van der Weyden, Experimental Cancer Genetics, Wellcome Trust Sanger Inst., Wellcome Trust Genome Campus, Hinxton, Cambs, CB10 1SA, UK (e-mail:& ).
- Copyright © 2008 the American Physiological Society