Physiol. Genomics Fuel your research with LabChart
HOME HELP FEEDBACK SUBSCRIPTIONS ARCHIVE SEARCH TABLE OF CONTENTS
 QUICK SEARCH:   [advanced]


     


Physiol. Genomics 11: 133-164, 2002; doi:10.1152/physiolgenomics.00074.2002
1094-8341/02 $5.00
This Article
Right arrow Abstract Freely available
Right arrow Full Text (PDF)
Right arrow Alert me when this article is cited
Right arrow Alert me if a correction is posted
Right arrow Citation Map
Services
Right arrow Email this article to a friend
Right arrow Similar articles in this journal
Right arrow Similar articles in ISI Web of Science
Right arrow Similar articles in PubMed
Right arrow Alert me to new issues of the journal
Right arrow Download to citation manager
Citing Articles
Right arrow Citing Articles via HighWire
Right arrow Citing Articles via ISI Web of Science (54)
Right arrow Citing Articles via Google Scholar
Google Scholar
Right arrow Articles by van der Weyden, L.
Right arrow Articles by Bradley, A.
Right arrow Search for Related Content
PubMed
Right arrow PubMed Citation
Right arrow Articles by van der Weyden, L.
Right arrow Articles by Bradley, A.
Received 13 June 2002; accepted in final form 30 July 2002.
Physiological Genomics 11:133-164 (2002)
1094-8341/02 $5.00 © 2002 American Physiological Society

Review

Tools for targeted manipulation of the mouse genome

Louise van der Weyden, David J. Adams and Allan Bradley

The Wellcome Trust Sanger Institute, Hinxton, Cambs CB10 1SA, United Kingdom


    ABSTRACT
 TOP
 ABSTRACT
 1) BACKGROUND
 2) GENE TARGETING
 3) MUTAGENESIS TECHNOLOGIES
 4) CONCLUSIONS AND FUTURE...
 References
 
In the postgenomic era the mouse will be central to the challenge of ascribing a function to the 40,000 or so genes that constitute our genome. In this review, we summarize some of the classic and modern approaches that have fueled the recent dramatic explosion in mouse genetics. Together with the sequencing of the mouse genome, these tools will have a profound effect on our ability to generate new and more accurate mouse models and thus provide a powerful insight into the function of human genes during the processes of both normal development and disease.

gene targeting; homologous recombination; embryonic stem cell; mutagenesis; gene trap; mouse genome project


    1) BACKGROUND
 TOP
 ABSTRACT
 1) BACKGROUND
 2) GENE TARGETING
 3) MUTAGENESIS TECHNOLOGIES
 4) CONCLUSIONS AND FUTURE...
 References
 
1.1) Why the Mouse?
The Human Genome Project has revealed the sequence information of many of the genes that make us what we are. However, although at least 35,000 human genes have been sequenced and mapped, adequate expression or functional information is available for only 15% of them (230). Therefore, a significant challenge for scientists over the next few decades is to annotate the human genome with functional information. This effort will enable us to gain an understanding of the molecular mechanisms and pathways underlying normal development, as well as those responsible for pathogenesis. However, to do this we need an experimental model. The mouse is an excellent experimental model for defining human gene function because of its anatomic, physiologic, and genetic similarity to humans. The mouse is also a popular model because of its relatively short life cycle and because its genome can be readily manipulated by molecular means. For example, mouse geneticists can eliminate or overexpress genes in the whole animal or in a specific tissue, introduce large pieces of self or foreign DNA into the genome, and engineer whole chromosomes. Furthermore, inbred strains of mice also provide the opportunity to study a disease trait in a defined genetic background, allowing distinction between the phenotypes conferred by a single mutation vs. the contributions of other genetic modifiers. A large genetic reservoir of potential models of human disease has been generated through the identification of spontaneous, radiation-, or chemical-induced mutant loci in mice 100 mouse models of human disease where the homologous gene has been shown to be mutated in both human and mouse (19)], indicating the validity of using mice to model human disease.

In the last few years, a number of significant technological advances have dramatically increased our ability to create mouse models of human disease. These technological advances have been greatly aided by the sequencing of the mouse genome and the subsequent mouse genomic resources that have been developed.

1.2) The Mouse Genome Sequencing Project
The C57BL/6J strain has been selected as the reference strain for the production of a finished genome sequence (17). The mouse genome sequencing project has two main aims; the first is to generate a finished sequence of the mouse genome that is freely accessible to the public, and the second is to create a richly annotated information resource consisting of a database containing information about the underlying genomic sequence (reviewed in Ref. 129). Mouse genomic resources include mapping resources [such as dense genetic maps (51, 168), sequence tagged site (STS)-based physical maps (168), radiation hybrid (gene-based) maps (11, 88, 250), and simple sequence length polymorphism (SSLP) marker-anchored bacterial artificial chromosome (BAC) framework maps (33)]; DNA resources [such as the generation of expressed sequence tags (ESTs) (144), cDNAs (109, 233) and BAC libraries (175, 283)]; and database resources, which help to provide community access to this information (see below for URLs).

1.2.1) How is the mouse genome being sequenced?
The sequencing of the mouse genome involves two parts: a genome-wide program, and a targeted program (summarized in Ref. 129). The genome-wide sequencing program is being undertaken by the Mouse Genome Sequencing Consortium (MGSC), which is an international collaboration between four centers [the Wellcome Trust Sanger Institute; The Whitehead Institute/MIT Center for Genome Research; The Washington University Genome Sequencing Center; and Ensembl (a joint project between the Sanger Institute and the European Bioinformatics Institute)]. Incorporated into the genome-wide sequence will be the sequence of regions targeted by specific projects. These include the Trans-NIH BAC sequencing program (funding sequencing of specific BACs/regions of biological interest by sequencing facilities at Cold Spring Harbor Laboratory, University of Oklahoma, and the Albert Einstein College of Medicine), the UK-MRC Mouse Sequencing Programme (adopting a regional and functional approach to target four "core regions" on chromosomes 4, 13, 2, and X), and the Joint Genome Institute (performing comparative sequencing of mouse chromosomal regions syntenic with human chromosome 19). The annotation of the mouse genome will be facilitated by cDNA sequencing programs, including that carried out by the RIKEN Institute (who collect data on most, if not all, full-length cDNAs, their primary structures, and expression sites) and the Mammalian Gene Collection [MGC; who provide a complete set of full-length (open reading frame) sequences and cDNA clones of human and mouse genes (233)]. In addition, the MGSC will do clone-by-clone BAC sequencing to a high standard.

There are several components to the genome-wide sequencing effort. First, there is the development of a BAC map, consisting of overlapping BAC clones covering the vast majority of the mouse genome. For this, two BAC libraries have been selected: RPCI-23 (female) and RPCI-24 (male) (175). Although previously characterized by sequencing both ends of the cloned inserts ("BAC-end sequencing") (283), as part of the MGSC they are being characterized by restriction-digest fingerprint of each clone ("fingerprinting"), which can be used to assemble the BACs into overlapping clone "contigs" (continuous stretches of assembled sequences) covering the mouse genome. The BAC map and BAC end-sequences provide a crucial scaffold for assembling the sequence of the mouse genome. The second component uses paired-end whole-genome shotgun reads to generate light (~2.5-fold) and deeper (5- to 6-fold) coverage of the mouse genome; the sequence is obtained by preparing plasmid libraries (with inserts of 2–10 kb) and sequencing both ends of the inserts ("paired-end sequencing"). By combining the BAC map and the shotgun coverage, a "hybrid" genome assembly can be constructed, consisting of sequence contigs, linked into "supercontigs" (neighboring contigs that have been properly organized and orientated), which are further linked into "ultracontigs" (neighboring supercontigs that have been properly organized and orientated). In addition, comparative sequence information from other strains of mice, such as 129S1/SvImJ, BALB/cByJ, and C3H/HeJ will also be obtained, as such analysis is important for identifying sequence variants, such as single nucleotide polymorphisms (SNPs).

As of April 2002, two assemblies of the complete whole genome data set for the mouse genome have been generated: one by the Whitehead group using the ARACHNE program (18), and the other by the Sanger Institute using the PHUSION program. Both assemblies used the February 2002 freeze of data, and were done using the same set of files (each assembly included about 33 million reads, corresponding to x7 coverage of the genome). Although both assemblies were evenly matched by many criteria, the ARACHNE assembly was chosen by the MGSC as the one with which to proceed for further analysis. This assembly provides 96% coverage of the euchromatic mouse genome and predicts with high confidence 22,444 genes across the genome, of which ~75% have a firm human genome counterpart. This "MGSC Version 3" assembly can be downloaded (ftp://ftp.ensembl.org/pub/assembly/mouse/mgsc_assembly_3) or searched using SSAHA (http://www.ensembl.org/Mus_musculus/ssahaview) or BLAST (http://www.ncbi.nlm.nih.gov/genome/seq/MmBlast.html). In addition, the BAC map resource has been aligned to the sequence, and an initial functional annotation of these genes has been added, as well as a comparison to the human genome (at the DNA level). The final phase of the MGSC’s work is to now fill in the "gaps" and correct any errors or misassemblies (for a completed genome sequence to be available within the next 3 yr).

All information generated by the MGSC is rapidly released to the scientific community; a constantly updated and comprehensive view of the project’s data is provided at a central MGSC server (http://mouse.ensembl.org/). In addition, MGSC data is incorporated into other key genome servers, including mouse genome resources at the National Center for Biotechnology Information (NCBI; http://www.ncbi.nlm.nih.gov/genome/guide/mouse/index.html), the MRC UK mouse sequencing program (http://mrcseq.har.mrc.ac.uk), and the mouse genomic sequence reads aligned against the human draft sequence at the University of California at Santa Cruz (http://genome.cse.ucsc.edu).

1.3) Basic Genetic Technologies for Mining the Mouse Genome
With the exponential increase in the number of genes identified by the genome sequencing projects, it has become imperative that efficient methods be developed for determining gene function. It has now become possible to make essentially any mutation in the germ line of mice by utilizing homologous recombination (the recombining of an exogenous piece of DNA with its endogenous homologous sequence in vivo) and embryonic stem (ES) cells (the pluripotent derivatives of the inner cell mass of the blastocyst) (34). Gene targeting (using homologous recombination to alter specific endogenous genes) provides the highest level of control over producing mutations. The combination of gene targeting techniques in mouse ES cells and the Cre-loxP recombination system has resulted in the emergence of chromosomal engineering technology in mice. This advance has opened up new opportunities for modeling new diseases that are associated with chromosomal rearrangements [in humans, chromosomal abnormalities are a principle cause of fetal loss and developmental disorders (214), and chromosomal translocations are involved in the genesis of many types of human tumor (189)]. Mutagenesis studies in the mouse have emphasized modeling human diseases through phenotype-driven assays. The chromosome engineering strategy for generating mice with precise chromosomal rearrangements, in combination with mutagenesis approaches, provides an invaluable tool for deciphering gene function using phenotype-based methods. In addition, gene trap approaches in ES cells also offer a route to gene discovery by providing information on gene sequence, expression, and mutant phenotype.


    2) GENE TARGETING
 TOP
 ABSTRACT
 1) BACKGROUND
 2) GENE TARGETING
 3) MUTAGENESIS TECHNOLOGIES
 4) CONCLUSIONS AND FUTURE...
 References
 
2.1) Gene Targeting Strategies
Genetically modified mice can be generated either by direct pronuclear injection of exogenous DNA into fertilized zygotes (177) or injection of genetically modified mouse ES cells into a blastocyst (115). Direct pronuclear injection results in random integration of the injected DNA into the genome and relies on the overexpression of the transgene to produce a phenotype. In contrast, ES cells have the advantage in that they can be genetically modified by means of homologous recombination (a process by which a fragment of genomic DNA introduced into a mammalian cell can locate and recombine with the endogenous homologous sequence) prior to being injected into the blastocyst. This process is known as "gene targeting" and was first reported in mammalian cells in 1985 in erythroleukemia cells (226) and a fibroblast cell line (127). The first examples of gene targeting by homologous recombination in ES cells targeted the hypoxanthine-guanine phosphoribosyltransferase (HPRT) gene (a selectable locus on the X chromosome) (52, 239). The first reports of germ-line transmission of a targeted allele in ES cells also occurred at the HPRT locus (241), as well as the c-abl locus (212). Since then, gene targeting has been widely used in ES cells to make a variety of genetic mutations in many different loci, allowing the phenotypic consequences of the modifications to be assessed.

The procedure for the generation of mice that have been genetically modified using gene targeting strategies is essentially the same regardless of the specific targeting strategy used and is outlined in Fig. 1. Briefly, the targeting construct (containing sequence homologous to the targeted gene and a selectable marker) is electroporated into ES cells. These cells are then cultured in the presence of a selection agent to remove any cells that have not stably integrated the construct into their genome. The surviving ES cell colonies are then isolated and examined for the presence of the targeted allele (to ensure the desired recombination event has occurred) by PCR amplification or Southern blot analysis. Those containing the targeted allele are then injected into blastocysts (embryonic day 3.5) and transferred to the uterus of a foster mother. The resulting pups are then examined for their degree of chimerism (percentage of the genetic makeup of the mouse that was contributed by the ES cell). Male mice showing a high percentage of chimerism are then mated with wild-type mice to check for germ-line transmission of the targeted allele in the F1 offspring. The F1 heterozygotes may then be intercrossed to breed to homozygosity.



View larger version (28K):
[in this window]
[in a new window]
 
Fig. 1. Procedure for the generation of genetically modified mice generated by gene targeting strategies. A: a targeting vector containing a positive (+) selection cassette and sequences of homology with the target locus (gray line) is generated. B: the vector is then linearized prior to electroporation into embryonic stem (ES) cells. C: the ES cells are then cultured in the presence of the selectant (for example, G418 drug if a neomycin resistance cassette was present in the targeting vector). D: ES cell clones that carry the desired chromosomal rearrangements are identified and genetically characterized (such as by PCR or Southern blot analysis). E: the selected ES cells are injected into mouse blastocysts, and the embryos are transferred into the uteri of pseudopregnant foster mothers. F: chimeras that are generated from the blastocyst injections are mated with wild-type mice to establish germ-line transmission of the modified gene. G: the progeny derived from the chimeras are characterized, and a mutant mouse line that carries the desired targeting event is established.

 
Important aspects of any gene targeting experiments include the vector design (which can determine the efficiency of recombination), the type of mutation generated at the target locus, selection cassettes (detailed in Table 1), and screening strategies used to identify clones of ES cells with the desired targeted modification (detailed in Ref. 38). The different types of mutations, and the vectors needed to generate such mutations, are discussed below in more detail.


View this table:
[in this window]
[in a new window]
 
Table 1. Examples of some commonly used dominant selectable markers

 
2.1.1) Generating "knockout" mice: the null allele.
The most common experimental strategy is to ablate the function of a target gene (generating a null allele) by introducing a selectable marker gene (for a review see Ref. 38). A targeting vector is designed to recombine with and mutate a specific chromosomal locus. The minimal components of such a vector are sequences of DNA homologous with the desired chromosomal integration site, a selection cassette, and a plasmid backbone. Two basic types of vectors can be used for targeting in mammalian cells, namely, replacement and insertion vectors. The fundamental aspects of a replacement vector are sequences of isogenic DNA with homology to the target locus (5–8 kb), a positive selectable marker (see Table 1), bacterial plasmid sequences, and a linearization site outside of the homologous sequences of the vector. The final postrecombination product is a replacement of the chromosomal homology with all components of the vector flanked on both sides by homologous sequences (see Fig. 2, A and B) (reviewed in Refs. 38, 82). The basic elements of an insertion vector are the same as those in a replacement vector. However, the major difference between the two vector types is that the linearization site of an insertion vector is made within the homologous sequences. An insertion vector undergoes single reciprocal recombination (vector insertion) with its homologous chromosomal target, which is stimulated by the double-strand break (or gap) in the vector. Insertion vectors have been reported to target at a 5- to 20-fold higher efficiency than replacement vectors given the same homologous sequences (81). Since the entire insertion vector is integrated into the target site, including the homologous sequences of the vector, the recombinant allele becomes a duplication of the target homology separated by the heterologous sequences in the vector backbone (see Fig. 2C).



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 2. Replacement and insertion type vectors. A and B: replacement vectors target the locus by double reciprocal recombination, and they insert only the genetic sequences contained inside the homologous region (green). The positive (+) selection cassette can be used to replace an exon (represented by a box) as shown in A, or to disrupt an exon, as shown in B. C: insertion vectors target the locus by single reciprocal recombination and insert the whole vector sequence. The vector (black line) contains regions of DNA (green line) homologous to the target locus (gray line) and a positive (+) selection cassette. X represents recombination between the vector and the genome.

 
2.1.2) Generating subtle mutations.
Although major disruptions of a locus represent a valuable starting point for genetic analysis, a series of changes at the nucleotide level in both coding and control regions of the gene are important for the full understanding of a gene’s function. Thus there are techniques to allow the inclusion of small mutations in the genome, such as the "hit-and-run" and "double replacement" strategies. Hit-and-run vectors are essentially insertion vectors containing both positive and negative selectable markers outside the region of homology and the desired mutation in the region of homology (Fig. 3). This strategy requires two rounds of homologous recombination: homologous recombination and positive selection are used to generate a duplication at the target locus (via insertion of the targeting vector), followed by resolution of the duplication (excision) via a second homologous recombination event between the duplicated homologous sequences (Fig. 3). This results in removal of the plasmid sequences, selection markers, and one complement of the duplication from the target locus. Hit-and-run targeting procedures have been successfully used to generate mutations in ES cells in the HPRT locus (247), the Hox-2.6 locus (80), the ß-amyloid precursor protein locus (73), the Gi2{alpha} locus (202), and the cystic fibrosis transmembrane conductance regulator (Cftr) locus (49) (with germ-line transmission of the mutated allele being demonstrated in the latter two examples). Although hit-and-run targeting procedures are typically performed in ES cells in vitro, with the modified ES cells subsequently being injected into blastocysts to generate mice carrying the mutation, one group has modified this technique to generate mice carrying a subtle mutation in the K-ras gene, whereby the targeted insertion allele was created in cultured ES cells, and the excision was performed in mice in vivo (95).



View larger version (13K):
[in this window]
[in a new window]
 
Fig. 3. "Hit-and-run" targeting procedure. The two-step "hit-and-run" targeting strategy requires an insertion type vector, containing a piece of DNA (green line; exons shown as boxes) homologous to the target locus (gray line; exons shown as boxes), which has been modified with the desired small, nonselectable mutation (red star). It also contains positive (+) and negative (-) selectable markers. In the first step, homologous recombination (X) and positive selection are used to generate a duplication at the target locus via insertion of the targeting vector. The second step is based on the resolution of the duplication by recombination (pop-out) between the duplicated homologous sequences [via single reciprocal intrachromosomal recombination or uneven sister chromatid exchange (25)], resulting in removal of the plasmid sequences, selection markers, and one complement of the duplication from the target locus. Since the negative selection marker is excised, clones undergoing this event can be selected with drugs used against the negative selection cassette.

 
The "double replacement" (or "tag-and-exchange") technique also relies on two rounds of homologous recombination, although the second round of recombination is also a gene targeting event (i.e., vector-chromosome recombination rather than chromosome-chromosome recombination). Double replacement vectors are replacement vectors containing both positive and negative selectable markers inside the region of homology (Fig. 4).1 Using this strategy, the targeted gene is first replaced with a selectable marker to inactivate the allele ("tagging"), followed by retargeting with a second vector to reconstruct the inactivated allele, and introduce the engineered mutation ("exchanging"). This strategy has been used to introduce subtle mutations in mouse ES cells in the Col1a-1 gene (269), {alpha}2 Na,K-ATPase gene (7), and the Huntington’s disease homolog (Hdh) gene (36) (with germ-line transmission of the mutated allele demonstrated in the latter example). A modified approach, termed "stable tag exchange," incorporates an additional positive selection cassette to circumvent the problem of high levels of nonhomologous recombinants, which survive selection due to loss of tag gene expression (266). The advantage of this technique is that any number of different vectors can be used in the second step to generate an array of mutations at the same locus, as has been used in ES cells to generate a series of different targeted alterations to the prion protein gene (153). However, as with the hit-and-run method, a limitation of this procedure is the spontaneous loss of sensitivity to the negative selection agent due to events other than homologous recombination, such as mutation of the negative selectable marker (82).



View larger version (23K):
[in this window]
[in a new window]
 
Fig. 4. "Double replacement" targeting. "Double replacement" vectors are replacement vectors, containing a piece of DNA (green line; exons shown as boxes) homologous to the target locus (gray line; exons shown as boxes), as well as positive (+) and negative (-) selectable markers. Replacement vectors are linearized outside the target homology prior to transfection. In the first step, replacement targeted clones are generated (identified by use of the positive selection marker). In the second step, a different targeting vector homologous to the same genomic locus (yellow line; exons shown as boxes), but devoid of any selectable markers, is used. This vector carries the desired mutation (red star), and targeted clones are identified by selection against the negative selectable marker (with resistant clones being screened for the presence of the desired mutation). X represents recombination between the vector and the genome.

 
In contrast to conventional targeting techniques, which generate ES cells with a selectable marker in the mutated (targeted) locus, both the hit-and-run and double replacement techniques generate ES cells with mutations, yet no selectable marker. This is important, as most of these cassettes contain a promoter/enhancer and polyadenylation signal, all of which have the potential to interfere with transcription of neighboring genes and hence generate a phenotype that is difficult to interpret (89, 182). Another way of eliminating the problem associated with the presence of the selectable marker in the targeted locus is to use a site-specific recombination system, such as loxP sites flanking the marker to allow its excision through the actions of Cre recombinase (discussed in more detail in Section 2.2).

2.1.3) Generating conditional mice.
Conditional activation ("gain-of-function") or inactivation ("loss-of-function") of gene expression in vivo can be used to control gene expression in a temporal-, spatial-, or tissue-specific manner. The tight regulation of gene expression offered by conditional gene targeting means avoidance of any potential embryonic lethality associated with the mutation and allows for a more accurate mouse model of human disease and sporadic cancer initiation and progression (reviewed in Ref. 97).

Such control can be achieved using binary transgenic systems, in which gene expression is regulated by the interaction of an effector protein product on a target transgene. These interactions can be controlled by crossing mouse lines or by adding/removing an exogenous inducer. Binary transgenic systems fall into two categories: transcriptional transactivation (used to activate transgenes in gain-of-function experiments) and site-specific DNA recombination (used to activate transgenes or to generate tissue-specific gene knockouts). Transcriptional transactivation is more widely used than DNA recombination to bring about transgene activation in mice, because the latter is irreversible, whereas the former occurs only when the transactivator is present (thus if an inducible system is used, then transgene activation can be reversed at a given time, providing greater control over transgene expression) (reviewed in Ref. 123).

2.1.3.1) transcriptional transactivation. The most widely used binary transcription transactivation systems are the tetracycline-dependent regulatory systems ("Tet-on" and "Tet-off"; see http://www.clontech.com/tet/index.shtml or http://www.zmbh.uni-heidelberg.de/bujard/Homepage.html) (70). These systems use a chimeric transactivator to control transcription of the gene of interest from a silent promoter and are based on two regulatory elements derived from the E. coli tetracycline-resistance operon: the Tet repressor protein (TetR) and the Tet operator sequence (tetO). The Tet-off system uses a plasmid that expresses a fusion protein known as the tetracycline-controlled transactivator (tTA), composed of TetR and the VP16 activation domain of herpes simplex virus (Fig. 5A). tTA binds to the tetracycline response element (TRE) and activates transcription of the target gene in the absence of the inducer, doxycyclin (Dox).2 The Tet-on system uses a form of TetR containing four amino acid changes that result in altered binding characteristics and create the reverse TetR (rTetR). As shown in Fig. 5B, rTetR binds the TRE and activates transcription of the target gene in the presence of Dox. Recent examples of use of the Tet system in transgenic mice include the regulation of delta-FosB expression to control bone formation and bone mass (221), regulation of the SV40 T-antigen oncogene to control ß-cell autoimmunity and hyperplasia (21), and regulation of the endothelin receptor B (ednrb) to determine when ednrb signaling is required during embryogenesis (218). In addition, there are many variations on the basic Tet system, such as allowing two target transgenes to be coregulated (14), thereby enabling monitoring of transactivation in transgenic mice when one promoter drives a reporter gene [such as luciferase or ß-galactosidase (ß-gal)] (120). Other variations include modification of the DNA-binding specificity of the transactivator (15) and the generation of rtTA variants with increased Dox sensitivity, allowing a greater Dox responsiveness in mouse tissues to which Dox has limited access (such as the brain) (246).



View larger version (17K):
[in this window]
[in a new window]
 
Fig. 5. The tetracycline-responsive regulatory system for transcriptional transactivation. A: the Tet-off system uses a plasmid expressing a fusion protein known as the tetracycline-controlled transactivator (tTA), composed of the wild-type repressor protein (tetR; box with gray spots) and the VP16 activation domain of herpes simplex virus (black box). tTA binds to the tetracycline response element (TRE) and activates transcription of the target gene in the absence of the inducer, doxycyclin (Dox). B: the Tet-on system uses a regulator plasmid expressing a fusion protein (rtTA) composed of an "altered" TetR, the reverse TetR (rTetR; box with gray stripes), and the VP16 activation domain of herpes simplex virus (black box). rtTA binds to the TRE and activates transcription of the target gene in the presence of Dox.

 
Another transcriptional transactivation system is the Gal4-based system, from Saccharomyces cerevisiae. The transcriptional activator Gal4 directs the expression of responsive genes by binding to upstream activating sequences (UAS), such that UAS-linked target genes can be activated (174). The system can be made inducible if Gal4 is linked to ligand binding domain of progesterone receptor and VP16 activator (GLVP). GLVP can be activated by binding to synthetic steroids, such as RU-486 or ZK-98.734. However, the use of the GLVP system has generally been limited to controlling gene expression in adult mice (183, 260261), as the synthetic steroids have been shown to induce abortions.

2.1.3.2) site-specific dna recombination. The most widely used site-specific DNA recombination system uses the Cre recombinase from bacteriophage P1, and more recently the Flp recombinase from S. cerevisiae has begun to show utility (discussed in more detail in Section 2.2). By using gene targeting techniques to produce mice with modified endogenous genes that can be acted on by Cre or Flp recombinases expressed under the control of tissue-specific promoters, site-specific recombination can be used to inactivate endogenous genes in a spatially controlled manner. Cre/Flp activity can also be controlled temporally by 1) delivering cre/FLP-encoding transgenes in viral vectors, 2) administering exogenous steroids to mice that carry a chimeric transgene consisting of the cre gene fused to a mutated ligand-binding domain, 3) using transcriptional transactivation to control cre/FLP expression, and 4) using soluble Cre. There are numerous examples of conditional mice that have been generated using the loxP/Cre recombinase system (19b).

2.1.4) Generating "knock-in" mice.
Knock-in experiments are used to place a transgene, such as a cDNA or a reporter construct contained in a targeting vector, under the transcriptional control of an endogenous gene. For example, the endogenous gene may be replaced with a homolog (to assess whether members of the same gene family have similar functions when expressed in the same spatial/temporal pattern); human cyclin E has been knocked into the murine cyclin D1 gene (which rescues all phenotypic manifestations of cyclin D1 deficiency and restores normal development in cyclin D1-dependent tissues) (67), and murine Otx1 has been knocked into the murine Otx2 locus (to demonstrate the functional equivalency between Otx2 and Otx1 in development of the rostral head) (236). The most widely used knock-in strategy is the replacement of a gene by a reporter gene such as LacZ or GFP, to monitor expression pattern of the gene during development and in the adult mouse, both in a spatial and temporal manner. For example, LacZ has been knocked into the Pax3 locus (to examine the role of Pax3 in the neuroepithelium and somites) (143). A knock-in allele usually results in the loss-of-function of the endogenous gene.

Although the early knock-in studies in mouse ES cells utilized the double replacement technique (Section 2.1.2) (69, 228), more recent knock-in experiments have been performed using conventional gene targeting approaches (Fig. 6A), often in combination with the Cre-loxP system to remove the selection cassette (detailed in Section 2.2; Fig. 6B) (76). As shown in Fig. 6, such knock-in vectors are essentially replacement targeting vectors containing the transgene and a positive selectable marker and are designed such that after homologous recombination, the transgene is transcriptionally regulated by the endogenous promoter of the locus. This method is based on the production of a fusion protein between the endogenous and knocked in products. However, if a fusion product is undesirable (such as in the case of secreted proteins), then 1) the transgene can be knocked into the untranslated region upstream of the endogenous translational start site (although splicing may occur around the inserted transgene) or 2) an internal ribosome entry site (IRES) element (181) can be used to generate a bicistronic mRNA [the encephalomyocarditis virus IRES is most commonly used in mammalian cells (110)] (113).



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 6. Knock-in strategies. A: knock-in of a transgene into the target locus in-frame with the target. The knock-in vector (black line) contains the transgene, polyadenylation signal (pA), a positive selection cassette (+), and DNA (green line; exons shown as boxes) that has homology to the target locus (gray line; exons shown as boxes). Using this vector, the transgene must be inserted in-frame with the upstream exon. B: knock-in of a transgene using a splice acceptor sequence and floxed selection cassette. The knock-in vector (black line) contains the splice acceptor signal (SA), transgene, polyadenylation signal (pA), a positive selection cassette (+) flanked by two loxP sites (black triangles), and DNA (green line; exons shown as boxes) that has homology to the target locus (gray line; exons shown as boxes). The use of a splice acceptor allows the transgene to be in-frame with the target locus irrespective of where it integrates within the locus. The use of loxP sites to flank ("flox") the selection cassette allows for excision of the cassette from the locus upon exposure to Cre recombinase. X represents recombination between the vector and the genome.

 
2.1.5) Generating mosaic mice.
As discussed above, the most widely used analysis of gene function in mice is through the generation of a null allele, which allows the generation of mice homozygous for the desired mutation. However, it is important to note that such a strategy can only be used to assess the earliest function of the gene, as mutational loss of a gene essential for embryogenesis can cause lethality and hence prevent examination of its role in adult mice. To overcome this limitation, conditional mice (see Section 2.1.3) or mosaic mice can be generated. Mosaics are individuals containing cells of more than one genotype. This allows for the generation of mice containing mutant homozygous cells (which would be lethal in a homozygous mouse) on an otherwise heterozygous background (i.e., the genetic composition of the vast majority of cells in the mouse is heterozygous; hence, the animal is viable). Conversely, mosaic mice have been used to study the function of c-src in osteoclasts and other cell types; Src-deficient mice (src-/-) crossed with transgenic mice expressing src under the control of tartrate resistant acid phosphatase promoter (a gene highly expressed in osteoclasts) allowed the generation of src-/- mice that express Src in osteoclasts, thereby allowing the analysis of Src function in other tissues in mice free from the morbidity of osteopetrosis (211).

Genetic mosaics can be generated when mitotic recombination between homologous chromosomes occurs during the G2 phase of the cell cycle and the recombinant chromatids segregate to different daughter cells (X segregation). Recombinant chromatids produced in G2 can also segregate to the same daughter cell, and the nonrecombinant chromatids to the other daughter cell (Z segregation). It is X segregation that is useful for genetic mosaic analysis, because it produces clones of homozygous mutant daughter cells from heterozygous mothers (whereas Z segregation produces daughter cells that are phenotypically indistinguishable from the parent cell or from cells produced by G1 recombination).

There are numerous approaches to generate mosaics in mice (see Fig. 7). The most common is the use of site-specific Cre/loxP technology (described in Section 2.2) in a conditional knockout strategy, in which mosaics are generated in specific tissues by the activity of Cre on a loxP-flanked allele (Fig. 7A) (74). Alternatively, homozygous mutant ES cells can be produced in culture, and these can be used to generate chimeras. For example, direct selection of loss of heterozygosity (LOH) events in neo cassette-containing ES cells can be achieved using high concentrations of G418 (which causes chromosomal nondisjunction followed by reduplication of the remaining chromosome; Fig. 7B) (154). The mechanism that duplicates the targeted neo cassette (thereby providing enhanced G418 resistance) also allows mutations on the same chromosome to become homozygous. Homozygous clones can also be generated by repetitive neo targeting, followed by puro targeting of the second allele. Another strategy (Fig. 7C), based on the principles of the Flp-FRT site-specific recombination system in Drosophila (22), uses the spatial and temporal regulation of Cre expression, which can be used to trigger mitotic recombination (with recombination rates of 10-4–10-2 in ES cells transiently transfected with Cre) (134). Alternatively, Bloom’s mice can be used, as they possess a genetic background that displays increased rates of mitotic recombination (Fig. 7D). Bloom’s mice (Blm-/-) are deficient in the RecQ DNA helicase/Blm protein (140), analogous to the human condition, Bloom’s syndrome [a rare cancer-prone disorder in which the cells of affected persons have a high frequency of somatic mutation and genomic instability due to high frequency of sister chromatid exchange (68)]. Bloom’s mice and ES cells exhibit greatly elevated rates of mitotic recombination, allowing homozygous mutant daughter cells to be generated from heterozygous mothers both in vitro and in vivo. The rate of mitotic recombination in Blm-deficient cells is sufficient to enable homozygous mutant clones to be recovered for any gene, regardless of its location in the genome (140).



View larger version (20K):
[in this window]
[in a new window]
 
Fig. 7. Mechanisms for generating mosaics in vitro and in vivo. A: mosaics can be generated by the use of Cre/loxP technology in a conditional knockout strategy, in which mosaics are generated in vivo in specific tissues by the activity of Cre on a loxP-flanked allele. B: mosaics can be generated in vitro by using neomycin resistance cassette (neo)-containing embryonic stem (ES) cells that have been selected under high concentrations of G418. Homozygous mutant ES cells can result by direct selection of loss of heterozygosity (LOH) events (causing chromosomal nondisjunction followed by reduplication of the remaining chromosome). These mutant ES cells can then be used to generate chimeras (which will possess homozygous mutant cells on a heterozygous background). C: the regulated expression of Cre recombinase can be used to induce mitotic recombination, during which time X segregation of the chromosomes can lead to the generation of mosaics in vitro. D: similarly, Bloom’s mice can be used to generate mosaics in vitro, as they possess a genetic background that displays increased rates of mitotic recombination.

 
2.2) Site-Specific Recombination
2.2.1) Recombinases: Cre and Flp.
The simplest site-specific recombination systems are those composed of a recombinase enzyme and its target sequence. These systems allow for the deletion, insertion, inversion, or translocation of specific regions of DNA. Two such recombinase systems (both members of the {lambda}-integrase superfamily) include the Cre-loxP system from the bacteriophage P1 and the Flp-FRT system from the budding yeast S. cerevisiae. Both Cre and Flp cleave DNA at a distinct target sequence (Fig. 8A) and ligate it to the cleaved DNA of a second identical site, to generate a contiguous strand. The orientation of these target sites relative to each other directs the type of modification catalyzed by the recombinase (detailed in Fig. 8B). When compared with one another, Cre appears more effective in recombining a transgene array in vitro and in vivo than Flp (198). However, mutated versions of Flp, such as the temperature-sensitive FlpL (30) and the enhanced Flp (Flpe) (31), show improved activity over wild-type Flp. Indeed, the generation of high-efficiency Flpe deleter mice [the ROSA26 locus was targeted to create a mouse strain with generalized expression of FLPe, termed the FLPeR ("flipper") strain (56)] show that Flpe is an alternative to Cre-loxP (199). In general, Cre and Flpe are used in situations requiring high-efficiency recombination (3031, 56, 245); Flp and FlpL are used when tight regulation is needed (255). Currently, the most widely used site-specific DNA recombinase system in both ES cells and mice is the Cre-loxP system.



View larger version (24K):
[in this window]
[in a new window]
 
Fig. 8. A: recombinase target sites. Target sites contain inverted 13-bp symmetry elements (indicated by the bold arrows) flanking an 8-bp A:T-rich nonpalindromic core (indicated by the open arrow). One recombinase monomer binds each symmetry element, while the core sequence provides the site of strand cleavage, exchange, and ligation. The asymmetry of the core region (open arrow) imparts directionality on the reaction, such that directly orientated sites lead to excision of the intervening DNA, and inverted sites cause inversion of the intervening DNA (see B for more details). A-a: the target site recognized by Cre recombinase (causes recombination) is called loxP ["locus of crossover (x) in P1"]. A-b: the Flp recognition target sequence is designated FRT. B: recombinase reactions catalyzed by Cre recombinase. B-a: a cis recombination event between two loxP sites (triangles) in the same orientation will led to the excision of the flanked DNA sequence (green, red, yellow, and gray boxes). B-b: if loxP sites are orientated in opposite directions, then the loxP-flanking sequence will be inverted. B-c and B-d: recombination between two loxP sites in trans will lead to the reciprocal exchange of the regions that flank the loxP sites. The arrows between recombinase substrates and products indicate the reversible nature of reach reaction.

 
2.2.2) Cre: advantages and disadvantages.
Advantages of the Cre recombinase system include its 1) simplicity (no cofactors required), 2) fidelity (recombination is carried out such that there is no loss or gain of nucleotides), 3) small 34-bp target site (which does not perturb the surrounding genes when positioned in chromosomal DNA), and 4) broad utility (acting on supercoiled, relaxed, or linear DNA substrates; functioning over large megabase distances; and functional in a wide range of cell types) (reviewed in Refs. 53, 163). Moreover, Cre functions in the mammalian germ line, such that it can be used to generate transmissible modifications of loxP-flanked ("floxed") DNA sequences (e.g., selection marker and/or essential gene segment). However, the disadvantages of Cre include recombination of DNA sequences naturally occurring in yeast and mammalian genomes, via "cryptic" loxP-like sites in vitro. Reports are also emerging that Cre recombinase can cause chromosomal rearrangements/aberrations and increased number of sister chromatid exchanges when expressed at very high levels (137, 210, 220), most likely by way of pseudo-loxP sites. High level of Cre expression has also been shown to reduce cell proliferation in mouse embryonic fibroblasts (MEFs) (48, 137, 220) and is speculated to be involved in causing cell-cycle arrest (1).

2.2.3) Systems for delivery of Cre in vitro and in vivo.
The methods used to introduce Cre into ES cells are numerous. The usual in vitro introduction method for transient expression or stable integration is calcium precipitation or electroporation [which can reach as high as 70% efficiency (163)]. Alternative methods of delivery include infection of ES cells with a recombinant Cre adenovirus (2, 5, 259). Adenoviral transfection of Cre results in transient Cre expression, but mosaic mice can be produced, which transmit the targeted allele to their offspring with high frequency (106). The method of introduction of Cre in vivo depends heavily upon the specific application. The simplest solution to achieve overall excision in the developing embryo is crossing of the floxed mice with transgenic mice carrying the Cre recombinase gene (a general Cre expresser transgenic line) (119, 206). Another variation of this approach is pronuclear injection of a Cre expression vector [with the resultant transient Cre expression inducing recombination during preimplantation development (8)] or injecting Cre RNA or protein.

2.2.4) Temporal- and tissue-specific expression of Cre.
As discussed above (Section 2.1.3), the Cre recombinase system can be used to conditionally activate ("gain-of-function") or inactivate ("loss-of-function") gene expression in a temporal-, spatial-, or tissue-specific manner, to allow for a more accurate mouse model of human disease and cancer initiation and progression. To create spatiotemporally controlled somatic mutations in the mouse, chemically inducible forms of Cre have also been used, such as Cre-ERT and Cre-ERT2, in which the ligand-binding domain of a mutated human estrogen receptor (ERT), which recognizes tamoxifen or its derivative 4-hydroxytamoxifen (4-OHT), has been added to Cre (29, 59, 65, 90). Accordingly, Cre-mediated recombination is 4-OHT dependent in mice bearing a CreT transgene. This approach has been used to selectively ablate expression of the retinoid X receptor-{alpha} (RXR{alpha}) in adult mouse keratinocytes, by putting expression of these recombinases under control of the bovine keratin-5 promoter (126). A mouse strain expressing Cre-ERT from the ubiquitously expressed ROSA26 locus has also been generated (256).

The main methodology for tissue-specific Cre-mediated excision is the use of established transgenic lines expressing Cre under the control of a promoter with the required specificity. For example, to create a mouse model for BRCA2-associated breast cancer [as inheritance of one defective BRCA2 allele predisposes humans to breast cancer (reviewed in Ref. 213)], mice conditional for the tumor suppressor genes (TSGs) Brca2 (floxed at exon 11) and/or p53 (floxed at exons 2–10) were mated with mice expressing Cre under control of the epithelial-specific K14 promoter; although no tumors arose in mice carrying conditional Brca2 alleles alone, mammary and skin tumors developed in females carrying conditional Brca2 and p53 alleles, showing that inactivation of both Brca2 and p53 combine to mediate mammary tumorigenesis (96). There are numerous more examples of the use of Cre-expressing mice (19a). In addition, a voluntary database of Cre-expressing mice has been established (see http://www.mshri.on.ca/nagy).

However, a stumbling block is simply the limited number of existing transgenic mouse lines that express recombinase in the appropriate cell type. To address this issue, recombinant Cre fusion proteins bearing hydrophobic peptides from the Kaposi fibroblast growth factor (FGF-4) (94) or basic peptides derived from HIV-TAT (98, 180) have been produced to promote cellular uptake of recombinant Cre. Recombination has been observed in a variety of cultured cell types and in specific tissues examined in mice following intraperitoneal administration. This new cell-permeable form of Cre will likely open up new opportunities for genetically manipulating cells both in vitro and in vivo (40).

In certain applications, the retroviral or adenoviral delivery systems have been used. For example, after intranasally administered adenoviral Cre, mice expressing a mutated K-ras oncogene gene placed downstream of a floxed transcriptional termination stop element developed lung tumors by 2–16 wk of age (due to Cre-mediated removal of the floxed stop element, and hence expression of the mutant K-ras) (95). An alternative is the combination of avian retrovirus with the TVA (EnvA receptor) delivery system, which provides the possibility of viral delivery of Cre in a tissue- or cell-specific manner, as there is a requirement for the expression of a special avian receptor (tv-a) in the mouse cells from a cell-type-specific transgene to make these cells susceptible to infection (85). More recently, an adeno-associated virus [used to transfer foreign genes into the adult and neonatal central nervous system in animals (23, 136, 146)] expressing Cre recombinase has been shown to mediate extensive in vivo recombination in neural cells of defined brain regions in the mouse (108).

2.3) Chromosome Engineering
2.3.1) Chromosomal rearrangements.
A large array of mice containing chromosomal rearrangements (deletions, inversion, duplications, and translocations) have been generated by exposure to chemical mutagens [cyclophosphamide, ethylene oxide, chlorambucil, and N-ethyl-N-nitrosourea (ENU)] or radiation (X-rays) (47, 124, 192, 205, 234, 237). However, although these mutagens have generated some valuable mouse models for human diseases, such as the mouse model of trisomy 21 (47, 192), their usefulness for inducing rearrangements is limited by the fact that the endpoints of the induced rearrangements cannot be predetermined (see SECTION 3, below, for details on mutagenesis strategies). To this end, gene targeting-based strategies have been developed to introduce defined chromosomal rearrangements into the mouse genome by engineering them in ES cells using the Cre/loxP site-specific recombination system. This technology, known as "chromosome engineering," has successfully generated numerous mouse models that accurately recapitulate human chromosomal rearrangements, such as the chromosomal deletion within band 22q11 (del22q11) causing DiGeorge syndrome (130131), the paternal deficiency of chromosome 15q11-q13 causing Prader-Willi syndrome (244), the translocation between chromosomes 8 and 21 [t(8;21)] found in acute myeloid leukemia (AML) (32), and the reciprocal translocation between chromosomes 9 and 11 [t(9;11)(p22;q23)] associated with acute leukemia (42).

2.3.2) Chromosome engineering technology.
2.3.2.1) selection of endpoints. Generating chromosomal rearrangements in the mouse involves the sequential insertion of two targeting vectors into two separate loci in the ES cell genome. Thus an important decision is to define the two endpoints of the rearrangement. Endpoints can be chosen at any genomic region. For example, SSLP microsatellite markers make useful endpoints as they have been genetically mapped (see the Whitehead Institute STS Physical Map of the Mouse at http://www-genome.wi.mit.edu/cgi-bin/mouse/index). Numerous SSLP markers have been successfully used as the endpoints for engineering chromosomal deletions on mouse chromosome 11 (286). In addition, the high-resolution mapping information available for the mouse genome (see the MGSC Ensembl Mouse Genome Server at http://www.ensembl.org/Mus_musculus/) means that genes of known chromosomal location may also be used as endpoints. Genes used as endpoints have included the epidermal growth factor receptor (Egfr) gene (242), the amyloid precursor protein gene (125), the myeloperoxidase (MPO) gene (10), Wnt3, p53, and Hoxb9 (190, 286), p63 (149), and the HoxB cluster (148), to name a few.

2.3.2.2) generating the chromosomal rearrangement. Once the endpoints have been selected, the first step involves the targeting of an insertion vector containing a loxP site, a positive selection cassette (e.g., neomycin), and one of two complementary, but nonfunctional, fragments of the Hprt gene into the desired locus (selected endpoint) of the ES cell genome (see Fig. 9A) (285). ES cell clones with a loxP site targeted to a first endpoint can be identified by positive selection and Southern blot analysis. The second step involves the targeting of a second insertion vector containing a loxP site, a different positive selection cassette (e.g., puromycin), and the complementary fragment of the Hprt gene into the second endpoint (see Fig. 9B). The generation of these targeting vectors is detailed below. The doubly targeted ES cell is then transiently transfected (electroporated) with a vector expressing Cre, which facilitates recombination between loxP sites such that the intervening DNA is deleted. Commonly used Cre-expression vectors include pOG231 (171), pTurboCre (GenBank accession no. AF334827), pCrePAC (238), and pBS185 (125). The type of chromosomal rearrangement derived from the double-targeted ES cells is determined by the loxP configuration (reviewed in Fig. 8B). As shown in Fig. 10A, loxP sites in the same orientation generate a chromosomal deletion (or duplication event; not shown). Alternatively, loxP sites in the opposite orientation generate a chromosomal inversion (Fig. 10B). The generation of chromosomal translocations is discussed below (in Section 2.3.4).



View larger version (16K):
[in this window]
[in a new window]
 
Fig. 9. Gene targeting in ES cells. Insertional targeting vectors can be used to insert loxP sites, positive selectable markers, hypoxanthine phosphoribosyltransferase (Hprt) gene fragments, and coat-color markers to predetermined loci in the ES cell genome. The vector (black line) is linearized in the region of homology (green line) to stimulate targeted insertion into the locus (gray line). A: the 5'-Hprt vector contains the neomycin selectable marker (N; yellow box), the 5' end of the Hprt minigene, a loxP site (black triangle), and the Tyrosinase minigene (Ty; red box) coat-color marker. B: the 3'-Hprt vector contains the puromycin selectable marker (P; yellow box), the 3' end of the Hprt minigene, a loxP site (black triangle), and the K14-Agouti transgene (Ag; red box) coat-color marker. The Hprt gene is divided into two complementary, but nonfunctional, fragments [5'-Hprt contains exons 1–2, and 3'-Hprt contains the remaining exons 3–9 (190)] that are each linked to a loxP site. Cre-mediated recombination unites the 5' and 3' cassettes and restores Hprt activity (which is required for purine biosynthesis), thereby allowing the desired recombination events to be selected for in HAT (hypoxanthine, aminopterin, and thymidine) medium.

 


View larger version (27K):
[in this window]
[in a new window]
 
Fig. 10. Generation of defined chromosome rearrangements. The endpoints (red stars, X and Y) selected for the interval of genomic DNA to be rearranged are modified using two consecutive steps of gene targeting in the same ES cell. Both steps involve a positive selection scheme using the Hprt minigene to identify clones containing rearrangements induced by Cre recombinase. The first step involves targeting of a linearized insertion vector containing a loxP site (black triangle), a nonfunctional 5'-Hprt cassette, and a neomycin resistance cassette (N), to endpoint X. The second step involves targeting of a linearized insertion vector containing a loxP site, a nonfunctional 3'-Hprt cassette, and a puromycin resistance cassette (P), to endpoint Y. The final step uses Cre recombinase to catalyze site-specific recombination between the loxP sites. A: if the loxP sites are in the same orientation, then the intervening region is deleted. B: if the loxP sites are in the opposite direction, then a chromosomal inversion occurs. Cre-mediated recombination to unite the 5' and 3' cassettes restores Hprt activity, making the ES cells resistant to HAT-containing medium (HATr).

 
This Cre-mediated recombination event can be selected for in culture because a functional Hprt cassette is reconstituted, which confers resistance to the drug hypoxanthine-aminopterin-thymidine (HAT).3 Thus, using selection in HAT, Southern blot analysis, and fluorescent in situ hybridization analysis (FISH), ES cell clones carrying the desired chromosomal rearrangement can be identified. As with traditional gene targeting strategies shown in Fig. 1, these ES cells are injected into mouse blastocysts to generate chimeras, from which the progeny that carry the engineered chromosomes are derived.

2.3.2.3) generation of gene targeting vectors for chromosome engineering. Gene targeting vectors, such as those shown in Fig. 9, can be generated in the conventional way by sequentially inserting various genetic components into a plasmid construct (133, 190). However, the "two library" system (285) greatly reduces the number of cloning steps required for generating gene targeting vectors. This system is composed of two complementary libraries of pre-made gene targeting insertion vectors. The 5'-Hprt library was generated by cloning a genomic library into a vector backbone that contained the 5'-Hprt cassette, a loxP site, and neomycin (PGKneobpA) as a positive selectable marker. The complementary 3'-Hprt library was generated by cloning a genomic library into a vector backbone that contained the 3'-Hprt cassette, a loxP site, and puromycin (PGKpurobpA) as a positive selectable marker. In addition, both of these libraries are equipped with genes encoding visible coat color markers. The tyrosinase minigene (Ty) has been used to "tag" transgenes with a visible pigment marker in albino mice (176, 275), and the K14-Agouti transgene (Ag) uses the keratin-14 promoter to constitutively express agouti (114). Mice carrying the Ty transgene have a grayish coat on an otherwise albino background, and the Ag confers a "butterscotch" coat color in black agouti or non-agouti mice (285).

To make a defined rearrangement between any two desired loci, clones for each endpoint are first isolated from the appropriate library. Linearization of the construct within the genomic insert generates a gene targeting insertion vector that integrates at the target locus. Targeting can be assessed by Southern blot analysis using a DNA fragment that was removed from the genomic insert before gene targeting, by an external probe, or by PCR amplification using primers specific for the gap and vector. An additional feature of this gene targeting library system is that clones isolated from these libraries can also be used for analyzing single gene function via the knockout (null allele) approach, as demonstrated by targeted disruption of the p63 locus (149).

2.3.3) Chromosomal deletions.
2.3.3.1) uses for the engineering of chromosomal deletions. The chromosomal engineering of deletions can be used to identify TSGs without prior knowledge of the gene function. For example, mice possessing a deletion encompassing a putative TSG should exhibit increased tumorigenesis, as they only possess a single copy of a TSG, and tumor-specific loss or inactivation of the remaining allele can be used to clone the causative gene. Chromosome engineering can also be used to generate mouse models of human microdeletion syndromes. For example, mice heterozygous for a 1.2-Mb deletion between Es2 and Ufd1l on mouse chromosome 16 show cardiovascular abnormalities resembling those found in DiGeorge syndrome patients (130). When these mice were crossed to a strain harboring a duplication of the same region, the mutant phenotype was functionally rescued, indicating that the defects seen in DiGeorge patients are due to haploinsufficiency (loss of one functional copy of a gene). Thus the potential embryonic lethality of a particular deletion can be functionally determined by assessing the ability of the duplicated allele to rescue the embryonic lethality of the deletion (as mice containing both the deleted and duplicated alleles are genetically balanced).

The major factor limiting the generation of deletions in ES cells is the size of the rearranged interval, with deletions >22 cM leading to ES cell lethality or severe growth disadvantage of the cells in culture (286). Although Cre/loxP recombination has been shown to occur over such large distances, clones can emerge from these experiments that have undergone a compensatory genetic change, such as a chromosomal duplication (286). X-ray- and UV-induced mutagenesis have also been successfully performed to generate deletion complexes in ES cells. Radiation-induced deletions can be localized and made selectable by targeting a vector that carries a negative selection cassette to a predetermined locus; for example, homologous recombination can be used to insert a herpes simplex virus thymidine kinase (HSV-tk) gene into a specific locus, followed by irradiation and then selection for clones that have deleted the tk cassette (240, 276). In addition, deletion alleles in ES cells of several centimorgans have been successfully transmitted through the germ line (190, 276).

2.3.3.2) generation of nested chromosomal deletions. An extension of the chromosome engineering strategy is the generation of nested chromosomal deletions, a series of variably sized, overlapping deletions surrounding a predetermined genomic locus. This strategy has been used to identify that haploinsufficiency of the Tbx1 gene contained within the 1.2-Mb deletion interval of DiGeorge mice was responsible for the aortic arch defects seen in these patients (131). Furthermore, if the genomic locations of the endpoints are known, then nested deletions can be extremely useful for mapping novel recessive mutations (278). However, to get around the task of having to generate targeting vectors for the nested endpoints, retroviral integration of a second loxP site and selection cassette can be used (235). More specifically, deletion complexes can be anchored to a predetermined location in the genome by targeting the 5'-Hprt-loxP cassette. The 3'-Hprt-loxP cassette is then randomly inserted into the ES cell genome by retrovirus-mediated integration, generating a library of ES cell clones with the same targeted endpoint and a collection of random endpoints (Fig. 11). This method has been used to generate nested deletions extending from a few kilobases to several megabases at the Hprt locus, also on chromosome 11 (235). Electroporation has also been used to randomly insert the second loxP site and selection cassette into the ES cell genome (122). However, insertion by electroporation increases the risk of genomic rearrangements occurring at the insertion site, and tandem repeats of a vector may be introduced into the insertion site (although these should be reduced to a single locus by the activity of Cre on a head-to-tail concentrate) (278). Endpoints generated by random insertion can be defined by cloning the genomic DNA that flanks the deletion endpoints and by mapping these junction fragments onto a physical map of the region. Nested deletions can also be efficiently generated by irradiation (116, 240, 276), although they require additional extensive characterization to define each deletion interval (278).



View larger version (19K):
[in this window]
[in a new window]
 
Fig. 11. Nested chromosomal deletions induced with a retroviral vector. The first deletion endpoint is fixed by targeting the 5'-Hprt cassette, the neomycin resistance gene (N), and a loxP site (black triangle) to a predetermined locus. The second step involves the integration of the 3'-Hprt cassette, the puromycin resistance cassette (P), and a second loxP site into a random site in the ES cell genome, using a recombinant retroviral vector. Here is depicted G1 recombination events from retroviral orientations that result in nested deletions of the randomly targeted gene (the stars depict genetic markers within this gene). Cre catalyzes recombination between the loxP sites, resulting in the expression of the Hprt gene and hence survival in HAT medium, for ES cell clones carrying the recombinant chromosomes. The nested deletions can be identified from a pool of HAT-resistant clones on the basis of their sensitivity to G418 and puromycin. LTR, long terminal repeat.

 
2.3.4) Chromosomal translocations.
Chromosomal translocations are involved in the genesis of many types of human tumors, often as a result of the abnormal expression of cellular oncogenes or by creating novel fusion genes (152, 189). Although mouse models for several human leukemias have been established by tissue-specific expression of fusion proteins transgenes (138) and knock-in constructs expressing a fusion protein under the control of the appropriate endogenous promoter (44), they fail to recapitulate the situation found in human translocation-induced tumors. For example, in these mice the fusion gene is present from the inception of embryogenesis, whereas in humans the chromosomal translocation is believed to occur at later stages (172, 274). In addition, these mice possess only one fusion protein, whereas in balanced chromosomal translocations two fusion proteins are generated (186). Thus to accurately recapitulate chromosomal translocations found in human tumors, chromosome engineering is most effective (224, 249). Numerous mouse chromosomal translocations with predetermined breakpoints have been created using this strategy, including models of human leukemia-associated translocations such as t(8;21)(q22;q22) (32) and t(9;11)(p22;q23) (42).

Using the Cre-loxP system, chromosomal translocations are generated using the same techniques as for deletions, duplications and inversions (see Section 2.3.2), except the loxP sites, orientated in the same direction relative to their respective centromeres, are targeted to nonhomologous (different) chromosomes. If the loxP sites are in opposite directions, then recombination will result in acentric chromosomes (without a centromere) and dicentric chromosomes (containing two centromeres), which will be unable to survive. Thus, to prevent the generation of acentric and dicentric chromosomes, only pairs of genes with the same transcriptional orientations relative to their centromeres can be engineered to produce fusion proteins (278). To generate a fusion protein from a chromosomal translocation, the targeting vectors need to be designed to allow the two genes to be linked through their introns (with the loxP site embedded in the breakpoint), so RNA splicing will generate an in-frame fusion mRNA and protein (278). To induce the translocation, mice carrying both targeted endpoints can be crossed with transgenic mice expressing a tissue- and/or temporal-specific Cre. In addition, this strategy circumvents the problem of transmitting the translocation through the male germ line, as the presence of chromosomal translocations in male germ cells can cause infertility (141).

2.3.5) Balancer chromosomes.
A balancer chromosome is one containing an inverted region(s). Balancer chromosomes can be generated using chromosomal engineering technology, as chromosomal inversions can be generated in mouse ES cells by successive gene targeting of a loxP site to the two endpoints, followed by Cre-mediated recombination between the two loxP sites (Fig. 10B) (284). In addition, the inversion can be marked with a dominant marker (such as the dominant K14-Agouti coat-color gene) so that progeny carrying the balancer can be readily identified.

As inversions suppress crossing over during mitotic recombination (the genetic exchange of information between sister chromatids), balancer chromosomes are genetic reagents that can be used for stock maintenance (to maintain the integrity of mutagenized chromosomes). Balancer chromosomes can also be used for large-scale mutagenesis screens (see Section 3.1.2). For example, in intercrosses between siblings that have inherited the balancer chromosome and a mutagenized chromosome, absence of non-balancer-carrying progeny indicates the presence of one or more recessive lethal mutations on the mutagenized chromosome (284, 278). Indeed, the first mouse balancer chromosome was constructed on chromosome 11, to facilitate the isolation of ENU-induced recessive mutations on mouse chromosome 11 (http://www.mouse-genome.bcm.tmc.edu/ENU/ENUHome.asp) (284). This balancer chromosome is based on a 24-cM inversion between the Trp53 gene and the Wnt3 gene (in addition, a coat color marker, K14-Agouti, has been inserted into the mutated Wnt3 locus). Balancers will also be useful to facilitate analysis and maintenance of other types of knockouts.

2.3.6) Homologous recombination in E. coli.
To simplify the generation of knockout constructs, recombineering technologies have been developed. This form of DNA engineering utilizes methods based on homologous recombination in E. coli that enable large segments of genomic DNA in BACs or P1 artificial chromosomes (PACs) to be modified and subcloned, without the need for restriction enzymes or DNA ligase.

2.3.6.1) reca homologous recombination. Several recombination pathways have been identified in E. coli, including the RecA pathway. A homologous recombination-based system allowing modification of BACs in recombination-deficient E. coli is the temperature-sensitive shuttle-vector-based system (75, 170). This temperature-sensitive plasmid replicates in cells growing at the permissive temperature (30°C) but is lost in cells growing at the restrictive temperature (42–44°C) because its origin of replication cannot function (79). Introduction of the E. coli RecA gene into the temperature-sensitive shuttle vector allows the RecA- host E. coli strain containing the BAC to become competent to perform homologous recombination of the resident BAC in vivo (150, 272). Using this method, transgenic mice have been generated by pronuclear injection of the modified BAC, and germ-line transmission of the intact BAC obtained (272).

2.3.6.2) et recombination. Another recombination pathway in E. coli is the RecBCD pathway involved in repairing double-strand breaks (225). RecBCD unwinds and degrades DNA to generate 3' single-stranded DNA (ssDNA) tails, which are used by RecA to initiate recombination. However, E. coli possessing wild-type RecBCD degrade the introduced linear DNA molecules before recombination has proceeded; therefore, to restore recombination activity, a suppressor mutation that activates expression of a nuclease that produces 3' overhangs may be used (117). RecBCD can be inactivated by 1) the sbcA mutation, which removes a repressor for the endogenous lamboid prophage, Rac, which in turn induces the expression of recE and recT [two Rac genes that encode homologous recombination functions (135)], or 2) the gam protein of {lambda} phage [in the presence of gam, {lambda} phage-encoded recombination function stimulates homologous recombination by the {lambda} red genes (156, 188)].

Engineering DNA by homologous recombination mechanisms involving the RecE/RecT and Red{alpha}/Redß proteins in E. coli is termed "ET recombination" (also known as Red recombination and lambda-mediated cloning). ET recombination was pioneered by Stewart and colleagues (280), who showed that a PCR-amplified fragment of linear dsDNA, flanked by short regions of homology (42 bp) to a plasmid or BAC can be efficiently targeted to a plasmid or BAC by electroporating the dsDNA into recBC sbcA strains (see http://www-db.embl-heidelberg.de/jss/servlet/de.embl.bk.wwwTools.GroupLeftEMBL/ExternalInfo/stewart/ETcloning-textonly.html) (280). As shown in Fig. 12, ET recombination involves two steps: first, the amplification of a fragment of DNA by PCR with flanking regions of homology, plus the introduction of phage recombination functions into a BAC-containing bacterial strain; and second, the transformation of the cassette into the bacterial cells that contain a BAC and recombinase functions (the bacterial cells generate a recombinant in vivo, and detection of the recombinant is done by selection, counter-selection, or by direct screening). ET recombination allows alterations such as point mutations, sequence insertions and/or deletions to be carried out at any position on a target DNA molecule and has greatly simplified the generation of transgenic and knockout constructs (reviewed in Refs. 43, 160).



View larger version (14K):
[in this window]
[in a new window]
 
Fig. 12. Recombineering steps to generate a bacterial artificial chromosome (BAC) recombinant. The first step in recombineering is to amplify the cassette of interest (such as a selectable marker or reporter gene; gray box) by PCR, using primers containing 40–50 bp of homology to the target site on the BAC (white box). The next step is to introduce phage recombination functions into a BAC-containing bacterial strain or introduce a BAC into a strain that carries recombination functions. The cassette can then be transformed into cells that contain the BAC and recombination functions, to allow generation of a recombinant clone [which are those containing the PCR-amplified DNA cassette inserted into the target gene (black box)]. Re