## Abstract

Genetical genomics approaches provide a powerful tool for studying the genetic mechanisms governing variation in complex traits. By combining information on phenotypic traits, pedigree structure, molecular markers, and gene expression, such studies can be used for estimating heritability of mRNA transcript abundances, for mapping expression quantitative trait loci (eQTL), and for inferring regulatory gene networks. Microarray experiments, however, can be extremely costly and time consuming, which may limit sample sizes and statistical power. Thus it is crucial to optimize experimental designs by carefully choosing the subjects to be assayed, within a selective profiling approach, and by cautiously controlling systematic factors affecting the system. Also, a rigorous strategy should be used for allocating mRNA samples across assay batches, slides, and dye labeling, so that effects of interest are not confounded with nuisance factors. In this presentation, we review some selective profiling strategies for genetical genomics studies, including the selection of individuals for increased genetic dissimilarity and for a higher number of recombination events. Efficient designs for studying epistasis are also discussed, as well as experiments for inferring heritability of transcriptional levels. It is shown that solving an optimal design problem generally requires a numerical implementation and that the optimality criteria should be intimately related to the goals of the experiment, such as the estimation of additive, dominance, and interacting effects, localizing putative eQTL, or inferring genetic and environmental variance components associated with transcriptional abundances.

- optimal design
- selective phenotyping
- transcriptional profiling
- gene expression
- expression quantitative trait loci

modern techniques being used to unravel the genetic mechanisms governing variation in complex traits combine information on phenotypic traits, family (or pedigree) structure, molecular markers, and gene expression, and are generally referred to as genetical genomics, or quantitative genomics approaches (11, 22, 31, 42). For example, the transcriptional activity of genes, assessed by microarray experiments in genotyped individuals, has been treated as multiple phenotypic traits such that traditional quantitative trait locus (QTL) analysis has been used to search for polymorphisms associated with gene expression variability (the so-called expression QTL, or eQTL). Applications of such methodology can be found, for example, in Brem et al. (4), Hubner et al. (20), Morley et al. (38), Schadt et al. (45), and Yvert et al. (54).

Results of eQTL studies can be represented as in Fig. 1, in which a fictitious chromosome is depicted on both axes, with the molecular marker locations (10 markers) placed on the horizontal axis and the genes probed in the microarray slide (20 genes) placed on the vertical axis. For each gene, a genome scan is performed, and the significant QTL are represented by a dot (location point estimate) and a horizontal segment (location confidence interval). In genetical genomics studies hundreds of markers (in multiple chromosomes) and thousands of genes are generally considered in a single experiment. For an example of presentation of eQTL mapping results see Bing and Hoeschele (3) and Lan et al. (32), among others.^{1}

On the illustrative example of Fig. 1, five genes (which are denoted as g_{1}–g_{5}) present at least one significant QTL affecting their transcriptional levels. Some eQTL locations coincide with the region where the gene whose expression is being studied resides, such as the eQTL found for *gene 1*. *Gene 1* as well as its eQTL are located around *marker 3*. This indicates that the transcriptional activity of *gene 1* may be partially modulated by polymorphisms on *gene 1*. This process is referred to as *cis*-acting. Some other eQTL, however, are located elsewhere in the genome, denoting polymorphisms on specific loci contributing to variation in the expression of genes in different regions of the genome. This process, denoted by *trans*-acting, can be studied to understand how genes interact and how they cluster in gene networks.

Throughout this paper (to facilitate the discussion on the advantages and disadvantages of alternative experimental strategies) the terms “epistasis” and “*trans*-acting effects” are used to represent two variants of gene interaction. Epistasis refers to the classical definition of the joint effect of alleles in two or more segregating loci (i.e., how the combined effect of genotypes in multiple loci differs from the sum of each genotype effect alone) and how it contributes to variation on phenotypes (which may represent also transcriptional activity of genes). Epistasis is then defined similarly to the statistical interaction among two or more factors (8), but having factors and their levels represented by loci and their genotypes, respectively. Epistasis involving two biallelic loci can be then factored into specific components such as additive × additive, additive × dominance, dominance × additive, and dominance × dominance interactions; higher order terms can also be studied if more than two loci are considered. *Trans*-acting effects, on the other hand, represent the effect of a specific polymorphism on the transcriptional abundance of another gene, which may be not even polymorphic. The *trans*-acting effect of a biallelic locus [e.g., single nucleotide polymorphism (SNP)] on the expression of a specific gene can be factored on additive and dominance *trans*-acting components, similarly to any quantitative phenotype.

Genetical genomics studies provide valuable information regarding gene interactions (both epistatic effects and *trans*-acting factors), gene allelic variants responsible for its own accentuated or attenuated transcriptional activity (*cis*-acting factors), and eQTL hot-spots (chromosomic regions affecting expression of multiple genes, within a pleiotropic context, e.g., the region between *markers 3* and *4* in Fig. 1, which is found to be associated with the transcriptional activity of *genes 1, 2, 3,* and *5*) and can be combined with QTL analysis of phenotypic traits (such as economically important agricultural traits or human disease-related traits). Such information helps us further our understanding of the genetic complexity underlying variation of such traits, useful for the generation of candidate genes for target pharmaceutical drug development and for the selection of molecular markers to be used in marker assisted breeding programs in agriculture.

In addition to eQTL mapping, genetical genomics approaches have also been used to study whole genome *trans*-acting effects of specific loci, such as candidate genes or transgenes (40), to estimate heritability of mRNA transcript abundances (13, 18, 37) using information on related subjects and to infer regulatory gene networks (3, 6, 9, 35, 55). For a review of available methods for the statistical analysis of genetical genomics data see, for example, Alberts et al. (1), Carlborg et al. (7), Kadarmideen et al. (25), Kendziorski and Wang (28), and Rosa et al. (44).

As genetical genomics studies involve expensive and labor-intensive throughput laboratory techniques (50), such as SNP scoring and gene expression profiling using microarray and/or quantitative reverse transcription polymerase chain reaction (qRT-PCR), careful experimental design of such trials is critical for their success. In the following sections, we discuss some design strategies for genetical genomics studies, using simple language and avoiding excessive mathematical formalism, and provide some general design guidelines related to different experimental goals, such as the comparison of expression levels of different genotypic groups for candidate genes, eQTL mapping, and the estimation of heritabilities of mRNA transcript abundances.

## DESIGN OF GENETICAL GENOMICS STUDIES

The planning of a genetical genomics study entails a variety of aspects. Similarly to any QTL mapping experiment it requires, for example, the choice of breeds or lines to be used, as well as the experimental design, such as backcross (BC), F_{2}, granddaughter design, etc. In addition, gene expression studies involve careful thought regarding the cell type(s), tissue(s), and the developmental stage(s) to be assayed.

Specifically with respect to gene expression microarray experiments, researchers are also faced with the questions of which microarray platform best suits their specific experiment goals and how many slides should be considered for the desired experiment efficiency. In addition, after the choice of a microarray platform and the number of slides to be used, researchers should still decide the subjects to be assayed, as well as how to pair samples within slides and how to assign dye labeling (e.g., Cy3 and Cy5 dyes), in the case of two-color technologies, and how to organize the hybridizations across assay batches.

The choice of microarray platform is generally guided by the availability of alternative technologies for the organism being studied. For example, cDNA microarrays are available only for species from which expressed sequence tags (ESTs) were obtained from cDNA libraries. Likewise, slides using oligonucleotide probes can be generated only if DNA sequence is available, and an optimal probe set can be designed only for those species with completely sequenced genomes. Under these circumstances, a broader range of commercial and homemade array platforms is generally available for experiments involving human subjects or model organisms, as opposed to most livestock or wildlife species. In addition, given the current costs associated with microarray hybridization experiments, the size of the experiments (i.e., number of slides considered) is most often dictated by a budget constraint.

Therefore, in this paper we focus our discussion on two specific statistical issues of the experimental design of genetical genomics studies. First, the subset selection of individuals for gene expression assaying (also known as selective phenotyping, or selective profiling), and second, the microarray experimental set-up, especially when making use of two-color systems, such as cDNA or long oligo arrays.

### Selective Profiling

In the past, genotyping costs used to limit the sample sizes of gene mapping studies. To overcome this problem and increase the power of such studies for a fixed number of experimental units, selective genotyping approaches have been considered (2, 12, 33, 34). In genetical genomics studies, phenotyping can also be extremely expensive because of the high costs of gene expression profiling via microarrays. Thus measuring gene expression for only a subset of available individuals is a natural strategy for reducing the cost of eQTL mapping experiments. In the next subsections we discuss alternative selective profiling approaches for different experimental goals of genetical genomics trials.

#### Genetic dissimilarity.

A proposed strategy for selective phenotyping uses marker information for subset selection for increased genetic dissimilarity (23), with the goal of maximizing the power for QTL detection. The procedure compares the marker genotypes of all subjects available (the so-called full mapping panel) and uses an algorithm based on the experimental design concept of minimum moment aberration (MMA) to select a subsample of individuals to be phenotyped. The procedure may either consider information on all available markers or alternatively target specific regions of the genome thought to be important for the trait(s) of interest. Given *m* markers, MMA measures similarity for a subsample of *n* individuals as the average of all pairwise similarities, given by: where s_{ij} is the similarity measurement between individuals i and j, and p = n(n − 1)/2 is the total number of pairs of individuals.

Jin and colleagues (23) used the number of alleles two individuals share (0, 1, or 2) as a measure of similarity, so that s_{ij} is the sum of number of alleles in common over all markers considered. To allow comparisons across experiments of different sizes, the authors considered a standardized version of the similarity measure, called score (S), given by: where M is the maximum possible value of K, and R is the difference between the maximum and the minimum possible values.

Jin and collaborators (23) used a simulation study to illustrate the increase in the efficiency of QTL detection when using this selective phenotyping approach, compared with a random sample from the mapping panel. An F_{2} experiment was considered, with a single chromosome and evenly spaced (10 cM) markers, and a single QTL with heritability h^{2} = 0.25, 0.50, and 0.75. Varying mapping panels (*N* = 50–200), sample sizes (*n* = 10–*N*), and proportion of individuals selected (10–90%) were assessed as well. Their results show that, for a fixed subsample size (*n* = 50), the sensitivity (i.e., the percentage of simulation runs in which the QTL was detected) increased with mapping panel size, leveling off when the proportion of selected subjects reached 50% (for h^{2} = 0.25). In a situation with a fixed mapping panel (*N* = 100), as the proportion of subjects selected increased, the sensitivity improved much faster when using the selective phenotyping approach than with a random sample from the mapping panel. But again there was not much improvement on sensitivity with >50% of subsampling, especially for higher heritability scenarios, suggesting that most of the information needed for QTL detection is retained with 50% of selective phenotyping.

As discussed by these authors, their genetic dissimilarity criterion tends to select individuals that are predominantly homozygous for different alleles. For example, in an F_{2} mapping panel population originated from inbred *lines A* and *B*, a 1:2:1 ratio of A:H:B genotypes is expected. Their genetic dissimilarity approach samples from the mapping panel such that a 1:1 ratio between homozygous individuals in the subsample is favored. This procedure, however, is recommended only if the focus of the experiment refers to additive effects. While additive effects are usually considered the most important and most prevailing among all (23), they are also the easiest ones to detect. For estimating more complex gene action effects, however, other genotypes are required in the subsample, so alternative selection criteria should be considered instead. For example, heterozygous individuals are required in the subsample if one wants to infer dominance effects as well. Jin and colleagues (23) suggested that similarity could be defined as 1 for the same genotype and 0 for different genotype at each marker if interest refers to general QTL effects. Moreover, the authors indicated that their MMA criterion, which corresponds to the first moment (or mean) of the similarity measure across the individuals, optimizes selection for nonepistatic effects. The second moment (or variance) would further optimize for epistatic QTL.

The MMA criterion is conceptually simple and easy to implement, but its current theory relies on complete data and independent factors. For dealing with missing genotypes, Jin and collaborators (23) used a data imputation approach using the Haldane mapping function and the information on flanking markers. To minimize the correlation from genetic linkage, the authors suggest the selection of widely spaced markers for computing similarity measures.

It is important to mention that classical interval mapping approaches may produce biased estimates for QTL effects when selective genotyping is considered (33). As demonstrated by Jin and colleagues (23), interval mapping is robust against selective phenotyping, meaning that inference obtained by analyzing only the selected subjects is representative of the whole population. Finally, the authors suggest that a two-stage selective phenotyping could considerably reduce the cost and increase the power of large experiments, in which a first stage of genome-wide selection could identify promising genomic regions, which would then be used for marker-based selective phenotyping on a second stage. Applications of the selective phenotyping approach proposed by Jin and collaborators (23), in the context of genetical genomics, can be found for example in Lan and colleagues (32).

#### Genetic complementarity.

As discussed above, increasing genetic dissimilarity maximizes the power only for detecting additive effects. Important nonadditive effects, which are actually generally more difficult to estimate, may be missed if not specifically targeted when designing the experiment. To illustrate this concept, consider a situation (Fig. 2) with a single locus and three possible genotypes (homozygotes A and B and the heterozygote H), for which the expected phenotypic values are represented, respectively, by μ_{A}, μ_{B}, and μ_{H}.

The additive effect is defined as the difference between μ_{A} or μ_{B} and the average (μ) between μ_{A} and μ_{B} or, similarly, equal to half the difference between μ_{A} and μ_{B}. The dominance effect is defined as the difference between μ_{H} and μ. So, it is clear that with information on both homozygous groups one can calculate the additive effect, but the dominance effect can be computed only with information on the three genotypic groups. In practice, however, the phenotypic means (μ_{A}, μ_{B}, and μ_{H}) are unknown so they need to be inferred from experimental data. A linear model for the analysis of such data can be expressed as: where y_{ij} represents the phenotypic observation on replication j of genotype i (with j = 1,…,n_{i}; i = A, B, and H; and n_{i} being the sample size for genotype i), μ is a general constant (defined here as the average between the expected phenotypic values of the homozygous genotypes), G_{i} is the effect of genotype i, and e_{ij} is a residual term associated with the observation y_{ij}. The residuals are generally assumed normally distributed with mean 0 and variance σ^{2}, representing polygenic and environmental factors affecting the phenotype. Note that in genetical genomics the phenotype can be the transcriptional abundance (often log transformed) relative to either the polymorphic locus A/B or any other locus of interest.

An analysis of variance could be considered to study the effect of genotype, as well as to estimate the residual variance σ^{2}. An estimate of the additive effect can be obtained by a linear contrast involving the averages of the homozygous groups, i.e., α̂ = (μ̂_{A} − μ̂_{B})/2 = (ȳ_{A} − ȳ_{B})/2. For estimating the dominance effect the contrast must also involve the heterozygous group average, as δ̂ = μ̂_{H} − (μ̂_{A} + μ̂_{B})/2 = ȳ_{H} − (ȳ_{A} + ȳ_{B})/2. It is shown that the variances of α̂ and δ̂ are, respectively, (1/r_{A} + 1/r_{B}) × σ^{2}/(4N) and (4/r_{H} + 1/r_{A} + 1/r_{B}) × σ^{2}/(4N), where r_{i} is the proportion of individuals with genotype i and N = n_{A} + n_{B} + n_{H}. Figure 3*A* depicts the variance of the estimates of the additive and dominance effects as a function of the proportion of heterozygous individuals (r_{H}) in the sample. It is considered the same proportion of each homozygous group (A and B), i.e., r_{A} = r_{B} = (1 − r_{H})/2. It is seen that the variance for additive effects is always smaller than that for the dominance effects. Also, the variance for additive effects is minimized when r_{H} is small, as proposed by the selection criterion discussed by Jin and colleagues (23). On the other hand, the variance for the dominance effects increases exponentially as the values of r_{H} decrease, and it goes to infinity as r_{H} approaches zero (i.e., the dominance effects simply cannot be estimated if there are no H individuals in the sample). Note also that the variances of both the additive and the dominance effects increase exponentially as r_{H} approaches 1, indicating the obvious conclusion that neither additive nor dominance effects can be estimated if only H individuals are represented in the sample.

In Fig. 3*B*, the proportion of H individuals is fixed to its optimal value r_{H} = 0.5 (i.e., the proportion that minimizes the variance for dominance effects), and the proportion of each homozygous group is changed. It is seen, as expected, that the best scenario refers to a situation with a balance between each homozygous groups, i.e., r_{A} = r_{B} = 0.25. But more importantly, note that the variance for the additive effects is always smaller than that for the dominance effects, even when there is strong unbalance between the A and B groups.

With special interest on dominance effects, Keller and collaborators (26) and Piepho (41) discussed an alternative selective profiling criterion that favors a 1:2:1 ratio of A:H:B genotypes in the subsample. It is shown that even with an A:H:B ratio of 1:2:1 (i.e., the ratio that maximizes the precision of dominance effects estimates), the variance of the additive effect estimate for a specific locus is still half the variance of its dominance effects estimate (Fig. 3*A*). The authors, however, considered a situation with two inbred lines and their hybrids, such that only two haplotype configurations are possible for each chromosome. Under these circumstances, it is not possible to determine if a differential gene expression observed across the three genotypic groups for any specific gene is due to the allelic variation in that gene (*cis*-acting) or to allelic variation on other loci (*trans*-acting) of the genome or to a combination of such effects. For example, consider an experiment to compare the transcriptional activity of three genes (1, 2, and 3) between two lines (A and B) with genotypes A_{1}A_{1}A_{2}A_{2}A_{3}A_{3} and B_{1}B_{1}B_{2}B_{2}B_{3}B_{3} and their hybrid A_{1}B_{1}A_{2}B_{2}A_{3}B_{3}. If a higher transcriptional abundance on *gene 1* is observed for individuals with genotype A_{1}B_{1} compared with the average of the two homozygous genotypes (A_{1}A_{1} and B_{1}B_{1}), it may be due to a *cis*-acting dominance effect on *gene 1*, as well as *trans*-acting dominance effects of *genes 2* or *3*, such as transcriptional factors or regulatory effects.

Keller and collaborators (26) and Piepho (41) use the terms heterosis and dominance interchangeably when referring to the overexpression of genes on H individuals compared with the average of the parent *lines A* and *B*. We understand heterosis is a more appropriate terminology in this case because, as discussed above, one cannot ensure whether the overexpression of a specific gene is due to any specific locus (or small set of loci) or to polygenic effects. The only way an experiment could provide information to disentangle these effects would be by allowing loci to recombine, such as by carrying the crosses to at least an F_{2} generation. More generations may be necessary to increase the probability of recombination among closely linked loci. In any event, a selective profiling criterion may be used to select individuals carrying desired allelic combinations across target loci.

A general approach in this regard was proposed by Bueno and colleagues (5). Their selective phenotyping criterion is based on what is coined here genetic complementarity, in which the subset selection of subjects depends on the goal of the experiment. For example, if a central goal of an F_{2} line cross experiment is to infer *trans*-acting dominance effects of a candidate gene, the selection criterion will tend to sample (similarly to the above) a subset of subjects for a 1:2:1 ratio of A:H:B genotypes. Conversely, if both additive and dominance effects are equally sought, then the subset selection will tend toward a 0.293:0.414:0.293 ratio of A:H:B genotypes. It is important to notice that ideally (but not necessarily) the candidate gene(s) should be in linkage equilibrium with other genes probed in the microarray slide. Complete linkage disequilibrium, on the other hand, leads the effects to be confounded, as discussed above for the case with inbred lines.

Bueno and colleagues (5) also discussed situations with multiple loci, including epistatic effects, i.e., the combined effect of alleles in two or more loci on the expression of another locus. It is shown that if only additive and additive × additive effects are to be estimated, an experiment involving K biallelic loci will correspond to a factorial of the series 2^{K}. If dominance effects and interactions (epistasis) involving dominance effects are sought as well, a 3^{K} factorial experiment should be considered. Simpler experiment layouts can be utilized if epistatic effects are not of interest, but more complicated experimental scenarios (such as fractional factorial structures) may be necessary depending on the number of loci and the number of microarray slides considered, as well as the genetic material available (e.g., some specific allelic combinations across multiple loci may be absent due to rare allelic frequencies or to low recombination rates between closely linked loci).

#### Recombination rates.

The genetic complementarity approaches discussed above consider situations with candidate genes (5), or when subjects belong to a few possible genotypic groups, such as inbred lines and F_{1} (26, 41). In either case, there is no need to estimate the location of eQTL. Many genetical genomics experiments, however, relate to eQTL mapping studies, such as the example depicted in Fig. 1. In such situations, interest refers not only to detection of eQTL, but also to localizing those putative eQTL, as well as to estimating their *cis*- or *trans*-acting effects.

The genetic dissimilarity methodology suggested by Jin and collaborators (23) for eQTL mapping maximizes the power of detection of eQTL with additive effects but may not be optimal for inferring the location of those eQTL. For example, consider a double haploid (DH) experiment with three linked, ordered loci. In such a situation, a pair of nonrecombinant individuals (i.e., individuals with genotypes A_{1}A_{1}A_{2}A_{2}A_{3}A_{3} and B_{1}B_{1}B_{2}B_{2}B_{3}B_{3}) has the same genetic dissimilarity value as a pair of double recombinant individuals (i.e., individuals with genotypes A_{1}A_{1}B_{2}B_{2}A_{3}A_{3} and B_{1}B_{1}A_{2}A_{2}B_{3}B_{3}). While these two pairs of individuals have the same amount of information regarding additive effects of putative eQTL within the chromosomic segment between *loci 1* and *3*, only the second pair has information regarding the number of eQTL (one vs. two), as well as the position of such eQTL.

With the goal of maximizing the efficiency of localizing eQTL, de Leon and Rosa (15) proposed a selective phenotyping criterion to maximize the number of recombination events in the subset sample. In a simulation study involving a DH experiment with 10 markers evenly spaced and a single QTL and different mapping panel sizes and subsampling rates, the authors concluded that the selective profiling based on recombination rates substantially improved the precision of the QTL position estimates (even compared with a genetic dissimilarity-selective criterion focused on markers nearby the QTL), with no sizeable detrimental effect on either the detection power or the precision of inferences regarding the QTL effect. Similar results were presented by Jannink (21) and Xu and colleagues (52), who performed even broader simulations, with varying marker spacing, map length, and number of QTL.

More general methodologies of selective phenotyping based on recombination rates were proposed by Jannink (21) and Xu and collaborators (52). Their approaches favor not only an increased number of recombination events in the subsample, but also an even distribution of recombinations across the genome. With use of such procedures, a pair of individuals with genotypes A_{1}A_{1}B_{2}B_{2}B_{3}B_{3} and B_{1}B_{1}B_{2}B_{2}A_{3}A_{3} (i.e., two individuals showing recombinations on different chromosomic regions) would be preferred over a pair of individuals with genotypes A_{1}A_{1}B_{2}B_{2}B_{3}B_{3} and B_{1}B_{1}A_{2}A_{2}A_{3}A_{3} (i.e., individuals with recombinations observed only for one of the chromosomic segments).

Xu and colleagues (52) used as the objective function the so-called sum of squares of bin lengths (SSBL), where bin was defined on a sample of individuals as an interval along the linkage group within which there were no crossovers in any sampled individual and bounded on either side either by a crossover in at least one individual or by the end of a linkage group. By minimizing the SSBL a sample of individuals in which crossovers are more frequent and the distance between them less variable is obtained. Alternatively, Jannink (21) proposed a selective profiling criterion (coined uniRec) based on the overall sum of d_{ij} = c_{ij}/m_{i} (across subjects and marker intervals), where c_{ij} = 1 if progeny j is recombinant in marker interval i (and c_{ij} = 0 otherwise), and m_{i} is the map distance (cM) between the markers flanking interval i. In addition, the author considered a selection criterion (called maxRec) based on the number of recombinations only (i.e., the overall sum of c_{ij} values, without weighting them by map distances), as the one discussed by de Leon and Rosa (15). Both Xu and collaborators (52) and Jannink (21) concluded that the selective profiling based on the number (and distribution) of recombinations significantly increased the accuracy of QTL position estimates, especially for smaller genetic map length.

The selective profiling strategies discussed above improve eQTL mapping efficiency in different ways compared with a random sample from the full mapping panel. The approaches of Bueno et al. (5), Jin et al. (23), Keller et al. (26), and Piepho (41) maximize the power of detection of specific QTL effects, while the procedures suggested by de Leon and Rosa (15), Jannink (21), and Xu et al. (52) improve the QTL mapping resolution. No research, however, has been published on alternatives to combine both strategies, i.e., to increase the number and homogeneity of recombinations on the subsample while favoring specific genotypic proportions across the selected individuals. Additional research and simulations in this area would certainly be welcome to further improve the benefits and flexibility of selective phenotyping approaches in genomics research, especially for situations in which a large genotyped population is available [such as with recombinant inbred lines (RIL)], and when the phenotypic assays are onerous and costly (such as with transcriptional and translational output assays).

#### Covariance among subjects.

A completely distinct research goal that has been considered in the genetical genomics literature relates to inferring variance components and heritability of transcriptional activity (18, 37). In these circumstances, treatments (which may refer either to family structures or related individuals on a complex pedigree) are considered of random effects. A selective phenotyping criterion approach for these cases should take into account the genetic relatedness among the available subjects to maximize the precision of the variance components or heritability estimates. Bueno and collaborators (5) discussed an algorithm for a subselection procedure with random treatments and presented some examples involving half sibs, full sibs, and complex pedigrees. The best designs are shown to be very specific for each pedigree and estimation objective. An R function was developed for finding optimal designs with any covariance structure among treatments.

### Microarray Experiment Layout

After a set of subjects has been selected for gene expression profiling, another important step in the experimental design is required, especially for two-color microarray platforms. Specifically, a microarray experiment layout should be optimized regarding the allocation of mRNA samples within assay batches, slides, dye labeling, and other local control factors (5, 51).

#### Reference and loop designs.

The most widely used experimental layouts for two-color microarray experiments refer to the so-called reference and loop structures (30, 53). In the reference design (Fig. 4*A*), a single sample (reference sample) is hybridized with every sample from each of the treatments or experimental groups. Dye-swap is sometimes considered, but it is not mandatory. The main advantage of the reference design is its simplicity; it is straightforward to conduct in the lab. Its disadvantage, however, relates to the fact that half of the observations refers to the reference sample, which is not of primary interest. Conversely, the loop design (Fig. 4*B*) refers to a more complex structure, in which each sample must be labeled with both dyes, which are cohybridized with samples from alternating experimental groups and dye labeling. Nonetheless, the loop structure is shown to be generally more efficient than the reference design (29, 30, 48, 53).

Some alternatives to the classical reference and loop layouts are discussed in Rosa et al. (43), Steibel and Rosa (46), and Tempelman (47), who compare efficiency and robustness of designs combining different levels of biological replication (i.e., subjects within the experimental groups) and technical replication (e.g., replicated arrays for each biological sample). The reference design depicted in Fig. 4*A* presents a single replication for each experimental group, so evidently multiple replications of such experiment would be necessary for statistical inference purposes. Moreover, in Fig. 4*B*, the two samples of each group (A to E) may refer to either two aliquots of the same mRNA sample from each group or to two independent samples (two subjects) from each group. In the first case there would be only technical replication for each mRNA sample, and the five slides in Fig. 1*B* would represent a single biological replication of the experiment. The second scenario would represent a minimally replicated balanced loop design with five experimental groups. More biological replications are generally needed for a reasonable experimental precision and efficiency. For a discussion on technical vs. biological replication please refer to, for example, Churchill (10), Rosa et al. (43), and Tempelman (47).

Most previous papers on the design of two-color microarray experiments, however, considered situations in which there was no genetic component distinguishing the experimental groups, so their results are not directly applicable to genetical genomics studies. Consequently, possibly due to unawareness of better design alternatives, genetical genomics experiments utilizing two-color microarrays are generally conducted using reference designs with dye-swap (4, 45, 54). Recently, however, some studies proposing more efficient experimental set-ups for genetical genomics were presented, which are discussed below.

#### Distant pairing.

In the genetical genomics context, Fu and Jansen (16) proposed a design strategy for allocating pairs of RIL samples to two-color microarray slides. Their approach, called distant pair design, is based on two basic principles. First, for a given number of slides, it is generally more efficient to increase biological replication rather than technical replication; and second, samples should be paired such that an increased ratio of within- over between-slides genotypic dissimilarity is obtained.

For example, consider an experiment with two slides to study the effects of allelic variation on the gene expression of three loci (1, 2, and 3). Consider also that four samples are available with the following genotypes: A_{1}A_{1}B_{2}B_{2}B_{3}B_{3}, A_{1}A_{1}A_{2}A_{2}B_{3}B_{3}, B_{1}B_{1}A_{2}A_{2}A_{3}A_{3}, and B_{1}B_{1}B_{2}B_{2}A_{3}A_{3}. In this case, a more efficient estimation of genetic effects is obtained by pairing *samples 1* and *3* in one slide, and *samples 2* and *4* in the other.

The distant pairing approach presented by Fu and Jansen (16) can also be combined with a selective profiling step whenever the mapping panel is bigger than the number of samples intended to be used in the microarray experiment. The selective phenotyping can be performed using either the genotypic dissimilarity approach proposed by Jin et al. (23) or the recombination-based approaches of Jannink (21) or Xu et al. (52).

#### Efficient designs to estimate dominance effects.

The distant pair design approach is recommended only when interest relies exclusively on additive effects or when only two genotypes are possible for each locus, such as with RIL, BC, or DH populations. If other effects are of interest, an alternative pairing strategy may be necessary. For instance, Piepho (41) discussed efficient designs for two-color microarray experiments when the main interest refers to the estimation of dominance effects. It is shown that in such cases slides cohybridizing samples from heterozygous against homozygous individuals are more informative and desired than slides comparing homozygous subjects for different alleles. As an example, consider a single locus situation with three possible genotypes: A, B, and H. Efficient designs for inferring dominance effects should pair samples A with H, and B with H. Obviously, such experiments should include multiple slides of each pair comparison (using independent samples from each genotypic group, i.e., biological replication), with alternating dye labeling.

#### General approach.

As discussed previously, the approach proposed by Piepho (41), either for selecting subjects for microarray screening or for pairing samples within slides, applies only to inbred lines and their hybrids or to situations targeting a single biallelic locus. A more general approach for searching for optimal genetical genomics designs (including an optimal dye assignment and pairing of samples across slides) was proposed by Bueno and colleagues (5). Similarly to their selective phenotyping approach based on genetic complementarity, samples are allocated to slides and dyes favoring hybridizations that provide more informative contrasts relative to the genetic parameters of interest. The results presented using examples with multiple loci generalize the distant pair concept for inferring additive effects, as well as Piepho's design strategy for inferring dominance effects. It is shown that the optimal design depends on the effect(s) of interest and on how they are weighted in the optimality criterion.

For example, if inferences are focused on additive effects, the optimal design will resemble a distant pairing strategy. Likewise, if dominance effects are the main effects of interest, the resulting design will favor slides comparing heterozygous vs. homozygous subjects. However, if interaction terms (epistatic effects) are also considered, the design structures get more complex. In such situations, the allocation of samples within slides should take into account their combination of genotypes across multiple loci. For example, a distant pair design considering two loci in a DH population would tend to hybridize only slides with samples A_{1}A_{1}A_{2}A_{2} vs. B_{1}B_{1}B_{2}B_{2}, and A_{1}A_{1}B_{2}B_{2} vs. B_{1}B_{1}A_{2}A_{2}. Nonetheless, additional genotypes should be paired if additive × additive epistatic effects are of interest as well. For instance, a pair of slides comparing the genotype A_{1}A_{1}A_{2}A_{2} against A_{1}A_{1}B_{2}B_{2} and the genotype B_{1}B_{1}A_{2}A_{2} against B_{1}B_{1}B_{2}B_{2} provides more precise information regarding how the additive effect of *locus 2* changes according to the genotype on *locus 1*, i.e., A_{2}A_{2} vs. B_{2}B_{2} when *locus 1* genotype is A_{1}A_{1}, and A_{2}A_{2} vs. B_{2}B_{2} when *locus 1* genotype is B_{1}B_{1}, respectively.

Another generalization of the distant pairing concept proposed by Fu and Jansen (16) for finite loci was also presented by Bueno and colleagues (5) for experiments aiming at the estimation of variance components and heritability of gene expression. It is shown that given a sample of subjects with a certain relatedness structure (pedigree), the search algorithm tends to pair less related individuals in each competitive hybridization.

## CONCLUDING REMARKS

This paper discusses the designing of microarray experiments for different goals of genetical genomics studies, such as the comparison of expression levels of different genotypic groups, eQTL mapping studies, or estimation of heritabilities of mRNA transcript abundances. Choosing a good microarray design for a genetical genomics study consists of two steps: selective profiling (or selective phenotyping), i.e., the selection of subjects to be assayed; and the microarray experiment layout, which refers to the allocation of pairs of samples to slides and the assignment of dye labeling (red and green). These two steps are also referred to in the statistical literature as treatment choice and treatment to unit allocation, respectively.

The selective phenotyping step depends on the genetic or biological material available and on the goal of the experiment, so it is similar for any microarray platform being used, such as single channel high density oligonucleotide technology (e.g., Affymetrix) or two-color spotted slides (using either cDNA or long oligonucleotide probes). The microarray experiment layout, however, depends also on the microarray technology considered. In the case of single-channel platforms, as each sample is assayed in an independent slide, the experiment set-up is straightforward. Conversely, with two-color microarrays, there are always numerous ways of pairing samples and assigning dyes. The simplest alternative in this regard refers to the reference design, which resembles a single-channel experiment layout. However, it is possible to take advantage of the possibility of assaying two samples in each slide by searching for more general structures that provide increased efficiency and precision for the experiments.

In general (and especially from the statistical point of view), biological replication should be preferred over technical replication. For example, if a reference design with 2*n* microarray slides is considered, better statistical precision is obtained if 2*n* subjects are assayed instead of a dye-swap structure (i.e., reverse labeling of each biological sample and the reference) with *n* subjects in two slides each. As discussed in this paper, even more efficient experiments may be sought in the context of row-column (slides and dyes) structures, by searching for optimal (or near-optimal) designs for specific goals of the experiment (5, 51). Dye-swap (or any other level of technical replication) may be considered if a limited number of biological samples are available; however, statistical tests should take into account such hierarchical replication structure (i.e., subjects and slides within subjects) when performing significance testing (43). Another common strategy in microarray experiments is to pool samples as an attempt to reduce biological variability (10, 27). In genetical genomics studies, however, pooling is generally not advised, except in a few cases such as with RIL (28) or other experiments involving genetically identical individuals.

Another interesting issue with microarray experiments, which is not addressed in this paper but which certainly relates to genetical genomics studies as well, is that of power and sample size calculation. Because of the high-dimensional nature of microarray assays, power calculation should be based on the false discovery rate (FDR) concept, as proposed by Dobbin and Simon (14), Gadbury et al. (17), Hu et al. (19), Jung (24), and Muller et al. (39). However, genetical genomics studies usually involve multiple, hierarchical sources of variation within a mixed effects model context (43). Extensions of the FDR-based power calculations for genetical genomics studies are not yet available but would certainly be extremely useful.

In this paper we discuss selective phenotyping based on information on genetic markers or relatedness among individuals. Alternatively, selective transcriptional profiling may be based on traditional phenotypes (36) or on combinations of trait and marker data (49) or trait and family structures (44). The suitability of each approach will depend on the goal of the experiment and on the assumptions of the model. Generally, however, the subsample selection mechanism should be taken into account when analyzing the observed data (49).

While for small or simple experiments an analytical solution for the optimality problem may be possible, more complex scenarios (such as in situations with multiple allelic and interacting loci or with complex pedigrees) require a numerical solution using a search algorithm. An optimal design is guaranteed only with a full search (i.e., by comparing all possible designs), but it is generally unfeasible for larger design spaces. Search algorithms (such as simulated annealing or genetic algorithms) can be used instead to find optimal (or near-optimal) designs, but they may be computationally demanding and time consuming. Common sense may then be used either to constrain the design space to be searched, by eliminating design structures that are clearly inadequate (such as designs with strong unbalance on dye labeling), or to come up with a reasonable starting point for the search algorithm.

## GRANTS

This work was supported by United States Department of Agriculture Grant 2004-33120-15204 to G. J. M. Rosa.

## Footnotes

↵1 The 2nd International Symposium on Animal Functional Genomics was held May 16–19, 2006 at Michigan State University in East Lansing, MI, and was organized by Jeanne Burton of Michigan State University and Guilherme J. M. Rosa of University of Wisconsin-Madison (see meeting report by Drs. Burton and Rosa,

*Physiol Genomics*28: 1-4, 2006).Address for reprint requests and other correspondence: G. J. M. Rosa, 460 Animal Science Bldg., 1675 Observatory Dr., Univ. of Wisconsin - Madison, Madison, WI 53706 (e-mail: grosa{at}wisc.edu).

Article published online before print. See web site for date of publication (http://physiolgenomics.physiology.org).

- Copyright © 2007 the American Physiological Society