# Info

72°C

2 min

Repeat for 40 cycles. Final extension: End

5 min

5. Run the PCR products on an agarose gel. Blot the gel and probe the filter with a fragment representing YFG. If a positive signal is obtained for any of the 30 superpools, proceed to the next step. If no positive signals are obtained, and experimental error has been ruled out, the conclusion must be that there are no insertions in YFG in this collection. The process must start again using a different collection.

The probability of finding an insertion in YFG in a population of a certain size can be calculated from the Poisson distribution, assuming that the distribution of insertions is random. For example, let us say that the coding sequence of the gene of interest is 2 kb. If the size of the haploid genome is 126,000 kb, then the fraction of the genome that is the target is 2/126,000 = 0.0000159. Say that the population being screened contains 60,000 different insertions. The mean number of insertions in the gene of interest is 60,000 x 0.0000159 = 0.95. The probabilities that the popula tion contains x number of insertions in YFG is given by the Poisson distribution: p(x) = (e"V)/jt!, where u is the mean number of insertions, and x is a particular number of insertions. In our example, the probability of 0, 1, and 2 insertions in YFG is 0.39, 0.37, and 0.17, respectively. Blotting and probing are necessary to confirm the presence or absence of product (some products may be present in vanishingly small quantities). The exercise also confirms that the product has the right sequence.

6. Each superpool is represented by nine subpools, each representing 225 T-DNA insertion lines. Repeat the PCR that gave a positive signal in Step 5 using the appropriate nine subpools as templates. Make sure that the product obtained is the same size as that obtained in Step 5; confirm that the sequence is correct by blotting and probing as in Step 5.

If possible, it is a good idea to sequence the fragment amplified from the subpools. This will determine exactly where the insertion lies. If it is outside the coding region, or otherwise in an undesirable position, the search for the individual plant carrying this particular insertion can be abandoned at this point, without wasting any more time.

7. Each of the nine subpools is represented by 25 seed pools, each pool containing seed from nine T-DNA insertion lines. Working with the positive subpool identified in Step 6, grow -100 plants from each of the 25 seed pools. Combine tissue from the plants grown from each seed pool. Purify DNA for each seed pool, and repeat the PCR to determine which seed pool contains the desired insertion.

8. Prepare DNA from at least 100 individual plants from the positive seed pool. Repeat the PCR to identify at least one individual plant that contains the insertion.

### Considerations for Phenotypic Characterization

Most T-DNA insertions that cause altered phenotypes do so by eliminating or reducing the function of the gene they interrupt. Such loss-of-function phenotypes are usually only apparent in homozygotes, and thus plants must be obtained that are homozygous for the insertion. This can be accomplished by using PCR. The PCR product used to isolate the insertion line serves to indicate whether or not at least one chromosome carries the insertion. In general, the 5' and 3' YFG primers can be used to amplify a product from the uninterrupted gene, but not the gene with the insertion. If these two primers are too far apart to yield a PCR product, new primers can be designed, one on either side of the insertion site. By carrying out two PCRs, it is possible to distinguish homozygous wild type (only the 5" and 3' YFG primers yield products) from hemizygous (both sets of primers give products) from homozygous mutants (only the set of pinners that amplify the sequence flanking the insertion yield a product). It is possible that 1 of the 100 plants from the pool of nine in Step 6 is homozygous. If it is, this plant and its descen-dents can be used for phenotypic testing. Otherwise, homozygous plants can be identified among the progeny of a hemizygous plant.

If the insertion mutant has a phenotype, the T-DNA insertion in YFG may or may not be the cause of the observed phenotype. There may be another T-DNA insertion that causes the phenotype, or it may be due to a point mutation. There are three ways to test whether the insertion in YFG is the cause of the phenotype: (1) Cross the homozygous insertion line to wild type, and look for cosegregation of the phenotype with the homozygous mutant genotype. If the mutation is the cause of the phenotype, they will always cosegregate. However, this test cannot exclude the possibility that the phenotype is caused by a second mutation that is closely linked to the insertion. (2) Find another insertion in YFG. If both insertion mutations, or even better irdws-heterozy-gotes derived from crossing both insertions, cause the same phenotype, it is virtually certain that the mutations in YFG are the cause of the phenotype. (3) Transform plants homozygous for the insertion with the wild-type version of YFG. If the phenotype reverts to wild type, the mutation in YFG must have been the cause of the phenotype. Even if test 2 and/or 3 indicate that the insertion causes the phenotype, it is a good idea to backcross the insertion line to wild type several (two to three) times, as there are often secondary mutations in T-DNA lines that may have subtle effects on phenotypes of interest.

BEYOND MUTANTS: USING NATURAL VARIATION TO IDENTIFY INTERESTING GENES

Apart from mutations induced in the laboratory, an important resource for the identification of genes affecting different biological processes is the genetic variation that exists in natural populations. In these populations, special alleles, which have been selected for during evolution, might represent variants that are difficult or impossible to recreate by mutagenesis in the laboratory (Alonso-Blanco and Koornneef 2000).

There are principally two ways in which natural variations become manifest. In a simple case, natural variation between different strains is due to one or two loci of major effect. In such a case, the parental phe-notypes will segregate in an F population 1:3 or 1:15. Examples of genes that were discovered as such major loci are CAL, FLC, and FRI (Bowman et al. 1993; Lee et al. 1993; Clarke and Dean 1994; Koornneef et al. 1994). In many cases, however, the genetic basis is complex and involves many loci. Because each locus contributes to the trait of interest in a quantitative manner, these loci are called quantitative trait loci (QTL).

If there is reason to believe that different natural populations of Ara-btdopsis, called accessions or ecotypes, show differences in the trait of interest, investigators can obtain many such accessions from the stock centers. To begin with, select accessions that come from geographically diverse locations, such as Southwestern Europe, Northern Europe, and Central Europe/Asia. Start out by measuring the trait of interest in these different accessions. If natural variation is found, the selected accession can be crossed to an accession that shows a different behavior, preferably one of the common laboratory strains such as Columbia. In the F , it can be determined whether either of the parents behaves dominantly, and in the F2, it can be ascertained how complex the genetic basis is.

Further analysis depends on whether there are only one to two loci of major effect. If this is the case, it should be possible to use regular mapping strategies as described in Chapter 6. In the case of dominant alleles with strong effect, it is often helpful to reduce the background effects by other loci of minor effect through repeated backcrosses (five to ten) prior to further genetic analysis. Repeated backcrossing will create near isogenic lines (NILs); ideally, an NIL will contain only a small region of the genome from one of the parents in a background that is otherwise derived from the other parent.

If there are more than two loci of major effect, statistical methods must normally be used for mapping. These cannot be described in detail here (for an introduction, see Alonso-Blanco and Koornneef 2000), but they require genotyping each F2 plant at many loci across the genome. Depending on how precisely the trait of interest can be measured, it may be better to carry out further mapping using recombinant inbred lines

(RILs). RILs are generated by single-seed descent over a minimum of six generations from individual F2 plants derived from crosses between different accessions. Because of repeated selfing, recombinant chromosomes tend to become homozygous, and an RIL will therefore consist of individuals that are largely genetically identical. RILs have several advantages. First, they allow repeated measurements of a single genotype, which decreases experimental error and increases precision in QTL prediction. Second, genome-wide genotyping of RILs is performed only once, but many different traits can be measured in a given set of RILs. For this reason, a number of generous investigators are making their sets of RILs, together with the accompanying genotyping information, available through the stock centers.

Importantly, different accessions may contain different QTL, even if their phenotype is otherwise very similar. Each of them may contain different combinations of positive and negative QTL, whose net result is the same. A case in point is the EDI flowering-time locus discovered in a cross between Landsberg and Cape Verde Island; these two accessions, as pure-bred strains, have very similar flowering-time behaviors, indicating that the difference at the EDI locus is normally masked by other, balancing QTL (Alonso-Blanco et al. 1998). Consequently, it is worth checking available RILs for variation in the phenotype of interest, even if the parental lines have similar phenotypes.