Most statistical methods used to determine significant alterations are based on the assumption that the data are normally distributed, yet distribution analysis of 2-DGE data is often neglected. Two types of distribution analyses ought to be performed on each data set: (i) the distribution of spot volumes of each individual protein spot across replicate gels ("spot volume distribution") and (ii) the distribution of the resulting variances of the spot volumes across the replicate gels ("variance distribution")â€”that is, the values resulting from the "evaluation of software-induced variance" described above (Figure 8.4). The distribution pattern can be visually assessed through a histogram (Figure 8.4, upper panel) or a Q-Q-normal plot, in which the sample quantiles are plotted against the theoretical quantiles in the corresponding normal distribution (Figure 8.4, lower panel). A linear correlation implies that the sample is normally distributed, and a formal goodness-of-fit test should be performed for verification. The Shapiro-Wilk test was developed for small sample sizes [23], and it is a suitable choice for omics experiments (i.e., large-scale data approaches where the number of replicates typically is low [16, 17, 24]). Based on the null hypothesis that the data are normally distributed, the test calculates the correlation of the points in the Q-Q-normal plot. A rejection of the null hypothesis (p < 0.05) thus implies a non-normal distribution, and appropriate transformation of the data should be considered. The most commonly used transformation for 2-DGE data is log-transform. However, Kreil et al. [25] have reported that log-transform may lead to inflated variance at low signal levels [25]. The many similarities between mRNA and protein global expression analyses have prompted the exploration of applying transformations common in microarray experiments on 2-DGE data, and the successful use of Arsinh transformation to achieve

