Readme file for WormBase Expression Data
This page describes WormBase expression data from the paper "Genomic Analysis of Gene Expression in C. elegans" (Science, 290 809-812). Here, we briefly summarize our experiments, and the corresponding data that is available at WormBase. Complete descriptions of our experimental protocols and data analysis procedures are available in the original publication, and at the Supplemental Data website for the paper, at http://www.sciencemag.org/feature/data/1053496.shl.
1. Brief description of experiments
Populations of oocytes, eggs, and staged worms were prepared as described in the paper. Total RNA was isolated from each population. cRNA was generated from total RNA by oligo-dT primed reverse transcription to double-stranded cDNA followed by in-vitro transcription. This cRNA was then hybridized to high-density (24-um feature) oligonucleotide arrays. After hybridization, arrays were fluorescently stained and scanned with a laser scanner. The Affymetrix GeneChip software was used to analyze the resulting array images. There were 3 array designs ("A", "B", and "C") which monitored 5768-6646 genes each. In total, the 3 arrays monitored 98% of the 19,099 worm ORFs in the October 1998 C. elegans sequence release. Each sample was applied to all three arrays; measurements were replicated to determine experimental uncertainties in expression levels.
We report expression data for 8 populations: oocytes, mixed-stage eggs (0h), and staged worm populations at developmental times of 12h, 24h, 36h, 48h, 60h, and 2 weeks. For each gene in each experimental population we report 2 quantities: "frequency" (an integer number) and "present fraction" (either NP, PS, or PA). Finally, for each gene we also report one P-value, indicating the likelihood that the gene's expression level did not change across the 8 samples.
2.1 Frequency
The Affymetrix GeneChip software provided a specific hybridization intensity value or "average difference" (AD) for each transcript that was proportional to transcript abundance. We normalized the AD values to "frequency" values by referring the ADs on each chip to a calibration curve constructed from the AD values for the 11 spike-in control transcripts with known abundances. This "frequency normalization" allows comparison of transcript measurements across multiple array experiments. Frequency values for each gene are expressed in number concentrations (transcripts per million, or ppm), under the assumptions described in the paper. Essentially, in order to calculate the number concentrations, we assume an average worm transcript length of 1 kb.
2.2 Present Fraction
The Affymetrix GeneChip software also provided for each transcript an "absolute decision", which predicted if the gene was "present" or "absent" in any given sample. The algorithm used to calculate the absolute decision is described in the Affymetrix GeneChip Analysis Suite User Guide (Affymetrix, 1999). Essentially, a gene was called present if its specific hybridization intensity was signficantly above array background and noise levels. Since we did replicated hybridizations, we generated more than one absolute decision for each gene in each sample. To summarize this data, we have reported here one summary value, the "present fraction". If a gene was called present in all replicated hybridizations of a given population, the present fraction is "PA" (present always). If a gene was called present in only some of the replicated hybridizations of a given population, the present fraction is "PS" (present sometimes). Finally, if a gene was called never called present in any of the hybridizations of a given population, the present fraction is "NP" (never present).
2.3 P-value
The P-value reported for each gene was determined by carrying out a one-way ANOVA analysis of the log-transformed frequencies, and testing the null hypothesis that frequency did not change across the 8 samples. A small P-value inidicates that a gene's expression level changed significantly; large P-values (close to one) indicate little confidence that the observed variation in gene expression was signfiicantly above assay noise.
2.4 Interpreting the data
A few key points should be kept in mind when viewing the data.
First, we estimate that the overall senstivity of detection of the arrays varied from about 1:300,000 (~3 ppm) to 1:55,000 (18 ppm). In each hybridization, frequencies that were below the sensitivity of the array were averaged with the array sensitivity, to provide a damped estimate of the gene frequency. In practice, this means that observed frequencies around 1-20 ppm must be interpreted with caution: these numbers may be "in the noise" of the array, and may have been rounded up from lower values, depending on the sensitivity of the array in question. The reported present fraction can also be used as a gauge of signal-to-noise ratio. Observations that have a present fraction "NP" were always below the basal noise level on the array, while observations with present fractions "PS" or "PA" were sometimes or always above array noise, respectively.
Second, extensive replication of frequency measurements has shown that the long-run coefficient of variation of repeated measurements is typically ~10% for the most abundant genes, and can be ~40% for the lowest abundance genes (these numbers are approximate). This means that relative changes in observed gene expression that are smaller than 2-3 fold could arise from noise. The reported P-values are a direct way of estimating the probability that the observed changes in a given gene could be due to noise alone.
Combining these features of the data, one rule of thumb for deciding if a gene expression level "really changed" between two or more populations is to look at the following data quality indicators:
(1) Frequencies with corresponding present fractions that are "PA" or "PS".
(2) Absolute differences in frequency greater than ~20 ppm.
(3) Significant P-values.
(4) Relative increases or decreases in frequency that are greater than 2-3 fold.
Gene expression changes that meet these criteria are the most robust data that we collected. Observations that don't show some of these indicators should be interpreted more cautiously.