Viewing RNA-Seq and expression tiling array data in the same view in IGB 6.3
In this image, you see three different Arabidopsis expression tiling array data sets corresponding to cold (GSM243694), high salt (GSM243703), and drought (GSM243707) treatments assayed using the Affymetrix AtTile1R tiling array platform. The data were loaded in simple graph format, where probe intensities are shown as vertical bars. The graphs were then configured via the Graph Adjuster tab's Graph Thresholding option to display a bar underneath groups of consecutive probes with intensity values above a certain threshold. Note how the bars seem to correspond to know exons in the gene models displayed in the TAIR9 track. At the top of the display are short read Illumina RNA-Seq data (75 bases per read) from plants undergoing severe drought stress. Note that graph thresholding seems to suggest that this gene contains a previously undiscovered five prime exon. However, the RNA-Seq data contain no reads that support this idea. This example illustrates some of the ambiguities that can arise from using data from high-throughput methods - like tiling arrays and Illumina sequencing. Consider that the data sets are enormous and noisy! Thus, purely through chance we will observe at least some genes adjacent to what tiling arrays seem to suggest are unannotated exons. If the RNA-Seq data supported the idea that this region of high probe intensity is indeed an exon, then we would be much more likely to believe the conclusion, because the odds of two entirely different expression measurement technologies giving the same spurious result are very small.