Page tree
Skip to end of metadata
Go to start of metadata

Introduction

A common analysis step is to determine meaningful regions on a sequence based on graph values being above or below a certain threshold.  To screen out non-meaningful values and to view meaningful features of a graph as annotations, use the thresholding feature in IGB. Thresholding displays graph features as annotation-like bars in locations where the graph value at the coordinate meets your defined threshold. 

You can convert this data into a standard annotation track in order to analyze it using all of the features in IGB for working with and comparing annotations.

In some cases, you may want to adjust your graph thresholds so the bars correlate to existing annotations; any threshold bars visible at that point that don't have known annotation counterparts (or vice-versa) may indicate areas for further investigation.

The Thresholding Window

To use thresholding, select one (or more graphs). Click the Thresholding button in the Graph tab panel. This will open the Graph Thresholds window.

The first thing you will need to do is set the visibility to 'On'. By default, the threshold is set midway between the graph  minimum and maximum values.  To obtain more meaningful thresholds, adjust the threshold settings using the methods described in the following sections.

Direction

You can tell IGB to show you either the data above the threshold ( > thresh) or below the threshold (<= thresh). For example, for expression results graph score represents the level of hybridization activity.  For a hybridization activity level to be considered significant, the score must be above a certain threshold. So you would choose the > thresh option and set your desired threshold value. If you need help to pick a threshold level, toggle the Y-axis on to see the graph's scale on the Y-axis. See Graph.

Changing the threshold

To change the thresholds of selected graphs so that more or fewer points meet the threshold limit, modify these settings in the Graph Thresholds window:

  • By Value specifies the exact threshold value.
  • By Percentile sets the threshold based on the percentage of graph values greater or less than your threshold.

Specify offsets

The Offsets for Thresholded Regions feature is designed specifically for working with tiling array data. Captured data from tiling array probes should usually be shifted for display.  Since the probes are 25 base pairs long, but the x-coordinates given represent the starting coordinate, you should shift the threshold data so that it starts at 12 base pairs past the given beginning, and ends at 13 base pairs past the beginning.  This is the default placement of all graph threshold bars; if you are viewing typical tiling array data, IGB default is set to the correct parameters. If these offsets are not correct for the data you are analyzing, you should change them.

When you view non-tiling array data in IGB, you should adjust the offset to 0 (set Start to 0 and End to 1) to correctly align the threshold bars with their actual coordinates.

Control gaps

Experimental data can be noisy.  If you are looking for general trends and want to ignore small local variations, you can use the Max Gap and Min run settings. Slide the sliders to adjust, or enter values into the boxes for each parameter:

Max Gap: Groups of coordinates that meet the threshold may be separated by gaps where the threshold is not met.  By default, thresholding bridges such gaps if they are less than 100 bps long.  For example, if there are 300 bps that meet the threshold, then 99 bps that don't, followed by 100 that do, the threshold bar would bridge the 99 bps to produce a single threshold annotation 499 bps long.  To change the length of this bridge, enter a different value into the Max Gap text-entry field. Often times, if you have short introns in your species/ gene models, you might need to adjust this lower to have potential introns drawn correctly.

Min run: minimum number of bases in a row that must meet the threshold before an annotation bar will appear.  For example, by default the Min run is 30 base pairs, so a sequence of 29 base pairs that meet the threshold will not be marked by an annotation bar. Again if you species/ gene models typically have very short exons, you may need to reduce this number. On the other hand, if your species has a minimum size for exon length to be meaningful, you can adjust this higher.

Make an Annotation Track of the Threshold Results

As shown above, while you are working with threshold values, the 'bars' appear in the same track as the graph itself. However, you may wish to record the threshold results before changing values again; you may wish to work with the thresholding results as a separate track; you may wish to save the results to share with others. For all of these reasons, you can create an annotation (NOT a graph) track of the thresholding results. After you have captured threshold "annotations" from a graph or other set of data, you can examine these "annotations" using all of the tools in IGB for working with and comparing annotations.

After you have set the threshold levels to yield results, you click the Make Track button. After an annotation track has been created with Make Track, it is frozen and is no longer linked to changes in the Graph Thresholds window.  However, you can continue to modify the criteria in the Graph Thresholds window, altering the threshold results for the same graph track and then create another annotation track.

This track will be given a default label containing all of the relevant parameters, which you can later change if desired. An example name is

"threshold, [60 to infinity] offsets: (+12, +13), max_gap=20, min_run=30, graph: depth: HotI2T1"

The name of the new track will always start with 'threshold' and end with the name of the graph track that was used (in this case, the graph was a depth graph made from a .bam track called HotI2T1). The value '60 to infinity' shows that the threshold level was set to 60 and looked at > thresh. The offsets were NOT corrected as they should have been, and are set to the tiling array values. Finally, the max gap and min run values are recorded.

  • No labels