Loading Data

Introduction

IGB aims to be a truly integrated genome browser, meaning it can display data from a variety of diverse data sources, all merged into the same view. This includes data sets loaded from your computer, from URL sites, or from various public (and private) DAS, DAS2 and Quickload servers. IGB can also display data from many file types, including:

Short read alignments data, aka "next generation sequence data" (BAM format)
Standard annotations such as RefSeq annotations, provided by public repositories.
Alignments of Affymetrix probe sets to the genome, provided by the NetAffx group at Affymetrix (link.psl format).
Alignments of ESTs or mRNAs produced by blat (psl formats).
Tiling array graphs, from the TAS program from Affymetrix.
Copy number graphs from the CNAT program Affymetrix.
Data generated from other Affymetrix software tools, such as GCOS, Expression Console and ExACT.
Annotation and graph files prepared by any method in any of the supported formats.

The full list of supported file formats is here.

Choose Species and Genome Version

The first step to loading data is to choose Species and Genome Version. IGB uses this information to offer data sources with relevant data sets. To set the species and version, select them in the Data Access panel. Read about the Data Access panel.

Alternatively, if you chose to open a file from File > Open File... or Open URL... the file selection widow will permit you to specify species and genome using the drop down menus at the bottom of the window.

While some file types contain information specifying their species and genome version, most do not. However, if you load a file that has the species and genome information, IGB will open the proper species and genome, even if it is not the one you have chosen.

Loading Data Sets

NOTE: IGB does NOT immediately display loaded files. Many of today's next-gen sequencing files are too big to display all at once. IGB handles this issue by waiting to visualize data sets until you ask it to refresh. While you can immediately refresh to visualize most files, many larger file types, such as BAM and WIG should first have a defined, smaller region selected prior to refreshing the image.

There are several ways to get data sets into IGB, from servers/sources, from URLs and from the local computer. To load data from a server, locate the data set in the folders of the Data Sources panel. Put a check in the box next to the data you are interested in. This file will be entered into the Choose Load Mode list.

For files loaded from URL or from the local computer, just drag and drop into the IGB interface; the file will immediately appear in the Choose Load Mode list. Alternatively, use the File > Open File.. or Open URL... to find and load the file(s) you want. Be sure to set the species and genome at the bottom of the file selection window.

Choose Load Mode

IGB is capable of loading and displaying whole genomes, whole chromosomes or just portions of a data file, depending on the file type. Therefore, although there are many load options, not every file type can be loaded the same way. There is always the option of Don't Load, which will simply not load the individual file but will maintain it in the list.

Whole Genome is primarily used with sequence files; this is the default load that IGB uses for the reference sequences for most model organisms (e.g. TAIR10 mRNA)
Whole Chromosome is the standard load mode for most file types. We generally recommend selecting Whole Chromosome, unless it is a very large file (e.g. CoolI1T1.bed, A_thaliana_Jun_2009.fa).
Region in View tells IGB to load only the region in view (e.g. CoolI1T1.sm.wig). For large files, such as .bam, .wig or other short read associated files, we strongly recommend that you zoom into a small region of interest, usually about <100Kb depending on your read density, and then refresh the data to see the data within this region. This is easily accomplished if the reference genome for your species is already loaded (IGB automatically loads the reference sequences for most model organisms).
Autoload is an option for BAM and SAM files (e.g. CoolI1T1.sm.bam), which will allow IGB to automatically load the data in view, if the view is zoomed below the Autoload threshold (this is marked with a white arrow in the zoom bar; red box). The Autoload zoom level can be adjusted to your needs. First, set the main view to the zoom level you want. Then, right click in any blank area of the view and select Set Autoload Threshold to Current View. This will change the needed zoom level AND will move the white arrow to indicate the new threshold setting. BE CAREFUL! If you set the Autoload threshold too high, IGB will try to load all of the indicated data, which could exceed the memory limit.

Refresh to Visualize

After the data are set to the appropriate Load Modes, simply Refresh to see the data. If you want all the tracks to display at once, use the Refresh button in the Data Access panel or the refrtesh icon at the top of the window. Alternatively, if you want to visualize one track at a time, use the individual refresh icon next to the tracks.

You might partially load a track (Region in View, Autoload) and then zoom out to show a greater region of the data sets. IGB will show gray in the area of any track that is not fully loaded.

Page tree