This tutorial describes how to import sequence and annotations into IGB when your genome of interest is not available from an IGB QuickLoad or Distributed Annotation site.
In this tutorial, we will demonstrate importing a bacterial genome (E. coli) using files downloaded from NCBI. Overall, to view custom genome and annotations in IGB properly a Synonym File needs to be created.
First visit NCBI to retrieve the sequence data and annotations.
Save the file to your computer; by default, it will be named "sequence.gb".
If you change the name, be sure to use the ".gb" file extension so that IGB can recognize the file and the file format.
Save the file to your computer; by default, it will be named "sequence.fasta".
Make sure that the extensions for the FASTA file is either '.fasta' or '.fa'. If the extension is '.txt', you can safely change it to '.fa'.
When IGB shows the sequence.gb file, it uses the 'LOCUS' name from the sequence.gb. To show the sequence.fasta file, IGB uses the 'sequence name' that follows the >. Note, when the labels for each file type (FASTA sequence files, annotation files, .bam/alignment files, etc.) for the exact same chromosome/sequence are all different IGB will treat each one as a separate chromosome; they will not be visualized together.
To overcome this, we enter 'synonyms'; IGB already has some internal synonyms, for example "1" and "chr1" are equivalent. You will need to know the sequence names of each of these files; if you are not sure, a quick way is to drag all of the files into IGB at the same time.
Now that we know the headers of the files we will create a tab-delimited 'personal synonym' file:
If you are making a synonym file for a multi-chromosomal organism, then make a new line in the file for each chromosome, and just add all of the 'names' associated with it (make sure that there is a 'tab' between each name!). If you include a file that has a new name, open the chromosome.txt, and add the name to the proper line
When you open the new instance of IGB, your synonyms will be loaded for you. At this point, we will open the files, sequences.fasta and sequences.gb, so you can begin analysis of your own data.
To open the sequence and annotation files:
Both the sequence and the models will load. You will be zoomed out, so the sequence will appear as a grey bar; as you zoom in the colors and nucleotides will become visible. At this point, you can begin viewing the gene models and sequence. Keep in mind that if the name of the chromosome(s) is different in your files, you will have to add the name to the chromosome.txt and then reopen IGB so it can load in the new information.