Skip to end of metadata
Go to start of metadata


By now, you've learned a great deal about how to work productively in a UNIX environment. You've also learned to perform one of the most fundamentally important tasks in bioinformatics programming, which is creating data sets for visualization by end user biologists.

This week, you'll take it a step further by learning to work with process and visualize data sets from RNA-Seq, a form of EST sequencing that not only provides sequence information for expressed genes but also gives us quantitative information about overall expression levels.

This week, you'll learn:

  • how to get RNA-Seq data from the short-read archive
  • how to convert NCBI-specific sequence format (.sra) to FASTQ
  • how to interpret quality scores and other information in FASTQ files
  • how to align sequences in FASTQ files onto a reference genome
  • how to visualize alignments


Sequencing background

Fastq format

The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants


NCBI Short Read Archive (SRA)

RNA-Seq tools


Aligning and visualizing RNA-Seq data.

  • No labels