Skip to end of metadata
Go to start of metadata

Introduction

By now, you've learned a great deal about how to work productively in a UNIX environment. You've also learned to perform one of the most fundamentally important tasks in bioinformatics programming, which is creating data sets for visualization by end user biologists.

This week, you'll take it a step further by learning to work with process and visualize data sets from RNA-Seq, a form of EST sequencing that not only provides sequence information for expressed genes but also gives us quantitative information about overall expression levels.

This week, you'll learn:

  • how to get RNA-Seq data from the short-read archive
  • how to convert NCBI-specific sequence format (.sra) to FASTQ
  • how to interpret quality scores and other information in FASTQ files
  • how to align sequences in FASTQ files onto a reference genome
  • how to visualize alignments

References

Sequencing background

Fastq format

The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants

samtools

NCBI Short Read Archive (SRA)

RNA-Seq tools

Assignments

Aligning and visualizing RNA-Seq data.

  • No labels