TutorialΒΆ

The SeqAn tutorials are the best way to get started with learning how to develop using SeqAn. In contrast, the API Documentation gives more comprehensive but less verbose documentation about the library while the How-Tos are strictly task driven and narrower in scope.

The main audience of the tutorials are graduate students and professionals who want to learn how to use SeqAn. Previous programming knowledge is required, knowledge of C++ is recommended.

Introduction

These tutorials show you how to get started with SeqAn, including the installation. Then, you can learn about the background and motivation of SeqAn. You should then definitely start your engines and read the A First Example tutorial to see an example highlighting many important concepts in the SeqAn library.

Getting Started
This tutorial will walk you through the installation of SeqAn and its dependencies. Then, you will create your first minimal SeqAn application!
Background and Motivation
This tutorial gives an overview over the design aims and principles of SeqAn and a motivation for the employed mechanisms.
A First Example
This tutorial gives practical examples and applications of the most important basic techniques. You should read this tutorial if you are starting out with SeqAn.

We highly recommend you to follow the Getting Started instructions if you are starting out with SeqAn. Note that it is also possible to use SeqAn strictly as a library with your own build system. The article Integration with your own Build System contains detailed information about this.

A Stroll Through SeqAn

Sequences

Sequences
This tutorial introduces you to the basics of fundamental concept of sequences, namely Strings and Segments.
Alphabets
This tutorial introduces you to SeqAn’s alphabets, or in other words, the contained types of sequences.
String Sets
StringSets This tutorial introduces you to SeqAn’s StringSet, an efficient data structure to store a set of sequences.
Sequences In-Depth
In this tutorial you will learn how to optimize the work with sequences, using different specializations of Strings and different overflow strategies for capacity changes.

Iterators

Iterators
This tutorial explains how to use iterators in SeqAn, illustrated on containers.

Alignments

Alignment Representation
This section of the tutorial introduces you to the data structures that are used to represent alignments in SeqAn.
Pairwise Sequence Alignment
In this part of the tutorial we demonstrate how to compute pairwise sequence alignments in SeqAn. It shows the use of different scoring schemes, and which parameters can be used to customize the alignment algorithms.
Multiple Sequence Alignment
In the last section of this tutorial we show how to compute multiple sequence alignments in SeqAn using a scoring matrix.

Indices

Indices
This tutorial introduces you to the various indices in SeqAn like extended suffix arrays or k-mer indices.
Index Iterators
This tutorial introduces you to the various index iterators with which you can use indices as if traversing search trees or tries.
Q-gram Index
This tutorial introduces you to SeqAn’s q-gram index.

Pattern Matching

Pattern Matching
This section of the tutorial introduces you to the algorithms in SeqAn for exact and approximate pattern matching.

Graphs

Graphs
This section of the tutorial introduces you to the graph type in SeqAn. We will discuss the various graph specializations and show you how to create directed and undirected graphs as well as HMMs, how to store additional information for edges and vertices and last but not least how to apply standard algorithms to the graphs.

Input/Output

File I/O Overview
This article gives an overview of the formatted file I/O functionality in SeqAn.
Sequence I/O
This tutorial explains how to access FASTA, FASTQ, EMBL and GenBank sequence files.
Indexed FASTA I/O
This tutorial explains how to use FASTA index files for quick random access within FASTA files: read contigs or just sections without having to read through whole FASTA file.
SAM and BAM I/O
This tutorial explains how to access SAM and BAM files.
VCF I/O
This tutorial explains how to access VCF files.
BED I/O
This tutorial explains how to access BED files.
GFF and GTF I/O
This tutorial explains how to access GFF and GTF files.

Modifiers

Modifiers
Modifiers Modifiers can be used to change the elements of a container without touching them. Here you will see, what modifiers are available in SeqAn.

Randomness

Randomness
This chapter shows module random that provides pseudo random number generation functionality.

Seed-And-Extend

Seed-and-Extend
In this part of the tutorial we will introduce SeqAn’s seed class, demonstrate seed extension and banded alignment with seeds, and finally show the usage of seed chaining algorithms.

Parsing Command Line Arguments

Parsing Command Line Arguments
Parsing Command Line Arguments In this tutorial, you will learn how to use the ArgumentParser class for parsing command line arguments.

Genome Annotations

Genome Annotations
You will learn how to work with annotations in SeqAn and analyzing them, using the annotationStore which is part of SeqAn’s FragmentStore.

Advanced Tutorials

Fragment Store
This tutorial shows how to use the fragment store which is a database for read mapping, sequence assembly or gene annotation. It supports to read/write multiple read alignments in SAM or AMOS format and access and modify them. It supports to read/write gene annotations in GFF/GTF and UCSC format, to create custom annotation types, and to traverse and modify the annotation tree.
Consensus Alignment
This tutorial describes how to compute consensus alignments from NGS reads or other nucleic sequence, such as transcripts. The DNA sequences are stored in a fragment store, such that rough alignment information is available.
Realignment
This tutorial describes how to use SeqAn’s realignment module for refining multi-read alignment (or other sequences) stored in a fragment store.
Simple RNA-Seq
In this tutorial you will learn how to implement a simple RNA-Seq based gene quantification tool, that computes RPKM expression levels based on a given genome annotation and RNA-Seq read alignments.
Journaled Set
In this tutorial we demonstrate how you can handle multiple large sequence in main memory while the data structures themself support a certain parallel sequence analysis.
KNIME Nodes
Here you can learn how to use SeqAn apps in KNIME.

Developer’s Corner

First, congratulations on becoming an offical SeqAn developer! After you went through the tutorials and before you actually start to develop your own application with SeqAn you might want to learn Writing Tests and read about the API documentation. In addition, we follow a SeqAn specific SeqAn Style Guides. Information like this can be found on the section site. There are plenty of information completing your knowledge about SeqAn so have a look!

Frequently used Software Techniques

We assume that the user is acquainted with the basic data types of SeqAn, the introductory example and the demo programs. Also you should be acquainted with the STL and template programming. In this Section we introduce the three main techniques of programming in SeqAn, namely the global function interface, the use of Metafunctions, and the concept of Template subclassing.

Basic Techniques
Here we remind you of the basics of template programming and the use of the STL.
Metafunctions
In this section you find an introductory explanation how Metafunctions are used in SeqAn to obtain information about data types used which will only be instantiated at compile time.
Generic Programming
In this section you find a short example that illustrates the power of template subclassing.
Global Function Interface
In this section you find a useful piece of code that shows you the flexibility of the global function interface.