Consensus Alignment¶
- Learning Objective
You will learn how to perform a consensus alignment of sequences (e.g. NGS reads) stored in a FragmentStore. After completing this tutorial, you will be able to perform a consensus alignment of reads with and without using alignment information. This is useful for the consensus step in sequence assembly.
- Difficulty
Advanced
- Duration
1 h
- Prerequisites
The SeqAn module <seqan/consensus.h>
allows the computation of consensus alignments based on the method by Rausch et al. [RKD+09].
It can be used for the consensus step in Overlap-Layout-Consensus assemblers.
Consensus with Approximate Positions¶
The consensus module has two modes. The first one is applicable when approximate positions of the reads are known. The following program demonstrates this functionality.
First, we include the necessary headers.
#include <iostream>
#include <seqan/store.h>
#include <seqan/consensus.h>
using namespace seqan2;
int main()
{
Next, the fragment store is filled with reads and approximate positions. The true alignment is shown in the comments.
FragmentStore<> store;
// Resize contigStore and contigNameStore (required for printing the first layout).
resize(store.contigStore, 1);
appendValue(store.contigNameStore, "ref");
// Actual read layout.
//
// AATGGATGGCAAAATAGTTGTTCCATGAATACATCTCTAAAGAGCTTT
// AAAATAGTTGTTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTT
// AGTTGTTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTTAGTCAAATTTTCAATACTGTA
// ACATCTCTAAAGAGCTTTGATGCTAATTTAGTCAAATT
// AGAGCTTTGATGCTAATTTAGTCAAATTTTCAATACTGTACAATCTTCTCTAG
// Append reads (includes small errors).
appendRead(store, "AATGGATGGCAAAATAGTTGTTCCATGAATACATCTCTAAAGAGCTTT");
appendRead(store, "AAAGTAGTTGTTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTT");
appendRead(store, "AGTTGTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTTAGTCAATTTTCAATACTGTA");
appendRead(store, "ACATCTCTTAAAGAGCTTTGATGCTAATTTAGTCAAATT");
appendRead(store, "AGAGCTTTGATGCTAATTTAGTCAAATTTTCAATACTGTACAATCTTCTCTAG");
// The position used in the following are only approximate and would
// not lead to the read layout above.
appendAlignedRead(store, 0, 0, 0, (int)length(store.readSeqStore[0]));
appendAlignedRead(store, 1, 0, 12, 12 + (int)length(store.readSeqStore[1]));
appendAlignedRead(store, 2, 0, 14, 14 + (int)length(store.readSeqStore[2]));
appendAlignedRead(store, 3, 0, 18, 18 + (int)length(store.readSeqStore[3]));
appendAlignedRead(store, 4, 0, 25, 25 + (int)length(store.readSeqStore[4]));
// Print the (wrong) alignment.
std::cout << "Initial alignment\n\n";
AlignedReadLayout layout;
layoutAlignment(layout, store);
printAlignment(std::cout, layout, store, /*contigID=*/ 0, /*beginPos=*/ 0, /*endPos=*/ 80, 0, 30);
This is followed by computing the consensus alignment using the function consensusAlignment.
ConsensusAlignmentOptions options;
options.useContigID = true;
consensusAlignment(store, options);
Finally, the alignment is printed using an AlignedReadLayout object.
std::cout << "Final alignment\n\n";
layoutAlignment(layout, store);
printAlignment(std::cout, layout, store, /*contigID=*/ 0, /*beginPos=*/ 0, /*endPos=*/ 80, 0, 30);
return 0;
}
Here is the program’s output:
Initial alignment
--------------------------------------------------------------------------------
AATGGATGGCAAAATAGTTGTTCCATGAATACATCTCTAAAGAGCTTT
AAAGTAGTTGTTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTT
AGTTGTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTTAGTCAATTTTCAATACTGTA
ACATCTCTTAAAGAGCTTTGATGCTAATTTAGTCAAATT
AGAGCTTTGATGCTAATTTAGTCAAATTTTCAATACTGTACAATCTTCTCTAG
Final alignment
AATGGATGGCAAAATAGTTGTTCCATGAATACATCTC-TAAAGAGCTTTGATGCTAATTTAGTCAAATTTTCAATACTGT
.....................................*...........
...G.......................*......................
.....*................*..........................*...............
.......T...............................
.......................................
Consensus without Approximate Positions¶
When setting the useContigID
member of the ConsensusAlignmentOptions object to false
then we can also omit adding approximate positions for the reads.
In this case, the consensus step performs an all-to-all alignment of all reads and then computes a consensus multi-read alignment for all of them.
This is demonstrated by the following program.
#include <iostream>
#include <seqan/store.h>
#include <seqan/consensus.h>
using namespace seqan2;
int main()
{
FragmentStore<> store;
// Actual read layout.
//
// AATGGATGGCAAAATAGTTGTTCCATGAATACATCTCTAAAGAGCTTT
// AAAATAGTTGTTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTT
// AGTTGTTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTTAGTCAAATTTTCAATACTGTA
// ACATCTCTAAAGAGCTTTGATGCTAATTTAGTCAAATT
// AGAGCTTTGATGCTAATTTAGTCAAATTTTCAATACTGTACAATCTTCTCTAG
// Append reads (includes small errors).
appendRead(store, "AATGGATGGCAAAATAGTTGTTCCATGAATACATCTCTAAAGAGCTTT");
appendRead(store, "AAAGTAGTTGTTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTT");
appendRead(store, "AGTTGTCCATGAATACATCTCTAAAGAGCTTTGATGCTAATTTAGTCAATTTTCAATACTGTA");
appendRead(store, "ACATCTCTTAAAGAGCTTTGATGCTAATTTAGTCAAATT");
appendRead(store, "AGAGCTTTGATGCTAATTTAGTCAAATTTTCAATACTGTACAATCTTCTCTAG");
ConsensusAlignmentOptions options;
options.useContigID = false;
consensusAlignment(store, options);
std::cout << "Final alignment\n\n";
AlignedReadLayout layout;
layoutAlignment(layout, store);
printAlignment(std::cout, layout, store, /*contigID=*/ 0, /*beginPos=*/ 0, /*endPos=*/ 80, 0, 30);
return 0;
}
Here is this modified programs’ output:
Final alignment
AATGGATGGCAAAATAGTTGTTCCATGAATACATCTC-TAAAGAGCTTTGATGCTAATTTAGTCAAATTTTCAATACTGT
.....................................*...........
...G.......................*......................
.....*................*..........................*...............
.......T...............................
.......................................