Pattern matching is about searching a known string or StringSet (
needle) in another string or StringSet (
This tutorial will introduce you into the SeqAn classes Finder and Pattern.
It will demonstrate how to use the spezializations of the class finder to perform either an online search or an index based seach.
And you will learn how to specify the search algorithm, which can be either exact or approximate.
In the case of approximate searching errors are allowed, which are either only mismatches or also indels.
Additionally there are filtration algorithms which return potential matches, i.e.
haystack segments possibly containing a pattern match.
All searching is done by calling the function find, which takes at least two arguments:
- A Finder that stores all necessary information about the
haystackand the last found position of the
needlewithin the haystack.
- A Pattern that stores all information about the
needle. Some variants of find support further arguments. The Finder and Pattern classes expect the underlying
needletypes as first template arguments. In addition, a second template argument specifies the search algorithm.
Each call of find finds only one match (or potential match) of the
needle within the haystack.
The Finder can be asked for the begin and end position of the last found match.
The Pattern can be asked for the number of the found sequence if the
needle is a StringSet.
Subsequent calls of find can be used to find more occurrences of the
needle, until no more occurrences can be found and find returns
In general, search algorithms can be divided into algorithms that preprocess the
needle (online search) or preprocess the
haystack (index search).
The additional section on Optimal Search Schemes is also an indexed search algorithm but due to a different interface separated into its own section.