C++ Code Style
The aim of this style guide is to enforce a certain level of canonicity on all SeqAn code. Besides good comments, having a common style guide is the key to being able to understand and change code written by others easily.
(The style guide partially follows the Google C++ Code Style Guide.)
C++ Features
Reference Arguments
We prefer reference arguments to pointer arguments.
Use const where possible.
Use C-Style Logical Operators
Use &&, ||, and ! instead of and, or, and not.
While available from C++98, MSVC does not support them out of the box, a special header <iso646.h> has to be included.
Also, they are unfamiliar to most C++ programmers and nothing in SeqAn is using them.
Default Arguments
Default arguments to global functions are problematic with generated forwards. They can be replaced with function overloading. So do not use them!
You can replace default arguments with function overloading as follows. Do not do this.
inline double f(int x, double y = 1.0)
{
// ...
}
Do this instead.
inline double f(int x, double y)
{
// ...
}
inline double f(int x)
{
return f(x, 1.0);
}
Exceptions
SeqAn functions throw exceptions only to report unrecoverable errors, usually during I/O. Instead, functions expected to either success or fail use boolean return values to report their status.
Virtual Member Functions
SeqAn heavily uses template subclassing instead of C++ built-in subclassing. This technique requires using global member functions instead of in-class member functions.
If the design requires using in-class member functions, the keyword virtual should be avoided.
Virtual member functions cannot be inlined and are thus slow when used in tight loops.
static_cast<>
Prefer static_cast<> to C-style casts.
const_cast<>
Use const-casts only to make an object const. Do not remove consts.
Rather, use the mutable keyword on selected members.
const_cast<> is allowed for interfacing with external (C) APIs where the const keyword is missing but which do not modify the variable.
The following is an example where const_cast<> is OK:
template <typename T>
bool isXyz(T const & x)
{
return x._member == 0;
}
template <typename T>
bool isXyz(T & x)
{
return const_cast<T const &>(x)._member == 0;
}
reinterpret_cast<>
Only use reinterpret_cast<> when you absolutely have to and you know what you are doing!
Sometimes, it is useful for very low-level code but mostly it indicates a design flaw.
pre/post increment/decrement
Prefer the “pre” variants for decrement and increment, especially in loops. Their advantage is that no copy of an object has to be made.
Good:
typedef Iterator<TContainer>::Type TIterator;
for (TIterator it = begin(container); atEnd(it); ++it)
{
// do work
}
Bad:
typedef Iterator<TContainer>::Type TIterator;
for (TIterator it = begin(container); atEnd(it); it++)
{
// do work
}
Code Quality
Const-Correctness
Write const correct code. Read the C++ FAQ const correctness article for more information. Besides other things, this allows to use temporary objects without copying in functions that do not need to change their arguments.
Compiler Warnings
All code in the repository must compile without any warnings using the flags generated by the CMake system.
Currently, the GCC flags are:
- ::
-W -Wall -Wstrict-aliasing -pedantic -Wno-long-long -Wno-variadic-macros
Style Conformance
Follow this code style whenever possible. However, prefer consistency to conformance.
If you are editing code that is non-conforming consider whether you could/should adapt the whole file to the new style. If this is not feasible, prefer consistency to conformance.
Semantics
Parameter Ordering
The general parameter order should be (1) output, (2) non-const input (e.g. file handles), (3) input, (4) tags. Within these groups, the order should be from mandatory to optional.
In SeqAn, we read functions f(out1, out2, out3, ..., in1, in2, in3, ...) as (out1, out2, out3, ...) <- f(in1, in2, in3, ...).
E.g. assign():
template <typename T>
void f(T & out, T const & in)
{
out = in;
}
Scoping, Helper Code
Global Variables
Do not use global variables. They introduce hard-to find bugs and require the introduction of a link-time library.
Structs and Classes
Visibility Specifiers
Visibility specifiers should go on the same indentation level as the class keyword.
Example:
class MyStruct
{
public:
protected:
private:
};
Tag Definitions
Tags that are possibly also used in other modules must not have additional parameters and be defined using the Tag<> template.
Tags that have parameters must only be used within the module they are defined in and have non-generic names.
Tags defined with the Tag<> template and a typedef can be defined multiply.
These definitions must have the following pattern:
struct TagName_;
typedef Tag<TagName_> TagName;
This way, there can be multiple definitions of the same tag since the struct TagName_ is only declared but not defined and there can be duplicate typedefs.
For tags (also those used for specialization) that have template parameters, the case is different.
Here, we cannot wrap them inside the Tag<> template with a typedef since it still depends on parameters.
Also we want to be able to instantiate tags so we can pass them as function arguments.
Thus, we have to add a struct body and thus define the struct.
There cannot be multiple identical definitions in C++.
Thus, each tag with parameters must have a unique name throughout SeqAn.
Possibly too generic names should be avoided.
E.g. Chained should be reserved as the name for a global tag but ChainedFile<> can be used as a specialization tag in a file-related module.
Note that this restriction does not apply for internally used tags (e.g. those that have an underscore postfix) since these can be renamed without breaking the public API.
In-Place Member Functions
Whenever possible, functions should be declared and defined outside the class. The constructor, destructor and few operators have to be defined inside the class, however.
The following has to be defined and declared within the class (also see Wikipedia):
constructors
destructors
function call operator
operator()type cast operator
operator T()array subscript operator
operator[]()dereference-and-access-member operator
operator->()assignment operator
operator=()
Formatting
Constructor Initialization Lists
If the whole function prototype fits in one line, keep it in one line. Otherwise, wrap line after column and put each argument on its own line indented by one level. Align the initialization list.
Example:
class Class
{
MyClass() :
member1(0),
member2(1),
member3(3)
{}
};
Line Length
The maximum line length is 120. Use a line length of 80 for header comments and the code section separators.
Non-ASCII Characters
All files should be UTF-8, non-ASCII characters should not occur in them nevertheless.
In comments, use ss instead of ß and ae instead of ä etc.
In strings, use UTF-8 coding.
For example, "\xEF\xBB\xBF" is the Unicode zero-width no-break space character, which would be invisible if included in the source as straight UTF-8.
Spaces VS Tabs
Do not use tabs! Use spaces.
Use "\t" in strings instead of plain tabs.
After some discussion, we settled on this. All programmer’s editors can be configured to use spaces instead of tabs. We use four spaces instead of a tab.
There can be problems when indenting in for loops with tabs, for example.
Consider the following (-->| is a tab, _ is a space):
for (int i = 0, j = 0, k = 0, ...;
_____cond1 && cond2 &&; ++i)
{
// ...
}
Here, indentation can happen up to match the previous line. Mixing tabs and spaces works, too. However, since tabs are not shown in the editor, people might indent a file with mixed tabs and spaces with spaces if they are free to mix tabs and spaces.
for (int i = 0, j = 0, k = 0, ...;
-->|_cond1 && cond2 &&; ++i)
{
// ...
}
Indentation
We use an indentation of four spaces per level.
Note that ‘’’namespaces do not cause an increase in indentation level.’’’
namespace seqan2 {
class SomeClass
{
};
} // namespace seqan2
Trailing Whitespace
Trailing whitespace is forbidden.
Trailing whitespace is not visible, leading whitespace for indentation is perceptible through the text following it. Anything that cannot be seen can lead to “trash changes” in the GitHub repository when somebody accidentally removes it.
Inline Comments
Use inline comments to document variables.
Possibly align inline comments.
short x; // a short is enough!
int myVar; // this is my variable, do not touch it
Brace Positions
Always put brace positions on the next line.
class MyClass
{
public:
int x;
MyClass() : x(10)
{}
};
void foo(char c)
{
switch (c)
{
case 'X':
break;
}
// ...
}
Conditionals
Use no spaces inside the parantheses, the else keyword belongs on a new line, use block braces consistently.
Conditional statements should look like this:
if (a == b)
{
return 0;
}
else if (c == d)
{
int x = a + b + d;
return x;
}
if (a == b)
return 0;
else if (c == d)
return a + b + d;
Do not leave out the spaces before and after the parantheses, do not put leading or trailing space in the paranthesis. The following is wrong:
if (foo){
return 0;
}
if(foo)
return 0;
if (foo )
return 0;
Make sure to add braces to all blocks if any block has one. The following is wrong:
if (a == b)
return 0;
else if (c == d)
{
int x = a + b + d;
return x;
}
Loops and Switch Statements
Switch statements may use braces for blocks.
Empty loop bodies should use {} or continue.
Format your switch statements as follows. The usage of blocks is optional. Blocks can be useful for declaring variables inside the switch statement.
switch (var)
{
case 0:
return 1;
case 1:
return 0;
default:
SEQAN_FAIL("Invalid value!");
}
switch (var2)
{
case 0:
return 1;
case 1:
{
int x = 0;
for (int i = 0; i < var3; ++i)
x ++ i;
return x;
}
default:
SEQAN_FAIL("Invalid value!");
}
Empty loop bodies should use {} or continue, but not a single semicolon.
while (condition)
{
// Repeat test until it returns false.
}
for (int i = 0; i < kSomeNumber; ++i)
{} // Good - empty body.
while (condition)
continue; // Good - continue indicates no logic.
Expressions
Binary expressions are surrounded by one space. Unary expressions are preceded by one space.
Example:
if (a == b || c == d || e == f || !x)
{
// ...
}
bool y = !x;
unsigned i = ~j;
Type Expressions
No spaces around period or arrow.
Add spaces before and after pointer and references.
const comes after the type.
The following are good examples:
int x = 0;
int * ptr = x; // OK, spaces are good.
int const & ref = x; // OK, const after int
int main(int argc, char ** argv); // OK, group pointers.
Bad Examples:
int x = 0;
int* ptr = x; // bad spaces
int *ptr = x; // bad spaces
const int & ref = x; // wrong placement of const
int x = ptr -> z; // bad spaces
int x = obj. z; // bad spaces
Function Return Types
If a function definition is short, everything is on the same line. Otherwise, split.
Good example:
int foo();
template <typename TString>
typename Value<TString>::Type
anotherFunction(TString const & foo, TString const & bar, /*...*/)
{
// ...
}
Inline Functions
If a function definition is short, everything is on the same line. Otherwise put inline and return type in the same line.
Good example:
inline int foo();
template <typename TString>
inline typename Value<TString>::Type
anotherFunction(TString const & foo, TString const & bar, /*...*/)
{
// ...
}
Function Argument Lists
If it fits in one line, keep in one line. Otherwise, wrap at the paranthesis, put each argument on its own line. For very long function names and parameter lines, break after opening bracket.
Good example:
template <typename TA, typename TB>
inline void foo(TA & a, TB & b);
template </*...*/>
inline void foo2(TA & a,
TB & b,
...
TY & y,
TZ & z);
template </*...*/>
inline void _functionThisIsAVeryVeryLongFunctionNameSinceItsAHelper(
TThisTypeWasMadeToForceYouToWrapInTheLongNameMode & a,
TB & b,
TC & c,
TB & d,
...);
Template Argument Lists
Follow conventions of function parameter lists, no blank after opening <.
As for function parameters, try to fit everything on one line if possible, otherwise, break the template parameters over multiple lines and put the commas directly after the type names.
template <typename T1, typename T1>
void foo() {}
template <typename T1, typename T2, ...
typename T10, typename T11>
void bar() {}
Multiple closing > go to the same line and are only separated by spaces if two closing angular brackets come after each other.
typedef Iterator<Value<TValue>::Type,
Standard> ::Type
typedef String<char, Alloc<> > TMyString
// -------------------------^
Function Calls
Similar rules as in Function Argument Lists apply. When wrapped, not each parameter has to occur on its own line.
Example:
foo(a, b);
foo2(a, b, c, ...
x, y, z);
if (x)
{
if (y)
{
_functionThisIsAVeryVeryLongFunctionNameSinceItsAHelper(
firstParameterWithALongName, b, c, d);
}
}
Naming Rules
In the following, camel case means that the first letter of each word is written upper case, the remainder is written in lower case. Abbreviations of length 2 are kept in upper case, longer abbreviations are camel-cased.
Macros
Macros are all upper case, separated by underscores, prefixed with SEQAN_.
Example:
SEQAN_ASSERT_EQ(val1, val2);
#define SEQAN_MY_TMP_MACRO(x) f(x)
// ...
SEQAN_MY_TMP_MACRO(1);
// ...
#undef SEQAN_MY_TMP_MACRO
Variable Naming
Variables are named in camel case, starting with a lower-case parameter. Internal member variables have an underscore prefix.
Example:
int x;
int myVar;
int saValue(/*...*/);
int getSAValue(/*...*/);
struct FooBar
{
int _x;
};
Constant / Enum Value Naming
Constant and enum values are named like macros: all-upper case, separated by dashes.
Example:
enum MyEnum
{
MY_ENUM_VALUE1 = 1,
MY_ENUM_VALUE2 = 20
};
int const MY_VAR = 10;
Struct / Enum / Class Naming
Types are written in camel case, starting with an upper case character.
Internal library types have an underscore suffix.
Example:
struct InternalType_
{};
struct SAValue
{};
struct LcpTable
{};
Metafunction Naming
Metafunctions are named like structs, defined values are named VALUE, types Type.
Metafunctions should not export any other types or values publically, e.g. they should have an underscore suffix.
Example:
template <typename T>
struct MyMetaFunction
{
typedef typename RemoveConst<T>::Type TNoConst_;
typedef TNonConst_ Type;
};
template <typename T>
struct MyMetaFunction2
{
typedef True Type;
static bool const VALUE = false;
};
Function Naming
The same naming rule as for variables applies.
Example:
void fooBar();
template <typename T>
int saValue(T & x);
template <typename T>
void lcpTable(T & x);
Names In Documentation
In the documentation, classes have the same name as in the source code, e.g. the class StringSet is documented as “class StringSet.”
Specializations are named “$SPEC $CLASS“, e.g. “Concat StringSet”, “Horspool Finder.”
Source Tree Structure
File Name Rules
File and directories are named all-lower case, words are separated by underscores.
Exceptions are INFO, COPYING, README, … files.
Examples:
string_base.hstring_packed.hsuffix_array.hlcp_table.h
File Structure
Header #define guard
The header #define include guards are constructed from full paths to the repository root.
Example:
filename |
preprocessor symbol |
|---|---|
seqan/include/seqan/basic/iterator_base.h |
|
#ifndef SEQAN_INCLUDE_SEQAN_BASIC_ITERATOR_BASE_H_
#define SEQAN_INCLUDE_SEQAN_BASIC_ITERATOR_BASE_H_
#endif // #ifndef SEQAN_INCLUDE_SEQAN_BASIC_ITERATOR_BASE_H_
Include Order
The include order should be (1) standard library requirements, (2) external requirements, (3) required SeqAn modules.
In SeqAn module headers (e.g. basic.h), then all files in the module are included.
CPP File Structure
// ==========================================================================
// $APP_NAME
// ==========================================================================
// Copyright (c) 2006-2024, Knut Reinert, FU Berlin
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following disclaimer in the
// documentation and/or other materials provided with the distribution.
// * Neither the name of Knut Reinert or the FU Berlin nor the names of
// its contributors may be used to endorse or promote products derived
// from this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
// ARE DISCLAIMED. IN NO EVENT SHALL KNUT REINERT OR THE FU BERLIN BE LIABLE
// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
// LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
// OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
// DAMAGE.
//
// ==========================================================================
// Author: $AUTHOR_NAME <$AUTHOR_EMAIL>
// ==========================================================================
// $FILE_COMMENT
// ==========================================================================
#include <seqan/basic.h>
#include <seqan/sequence.h>
#include "app_name.h"
using namespace seqan2;
// Program entry point
int main(int argc, char const ** argv)
{
// ...
}
Application Header Structure
// ==========================================================================
// $APP_NAME
// ==========================================================================
// Copyright (c) 2006-2024, Knut Reinert, FU Berlin
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following disclaimer in the
// documentation and/or other materials provided with the distribution.
// * Neither the name of Knut Reinert or the FU Berlin nor the names of
// its contributors may be used to endorse or promote products derived
// from this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
// ARE DISCLAIMED. IN NO EVENT SHALL KNUT REINERT OR THE FU BERLIN BE LIABLE
// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
// LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
// OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
// DAMAGE.
//
// ==========================================================================
// Author: $AUTHOR_NAME <$AUTHOR_EMAIL>
// ==========================================================================
// $FILE_COMMENT
// ==========================================================================
#ifndef APPS_APP_NAME_HEADER_FILE_H_
#define APPS_APP_NAME_HEADER_FILE_H_
// ==========================================================================
// Forwards
// ==========================================================================
// ==========================================================================
// Tags, Classes, Enums
// ==========================================================================
// --------------------------------------------------------------------------
// Class ClassName
// --------------------------------------------------------------------------
// ==========================================================================
// Metafunctions
// ==========================================================================
// --------------------------------------------------------------------------
// Metafunction MetafunctionName
// --------------------------------------------------------------------------
// ==========================================================================
// Functions
// ==========================================================================
// --------------------------------------------------------------------------
// Function functionName()
// --------------------------------------------------------------------------
#endif // APPS_APP_NAME_HEADER_FILE_H_
Library Header Structure
// ==========================================================================
// SeqAn - The Library for Sequence Analysis
// ==========================================================================
// Copyright (c) 2006-2024, Knut Reinert, FU Berlin
// All rights reserved.
//
// Redistribution and use in source and binary forms, with or without
// modification, are permitted provided that the following conditions are met:
//
// * Redistributions of source code must retain the above copyright
// notice, this list of conditions and the following disclaimer.
// * Redistributions in binary form must reproduce the above copyright
// notice, this list of conditions and the following disclaimer in the
// documentation and/or other materials provided with the distribution.
// * Neither the name of Knut Reinert or the FU Berlin nor the names of
// its contributors may be used to endorse or promote products derived
// from this software without specific prior written permission.
//
// THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
// AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
// IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
// ARE DISCLAIMED. IN NO EVENT SHALL KNUT REINERT OR THE FU BERLIN BE LIABLE
// FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
// DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
// SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
// CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
// LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
// OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH
// DAMAGE.
//
// ==========================================================================
// Author: AUTHOR NAME <AUTHOR EMAIL>
// ==========================================================================
// SHORT COMMENT ON WHAT THIS FILE CONTAINS
// ==========================================================================
#ifndef INCLUDE_SEQAN_BASIC_ITERATOR_BASE_H_
#define INCLUDE_SEQAN_BASIC_ITERATOR_BASE_H_
namespace seqan2 {
// ==========================================================================
// Forwards
// ==========================================================================
// ==========================================================================
// Tags, Classes, Enums
// ==========================================================================
// --------------------------------------------------------------------------
// Class ClassName
// --------------------------------------------------------------------------
// ==========================================================================
// Metafunctions
// ==========================================================================
// --------------------------------------------------------------------------
// Metafunction MetafunctionName
// --------------------------------------------------------------------------
// ==========================================================================
// Functions
// ==========================================================================
// --------------------------------------------------------------------------
// Function functionName()
// --------------------------------------------------------------------------
} // namespace seqan2
#endif // INCLUDE_SEQAN_BASIC_ITERATOR_BASE_H_
Comments
File Comments
Each file should begin with a file header.
The file header has the format. The
skel.pytool automatically generates files with appropriate headers.Class, Function, Metafunction, Enum, Macro DDDoc Comments
Each public class, function, metafunction, enum, and macro should be documented using dox API docs. Internal code should be documented, too.
Example:
Implementation Comments
All functions etc. should be well-documented. In most cases, it is more important how something is done instead of of what is done.
TODO Comments
TODO comments have the format
// TODO($USERNAME): $TODO_COMMENT. The username is the username of the one writing the item, not the one to fix it. Use GitHub issues for this.