Contributing guidelines#
We are glad that you are interested in contributing to Ginkgo. Please have a look at our coding guidelines before proposing a pull request.
Table of Contents#
Most important stuff (A TL;DR)#
GINKGO_DEVEL_TOOLSneeds to be set toonto commit. This requiresclang-formatto be installed. See Automatic code formatting for more details. Once installed, you can runmake formatin yourbuild/folder to automatically format your modified files. Asmake formatunstages your files post-formatting, you must stage the files again once you have verified thatmake formathas done the appropriate formatting, before committing the files.See Our git workflow to get a quick overview of our workflow.
See Creating, Reviewing and Merging Pull Requests on how to create a Pull request.
Project structure#
Ginkgo is divided into a core module with common functionalities independent
of the architecture, and several kernel modules (reference, omp, cuda,
hip, dpcpp) which contain low-level computational routines for each
supported architecture.
Extended header files#
Some header files from the core module have to be extended to include special
functionality for specific architectures. An example of this is
core/base/math.hpp, which has a GPU counterpart in cuda/base/math.hpp. For
such files you should always include the version from the module you are working
on, and this file will internally include its core counterpart.
Using library classes#
You can use and call functions of existing classes inside a kernel (that are
defined and not just declared in a header file), however, you are not allowed to
create new instances of a polymorphic class inside a kernel (or in general
inside any kernel module like cuda/hip/omp/reference) as this creates circular
dependencies between the core and the backend library. With this in mind, our
CI contains a job which checks if such a circular dependency exists.
These checks can be run manually using the -DGINKGO_CHECK_CIRCULAR_DEPS=ON
option in the CMake configuration.
For example, when creating a new matrix class AB by combining existing classes
A and B, the AB::apply() function composed of invocations to A::apply()
and B::apply() can only be defined in the core module, it is not possible to
create instances of A and B inside the AB kernel files. This is to avoid
the aforementioned circular dependency issue. An example for such a class is the
Hybrid matrix format, which uses the apply() of the Ell and Coo matrix
formats. Nevertheless, it is possible to call the kernels themselves directly
within the same executor. For example, cuda::dense::add_scaled() can be called
from any other cuda kernel.
Code style#
Automatic code formatting#
Ginkgo uses pre-commit to automatically apply
code formatting when committing changes to git. What formatting is applied
is managed through ClangFormat
with a custom .clang-format configuration file (mostly based on ClangFormat’s
Google style). Make sure you have pre-commit set up and running properly
before committing anything that will end up in a pull request against
ginkgo-project/ginkgo repository. In addition, you should never modify the
.clang-format configuration file shipped with Ginkgo. E.g. if ClangFormat has
trouble reading this file on your system, you should install a newer version of
ClangFormat, and avoid commenting out parts of the configuration file.
ClangFormat is the primary tool that helps us achieve a uniform look of Ginkgo’s codebase, while reducing the learning curve of potential contributors. However, ClangFormat configuration is not expressive enough to incorporate the entire coding style, so there are several additional rules that all contributed code should follow.
Note: To learn more about how ClangFormat will format your code, see existing
files in Ginkgo, .clang-format configuration file shipped with Ginkgo, and
ClangFormat’s documentation.
Naming scheme#
Filenames#
Filenames use snake_case and use the following extensions:
C++ source files:
.cppC++ header files:
.hppCUDA source files:
.cuCUDA header files:
.cuhHIP source files:
.hip.cppHIP header files:
.hip.hppCommon source files used by both CUDA and HIP:
.hpp.incCMake utility files:
.cmakeShell scripts:
.sh
Note: A C++ source/header file is considered a CUDA file if it contains CUDA
code that is not guarded with #if guards that disable this code in non-CUDA
compilers. I.e. if a file can be compiled by a general C++ compiler, it is not
considered a CUDA file.
Macros#
Macros (both object-like and function-like macros) use CAPITAL_CASE. They have
to start with GKO_ to avoid name clashes (even if they are #undef-ed in the
same file!).
Variables#
Variables use snake_case.
Constants#
Constants use snake_case.
Functions#
Functions use snake_case.
Structures and classes#
Structures and classes which do not experience polymorphic behavior (i.e. do not
contain virtual methods, nor members which experience polymorphic behavior) use
snake_case.
All other structures and classes use CamelCase.
Members#
All structure / class members use the same naming scheme as they would if they were not members:
methods use the naming scheme for functions
data members the naming scheme for variables or constants
type members for classes / structures
Additionally, non-public data members end with an underscore (_).
Namespaces#
Namespaces use snake_case.
Template parameters#
Type template parameters use
CamelCase, for exampleValueType.Non-type template parameters use
snake_case, for examplesubwarp_size.
Whitespace#
Spaces and tabs are handled by ClangFormat, but blank lines are only partially handled (the current configuration doesn’t allow for more than 2 blank lines). Thus, contributors should be aware of the following rules for blank lines:
Top-level statements and statements directly within namespaces are separated with 2 blank lines. The first / last statement of a namespace is separated by two blank lines from the opening / closing brace of the namespace.
exception: if the first or the last statement in the namespace is another namespace, then no blank lines are required example:
namespace foo { struct x { }; } // namespace foo namespace bar { namespace baz { void f(); } // namespace baz } // namespace bar
exception: in header files whose only purpose is to declare a bunch of functions (e.g. the
*_kernel.hppfiles) these declarations can be separated by only 1 blank line (note: standard rules apply for all other statements that might be present in that file)exception: “related” statement can have 1 blank line between them. “Related” is not a strictly defined adjective in this sense, but is in general one of:
overload of a same function,
function / class template and it’s specializations,
macro that modifies the meaning or adds functionality to the previous / following statement.
However, simply calling function
ffrom functiongdoes not imply thatfandgare “related”.
Statements within structures / classes are separated with 1 blank line. There are no blank lines between the first / last statement in the structure / class.
exception: there is no blank line between an access modifier (
private,protected,public) and the following statement. example:class foo { public: int get_x() const noexcept { return x_; } int &get_x() noexcept { return x_; } private: int x_; };
Function bodies cannot have multiple consecutive blank lines, and a single blank line can only appear between two logical sections of the function.
Unit tests should follow the AAA pattern, and a single blank line must appear between consecutive “A” sections. No other blank lines are allowed in unit tests.
Enumeration definitions should have no blank lines between consecutive enumerators.
Include statement grouping#
The concrete ordering will be done by clang-format.
Here are the rules that clang-format will follow.
In general, all include statements should be present on the top of the file,
ordered in the following groups, with one blank lines between each group:
Main header file (e.g.
core/foo/bar.hppincluded incore/foo/bar.cpp, or in the unit testcore/test/foo/bar.cpp)Standard library headers (e.g.
vector)Executor specific library headers (e.g.
omp.h)System third-party library headers (e.g.
papi.h)Public Ginkgo headers
Local headers
Example: A file core/base/my_file.cpp might have an include list like this:
#include "ginkgo/core/base/my_file.hpp"
#include <algorithm>
#include <vector>
#include <tuple>
#include <omp.h>
#include <papi.h>
#include <third_party/blas/cblas.hpp>
#include <third_party/lapack/lapack.hpp>
#include <ginkgo/core/base/executor.hpp>
#include <ginkgo/core/base/lin_op.hpp>
#include <ginkgo/core/base/types.hpp>
#include "core/base/my_file_kernels.hpp"
Main header#
This section presents the handling of the main header attributed to a file.
For a given file, the main header is the header that contains the declarations of the
functions, classes, etc., which are implemented in this file.
In the previous example, this would be #include "ginkgo/core/base/my_file.hpp".
The clang-format tool figures out the main header. The only intervention form
a contributor is to always include the main header using "...".
Please note that this only applies to implementation files, so files ending in .cpp or .cu.
Some general comments.#
Private headers of Ginkgo should not be included within the public Ginkgo header.
It is a good idea to keep the headers self-sufficient, See Google Style guide for reasoning. When compiling with
GINKGO_CHECK_CIRCULAR_DEPSenabled, this property is explicitly checked.The recommendations of the
iwyu(Include what you use) tool can be used to make sure that the headers are self-sufficient and that the compiled files (.cu,.cpp,.hip.cpp) include only what they use. A CI pipeline is available that runs with theiwyutool. Please be aware that this tool can be incorrect in some cases.
Other Code Formatting not handled by ClangFormat#
Control flow constructs#
Single line statements should be avoided in all cases. Use of brackets is
mandatory for all control flow constructs (e.g. if, for, while, …).
Variable declarations#
C++ supports declaring / defining multiple variables using a single type-specifier. However, this is often very confusing as references and pointers exhibit strange behavior:
template <typename T> using pointer = T *;
int * x, y; // x is a pointer, y is not
pointer<int> x, y; // both x and y are pointers
For this reason, always declare each variable on a separate line, with its own type-specifier.
CMake coding style#
Whitespaces#
All alignment in CMake files should use four spaces.
Use of macros vs functions#
Macros in CMake do not have a scope. This means that any variable set in this macro will be available to the whole project. In contrast, functions in CMake have local scope and therefore all set variables are local only. In general, wrap all piece of algorithms using temporary variables in a function and use macros to propagate variables to the whole project.
Naming style#
All Ginkgo specific variables should be prefixed with a GINKGO_ and all
functions by ginkgo_.
Helper scripts#
To facilitate easy development within Ginkgo and to encourage coders and scientists who do not want get bogged down by the details of the Ginkgo library, but rather focus on writing the algorithms and the kernels, Ginkgo provides the developers with a few helper scripts.
Create a new algorithm#
A create_new_algorithm.sh script is available for developers to facilitate
easy addition of new algorithms. The options it provides can be queried with
./create_new_algorithm.sh --help
The main objective of this script is to add files and boiler plate code for the
new algorithm using a model and an instance of that model. For example, models
can be any one of factorization, matrix, preconditioner or solver. For
example to create a new solver named my_solver similar to gmres, you would
set the ModelType to solver and set the ModelName to gmres. This would
duplicate the core algorithm and kernels of the gmres algorithm and replace
the naming to my_solver. Additionally, all the kernels of the new my_solver
are marked as GKO_NOT_IMPLEMENTED. For easy navigation and .txt file is created
in the folder where the script is run, which lists all the TODO’s. These TODO’s can
also be found in the corresponding files.
Converting CUDA code to HIP code#
We provide a cuda2hip script that converts cuda kernel code into hip kernel code.
Internally, this script calls the hipify script provided by HIP, converting the CUDA syntax
to HIP syntax. Additionally, it also automatically replaces the instances of
CUDA with HIP as appropriate. Hence, this script can be called on a Ginkgo CUDA
file. You can find this script in the dev_tools/scripts/ folder.
Writing Tests#
Ginkgo uses the GTest framework for the unit test framework within Ginkgo. Writing good tests are extremely important to verify the functionality of the new code and to make sure that none of the existing code has been broken.
Testing know-how#
GTest provides a comprehensive documentation of the functionality available within Gtest.
Reduce code duplication with Testing Fixtures,
TEST_FWrite templated tests using
TYPED_TEST.
Some general rules.#
Unit tests must follow the KISS principle.
Unit tests must follow the AAA pattern, and a single blank line must appear between consecutive “A” sections.
Writing tests for kernels#
Reference kernels, kernels on the
ReferenceExecutor, are meant to be single threaded reference implementations. Therefore, tests for reference kernels need to be performed with data that can be as small as possible. For example, matrices lesser than 5x5 are acceptable. This allows the reviewers to verify the results for exactness with tools such as MATLAB.OpenMP, CUDA, HIP and DPC++ kernels have to be tested against the reference kernels. Hence data for the tests of these kernels can be generated in the test files using helper functions or by using external files to be read through the standard input. In particular for CUDA, HIP and DPC++ the data size should be at least bigger than the architecture’s warp size to ensure there is no corner case in the kernels.
Documentation style#
Documentation uses standard Doxygen.
Developer targeted notes#
Make use of @internal doxygen tag. This can be used for any comment which is
not intended for users, but is useful to better understand a piece of code.
Whitespaces#
Documenting examples#
There are two main steps:
First, you can just copy over the
doc/folder (you can copy it from the example most relevant to you) and adapt your example names and such, then you can modify the actual documentation.
In
tooltip: A short description of the example.In
short-intro: The name of the example.In
results.dox: Run the example and write the output you get.In
kind: The kind of the example. For different kinds see the documentation. Examples can be ofbasic,techniques,logging,stopping_criteriaorpreconditioners. If your example does not fit any of these categories, feel free to create one.In
intro.dox: You write an explanation of your code with some introduction similar to what you see in an existing example most relevant to you.In
builds-on: You write the examples it builds on.
You also need to modify the examples.hpp.in file. You add the name of the example in the main section and in the section that you specified in the
doc/kindfile in the example documentation.
Other programming comments#
C++ standard stream objects#
These are global objects and are shared inside the same translation unit.
Therefore, whenever its state or formatting is changed (e.g. using std::hex or
floating point formatting) inside library code, make sure to restore the state
before returning the control to the user. See this stackoverflow
question
for examples on how to do it correctly. This is extremely important for header
files.
Warnings#
By default, the -DCMAKE_CXX_FLAGS should be set to -Wpedantic to emit
pedantic warnings by default. The CI system additionally also has a step
where it compiles for pedantic warnings to be errors.
Avoiding circular dependencies#
To facilitate finding circular dependencies issues (see Using library
classes for more details), a CI step no-circular-deps
was created. For more details on its usage, see this
pipeline,
where Ginkgo did not abide to this policy and PR
#278 which fixed this. Note
that doing so is not enough to guarantee with 100% accuracy that no circular
dependency is present. For an example of such a case, take a look at this
pipeline
where one of the compiler setups detected an incorrect dependency of the cuda
module (due to jacobi) on the core module.