Castor: A C++ library to code “à la Matlab”

The objective of the Castor framework is to propose high-level semantics, inspired by the Matlab language, allowing fast software prototyping in a low-level compiled language. It is nothing more than a matrix management layer using the tools of the standard C++ library (C++14 and later), in different storage formats (full, sparse and hierarchical). Linear algebra operations are built over the BLAS API and graphic rendering is performed in the VTK framework. The Castor framework is provided as an open source software under the LGPL 3.0, compiled and validated with clang and gcc.


Statement of need
Matlab is a software used worldwide in numerical prototyping, due to its particularly userfriendly semantics and its certified toolboxes. However, many use cases do not allow codes in Matlab format, for example multi-platform portability issues, proprieraty licensing and more generally code interfacing. To start meeting these needs, a header-only template library for matrix management has been developed, based on the standard C++14 and later library, by encapsulating the std::vector class. Many tools and algorithms are provided to simplify the development of prototypes: • dense, sparse and hierarchical matrices manipulations, • linear algebra computations (Anderson et al., 1999), • graphical representations (Schroeder et al., 2000).
This high-level semantic/low-level language coupling makes it possible to gain efficiency in the developpement, while ensuring performance for applications. In addition, direct access to data structures allows users to optimize the most critical parts of their code. Finally, a complete documentation is available, as well as continuous integration unit tests. All of this makes it possible to meet the needs of teaching (notebooks using a C++ interpreter such as Cling), academic research and industrial applications at the same time.

State of the field
For a developer accustomed to the Matlab language, it is natural to turn to prototyping tools such as Numpy or Julia, to produce open-source codes. Indeed, these tools today offer similar semantics and performance, with well-established user communities. To illustrate this similarity, the following codes perform the same tasks, with one implementation in Matlab (MATLAB, 2010) (left) and another in Julia (Bezanson et al., 2012)  end toc @time test(); disp("done."); display("done."); Despite the many advantages that these languages have and their high popularity, many codes are still developed natively in Fortran, C, and C++, for practical or historical reasons. Even if there are tools to automatically generate C/C++ code from a high-level language (as Matlab Coder ), this work is often done manually by specialists. To find high-level semantics in native C++, we can turn to libraries like Eigen (Guennebaud et al., 2010), which offers a matrix API and efficient algebra tools. However, as the comparison below shows, the transcription from a Matlab code to an Eigen-based C++ code is not immediate: auto toc = high_resolution_clock::now(); auto duration = duration_cast<microseconds>(toc -tic); std::cout « "Elapsed time: " « duration.count()*1e-6 « std::endl; std::cout « "done." « std::endl; return 0; } To complete this example, other references are available on this link. This is why all the features of the Castor library have been designed and developed so that the semantics at user level are as close to Matlab as what C++ allows. Moreover, to gain in portability, the manipulations of full matrices and the main algorithms depend only on the standard library which is available on the most majority of operating systems (MacOS, Linux, Windows, Android, etc.). Only advanced linear algebra tools require an external BLAS / LAPACK API, as well as graphical visualization functionality (VTK). The example below illustrates this goal:

Matlab
Castor

Matlab
Castor disp("done."); disp("done."); return 0; } Note: It is important to specify that the Castor library is far from offering today all the functionalities offered by Matlab and its many toolboxes.

Dense Matrix
The dense matrix part of the Castor framework implements its own templatized class matrix< T> in matrix.hpp, where T can be fundamental types of C++ as well as std::complex. This class is built over a std::vector<T> which holds the values (ISO/IEC, 2014). Note that the element of a matrix is stored in row-major order and that a vector is considered as a 1 × n or n × 1 matrix.
The class matrix<T> provides many useful functions and operators such as: • builders which can be used to initialize all coefficients (zeros, ones, eye, etc.), • standard algorithms over data stored in matrices (norm, max, sort, argsort, etc.), • mathematical functions which can be applied element-wise (cos, sqrt, conj, etc.), • matrix manipulations like concatenate matrices in all dimensions, find the non-zero elements or transpose them, reshape size, etc., • standard C++ operators which have been overloaded and work element-wise ( • values accessors and matrix views with linear and bi-linear indexing, • elements of linear algebra, such as the matrix product (mtimes or tgemm) and linear system resolution (multi-right-hand-side gmres), • many other tools to display elements (<<, disp), save and load elements from file (ASCII or binary), etc.
The API provides more than a hundred functions and is designed such that it should feel like using Matlab. For advanced users, direct access to the data stored in the std::vector<T> enables all or part of an algorithm to be optimized in native C++.
This example displays the sum of two matrices with implicit cast :

Linear Algebra
The linear algebra part of the framework, implemented in linalg.hpp, provides a set of useful functions to perform linear algebra computations by linking to optimized implementations of the BLAS and LAPACK standards (Anderson et al., 1999) (OpenBLAS, oneAPI MKL, etc.).
The BLAS part is a straightforward overlay of the C-BLAS type III API, which is compatible with row-major ordering. This is achieved by a template specialization of the tgemm function, which allows optimized implementations to take control of the computation using sgemm, dgemm, cgemm and zgemm. Thanks to this interface, naive implementations proposed in matrix.hpp for dense matrix-products mtimes and tgemm may be improved in term of performance, especially for large matrices.
The LAPACK part is a direct overlay over the Fortran LAPACK API, which uses a column ordering storage convention. This interface brings new high-level functionalities, such as a linear solver (linsolve), matrix inversion (inv, pinv), factorizations (qr, lu), the search for eigen or singular values decompositions (eig ,svd), aca compression (aca), etc. It uses templatized low-level functions following the naming convention close to the LAPACK one (like tgesdd, tgeqrf, etc. In addition, an iterative multi-right-hand-side solver gmres is available in matrix.hpp, without dependency on BLAS and LAPACK.

2D/3D Visualization
The graphic rendering part, provided by graphics.hpp, features 2D/3D customizable plotting and basic mesh generation. It is based on the well-known VTK library (Schroeder et al., 2000).
Here again, the approach tries to get as close as possible to Matlab semantics.
First, the user creates a figure, which is a dynamic container of data to display. The figure class is composed of a vtkContextView class, providing a view with a default interactor style, renderer, etc. Then, graphic representations can be added to the figure, using functions like plot, imagesc, plot3, mesh, etc. Options are available to customize the display of the results, such as the plotting style, legend, colorbar and others basic stuff. Finally, the drawnow function must be called to display all defined figures. The latters are displayed and manipulated in independent windows.
In addition, graphics exports are available in different compression formats (png,jpg, tiff, etc.), as well as video rendering (ogg).

Sparse matrices
Some matrices have sparse structures, with many (many) zeros that do not need to be stored (Tewarson, 1973). There are adapted storage formats for this type of structure (LIL, COO, CSR, etc.), the most natural being to store the indices of rows and columns for each non-zero value, as a list of triplet {i, j, v}. For the Castor framework, a dedicated template class to this kind of matrix has been developed (see smatrix.hpp). The storage format is based on a row major sorted linear indexing. Only non-zero values and their sorted linear indices are stored in a list of pairs {v, l}: for a m × n matrix, the following bijection is used to switch with the common bilinear indexation: Accessors to all the elements are provided so that sparse matrices can be manipulated in a similar way as the dense matrices. This operation is performed by dichotomy with a convergence in log 2 (nnz), where nnz is the number of non-zero elements. Just like dense matrices, numerical values are stored in a templatized std::vector<T>. For convenience, we provide classical builders (sparse, speye, spdiags, etc.), standard C++ operators overloading, views, display functions (disp, spy) and some linear algebra tools (transpose, mtimes, gmres, etc.).

Hierarchical matrices
To widen the field of applications, the H-matrix format, so-called hierachical matrices (Hackbusch, 1999), have been added in hmatrix.hpp. They are specially designed for matrices with localized rank defaults. It allows a fully-populated matrix to be assembled and stored in a lighter format by compressing some parts of the original dense matrix using a low-rank representation (Rjasanow, 2002). They are constructed by binary tree subdivisions in a recursive way, with a parallel assembly of the compressed and full leaves (using the OpenMP standard). This format features a complete algebra, from elementary operations to matrix inversion. An example is given in the application section that follows.

Application with a FEM/BEM simulation
As an application example, an acoustical scattering simulation was carried out using a boundary element method (BEM) tool, implemented with the Castor framework (see the fembem package (Aussal & Bakry, 2021)). We consider a smooth n-oriented surface Γ of some object Ω, illuminated by an incident plane wave u i with wave-number k. The scattered field u satisfies the Helmholtz equation in Ω, Neumann boundary conditions (sound-hard) and the Sommerfeld radiation condition: The scattered field u satisfies the integral representation (Neumann interior extension, see (Terrasse & Abboud, 2013)): for some density µ, with the Green kernel G(x, y) = e ik|x−y| 4π|x − y| . Using the boundary conditions we obtain : where the hypersingular operator H is defined by: Hµ(x) = Γ ∂ nx ∂ ny G(x, y)µ(y)d y .
The operator H is assembled using a P 1 finite element discretization on a triangular mesh of the surface Γ, stored using dense matrices (matrix.hpp) or hierarchical matrices (hmatrix.hpp). Finally, using all the tools provided by Castor to write and solve these equations, we are able to efficiently compute the acoustic diffraction of a harmonic plane wave at 8kHz, on a human head mesh (Jin et al., 2013). As shown in Figure 2, the simulation result highlights the role of the auditory pavilion as a resonator, modifying the timbre of a sound source to allow a listener's brain to precisely locate its direction.