BOOST Library: Difference between revisions
| Line 1: | Line 1: | ||
DRAFT (written by Alfredo Correa Copyright 2009) |
|||
== BOOST Libraries == |
|||
For those using C++ as a programming language, BOOST is a very useful set of libraries that help programming in C++ by providing tools and commonly used idioms that solve recurring problems that emerge when developing in this language, even at a fairly low level. It also encourages certain style of programming, where |
This document gives a practical introduction to the [http://www.boost.org/ BOOST Libraries] oriented to common tasks in scientific computing. For those using C++ as a programming language in general, BOOST is a very useful set of libraries that help programming in C++ by providing tools and commonly used idioms that solve recurring problems that emerge when developing in this language, even at a fairly low level. It also encourages certain style of programming, where generality and efficiency are priority. |
||
These tools include macro |
These tools include macro definitions, template functions and classes, functions, generic data containers, and common algorithms. It can be viewed as a natural extension of the C++ Standard Template Library (STL). |
||
== General Remarks == |
|||
In my opinion BOOST has a very steep learning curve and the syntax can become sometimes very ugly and complicated, but it faster to learn BOOST than to reinvent the wheel and come with buggy home-brew solutions that at the end of the day will look even more ugly and won't be that good anyway. (By good I mean generic and efficient.) Note that I am not comparing BOOST and C++ with other tools like Matlab, or Fortran, or Python. Those language are likely to do better (and quick development) than C++ in many specific areas. (BTW, Boost also closes that gap without introducing modifications to the language.) |
|||
BOOST, like C++, has a very steep learning curve and the syntax can become sometimes very ugly and complicated, but it faster to learn BOOST than to reinvent the wheel and come with buggy home-brew solutions that at the end of the day will look even more ugly and won't be that good anyway. (By good I mean generic and efficient.) Note that I am not comparing BOOST and C++ with other tools like Matlab, or Fortran, or Python. Those languages are likely to do better (and quick development) than C++ in many specific areas. Boost also closes the gap with those languages without introducing modifications to the core C++ language. |
|||
Libraries included in Boost range from very simple ones (like lexical_cast) to very complex ones like (boost.mpi and boost.gil). |
|||
Libraries included in Boost range from very simple ones (like [http://www.boost.org/doc/libs/1_37_0/libs/conversion/index.html Boost.Conversion]) to very complex ones (like [http://www.boost.org/doc/libs/1_37_0/doc/html/mpi.html Boost.MPI] and [http://www.boost.org/doc/libs/1_37_0/libs/graph/doc/table_of_contents.html Boost.BGL]). |
|||
The aim of this document/tutorial is to incrementally document the utility and working practical examples. Extra documentation added by users is very welcome, specially in the area of non-trivial but concise examples, related to our common programming problems (i.e. scientific programming). |
|||
The aim of this document/tutorial is to incrementally document the utility and working practical examples. Extra documentation added by users is very welcome, specially in the area of non-trivial but concise examples, related to our common programming problems (i.e. scientific computing). |
|||
Topics covered in the first stages of this document will include: Boost building and installation, boost.multi_array and boost.mpi. That are the libraries that I have to deal with at this point. Ideally I would like to add boost.ublas in the future and document examples of small libraries that are very usefull for everyday tasks. Ideally the focus will be to document the interaction of these libraries with numerical libraries such as FFTW and Lapack. |
|||
Topics covered in the first stages of this document will include: Boost building and installation, [http://www.boost.org/doc/libs/1_37_0/libs/multi_array/doc/user.html Boost.MultiArray] and Boost.MPI. That are the libraries that I have to deal with at this point. Ideally I would like to add [http://www.boost.org/doc/libs/1_37_0/libs/numeric/ublas/doc/index.htm Boost.uBlas] in the future and document examples of small libraries that are very usefull for numerical tasks. The focus will be to document the interaction of these libraries with numerical libraries such as FFTW and Lapack, and above all, the documentation of practical example code. |
|||
== BOOST build and installation == |
== BOOST build and installation == |
||
It is a bless that most Linux distributions now include some versions of Boost. Many Boost libraries are header only, which means that no linking is necessary (an #include<boost/ ... > line set us ready to use a particular library). |
It is a bless that most Linux distributions now include some versions of Boost, at least the compiled libraries. Many Boost libraries are [http://www.boost.org/doc/libs/1_37_0/more/getting_started/unix-variants.html#header-only-libraries header-only], which means that no linking is necessary (an #include<boost/ ... > line set us ready to use a particular library). For the rest of the libraries, [http://www.boost.org/doc/libs/1_37_0/more/getting_started/unix-variants.html#link-your-program-to-a-boost-library linking] is not hard either, it is just a matter of adding something like "-lboost_NAME" during compilation (with certain [http://www.boost.org/doc/libs/1_37_0/more/getting_started/unix-variants.html#library-naming naming convention]). The current tutorial has been tested in wcr.stanford.edu, with GCC 3 and in my personal Ubuntu with [[Install GCC|GCC 4]]. Some libraries seem to compile a little better (fewer warnings) with GCC 4. If you are going to install GCC 4, [[Install GCC|do it]] before going ahead with this tutorial. |
||
The difficulty with linking is that most Boost distributions in the operating system are outdated; for example the current version of Boost is 1.37, while the most recent distributions come with version 1.34 at best. Once in a while we will need a library not included in older releases (for example Boost.MPI). For "header-only" libraries the situation is not too bad because the header files can be downloaded and they ready to use. For binary libraries we need to compile them. There are tools to compile specific libraries but it will be easier for the moment to compile the whole Boost set. |
|||
Below are the instructions for version 1.37/8 or 1.39. |
|||
An alternative build instructions (including Windows system) is provided [http://shoddykid.blogspot.com/2008/07/getting-started-with-boost.html here]. |
|||
Also if you need Boost.MPI install the mpi compiler before continuing. |
|||
=== Versions 1.37 and 1.38 === |
|||
Before going thorugh these instructions be aware that if you need these versions of boost in your computer you can check whether your distribution already ships with them. For example Ubuntu 9.10 ships with Boost 1.38.1 and it can be installed just by doing |
|||
sudo apt-get install libboost-dev |
|||
which doesn't include the mpi libraries by default. |
|||
If the aforementioned instructions doesn't work or if you need Boost.MPI proceed as follows: |
|||
The sad part is that most distributions are usually outdated, for example the current version of Boost is 1.36, while the most recent distributions come with version 1.34 at best. For "header only" libraries the situation is not that bad because the header files can be downloaded and they ready to use. For binary libraries we need to compile them. There are tools to compile specific libraries but it will be easier for the moment to compile the whole Boost set. |
|||
As usual, to be kind with our usage in |
As usual, to be kind with our usage in administrated system, we will install Boost in our user space, for example in $HOME/usr. |
||
mkdir $HOME/usr |
mkdir $HOME/usr |
||
| Line 28: | Line 45: | ||
wget http://internap.dl.sourceforge.net/sourceforge/boost/boost_1_37_0.tar.gz |
wget http://internap.dl.sourceforge.net/sourceforge/boost/boost_1_37_0.tar.gz |
||
If the download fails, try to download the last version from |
If the download fails, try to download the last version from [http://sourceforge.net/project/showfiles.php?group_id=7586&package_id=8041 Sourceforge Boost download page]. |
||
Then decompress it: |
|||
tar -zxvf boost_1_37_0.tar.gz |
tar -zxvf boost_1_37_0.tar.gz |
||
cd boost_1_37_0 |
cd boost_1_37_0 |
||
./configure --prefix=$HOME/usr |
|||
Append the |
Append the following quoted line to user-config.jam to include Boost.MPI among the installed libraries. |
||
echo "using mpi : mpicxx ;" >> |
echo "using mpi : mpicxx ;" >> user-config.jam |
||
where mpicxx can be replaced by your mpi compiler wrapper, for example "/usr/lib/mpich/bin/mpiCC". |
where mpicxx can be replaced by your mpi compiler wrapper, for example "/usr/lib/mpich/bin/mpiCC", or first selected interactively with "mpi-selector-menu" if is available. |
||
Then compile and install the package |
Then compile and install the package |
||
make |
|||
bjam --with-mpi --prefix=$HOME/usr install |
|||
make install |
|||
Go lunch, that takes ~30 minutes. After successful compilation and installation, all the header files will be installed in ~/usr/include/boost-1-37/boost/*.hpp and the binary linkable files will be in ~/usr/lib/libboost_*. |
|||
Bjam is a Make-like tool specially designed for compiling Boost and I assume it is already installed in the system, if not, you will have to compile it yourself from sources in ./tools/jam. (Don't worry, you don't need bjam to compile bjam. Just do "cd ./tools/jam", "./build_dist.sh" and "cp stage/boost-jam-3.1.16-1-linuxx86/bjam ~/usr/local/bin") |
|||
Depending on how you want to compile your programs it might be necessary to add these directories to the include path of compilation for example, LD_LIBRARY_PATH. It can also be useful to make a link to the more standard usr/include/boost/: |
|||
After successful compilation and installation, all the header files will be installed in ~/usr/boost_1_37_0/boost, for example ~/usr/boost_1_37_0/boost/mpi.hpp and the binary linkable files will be in ~/usr/lib, to be specific: |
|||
cd ~/usr/include |
|||
~/usr/lib/libboost_serialization-gcc43-mt.a |
|||
ln -s boost-1-37/boost boost |
|||
~/usr/lib/libboost_mpi-gcc43-mt.a |
|||
~/usr/lib/libboost_serialization-gcc43-mt-1_37.so |
|||
~/usr/lib/libboost_mpi-gcc43-mt-1_37.so |
|||
=== Versions 1.39 and up === |
|||
depending on how you want to compile your programs is might be necesary that you change add these directories to the include path of compilation for example, LD_LIBRARY_PATH. |
|||
You can download an specific version of Boost |
|||
(To see what is the last version of Boost go to http://sourceforge.net/projects/boost/files/boost) |
|||
You need subversion, for example |
|||
sudo apt-get install subversion |
|||
BOOST_VER=1_47_0 |
|||
mkdir -p $HOME/usr |
|||
mkdir -p $HOME/soft |
|||
cd $HOME/soft |
|||
wget http://downloads.sourceforge.net/boost/boost_$BOOST_VER.tar.gz |
|||
tar -zxvf boost_$BOOST_VER.tar.gz |
|||
cd boost_$BOOST_VER |
|||
'''or''' you can use the SVN to download (and update) the latest version. You need subversion for that (for example <code>sudo apt-get install subversion</code> |
|||
mkdir -p $HOME/soft/boost.svn |
|||
cd $HOME/soft/boost.svn |
|||
time svn co http://svn.boost.org/svn/boost/trunk |
|||
grep BOOST_LIB_VERSION trunk/boost/version.hpp #will show the nominal version of the lib |
|||
cd trunk |
|||
(20 minutes) |
|||
The installation can slightly benefit from having libicu and other basic libraries installed, for example |
|||
sudo apt-get install libicu-dev libbz2-dev python-dev |
|||
Then execute |
|||
./bootstrap.sh --prefix=$HOME/usr --exec-prefix=$HOME/usr`uname -m` |
|||
#if you need mpi, replace "mpicxx" by your mpi compiler wrapper (e.g. mpicxx.mpich, or /opt/mvapich-gnu-gen2-0.9.9/bin/mpicxx) |
|||
echo "using mpi : mpic++ ;" >> project-config.jam |
|||
NUM_CORES=`cat /proc/cpuinfo | grep processor | wc -l` |
|||
time ./bjam -j $NUM_CORES cxxflags=-fPIC | tee bjam.out |
|||
rm -rf ~/usr/include/boost/* ~/usr`uname -m`/lib$CLUSTER/libboost_* |
|||
./bjam install |
|||
It takes ~30 minutes (or ~15 minutes in two processors or ~3 minutes in 8 processors). |
|||
Then create the following link (not necessary in version >=1.40): |
|||
cd ~/usr/include |
|||
ln -s boost-1_39/boost boost |
|||
Boost 1.40 can be installed in Ubuntu 9.10 with the line |
|||
sudo aptitude install libboost1.40-all-dev |
|||
which includes Boost MPI (compiled with openMPI). |
|||
Add the following variable to your environment |
|||
export LD_LIBRARY_PATH=$HOME/usr/lib$CLUSTER |
|||
If downloading from SVN, Boost can be easily updated by doing |
|||
cd ~/soft/boost.svn/trunk |
|||
grep BOOST_LIB_VERSION boost/version.hpp #will show the last nominal version of the lib |
|||
svn update |
|||
grep BOOST_LIB_VERSION boost/version.hpp #will show the new nominal version of the lib |
|||
./b2 install -j 2 |
|||
To delete the installation: |
|||
rm -rf ~/usr/include/boost ~/usr/lib$CLUSTER/libboost_* |
|||
=== Versions 1.48 and up === |
|||
<code>bjam</code> is replaced by <code>b2</code>. |
|||
If you are planning to use c++0x standard it may be better to compile with c++0x support: |
|||
./bootstrap.sh --prefix=$HOME/usr |
|||
NUM_CORES=`cat /proc/cpuinfo | grep processor | wc -l` |
|||
time ./b2 -j $NUM_CORES cxxflags='-fPIC -std=c++11 -O3' | tee bjam.out |
|||
#rm -rf ~/usr/include/boost/* ~/usr`uname -m`/lib$CLUSTER/libboost_* |
|||
./b2 install |
|||
(take only 3 minutes in 8-cores computer) |
|||
==== Possible Issues ==== |
|||
Boost compilation may require bzlib library installed (libbz2-dev package in Ubuntu, if it is not available use the <code> -sNO_COMPRESSION=1 </code> as a bjam option), or if it complain about python headers install python development (python-dev pack in Ubuntu) |
|||
==== Intel compiler for Boost.MPI ==== |
|||
The following steps allows us to compile boost-mpi 1.42.0 library using intel compiler (on su-ahpcrc). |
|||
tar zxvf boost_1_42_0.tar.gz |
|||
cd boost_1_42_0 |
|||
./bootstrap.sh --with-toolset=intel-linux --with-libraries=mpi --prefix=$HOME/usr-intel --libdir=$HOME/usr-intel/lib |
|||
echo "using mpi : mpicxx ;" >> project-config.jam |
|||
vi tools/build/v2/tools/mpi.jam |
|||
time ./bjam -j $NUM_CORES --with-mpi toolset=intel | tee bjam.out |
|||
./bjam install --prefix=$HOME/usr-intel |
|||
In the line with command <tt>vi</tt>, we need to modify the mpi.jam file (at line 208) to something like: |
|||
if $(otherflags) { |
|||
for unknown in $(unknown-features) |
|||
{ |
|||
result += "$(unknown)$(otherflags:J= )" ; |
|||
} |
|||
} |
|||
The line in the original mpi.jam file reads: |
|||
result += "$(unknown)$(otherflags)" ; |
|||
This bug fix is discussed at [http://old.nabble.com/-Boost.mpi--Installation-with-Intel-compilers-td28217964.html#a28217964 here]. |
|||
Also see [http://software.intel.com/en-us/articles/intel-c-compiler-for-linux-building-quantlib-with-intel-c-compiler-for-linux/ here]. |
|||
bjam "-sINTEL_PATH=/opt/intel/cce/10.0.014/" -sTOOLS=intel-linux -sINTEL_CC="icc -xT -O3 -no-prec-div " -sINTEL_CXX="icc -xT -O3 -no-prec-div" "-sBUILD=release <debug-symbols>on" >& bjam-build-icc-xT-O3-no-prec-div.log & |
|||
=== Remove Boost === |
|||
To remove an old or incomplete installation |
|||
rm -rf ~/usr/lib/libboost_* ~/usr/include/boost* |
|||
== Boost.Array Library == |
|||
[http://www.boost.org/doc/libs/1_40_0/doc/html/array.html Boost.Array] is one of the simplest Boost libraries yet it solves a common problem in C++. That is that C++ fixed size arrays are not objects but simple pointers whose size (number of elements) cannot be passed to functions unless we add an argument to the function with the number of elements. |
|||
double a[] = {1,2,3}; double* b=0; // Typical C-arrays |
|||
double f(double* c, int c_size){ ... } // Typical function taking C-array by reference |
|||
To alleviate the problem the Array library defines a class called boost::array<T,N> where T is the type of the elements and N is the (fixed) size of elements. |
|||
The syntax is very easy but it can be tricky sometimes |
|||
boost::array<double, 3> a; |
|||
a[0]=1; |
|||
a[1]=2; |
|||
a[2]=3; |
|||
They can be used as other STL-containers: |
|||
assert(a.size()==3); |
|||
for(double* i=a.begin(); i!=a.end(); ++i) std::cout<<*i<<std::endl; |
|||
The corresponding (internal) mutable C-array can be accessed by the member function <code>.c_array()</code>. The member <code>.data()</code> is a constant C-array view. Arrays of the same size can be compared, swapped, copied, filled, etc ([http://www.boost.org/doc/libs/1_40_0/doc/html/boost/array.html see detailed specification]), things that can not be directly done with C-arrays. |
|||
=== Comparison with other containers === |
|||
std::vector<T> in some sense plays a similar role and it has the same memory layout (stored elements are contiguous in memory). But in the case of std::vector the size is not set at compile time but it is dynamic. This extra flexibility comes at a cost. First, there is a (small) overhead in std::vector. Second, sometimes we want to express that some array has a constant size. |
|||
Programmatically, static information of this type can be accessed with <code>::static_size</code> and <code>::value</code> type (http://codepad.org/gOnoJebu). |
|||
Some technical differences with other containers: 1) initial elements may have an undetermined value even if they have default constructor 2) there is no efficient swap() (no constant complexity) 3) no allocator support. |
|||
=== Initialization === |
|||
==== Canonical ==== |
|||
There not much else to say about this library. The only tricky part is how to construct and define a boost::array 'in place'. To do so we still have to use the old literal C++-fixed arrays and put a double curly bracket: |
|||
boost::array<double, 3> a=<nowiki>{{1,2,3}}</nowiki>; |
|||
Let's say we have a function 'f' that takes a boost::array<int,3> and we want to call the function but do not want to create an explicit temporary, in this case some intuitive syntaxes don't work |
|||
f(<nowiki>{{1,2,3}}</nowiki>); //doesn't work |
|||
f(boost::array<int, 3>(1,2,3)) //doesn't work |
|||
f(boost::array<int, 3>({1,2,3})) //doesn't work |
|||
f(boost::array<int, 3>(<nowiki>{{1,2,3}}</nowiki>)) //doesn't work |
|||
what eventually works is the following (see http://codepad.org/IEHmUEmu) |
|||
f((boost::array<int,3>)<nowiki>{{1,2,3}}</nowiki>); //does work for some compilers, note the parenthesis |
|||
This trick is undocumented and it works in modern versions of the compilers, for example in g++ 4.3.3. For old compilers, a named variable has to be created: |
|||
boost::array<int,3> tmp=<nowiki>{{1,2,3}}</nowiki>; |
|||
f(tmp); |
|||
Another problem is that if we initialize the array with less elements than necessary the compiler will not complain. |
|||
boost::array<int,3> a=<nowiki>{{1,2}}</nowiki>; // a[2] may be initialized to zero without warning |
|||
Any C-array can be used to be copied to a <code>boost::array</code> but that requieres an additional line of code and it is not strictly an initialization. |
|||
boost::array<int, N> a; |
|||
std::copy(carr, carr+N, a.begin()); |
|||
Besides the tricky syntax for initialization, these objects are useful to explicitly say that some arrays are not just pointers and that they have a size that is determined during compilation and does not depend on the state of the program. Also we can forget about the problems associated with built-in fixed arrays and pointer errors. |
|||
==== Using Boost.Assign ==== |
|||
Alternatively a library called [http://www.boost.org/doc/libs/1_40_0/libs/assign/doc/index.html Boost.Assign] can be used |
|||
#include<boost/assign/list_of.hpp> |
|||
namespace assign = boost::assign; |
|||
boost::array<double,3> a=assing::list_of(1.)(2.)(3.); |
|||
[http://www.boost.org/doc/libs/1_40_0/libs/assign/test/array.cpp Boost.Assign works with boost::array], but it is also more general. The same method can be used to initialize other containers, such std::vector, std::list, or std::set |
|||
using boost::assing::list_of; |
|||
const std::list<int> primes = list_of(2)(3)(5)(7)(11); |
|||
It is one of the few ways to initialize constant (const) containers. |
|||
std::maps can by initialized with <code>map_list_of(key1,value1)(key2,value2)...(keyN,valueN)</code>, containers of tuples can be initialized with <code>tuple_list_of</code>. |
|||
Although not strickly connected with the problem of initialization the library can also be used to assign values to containers very easily. Without the library we usually need something like |
|||
std::vector<int> a(10); |
|||
a[0]=1; |
|||
a[1]=2; |
|||
... |
|||
a[9]=10; |
|||
With the library this long code can be replaced by |
|||
#include <boost/assign/std/vector.hpp> |
|||
#include <boost/assert.hpp> |
|||
using namespace boost::assign; //brings operator+= for std::vector into scope |
|||
vector<int> v; |
|||
v += 1,2,3,repeat(10,4),5,6,7,8,9; |
|||
// v = [1,2,3,4,4,4,4,4,4,4,4,4,4,5,6,7,8,9] |
|||
(This doesn't work with boost::array, because arrays can not grow dynamically). |
|||
=== Replacing raw c-arrays === |
|||
There are certain advantages in using boost::array instead of plain c-array. For example, boost::array are first class objects, they can be duplicated, or copied, and there is no complications with pointers. If you ever programmed in C you will remember how easily c-arrays decay into pointers, loosing track of its lengths, and how much code is then devoted to take care of the pointed elements. Boost.Array also provides range (out of bound) checking in debug mode. Also, boost::array can be used where other STL containers and algorithms can be used. |
|||
Then, a question arises, if have a lot of code that uses plain old c-arrays, for example, in function calls |
|||
void f(unsigned size, const double c[]) |
|||
how do we pass a boost array <code>a</code>. The answer is that the old function can be called in three different ways: |
|||
f(a.size(), a.data()); //gives a pointer to const element |
|||
f(a.size(), a.c_array()); //gives a raw pointer whose pointee can be modified |
|||
f(a.size(), &a[0]); //but it is not clear whether a elements can be modified in the |
|||
The other way around, in which we write a new function that takes advantage of boost::array |
|||
g(boost::array<double, 3>& a); |
|||
but the code is handling c-arrays from the caller side. Then one possibility is to |
|||
double c[3]; |
|||
... |
|||
boost::array<double, 3> const& c_as_barray = *(boost::array<double, 3>*)c; // = (boost::array<double, 3>&)c; may not work |
|||
g(c_as_barray); |
|||
This trick can be used even if the function modifies the argument, modifications are reflected in the original array. |
|||
The function can be generalised to different sizes by using templates |
|||
template<unsigned N> |
|||
g(boost::array<double, N>& a){ ... } |
|||
and calling <code> g<3>(a) </code>. If the size is not known at the time of coding (because the size is a variable) then boost::array is not the right container, you can use std::vector instead (part of STL). |
|||
g(std::vector<double>& a) ... |
|||
std::vector should be used when the size of the array is only known at runtime, otherwise use boost::array. |
|||
=== Static Multidimensional arrays via Boost.Array === |
|||
Another feauture of boost::array is that array of arrays (of arrays...) can be defined, and intialized with its elements. |
|||
using boost::array; |
|||
array<array<double, 2>,3> a={{ |
|||
<nowiki>{{1,2}}, |
|||
{{3,4}}, |
|||
{{5,6}}</nowiki> |
|||
}}; |
|||
then a[0..2][0..1] can be used. (Most brackets could be removed in GCC 4.). This should replace the c-array |
|||
double ca[3][2] = {1,2,3,4,5,6}; |
|||
In both cases the elements will be continuous in memory. |
|||
(A similar effect also can be achieved with vector<vector<double> >, the important difference is that in this case each vector can have a different size and the elements are not contiguous.) |
|||
A shortcut can be defined with template typedefs: |
|||
struct array2<class T, unsigned D1, unsigned D2>{ |
|||
typedef array< array<T, D2>, D1> type; |
|||
... //optional metainfo |
|||
}; |
|||
... |
|||
array2<double, 2, 3>::type a; |
|||
See example code here http://codepad.org/Gp2AOP92. |
|||
Using boost::array in general is slightly better than using c-arrays because boost::array can never be confused with a pointer. |
|||
For better support of multidimensional arrays (or for dynamic sizing) read the following section. |
|||
=== Printing Boost.Array === |
|||
Boost.Array has no default print mechanism, even if the contained elements are printable (i.e. <code>std::cout << elem</code> code is valid). One option is to write a loop that prints each element. An easier options is to use Boost.Fusion (covered in another section). Just by including |
|||
#include <boost/fusion/adapted/boost_array.hpp> |
|||
#include <boost/fusion/sequence/io.hpp> |
|||
using namespace boost::fusion; // or using boost::fusion::operator<<; // important! |
|||
int main(){ |
|||
boost::array<double, 3> a = {1.,2.,3.}; |
|||
std::clog << a << std::endl; |
|||
std::clog << tuple_open('[') << tuple_close(']') << tuple_delimiter(", ") << a << std::endl; |
|||
std::clog << "input array here [respect default format (x y z) !!] : "; |
|||
std::cin >> a; |
|||
std::clog << a; |
|||
return 0; |
|||
} |
|||
This works out-of-the-box as long as element_type is printable. An added benefit compared to home brew print functions is that it works also for input streams (default format) and also that the format of the delimiters can be specified. |
|||
=== C++0x Array === |
|||
The standard library of the C++0x compiler includes a class called <code>std::array</code> that is almost the same as <code>boost::array</code>. However the interface is not exactly the same, (for example there is no .data member in <code>std::array</code> and static_size is replaced by tuple_size<.>::size) and also. strictly speaking they are different types (for example the printing code using Fusion doesn't work). But besides these differences they can be used interchangeably. If you want your code to work with Boost.Array and also with with std.array one strategy can be to exploit the using declaration: |
|||
#if __cplusplus > 199711L // new C++ standard and library is used by the compiler |
|||
#include<array> // std::array |
|||
using std::array; |
|||
#else |
|||
#include<boost/array.hpp> |
|||
using boost::array; |
|||
#endif |
|||
array<double, 3> arr = {1.,2.,3.}; //use std/boost array without the namespace |
|||
== Boost.MultiArray Library == |
== Boost.MultiArray Library == |
||
The MultiArray |
The MultiArray is a library solves a very annoying problem with C/C++ for our area. C/C++ have a very limited support for arrays and the situation is even worst for multidimensional arrays (2, 3 or more indices). Passing arrays to functions is very error prone, and functions with endless parameters with array dimensions must be continuously passed around. Moreover, writing algorithms for arbitrary array dimensionality is very hard. |
||
For one-dimensional arrays we have Boost.Array, std::vector and std::valarray. For multidimensional arrays we have Boost.MultiArray. |
|||
The C++ (or C) language was designed to be very flexible, there are infinite ways to arrange multidimensional arrays in memory and infinite ways to resize them, so the C++ language did not enforce any particular standard. Yet for scientific computing (which is only a subset of all computing areas), we can agree that there are a few multidimensional arrangements that make sense in numerical problems: namely, contiguous blocks memory with either column-major ordering (Fortran-convention) or [http://en.wikipedia.org/wiki/Row-major_order row-major ordering] (C-convention). |
|||
Among this |
Among this reasonable constrains of the memory layout, MultiArray provides a very flexible, very small and nice interface. The official [http://www.boost.org/doc/libs/1_37_0/libs/multi_array/doc/user.html#sec_example MultiArray manual] provides rather simple but useful examples to begin with. Below there are more examples more suitable for existing numerical programs. |
||
From the simple examples it can be noticed MultiArray usage is very similar to the default C-arrays with few very convenient differences 1) no explicit allocation (ie. malloc/new) is needed 2) no explicit deallocation (free/delete[]) 3) indices ordering can be C-like or Fortran-like (useful for interfacing with Fortran routines) 4) Arrays keep track of their shape (no separate variables with size are needed). |
|||
EXAMPLES HERE |
|||
The typing system of MultiArray is quite complicated and can be confusing, one way to alleviate this problem is to make extensive use of [http://en.wikipedia.org/wiki/Typedef#Usage_in_C.2B.2B <code>typedef</code>s]. (You may seldom benefit from using [http://en.wikibooks.org/wiki/More_C%2B%2B_Idioms/Type_Generator Template typedefs].) |
|||
== Boost.MPI == |
|||
=== Basic Usage (<code>multi_array</code>) === |
|||
I will assume that you already know the very basics of C-MPI and you want to improve the readability and design of your code. If you don't know C-MPI, you may try to learn from this examples (the syntax is very nice to learn what mpi is supposed to do --in contrast with the ugly C version--) but be aware that Boost.mpi is not very widely. This tutorial is meant to make your life easier in the former case not in the second case. |
|||
MultiArray is a header-only library -- no special linking is necessary. You just need to include the header file: |
|||
I will try to work out an example that is not trivial by using fftw in parallel. The official boost.mpi pages are plenty of trivial examples. |
|||
#include<boost/multi_array.hpp> |
|||
Before going to the first example let us see the command line necessary to compile it |
|||
The usage of the multi_array class is pretty intuitive. First we declare and construct an variable of type multi_array, specifying its dimensionality (2) and its dimensions (100x100). |
|||
mpicxx fft_mpi_test.cpp -I/usr/lib/mpich/include/ \ |
|||
-L$HOME/usr/lib -lfftw3_mpi $HOME/usr/lib/libfftw3.a -lboost_mpi-gcc43-mt-1_37 -lboost_serialization-gcc43-mt \ |
|||
-lm -o fft_mpi_test |
|||
boost::multi_array<double, 2> A(boost::extents[10][10]); |
|||
besides the fftw part, -L$HOME/usr/lib -lboost_mpi-gcc43-mt-1_37 -lboost_serialization-gcc43-mt do the job of linking against boost.mpi and boost.serialization. |
|||
The number of arguments of extents has to agree with the dimensionality. Then you can use the matrix as usual, |
|||
The reason that we need boost.serialization (although we don't use it explicitly here) is that mpi is highly dependent on the serialization mechanism. Serialization is the capacity of converting any C++ object in memory into a linear stream of data. That is what we do over and over again when we save data to a file, in this case MPI communication is basically the same thing: one processes send streams of data to others. (Ideally will be a separate chapter on the powerful boost.serialization.) This integration allows to pass serializable (in the boost sense) complex objects between processes without having to convert (deconstruct) the objects to numerical (or primitive) types first and then reconstructing the object on the other side as we would do it with C-MPI. |
|||
for(int i=0; i!=A.shape()[0]; ++i) |
|||
This is the compilable example: |
|||
for(int j=0; j!=A.shape()[1]; ++j) |
|||
A[i][j]=i+j; |
|||
The elements can be accessed by a sequence of square brakets '[]' or passing a collection of integers with the parenthesis operator '()' as in this example [http://codepad.org/lHyP9eV3 example]: |
|||
#include </home/correaa/usr/include/fftw3-mpi.h> |
|||
#include <iostream> |
|||
#include <complex> |
|||
#include <boost/mpi.hpp> |
|||
boost::array<int, 2> idxs=<nowiki>{{1,2}}</nowiki>; |
|||
A(idxs)+=1.2; |
|||
For an N-dimensional array we would need N similar loops. For any dimensionality, the elements can be iterated in its underlying memory representation: |
|||
for(double* it=A.origin(); it!=A.origin()+A.num_elements(); ++it) |
|||
(*it)*=3; |
|||
In this example, the whole multiarray is multiplied by 3, regardless of its dimensionality. Note that num_elements()==shape()[0]*shape()[1]...shape()[N]. |
|||
The dimensionality of a *particular array* can be known by using A.num_dimensions(). The dimensionality of the *type* can be known by using the static constant "::dimensionality". |
|||
typedef boost::multi_array<double, 4> marray; |
|||
marray A; |
|||
assert( A.num_dimensions() == marray::dimensionality ); |
|||
The [http://www.boost.org/doc/libs/1_37_0/libs/multi_array/doc/user.html#sec_example official documentation] and [http://www3.interscience.wiley.com/journal/109793810/abstract?CRETRY=1&SRETRY=0 this paper] have additional examples and some explanation of the library internals. |
|||
=== Using MultiArray on existing code together with C-arrays (<code>multi_array_ref</code>) === |
|||
MultiArray can be adapted easily in programs that already use raw-arrays since the first element to the contiguous array in memory can be known for functions that expose the pointer to the array in memory; the shape of the array can be also accessed: |
|||
boost::multi_array<double, 2> A(boost::extents[10][10]); // allocates memory for 100 elements |
|||
... |
|||
double* A_ptr = A.origin(); |
|||
int A_N0 = array.shape()[0]; // A_N0 <- 10 |
|||
int A_N1 = array.shape()[1]; // A_N1 <- 10 |
|||
... |
|||
// use the array in a typical c-function |
|||
somecfunction(A_N0, A_N1, A_ptr); // or as cfunction(A.shape()[0], A.shape()[1], A.origin() ); |
|||
... |
|||
// .. no need to free memory, it is deallocated when A goes out of scope |
|||
Such function can even modify the elements of the array. In the example the array memory is managed by the multi_array A and therefore the memory is freed when A goes out of scope. Sometimes, for existing applications, this is not possible because the memory is managed in some other complicated way. In that case we can still take advantage of Boost.MultiArray by using the <code>boost::multi_array_ref</code> class, that is the same as a boost::array except that it does not manage the underlaying memory: |
|||
int A_N0 = 10; int A_N1=10; |
|||
double* A_arr = new double[N1*N2]; // or = malloc(sizeof(double)*A_N0*A_N1); |
|||
... |
|||
boost::multi_array_ref<double, 2> A_ref(A_arr, boost::extents[A_N0][A_N1]); |
|||
... |
|||
// access A_arr through A_ref or call a MultiArray aware function |
|||
somemultiarrayfunction(A_ref); |
|||
... |
|||
delete[] A_arr; //or free(A_arr); (do not use A_view after this point!) |
|||
In this way we can go back and forth form C-arrays (raw-arrays) to boost::multi_array. Generalizations to higher dimensions are similarly easy. (One of the strong points of MultiArray is that dimensionality can be arbitrary). Let me stress that in the first case the memory is automatically managed for us, while in the second case we have to take care of that by some external (even manual) process. |
|||
==== A simple FFTW example ==== |
|||
In the context of numerical libraries the following is an example to call [http://www.fftw.org/fftw3.3alpha_doc/ FFTW3] in combination with MultiArray. (Remember to [[Install FFTW3]] first.) |
|||
boost::multi_array<std::complex<double>, 2> A(boost::extents[10][10]); |
|||
for(int i=0; i!=10; i++) |
|||
for(int j=0; j!=10; j++) |
|||
A[i][j]= std::complex<double>(i,j); |
|||
... |
|||
fftw_plan p=fftw_plan_dft_2d(A.shape()[0], A.shape()[1], A.origin(), A.origin(), FFTW_FORWARD, FFTW_ESTIMATE); // in place |
|||
fftw_execute(p); |
|||
fftw_destroy_plan(p); |
|||
Advanced interface of FFTW take stride arguments as well, this is not a limitation for MultiArray since, for example, array_view's (see below) also keep track of internal strides (e.g. A_center.stride()[n] is the stride in direction n of the referenced memory). |
|||
==== Generic code, but not too generic (Boost.Utility enable_if and Boost.TypeTraits) ==== |
|||
As soon as we start working with MultiArray we will recognize than there are several kinds of multiarrays: 1) multi_array, 2) multi_array_ref, 3) const_multi_array_ref and in general they are useful in different contexts. The first is the plain object that manages its own memory, the second is brings read write access to an existing C-array and the last one gives read-only access. On top of that when specifying partial indices (e.g. a[3] in a 2-dimensional array) the type is 4) subarray, and if we do this is a const_ version we get a 5) const_subarray. And all these types are formally different. This complicated type system has the drawback that there is no way to write a generic function that takes any of them unless we use templates. |
|||
So instead of repeating 5 versions of some function |
|||
return_type some_function(multi_array<double, 2>& ma){ ... |
|||
return_type some_function(const_multi_array<double,2>& ma){ ... |
|||
return_type some_function(multi_array_ref<double,2>& ma){ ... |
|||
return_type some_function(subarray<double,2>& ma){ ... |
|||
we write a single template function |
|||
template<class MultiArray> |
|||
return_type some_function(MultiArray& ma){ ... |
|||
<code>return_type</code> is just to indicate the return type of the function, it could be for example <code>void</code>. |
|||
When calling this function it will work with any of the mentioned types. This solves the problem of the multiple types as long as the actual deduced type behaves like a multi_array the program will work. |
|||
The problem is that now the function is too generic. In our first five versions of the function we were specific enough to only work with arrays of doubles but now we lost that constrain and it maybe that the function logically shouldn't work for not-real numbers. |
|||
For example the some_function could rely on comparing (ordering) different elements, if that is the case then when using the function with an array of complex<double> the compiler will complain. This not an ideal solution because the error will usually appear in mysterious places. Or it can be more subtle than that the function may rely and we may not even get an error from the compiler but silently have a bug. |
|||
Here is were [http://www.boost.org/doc/libs/1_40_0/libs/utility/enable_if.html <code>enable_if</code>] (part of [http://www.boost.org/doc/libs/1_40_0/libs/utility/index.html Boost.Utility]) allows us to constraint function arguments is some situations. What enable_if does when within the function declaration/specification (see later) is to disable or enable a particular instantiation of the template function depending on some condition. For example this condition can be that the elements of the array is a double (an nothing else). There are several ways to use enable_if, but there are mainly two for functions definitions (the example described here). Continuing with the example, we can |
|||
#include <boost/utility/enable_if.hpp> |
|||
#include "boost/type_traits.hpp" // for is_same |
|||
using boost::enable_if_c; |
|||
using boost::is_same |
|||
1) Replace the return type (return_type) with an enable_if |
|||
template<typename MultiArray> |
|||
typename enable_if_c<is_same<typename MultiArray::element, double>::value, |
|||
return_type>::type return_type some_function(MultiArray& ma){ ... |
|||
This is the preferable syntax since the condition is right before the function name. Another possibility is to |
|||
2) Replace and input parameter with an enable_if that on success gives the type of that original parameter |
|||
template<typename MultiArray> |
|||
return_type some_function( |
|||
typename enable_if_c<is_same<typename MultiArray::element, double>::type, |
|||
MultiArray >::type & ma){ ... |
|||
This can be also preferable because the constrain on the generic type MultiArray is next to the parameter itself. |
|||
Option 1) is not always applicable: some special functions don't have return types (i.e. class constructors), in this case we can |
|||
3) Add a dummy ''optional'' parameter |
|||
template<typename MultiArray> |
|||
return_type some_function( |
|||
MultiArray& ma, typename enable_if_c<is_same<typename MultiArray::element, double>::value, void*>::type = 0 |
|||
){ ... |
|||
The '=0' makes the last parameter optional which means that we can still use the function by passing the usual parameter. This optional parameter is not meant to be used at all during the function call. |
|||
In the three cases the effect is the same, function is invisible if we try to call it with a kind of multi_array whose elements are not doubles. |
|||
boost::multi_array<double, 2> adbl; |
|||
some_function(adbl); //ok, some_function will be called |
|||
boost::multi_array<std::complex<double>, 2> acmplx; |
|||
some_function(acmplx); // compiler error, no such function (i.e. not for std::complex arrays) |
|||
In the example we constrained the multiarray by its elements type and relied in the the function is_same from [http://www.boost.org/doc/libs/1_40_0/libs/type_traits/doc/html/index.html Boost.TypeTraits library]. is_same<T1,T2>::value is (a constant) true if T1 is identical to T2 and (the constant) false otherwise. |
|||
Of course the value is obtained at compilation time. |
|||
We could have constrained the function to take only arrays with certain dimensionality, is that case we could also have used <code>enable_if_c<MultiArray::dimensionality==2, ...>::type</code> as an improvement over more crude approaches like compile-time checking <code>BOOST_STATIC_ASSERT(MultiArray::dimensionality==2)</code>. |
|||
One difference with respect to BOOST_STATIC_ASSERT is that when using enable_if the kind of error given by the compiler is the usual 'no matching function'. In short, use static assert to force the compiler to issue an error directly if the type is not correct but use enable_if if you want the compiler to try with a different (overloaded) function and if this fails then issue a function not found error. |
|||
The reason that makes <code>enable_if</code> work is too complicated to explain here, but I only will mention that it is because of a feature of C++ called SFINAE[http://en.wikipedia.org/wiki/Substitution_failure_is_not_an_error] (an acronym that doesn't seem to describe what it's supposed to describe). Luckly nowadays most compilers support this feature. I admit that the syntax becomes so complicated that I wouldn't call it a feature but an embarrassment of the language. Another rough corner of C++. |
|||
=== Fortran ordering and Fortran functions (<code>fortran_storage_order</code>) === |
|||
Functions in libraries like Lapack expect arrays in column-major storage order format, this option of storage is supported by Boost.MultiArray. (For linear algebra matrices it is more recommendable to use [http://www.boost.org/doc/libs/1_37_0/libs/numeric/ublas/doc/index.htm Boost.uBlas] which is more oriented to linear algebra operations and one or two dimensional arrays, i.e. vectors and matrices.) |
|||
The ordering of the elements in memory is controlled by an option that can be set when declaring the array. (By default, if not specified, boost::c_storage_order is used.) |
|||
boost::multi_array<double,2> A(boost::extents[10][10], boost::fortran_storage_order() ); |
|||
... // assign A value with Fortran index ordering |
|||
In order to imitate Fortran convention, there is even an option to declare the multi_array to be 1-based (instead of 0-based) (see below). |
|||
We then can call a Fortran function as: |
|||
int A_N1=A.shape()[0]; int A_N2=A.shape()[1]; |
|||
fortran_function(&A_N1, &A_N2, A.origin()); // Fortran arguments are passed by reference(pointer) |
|||
or write a wrapper to the Fortran function |
|||
wrapper_function(boost::multi_array<double, 2>& in){ |
|||
int in_N1=in.shape()[0]; int in_N2=A.shape()[1]; |
|||
fortran_function(&in_N1, &in_N2, in.origin()); // Fortran arguments are passed by reference(pointer) |
|||
} |
|||
and then call it simply by doing 'wrapper_function(A);'. |
|||
Calling a Fortran function that takes stride arguments can be done by means of subarray's and this case is not covered here. |
|||
The difference between c_storage_order and fortran_storage_order is illustrated in this [http://codepad.org/LgTLwMz2 paste]. |
|||
==== Lapack examples ==== |
|||
This section shows how to combine <code>fortran_storage_ordering</code>, <code>::shape()</code> and <code>::origin()</code> in order to call Fortran routines. There are better ways to use linear algebra routines from C++, like Boost.uBlas, this section is intended to show the usage of the multi_array. |
|||
Note that the Lapack routines are called twice, this is because the first call is used to estimate the working space, which is allocated manually. The real work is done by the second call. Lapack functions are declared using the 'extern "C" void' prefix. |
|||
===== Singular Value Decomposition ===== |
|||
The design of the SVD class for decomposing complex matrices follows the one from Numerical Recipe 3rd edition (page 67) and the actual lapack calling is taken from this [http://software.intel.com/sites/products/documentation/hpc/mkl/lapack/mkl_lapack_examples/zgesvd_ex.c.htm example]. The following is the complete program, note the use of fortran_storage_order in order to maintain the fortran convention. The decomposition (hard work) is called from the constructor of the lapack::svd class, then the decomposition matrices are the member multi_arrays, u (<math>\mathbf{U}</math>), vh (<math> \mathbf{V}^\dagger</math>) (2-dimensional) and w (<math>\mathbf{w}</math>) (1-dimensional vector) (<math>\mathbf{A} = \mathbf{U}\cdot diag(\mathbf{w}) \cdot \mathbf{V}^\dagger</math>). The contents of the original matrix are destroyed. |
|||
#if 0 |
|||
c++ -I$HOME/usr/include -D_TEST $0 -L$HOME/usr/lib -llapack -lptf77blas -latlas -lgfortran -lpthread -Wfatal-errors && ./a.out |
|||
exit |
|||
#endif |
|||
#include<boost/multi_array.hpp> |
|||
#include<iostream> |
|||
/* ZGESVD prototype */ |
|||
struct _dcomplex{double re, im;}; |
|||
extern "C" void zgesvd_( char* jobu, char* jobvt, int* m, int* n, _dcomplex* a, |
|||
int* lda, double* s, _dcomplex* u, int* ldu, _dcomplex* vt, int* ldvt, |
|||
_dcomplex* work, int* lwork, double* rwork, int* info ); |
|||
namespace lapack{ |
|||
using boost::multi_array; |
|||
using boost::extents; |
|||
class svd{ |
|||
public: |
|||
multi_array<std::complex<double>, 2> u; |
|||
multi_array<std::complex<double>, 2> vh; |
|||
multi_array<double, 1> w; |
|||
svd(multi_array<std::complex<double>, 2>& a) : |
|||
u (extents[a.shape()[0]][a.shape()[0]], boost::fortran_storage_order()), |
|||
vh(extents[a.shape()[1]][a.shape()[1]], boost::fortran_storage_order()), |
|||
w (extents[std::min(a.shape()[0], a.shape()[1])]){ |
|||
int lwork = -1; |
|||
std::complex<double> wkopt; |
|||
std::vector<double> rwork(std::max(1u, 5*std::min(a.shape()[0], a.shape()[1]))); |
|||
int info=0; |
|||
char All='A'; |
|||
zgesvd_(&All, &All, &(int&)a.shape()[0], &(int&)a.shape()[1], |
|||
(_dcomplex*)a.origin(), &(int&)a.shape()[0], |
|||
w.origin(), (_dcomplex*)u.origin(), &(int&)a.shape()[0], |
|||
(_dcomplex*)vh.origin(), &(int&)a.shape()[1], |
|||
&(_dcomplex&)wkopt, &lwork, &rwork[0], &info); |
|||
lwork = (int)wkopt.real(); |
|||
std::vector<std::complex<double> > work(lwork); |
|||
zgesvd_(&All, &All, &(int&)a.shape()[0], &(int&)a.shape()[1], |
|||
(_dcomplex*)a.origin(), &(int&)a.shape()[0], |
|||
w.origin(), (_dcomplex*)u.origin(), &(int&)a.shape()[0], |
|||
(_dcomplex*)vh.origin(), &(int&)a.shape()[1], |
|||
&(_dcomplex&)work[0], &lwork, &rwork[0], &info); |
|||
assert(info<1); |
|||
} |
|||
}; |
|||
} |
|||
int main(){ |
|||
boost::multi_array<std::complex<double>,2 > a(boost::extents[2][2], boost::fortran_storage_order() ); |
|||
double data[2][2][2]={ |
|||
{{1., 0.}, {1., 0}}, |
|||
{{2., 0.}, {2., 0}} |
|||
}; |
|||
a.assign((std::complex<double>*)data, (std::complex<double>*)data + a.num_elements()); |
|||
lapack::svd uvh(ma); |
|||
std::clog |
|||
<<"u = {{"<<uvh.u[0][0]<<","<<uvh.u[0][1]<<"},\n" |
|||
<<" {"<<uvh.u[1][0]<<","<<uvh.u[1][1]<<"}}\n"; |
|||
std::clog |
|||
<<"v^h = {{"<<uvh.vh[0][0]<<","<<uvh.vh[0][1]<<"\n" |
|||
<<" {"<<uvh.vh[1][0]<<","<<uvh.vh[1][1]<<"}}\n"; |
|||
std::clog |
|||
<<"w = {{"<<uvh.w[0]<<",0},\n" |
|||
<<" {0,"<<uvh.w[1]<<"}}\n"; |
|||
/*output |
|||
u = {{(-0.707107,0),(-0.707107,0)}, |
|||
{(-0.707107,0),(0.707107,0)}} |
|||
v^h = {{(-0.447214,0),(-0.894427,0) |
|||
{(0.894427,-0),(-0.447214,0)}} |
|||
w = {{3.16228,0}, |
|||
{0,1.98603e-16}} |
|||
*/ |
|||
return 0; |
|||
} |
|||
===== QR Decomposition ===== |
|||
Complex QR decomposition is performed in two stages, first the matrix R (upper triangular) and an intermediate representation of Q (unitary <math>\mathbf{Q}\mathbf{Q}^\dagger=\mathbf{1}</math>) is obtained via [http://www.netlib.org/lapack/explore-html/zgeqrf.f.html Lapack ZGEQRF], second the proper Q is calculated with [http://www.netlib.org/lapack/explore-html/zungqr.f.html ZUNGQR]. The design of the qrd class follows that of Numerical Recipes. After the decomposition the members q and r contains the respective decomposition (<math>\mathbf{A} = \mathbf{Q}\mathbf{R}</math>, no conjugate, no transposition). During construction, it is checked that the multi_array has fortran ordering (the algorithm doesn't give QR decomposition for c-ordering). Output arrays also have fortran ordering. |
|||
#if 0 |
|||
cp $0 $0.cpp |
|||
g++ -Wl,-rpath=$HOME/usr/lib -I$HOME/usr/include -I.. -D_TEST $0.cpp -o $0.cpp.x -L$HOME/usr/lib -llapack -lptf77blas -latlas -lgfortran -lpthread -lboost_system -Wfatal-errors && ./$0.cpp.x |
|||
rm -f $0.cpp |
|||
exit |
|||
#endif |
|||
#include<boost/multi_array.hpp> |
|||
#include<iostream> |
|||
#include<mathematica/archive.hpp> |
|||
/* ZGEQRF and ZUNGQR prototype */ |
|||
struct _dcomplex{double re, im;}; |
|||
extern "C" void zgeqrf_(int* m, int* n, _dcomplex* a, int* lda, _dcomplex* tau, _dcomplex* work, int* lwork, int* info); |
|||
extern "C" void zungqr_(int* m, int* n, int* k, _dcomplex* a, int* lda, _dcomplex* tau, _dcomplex* work, int* lwork, int* info); |
|||
namespace lapack{ |
|||
using boost::multi_array; |
|||
using boost::extents; |
|||
class qrd{ |
|||
public: |
|||
multi_array<std::complex<double>, 2> q; |
|||
multi_array<std::complex<double>, 2> r; |
|||
multi_array<double, 1> w; |
|||
qrd(multi_array<std::complex<double>, 2> const& a) : // q.r = a (a is destroyed after the fact, no transpose no conjugate) |
|||
q (extents[a.shape()[0]][a.shape()[1]], boost::fortran_storage_order()), |
|||
r (extents[a.shape()[0]][a.shape()[1]], boost::fortran_storage_order()){ |
|||
assert(a.storage_order() == boost::fortran_storage_order() ); |
|||
std::vector<std::complex<double> > tau(std::min(a.shape()[0],a.shape()[1])); |
|||
int m=a.shape()[0]; |
|||
int lda=std::max(1u,a.shape()[0]); |
|||
_dcomplex* tau_=(_dcomplex*)&tau[0]; |
|||
_dcomplex work_size={0,0}; |
|||
_dcomplex* a_=(_dcomplex*)a.origin(); |
|||
{ |
|||
int n=a.shape()[1]; |
|||
int lwork=-1; |
|||
int info=0; |
|||
//see http://www.intel.com/software/products/mkl/docs/webhelp/lse/functn_geqrf.html |
|||
zgeqrf_(&m, &n, a_, &lda, tau_, &work_size, &lwork, &info); |
|||
assert(info==0); |
|||
lwork=(int)work_size.re; |
|||
std::vector<std::complex<double> > work(lwork); |
|||
_dcomplex* work_=(_dcomplex*)&work[0]; |
|||
zgeqrf_(&m, &n, a_, &lda, tau_, work_, &lwork, &info); |
|||
assert(info==0); |
|||
r=a; |
|||
for(unsigned i=0; i!=r.shape()[0]; ++i) |
|||
for(unsigned j=0; j!=std::min(i, r.shape()[1]); ++j) |
|||
r[i][j]=0.; |
|||
} |
|||
{ |
|||
int n=std::min(a.shape()[1], a.shape()[0]); |
|||
int k=n; |
|||
int lwork = -1; |
|||
int info=0; |
|||
//see http://www.intel.com/software/products/mkl/docs/webhelp/lse/functn_ungqr.html |
|||
zungqr_(&m, &n, &k, a_, &lda, tau_, &work_size, &lwork, &info); |
|||
assert(info==0); |
|||
lwork=(int)work_size.re; |
|||
std::vector<std::complex<double> > work(lwork); |
|||
_dcomplex* work_=(_dcomplex*)&work[0]; |
|||
zungqr_(&m, &n, &k, a_, &lda, tau_, work_, &lwork, &info); |
|||
assert(info==0); |
|||
q=a; |
|||
q.resize(extents[a.shape()[0]][a.shape()[0]]); //exploits the fact that the extra elements are set to zero |
|||
} |
|||
} |
|||
}; |
|||
} |
|||
#ifdef _TEST |
|||
int main(){ |
|||
boost::multi_array<std::complex<double>,2 > a(boost::extents[3][3], boost::fortran_storage_order() ); |
|||
mathematica::oarchive oa(std::cout); |
|||
double c_data[3][3][2]={ |
|||
{{11., 0.}, {21., 0.},{32.,0}}, |
|||
{{31., 0.}, {41., 1.},{21.,0}}, |
|||
{{20., 1.}, {32., 2.},{45.,1.}} |
|||
}; |
|||
boost::multi_array<std::complex<double>,2 > data(boost::extents[3][3], boost::c_storage_order() ); |
|||
data.assign((std::complex<double>*)c_data, (std::complex<double>*)c_data+data.num_elements()); |
|||
a=data; |
|||
oa<<"a = "<<a; |
|||
lapack::qrd qr(a); |
|||
oa<<"q = "<<qr.q; |
|||
oa<<"r = "<<qr.r; |
|||
return 0; |
|||
} |
|||
#endif |
|||
===== Orthogonalization via Cholesky Decomposition ===== |
|||
Given a set n linearly vectors v (stored as rows in V) a new set of n vectors w (stored in the rows of W) can be generated by the transformation |
|||
<math> |
|||
\mathbf{W} = {\mathbf{U}^\dagger}^{-1} \mathbf{V} |
|||
</math> |
|||
where <math>\mathbf{U}</math> is the upper triangular [http://en.wikipedia.org/wiki/Cholesky_decomposition Cholesky decomposition] of the square matrix <math>\mathbf{O}</math> |
|||
<math> |
|||
\mathbf{U}^\dagger \mathbf{U} = \mathbf{O} |
|||
</math> |
|||
in its turn <math>\mathbf{O}</math> is the array of overlaps. |
|||
<math> |
|||
\mathbf{O} = \mathbf{V} \mathbf{V}^\dagger |
|||
</math> |
|||
The result is that |
|||
<math> |
|||
\mathbf{W}\mathbf{W}^\dagger = |
|||
\mathbf{U}^{-1}\mathbf{V}\mathbf{V}^\dagger{\mathbf{U}^{-1}}^\dagger = |
|||
\mathbf{U}^{-1}\mathbf{O}{\mathbf{U}^{-1}}^\dagger = \mathbf{U}^{-1}\mathbf{U}\mathbf{U}^\dagger{\mathbf{U}^{-1}}^\dagger = \mathbf{1} |
|||
</math>, i.e. <math>\mathbf{W}</math> is an orthonormal set, as long as <math>\mathbf{V}</math> rows are linearly independent. |
|||
The array <math>\mathbf{O}</math> is the Hermitian square of <math>\mathbf{V}</math> and can be obtained with Lapack [http://www.netlib.org/lapack/explore-html/zherk.f.html ZHERK], the Cholesky decomposition (<math>\mathbf{U}</math>) is obtained via Lapack [http://www.netlib.org/lapack/explore-html/zpotrf.f.html ZPOTR], finally the orthonormal (actually unitary because the implementation is for complex numbers) set is obtained with the solution of |
|||
<math> |
|||
\mathbf{U}^\dagger \mathbf{W} = \mathbf{V} |
|||
</math> |
|||
given by [http://www.netlib.org/lapack/explore-html/ztrsm.f.html ZTRSM]. |
|||
The previous examples (SV and QR decompositions) only work properly for fortran_storage_order (if the order is c_storage_order the functions don't run). In contrast this implementation of orthogonalize works for both fortran and c-ordering and is this feature what makes the code complicated. (A more complicated code can even achive column orthogonalization also). In both cases the result is a set of orthonormal rows. (The original matrix is destroyed). The function uses a temporary square array with order equal to the number of rows in the input. Other operations are in-place and the output replaces the input. |
|||
namespace lapack{ |
|||
void orthogonalize_rows(multi_array<std::complex<double>,2>& v){ |
|||
multi_array<std::complex<double>,2> c(extents[v.shape()[0]][v.shape()[0]], boost::fortran_storage_order() ); |
|||
{// hermitian_square |
|||
char uplo=(v.storage_order()==boost::fortran_storage_order())?'U':'L'; |
|||
char trans=(v.storage_order()==boost::fortran_storage_order())?'N':'C'; |
|||
int n=((trans=='N') or (trans=='n'))?v.shape()[0]:v.shape()[1]; |
|||
int k=((trans=='N') or (trans=='n'))?v.shape()[1]:v.shape()[0]; |
|||
if(not(v.storage_order()==boost::fortran_storage_order())) std::swap(n,k); |
|||
double alpha=1.; |
|||
_dcomplex* a_=(_dcomplex*)v.origin(); |
|||
int lda=((trans=='N') or (trans=='n'))?std::max(1, n):std::max(1, k); |
|||
double beta=0.; |
|||
_dcomplex* c_=(_dcomplex*)c.origin(); |
|||
int ldc=std::max(1, n); |
|||
zherk_(&uplo, &trans, &n, &k, &alpha, a_, &lda, &beta, c_, &ldc); |
|||
} |
|||
{//cholesky decomposition |
|||
char uplo=(v.storage_order()==boost::fortran_storage_order())?'U':'L'; |
|||
int n=c.shape()[0]; |
|||
_dcomplex* a_ = (_dcomplex*)c.origin(); |
|||
int lda=std::max(1, n); |
|||
int info=0; |
|||
zpotrf_(&uplo, &n, a_, &lda, &info); |
|||
if(info>0) throw std::domain_error("matrix is not positive definite"); |
|||
} |
|||
{//inverse of cholesky applied to original array via solution of hermitian system |
|||
char side=(v.storage_order()==boost::fortran_storage_order())?'L':'R'; |
|||
char uplo=(v.storage_order()==boost::fortran_storage_order())?'U':'L'; |
|||
char transa='C'; |
|||
char diag='N'; |
|||
int m=v.shape()[0]; |
|||
int n=v.shape()[1]; |
|||
if(not(v.storage_order()==boost::fortran_storage_order())) std::swap(n,m); |
|||
double alpha=1.; |
|||
_dcomplex* a_=(_dcomplex*)c.origin(); |
|||
int lda=(side=='L' or side=='l')?std::max(1,m):std::max(1,n); |
|||
_dcomplex* b_=(_dcomplex*)v.origin(); |
|||
int ldb=std::max(1,m); |
|||
ztrsm_(&side, &uplo, &transa, &diag, &m, &n, &alpha, a_, &lda, b_, &ldb); |
|||
} |
|||
//now v is orthonormal (input is overwritten with orthonormal set) |
|||
}catch(std::domain_error& e){ |
|||
throw std::domain_error("can not orthogonalize because vectors are not linearly independent"); |
|||
} |
|||
} |
|||
The code can be used as follows, |
|||
multi_array<std::complex<double>,2> v_fortr(extents[2][3], boost::fortran_storage_order() ); |
|||
multi_array<std::complex<double>,2> v_clang(extents[2][3], boost::c_storage_order()); |
|||
... fill v with values ... |
|||
both arrays formally contain 2 vectors of 3 components. |
|||
lapack::orthogonalize_rows(a_fortr); |
|||
lapack::orthogonalize_rows(a_clang); |
|||
in both cases the answer is the same, the only difference is how the numbers are arranged internally in memory. |
|||
The trick is achieved by passing different options to the Lapack routines depending on the known ordering instead of explicitly transposing elements --which requieres copying the full arrays-- to accomodate for one or the other ordering). |
|||
(The algorithm throws if the input is not a linearly independent set, this is detected by the Cholesky decomposition, sometimes the lack of linear dependence is not detected but the routine returns vectors with almost zero components --in particular not normalized --). |
|||
As a reference, if the input is linearly independent, the output is numerically similar to Mathematica's <code>Orthogonalize[A]</code>, except that Mathematica accepts linearly dependent input (giving zero vectors in the output). |
|||
=== Arbitrary ordering (and direction) === |
|||
For 2 dimensional arrays the ordering can be either fortran or C-type, for higher dimensionality there are other possibilities. |
|||
Sometimes we may need to work with some arbitrary ordering, for example some 3D FFT routines do effectively transpose the first and second index in order to optimize for speed. Instead of taking care of this transposition (either by actually performing the transposition in memory or by adding some ad-hoc logic to the program), it might be convenient to just declare a different memory storage: |
|||
int ordering[] = {1,0,2}; |
|||
bool ascending[] = {true,true,true}; // we can choose to completely invert the memory order by 'false, false, false' |
|||
boost::multi_array A(extents[3][4][2], boost::general_storage_order<3>(ordering,ascending)); |
|||
This also works for multi_array_ref |
|||
boost::multi_array_ref Aref(ptr, extents[3][4][2], boost::general_storage_order<3>(ordering,ascending)); |
|||
In this case we can treat A and Aref with the conventional ordering of the index even if some external routine is transposing the internal memory layout. |
|||
The storage and ascending order can be accessed a posteriori by storage_order(), for example we can ensure fortran-ordering by doing |
|||
assert( a.storage_order()==boost::fortran_storage_order()); |
|||
We can check that all dimensions are in ascending order (independently of fortran or c-ordering) by |
|||
assert( a.storage_order().all_dims_ascending() == true); |
|||
or we can check that the second dimension (index) is ascending and that it is the slowest-index. |
|||
assert( a.storage_order().ascending(1)==true and a.storage_order().ordering(1)==0 ); |
|||
finally we can check that two arrays follow the exact same convention. |
|||
assert( a.storage_order() == b.storage_order() ); |
|||
=== Assign from c-arrays === |
|||
The assign method can be used to assign element from other containers (including c-arrays) into multi_array. |
|||
multi_array<double,2> a(extents[3][3]); |
|||
double a_data[2][2] = //this is what I call a c-array |
|||
{{1,2}, |
|||
{3,4}}; |
|||
a.assign(a_data, a_data+4); |
|||
The trick can be used to assign complex elements too |
|||
multi_array<std::complex<double>, 2> b(extents[2][2]); |
|||
double b_data[2][2][2] = |
|||
{ { {1,0}, {2,0} }, |
|||
{ {3,0}, {4,0} } }; |
|||
b.assign( (std::complex<double>*)b_data, (std::complex<double>*)b_data+4); |
|||
(as a pointer to std::complex<double> b_data has effectivelly 4 elements). The trick exploits the fact that a complex number can be viewed as a struct with 2 double elements. |
|||
The ordering (X_storage_ordering) and the direction (asc/descending) does not affect how assign works. Assign fills the internal memory of the array independently of the storage convention. This is confusing because in the previous example if 'a' is declared as fortran_storage_order the transpose array is obtained. |
|||
A workaround for fortran_storage_order is to declare an intermediate array with c_storage_order and then copy the contents of one array into the other. Elements are copied element by element in the right order. |
|||
multi_array<double> d(extents[2][2], fortran_storage_order()); |
|||
multi_array<double> d_corder(extents[2][2], c_storage_order()); |
|||
d_coder.assign(a_data, a_data+4); |
|||
d = d_coder; |
|||
=== Custom index ranges (<code>reindex</code>) === |
|||
In previous examples we used some obscure entity called <code>boost::extents</code>, it is obviously used to pass the valid range of indices to the matrix declaration. |
|||
Technically <code>extents</code> is an (empty) global variable of type <code>boost::extent_gen</code>; this is just a trick to allow a nice syntax with arbitrary number of parameters, like passing <code>extents[10][10][10][10]</code> for a 4D-array. |
|||
The real power of this syntax is that we can really pass other types than just integers declaring the size in each dimension, namely we can declare ranges instead. For example, this will create a 1-based array of size 10x10. |
|||
using boost::extents; |
|||
typedef boost::multi_array_types::extent_range range; |
|||
boost::multi_array<double, 2> A(extents[range(1,11)][range(1,11)]); |
|||
In this case the elements A[0][0], A[0][1], A[1][0] are out-of-bounds, just like A[11][11]. Furthermore, we can create an array that is behaves completely like a Fortran-array by also indicating the ordering. |
|||
boost::multi_array<double, 2> B_Flike(extents[range(1,11)][range(1,11)], boost::fortran_storage_order() ); |
|||
Let also mention that we can reindex an array on the fly by calling the <code>.reindex</code> method: |
|||
A.reindex(0); // now all dimensions are 0-based, valid elements 0..9 x 0..9 |
|||
... // take advantage of 0-based array |
|||
A.reindex(1); // 0-based index restored, valid elements 1..10 x 1..10 |
|||
After reindexing the range of each dimension <code>i</code> is given by <code>A.index_bases()[i]</code> ... <code>A.index_bases()[i]+A.shape()[i]</code>. |
|||
(Reindixing is just a new logical rearrangement of element referencing, no copies of the matrix or the elements is made.) |
|||
The indices can start an any arbitrary point (not just 0 or 1), even negative values -- in this example (taken from [http://www.boost.org/doc/libs/1_37_0/libs/multi_array/doc/user.html#sec_base here]), the indexing is totally arbitrary in each dimension: |
|||
typedef boost::multi_array<double, 3> array3D; |
|||
array3D A(extents[8][range(1,4)][range(-1,3)][range(10,20)]); |
|||
(Note that, as an argument, <code>[8]</code> is the same as <code>[range(0,8)]</code>.) |
|||
It is also good to note that the memory allocated by an array is always located between <code>A.origin()</code> and <code>A.origin()+A.num_elements()</code> (non-inclusive). |
|||
Let's clarify the difference between <code>A.data()</code> and <code>A.origin()</code>. As we saw before origin() returns a pointer to the first element of the array, what ever this is. On the other hand, A.data() returns a pointer to the eventual location of <code>A[0][0]..[0]</code>, regardless of whether that element is the first or not; such element may not even be part of the array (for example in 1-based arrays). In general we have to deal <code>origin()</code> and rarely with <code>data()</code>, but it is very easy to confuse one with the other. |
|||
==== Range Checking (and Boost.Assert) ==== |
|||
Once the ranges for a multi_array are defined, the elements can be accessed using the square bracket operators. |
|||
By default the library does check whether the indexes correspond to a valid element (index is not out of bounds). If it is not a valid element, the whole program stops with a failed assertion that looks like this: |
|||
./a.out: ... /boost/multi_array/base.hpp:136: |
|||
Reference boost::detail::multi_array::value_accessor_n< ... >::access( ... ) const [with Reference = |
|||
boost::detail::multi_array::sub_array< ... >, 1u>, ... , unsigned int NumDims = 2u]: |
|||
Assertion `size_type(idx - index_bases[0]) < extents[0]' failed. |
|||
Thanks to the [http://www.boost.org/doc/libs/1_40_0/libs/utility/assert.html Boost.Assert] facilities, this behavior can be suppressed (in which case there is not checking and overrun memory will be accessed) or changed to an C++ exception mode. |
|||
To disable the checking completely just <code>#define BOOST_DISABLE_ASSERT</code> ''before'' <code>#including multi_array.hpp</code>. |
|||
#define BOOST_DISABLE_ASSERT |
|||
#include<boost/multi_array.hpp> |
|||
The library will run faster because there is no range checking each time an element is accessed. |
|||
To change the behavior such that an exception is thrown if index is out of bounds, define the tag <code>BOOST_ENABLE_ASSERT_HANDLER</code> and the function <code>void boost::assertion_failed(char const * expr, char const * function, char const * file, long line)</code> ''before'' including the MultiArray library (http://codepad.org/JgXciACG): |
|||
#define BOOST_ENABLE_ASSERT_HANDLER |
|||
#include<stdexcept> |
|||
namespace boost{ |
|||
void assertion_failed(char const * expr, char const * function, char const * file, long line){ |
|||
if(string(expr)=="size_type(idx - index_bases[0]) < extents[0]"){ |
|||
throw std::range_error("multi_array index range error")"); |
|||
}else{ |
|||
throw std::runtime_error("generic multi_array error"); |
|||
} |
|||
} |
|||
} |
|||
#include<boost/multi_array.hpp> |
|||
(Unfortunately the information about the actual index value that caused the error is not part of the information carried by the exception.) |
|||
This change in behavior may not work if the program is linked to files already compiled without this definition, or if multi_array.hpp is included before this block is defined. |
|||
Now we are able to catch range errors: |
|||
multi_array<double, 3> a(extents[10][10][10]); |
|||
try{ |
|||
a[11][-1][20]=99; |
|||
}catch(std::range_error&){ ... } |
|||
Most Boost libraries use Boost.Assert to indicate errors. With this mechanisms users can choose the way to handle these errors. Using the exception version of Boost.Assert doesn't make the program run slower and it allow us to recover the program from some types of errors. |
|||
Boost.Assert should not be confused with Boost.StaticAssert. |
|||
Bound checks for multi_arrays is more critical than bounds checking for unidimensional containers (std::vector, etc); the reason being that the logic behind the indices can be more complicated and difficult to get right (for example exchange columns by rows in the indices). std::vector provides operator[] which doesn't raise exceptions and .at() which throws an instance of std::out_of_range exception. |
|||
In most implementations std::vector<...>::operator[] doesn't check bounds at all. To have bound checks, the alternative is to change all the v[n] occurrences by v.at(n). If that is not an option the flag -D_GLIBCXX_DEBUG or #define _GLIBCXX_DEBUG before #include<vector> can add bounds check temporarily. |
|||
A third option is to use __gnu_debug::vector in the declarion (instead of std::vector) which header is #include<debug/vector>. |
|||
=== Accessing elements === |
|||
Elements of multiarrays can be accessed in two ways, by using <code>operator[]</code> in sequence (as in the above examples) or by using <code>operator()</code> with an array of indices as described here. This last form has the advantage that if a given combination of indexes needs to be used repeatedly then it can be defined only once. For example the following code applies a certain operation that depends on a complicated combination of indices. |
|||
for(int i=0; i!=a.shape()[0]; ++i){ |
|||
for(int j=0; j!=a.shape()[1]; ++j){ |
|||
for(int k=0; k!=a.shape()[2]; ++k){ |
|||
array<int, 3> idx=<nowiki>{{i,j,k}}</nowiki>, idx2 = <nowiki>{{i+1==a.shape()[0]?0:i+1, j,k}}</nowiki>; |
|||
A(idx) = f(idx) * B(idx2); |
|||
A(idx) += g(idx) + C(idx2); //f and g are functions defined elsewhere |
|||
} |
|||
} |
|||
} |
|||
=== Regular subarrays === |
|||
One of the useful properties of raw-arrays is that *regular* subarrays can be defined simply by defining offsets and strides. MultiArray provides this kind of creation of subarray in a quite elegant manner. For example, given the array defined in the example above, the most simple subarray is the type of the object <code>A[5]</code> which is automatically a one-dimensional array containing a reference to the sixth (indices start at 0) row of the array, so for example <code>(A[5])[1] = 0</code> changes the value of element <code>A[5][1]</code>. |
|||
There are more complicated subarray views that can be defined in terms the original array. |
|||
The interesting thing is that these are only 'views' of the original array, no copy of elements are made. |
|||
The basic examples are given in [http://www.boost.org/doc/libs/1_37_0/libs/multi_array/doc/user.html#sec_views the official MultiArray tutorial]. The following are some digested examples, which at the same time introduces 'typedef' statements to reduce verbosity. The following examples will create arrays from the original 2D array A: |
|||
typedef std::complex<double> complex; |
|||
typedef boost::multi_array<complex,2> array2d; |
|||
using boost::extents; |
|||
array2d A(extents[10][10]); // 2D, 100 allocated elements |
|||
==== Subarrays (<code>sub_array</code>) ==== |
|||
These are simply references to the original array when the first n indexes are specified. The dimension of the resulting subarray is the number of dimensions of the original array minus n. |
|||
typedef array2d::subarray<1>::type subarray1d; |
|||
subarray1d A_row5 = A[5]; // 1D, 10 elements |
|||
==== Views (<code>array_view</code>) ==== |
|||
typedef array2d::array_view<1>::type view1d; |
|||
typedef boost::multi_array_types::index_range range; |
|||
boost::multi_array_types::index_gen indices; |
|||
view1d A_row3_5to10 = A[3][indices[range(5,10)]]; // or A[indices[3][range(5,10)] |
|||
view1d A_row3_5to10odd = A[indices[3][range(5,10,2)]; |
|||
typedef array2d::array_view<2>::type view2d; |
|||
view2d A_center = A[indices[range(2,7)][range(2,7)]]; // 2D, 25 elements |
|||
view2d A_upperleft_quadrant_oddonly = A[indices[range(0,5,2)][range(0,5,2)]]; // 2D, 4 (2x2) elements |
|||
view2d A_lowerright_quadrant_evenonly = A(range(5,10,2),range(5,10,2)); // 2D, 4 (2x2) elements |
|||
Last two arrays refer to odd and even elements of the original array. |
|||
Also note that something like |
|||
view1d A_error = A[indices[range(2,7)][range(2,7)]];' |
|||
is not allowed because the RHS is 2-dimensional and the LHS is declared to be 1-dimensional. |
|||
None of the arrays defined here make any copy of elements, remember that sub_arrays and array_view are just references, just another way to interpret (view) the data of original array. |
|||
Views are more general than subarrays. For example A[1] is an object of type subarray, which can be implicitly converted into a view. However A[1][indices[range(0,5)]] is an array_view object which can not be converted into a subarray. |
|||
Array views can be recursive, one can create an array view of an existing array view. For example, the even element of the even element view, is the 'every 4 elements' view of the original array. |
|||
==== Wrap view (code example) (<code>array_wrap_view</code>) ==== |
|||
Most discrete Fourier transform routines (e.g. [http://www.fftw.org/fftw3_doc/The-1d-Discrete-Fourier-Transform-_0028DFT_0029.html#The-1d-Discrete-Fourier-Transform-_0028DFT_0029 FFTW3] and Numerical Recipes) really compute the so-called 'in-order' transform. This means that the 'positive' frequencies are stored in the first half of the output and the 'negative' frequencies are stored in the second half of the output. |
|||
If the input data represents a continuous peridic function then applying operations on the output requires special code to treat the first and second half of the output data (quadrants in 2 dimensions). Numerical Recipes 3 (page 614), for example, implements a very basic class WrapVecDoub (one dimensional) that takes care of this ordering without affecting the output memory. In the same spirit we will implement a class <code>wrap_array_view</code> that takes care of the indexing in the context of array views. |
|||
namespace boost{ |
|||
template<class MultiArray> class wrap_array_view; |
|||
template<typename Element, typename Value> |
|||
struct return_helper{ |
|||
typedef wrap_array_view<Value> type; |
|||
}; |
|||
template<typename Element> |
|||
struct return_helper<Element, Element>{ |
|||
typedef Element type; |
|||
}; |
|||
template<class MultiArrayRef> |
|||
class wrap_array_view{ |
|||
MultiArrayRef ma_; |
|||
public: |
|||
wrap_array_view(MultiArrayRef const&ma) : ma_(ma){}; |
|||
typedef typename return_helper< |
|||
typename MultiArrayRef::element&, |
|||
typename MultiArrayRef::reference>::type return_helper_type; |
|||
return_helper_type operator[](int i){ |
|||
if(i<ma_.index_bases()[0]) i+=ma_.shape()[0]; |
|||
return return_helper_type(ma_[i]); |
|||
} |
|||
}; |
|||
}//namespace boost |
|||
When accessing each element (e.g. rows) the code takes care of shifting the index to wrap the second half of the array, the object returned is also wrapped (recursive), if the return type is an element of the original array the return type is a reference to the element instead. (Wrapped version of an element --e.g. double-- usually do not make sense, thats why the ofuscated code of <code>return_helper</code>). |
|||
The <code>wrap_array_view</code> object behaves as a reference to the original array and modifying its elements modify the original array. A small example code follows: |
|||
multi_array<double, 2> mm(extents[10][10]); |
|||
wrap_array_view<multi_array_ref<double, 2> > wmm(mm); |
|||
wmm[-5][-5]=111; |
|||
assert(mm[5][5]==111); |
|||
wmm[-1][-1]=222; |
|||
assert(mm[9][9]==222); |
|||
=== A simple FFTW wrapper === |
|||
Here I show that it is very easy to build a wrapper of known libraries that will make the syntax of the main program very easy to understand by packaging common tasks. |
|||
The packaging of common tasks is traditionally achieved by writing simple C-like functions. Sometimes need more flexibility, like the one given by C++ classes. Naturally, this will need a deep understanding of the language and of the wrapped libraries. But once written they will make easier to program. |
|||
For example, in the case of FFTW, although written in C, it is obvious that 'fftw_plan' structure behaves almost as a C++ object: 1) It has a certain state [pointers and size of arrays], 2) lifetime [until fftw_destroy_plan] 3) and can be referred recurrently during it lifetime [can be executed several times during it lifetime]. If we combine this with what we already know of Boost.MultiArray we can design a robust interface. This is a draft of the wrapper classes: |
|||
namespace fftw3{ |
|||
using namespace boost; |
|||
class plan : private counted<plan>{ |
|||
plan(fftw_plan const& p) : p_(p) {} |
|||
public: |
|||
template<class ConstMultiArray, class MultiArray> |
|||
plan(ConstMultiArray const& in, MultiArray const& out, int direction, unsigned flags=FFTW_ESTIMATE){ |
|||
assert(boost::shape(in)==boost::shape(out)); |
|||
if(in.num_dimensions()>1){ |
|||
for(int i=in.num_dimensions()-1, dense_in=in.strides()[i], dense_out=out.strides()[i]; i!=-1; dense_in*=in.shape()[i], dense_out[i]*=out.shape()[i], --i){ |
|||
assert(dense_in==in.strides()[i] and dense_out==out.strides()[i]); // check for dense representation of multidimensional |
|||
} |
|||
} |
|||
p_ = fftw_plan_many_dft(in.num_dimensions(), (const int *)in.shape(), 1 /*int howmany*/, |
|||
(double(*)[2])in.origin() , 0 /*const int *inembed, 0=in.shape()*/, |
|||
in.strides()[in.num_dimensions()-1] /*int istride*/, 0 /*int idist ignored for howmany=1*/, |
|||
(double(*)[2])out.origin(), 0 /*const int *onembed, 0=out.shape()*/, |
|||
out.strides()[out.num_dimensions()-1] /*int ostride*/, 0 /*int odist ignored for howmany=1*/, |
|||
direction, flags); |
|||
} |
|||
void operator()() const{return fftw_execute(p_);} |
|||
virtual ~plan(){fftw_destroy_plan(p_);} |
|||
private: |
|||
fftw_plan p_; |
|||
public: |
|||
static void cleanup(){ |
|||
if(get_count()<=0) throw std::logic_error("trying to call fftw3::cleanup() with "+boost::lexical_cast<std::string>(get_count())+" active plans, should be zero"); |
|||
fftw_cleanup(); |
|||
} |
|||
}; |
|||
void execute(plan const& p) {p.operator()();} |
|||
} |
|||
Then we can use it in the program: |
|||
boost::multi_array<std::complex<double>,2> A(boost::extents[100][100]); |
|||
boost::multi_array<std::complex<double>,2> B(boost::extents[100][100]); // must have the same shape! |
|||
fttw3::plan p(A,B, FFTW_FORWARD, FFTW_ESTIMATE); |
|||
... // assign values to A |
|||
execute(p); // repeated times |
|||
... // no need to explicitly destroy plan |
|||
The wrapper is elegant in some aspects, 1) it is uniform for all input/output dimensionalities and for in-place/out-of-place transform, 2) special cases where array_views are passed can be handled by the same syntax without having to call the advanced FFTW functions, 3) no explicit destruction needed (no risk to execute a destroyed plan). |
|||
The wrapper can be used with the condition that the input and output arrays are stored densely in memory. (This is not the case for some array_views.) This is a limitation of the library (which was designed to take advantage of contiguous memory) and this limitation must be reflected in the wrapper. Besides that, the wrapper can deal with arrays, sub_arrays and array_views (with strides==1 if multidimensional). |
|||
In-place transform can be achieved simply by using the same argument in the first and second placeholder. The wrapper will also check for size compatibility. |
|||
Using the original arrays, there are many possible ways to use the wrapper: |
|||
fftw3::plan p(A, B, FFTW_FORWARD); // 2D FFT 100x100 |
|||
fftw3::plan p(A, A, FFTW_FORWARD); // 2D inplace-FFT |
|||
fftw3::plan p(A[21], B[32], FFTW_FORWARD); // 1D FFT 100 elements, |
|||
fftw3::plan p(A[10][indices[range(5,10)], B[20][indices[range(5,10)]], FFTW_FORWARD); // 1D FFT on 5 elements |
|||
If not specified, the option FFTW_ESTIMATE is used. If using this default option the point at which the plan is created is unimportant. Instead, when using the options FFTW_MEASURE or FFTW_PATIENT, the location of plan creation matters. This is so because in these cases the arrays can be overwritten when FFTW looks for the optimal algorithm to use. That means that the plans should be created before the matrix is assigned values and after the matrix has the desired size and shape. |
|||
multi_array<comple,2> A(extents[100][100]); |
|||
fftw3::plan p(A,A, FFTW_FORWARD, FFTW_MEASURE); //overwrites values |
|||
... // assign values; |
|||
execute(p); // performs the transform |
|||
Once the plans are created over existing arrays these arrays must be not reallocated! That means not resize operations please. Assignment (global or partial) is OK because in this case MultiArray guarranties no internal reallocation is done. (Global assignment is only allowed for arrays with same sizes). The is reflects a limitation of the FFTW in which case the array pointer doesn't have to change after the plan is created. |
|||
==== 'Many' Transforms ==== |
|||
The wrapper can also be asked to perform multiple individual transform, by assuming that the first index of the array is used to index smaller arrays and using the following syntax: |
|||
fftw3::plan p=fftw3::plan::many(A, B, FFTW_FORWARD); // 100 1D-FFT of 100 elements each |
|||
note that this is not equivalent to the first example. |
|||
Again, the hard work is done by the library, which uses certain optimizations to perform repeated transforms of the same size. |
|||
==== Special Memory Allocation ==== |
|||
FFTW recommends to use its own version of malloc to allocate arrays. This is supposedly to improve speed taking advantage of some special allocation that is exploited by the numerical routines. For example they recommend to allocate the arrays to be used in the following way: |
|||
typedef std::complex<double> complex; |
|||
complex* in=(complex*)fftw_malloc(sizeof(complex)*N); //instead of malloc(sizeof(complex)*N); |
|||
fftw_free(in); //instead of free(in) |
|||
Up to this point we tried not to deal with memory allocation at all. |
|||
So, the question is: How can we take advantage of allocation of memory in a context of C++? (where manual allocation can --and should-- be avoided in most cases). |
|||
The answer is '[http://www2.roguewave.com/support/docs/leif/sourcepro/html/toolsug/12-6.html custom allocators]'. For STL containers and for multi_arrays, the allocator is passed as an extra template parameter. |
|||
In this example, I pass an special allocator that uses fftw_malloc internally, for the construction of an STL vector and also for a Boost.MultiArray multi_array. |
|||
std::vector<complex, fftw3::allocator<complex> > v(100); |
|||
boost::multi_array<complex, 3, fftw3::allocator<complex> > A(extents[100][100][100]); |
|||
The memory is freed (automatically) by fftw_free. After this, the object can be used like any other standard object. If the allocator is not specified, std::allocator<T> is used. |
|||
Declaring the objects in this way is optional in the same way as the usage of fftw_malloc instead of malloc is optional. |
|||
=== Reshape (<code>reshape</code>) === |
|||
Although MultiArray objects can be resized, the underlying memory layout is not specially good for resizing operations. In general, for resizing, the memory is reallocated and *all* the elements copied. This is even true for operations reducing the size of the matrix (since elements have to be contiguous in memory). So, resizing operations should be avoided as much as possible and they will not be covered here (otherwise see [http://www.boost.org/doc/libs/1_37_0/libs/multi_array/doc/user.html#sec_resize here]). |
|||
Multidimensional arrays are different from one dimensional arrays because they can be reshaped. In the library, reshape operations are different from resize in the sense that they preserve the total number of elements. The contained elements are rearranged logically but no copies or reallocations are performed. The reshape operation is done as follows: |
|||
boost::multi_array<complex, 3> a(extents[5][4][3]); //assert(a.shape()[0]==5 and a.shape()[1]==4 and a.shape()[2]==3); |
|||
{ |
|||
boost::array<int,3> new_shape=<nowiki>{{5,6,2}}</nowiki>; //assert(5*6*2==5*4*3); |
|||
a.reshape(new_shape); //assert(a.shape()[0]==5 and a.shape()[1]==6 and a.shape()[2]==2); |
|||
} |
|||
In this example, the reshape operation is allowed because the number of elements is preserved since 5*6*2==5*4*3. If the number of elements is not preserved the operation will fail during execution. |
|||
The following three syntaxes are ''not'' allowed due to very strange technical limitations of C++: |
|||
a.reshape(<nowiki>{{5,6,2}}</nowiki>); // does not compile |
|||
a.reshape({5,6,2}); // neither |
|||
a.reshape(boost::array<int, 3>(<nowiki>{{5,6,2}}</nowiki>)); // neither |
|||
What eventually works (and only in new compilers) is the following syntax: |
|||
a.reshape((boost::array<int,3>)<nowiki>{{5,6,2}}</nowiki>); // watch for parenthesis |
|||
or, depending on the context, it might be better to use: |
|||
using boost::array; |
|||
typedef array<int,3> array3; |
|||
a.reshape((array3)<nowiki>{{5,6,2}}</nowiki>); |
|||
Note 1: boost::array is not the same as boost::multi_array. boost::arrays objects are part of a much simpler library called [http://www.boost.org/doc/libs/1_37_0/doc/html/array.html Boost.Array]. In this other library all the arrays are one-dimensional and the number after the element type indicates the (constant) number of elements and not the number of dimensions. |
|||
So far, the only constrain for reshaping is that it preserves the number of elements. The other requirement that is not very obvious is that the reshape operation must preserve the number of dimensions (in this case 3). |
|||
This is not a big limitation since we can choose one of the dimensions to be 1 in size |
|||
a.reshape((array<int,3>)<nowiki>{{5,1,12}}</nowiki>); // or <nowiki>{{5,12,1}}</nowiki> depending on your application |
|||
Note 2: Do not try to do something like a.reshape((array<int,2>)<nowiki>{{5,12}}</nowiki>) for a 3D array to change dimensionality, it compiles but gives unexpected results. In fact, there is no way to have 'a' change its dimensionality during the program execution, since it is of type multi_array<double,3>. Another thing that is very confusing is that reshape does not work with extents; do not do <code>A.reshape(extents[5][4][3])</code>. Again it compiles and gives the wrong result. |
|||
If you really need an object with smaller dimensionality with the same elements you can do |
|||
multi_array<complex, 3> a3D(extents[5][4][3]); |
|||
multi_array_ref<complex, 2> a2D_ref(a3D.origin(), extents[5][12]); |
|||
and use 'a2D_ref' instead of 'a3D'. In this case nobody will check that the sizes are compatible unless you add something like <code> assert(a3D.num_elements()==a2D_ref.num_elements()); </code>. Alternativelly we can do |
|||
multi_array_ref<complex, 2> a2D_ref(a3D.origin(), extents[a3D.shape()[0]][a3D.num_elements()/a3D.shape()[0]]); |
|||
// num_elements is always divisible by shape()[0] |
|||
to ensure that all elements are referenced. |
|||
Note 3: Of course you can always copy element by element from one type of array to another with different shape or different dimensionality. But this section is exactly about avoiding copies. |
|||
==== Example of reshape with FFTW ==== |
|||
The FFTW wrapper presented earlier honors the dimensionality of the arrays passed. I mean by this that the dimensionality of the transform follows the dimensionality of the passed array instead of for example interpreting the input as a one dimensional array which would of course give a completely different result. In this section I will illustrate how to force FFTW to interpret the data differently. |
|||
For simplicity let us assume that we have a two dimensional array, and that we want to do our transforms in-place on complex data: |
|||
typedef std::complex<double> complex; |
|||
using boost::multi_array; |
|||
using boost::extents; |
|||
multi_array<complex, 2> A(extents[100][100]); |
|||
... // fill A with data |
|||
The simplest case is that we want a 2D FFT on the 2D data structure, I present it here as a review: |
|||
fftw3::plan pf_A2A_2D(A, A, FFTW_FORWARD); // implicitly 2D transform |
|||
execute(pf_A2A_2D); |
|||
Now suppose that we need to do a different thing that is to transform each row of the the array independently, the approach in this case will be different, for example we could iterate over the rows and perform a 1D transform (this is allowed because each A row is contiguous data in memory): |
|||
for(int i=0; i!=A.shape()[0]; i++){ |
|||
fftw3::plan pf_Arow2Arow_1D(A[i],A[i], FFTW_FORWARD); // implicitly 1D transform |
|||
execute(pf_Arow2Arow_1D); |
|||
} |
|||
Below there is a note against using FFTW plans inside loops. |
|||
A different approach, *to do the same thing* would be to use the 'many' option of the library and the wrapper. This can be faster because the library can take advantage of how the overall data is distributed. |
|||
fftw3::plan pf_Arow2Arow_many_1D=fftw3::plan::many(A, A, FFTW_FORWARD); // many 1D transforms |
|||
execute(pf_Arow2Arow_many_1D); |
|||
The last option is faster, there is no explicit loop, and only one plan is created, this is more elegant because the plan can be created just after the array is constructed, this is handy if want to create plans with the FFTW_MEASURE option. (But keep in mind that this is possible only because all the 1D transforms are performed inside a single multi_array.) |
|||
Note: There is also an insidious issue with creating many FFTW plans that I discovered accidentally. The library keeps track of all the created plans (even the 'destroyed' ones). Each created/destroyed plan leaves a small memory footprint. FFTW authors claim that this is for performance optimization purposes. But when creating, using and destroying plans repeatedly, let's say few million times, the used memory will grow up and the application crash. Authors of the library say this is not a memory leak, and in part they are right, just because there a function called |
|||
fftw_cleanup() [called fftw3::plan::cleanup() in the wrapper] that makes sure all that memory is reclaimed. The problem is that the function must be called at some point where it is known that there are no plans active (the wrapper will checks for this condition). It is your responsibility to call this function and since there are only few points in the program at which we know that there are no plans active, where to call cleanup(), this *effectively* behaves like a memory leak. The way to avoid this altogether is to keep the number of created plans controlled, and specially avoid creating plans inside loops. The 'many' plan can save lots of plans to be created. |
|||
Now suppose that we have our data in a 2D multi_array but that we want to interpret the data differently and perform a long 1D transform over a contiguous sequence of rows in the array. This can be motivated by special boundary conditions in the problem but also illustrates the utility of reshape. |
|||
There are two options here, one using reshape and the other using multi_array_ref (see above). By using reshape we *temporarely* rearrange the array data (logically) and perform the transform: |
|||
using boost::array; |
|||
A.reshape((array<int,2>)<nowiki>{{1,10000}}</nowiki>); //remember that original was 100x100 |
|||
fftw3::plan pf_Aunwrappedrows_1D(A[0],A[0], FFTW_FORWARD); // implicitly 1D |
|||
A.reshape((array<int,2>)<nowiki>{{100,100}}</nowiki>); |
|||
execute(pf_Aunwrappedrows_1D); |
|||
First, we reshape the original matrix to be effectively a one dimensional array. Strictly speaking A is 2D, but it has only one row A[0] and we create a 1D FFT plan for that single row that contains all the data. After the plan is created we can reshape the array again, since FFTW will remember the shape of the array. Another variant is to simply declare a plan: |
|||
fftw3::plan pf_Aunwrappedrows_2D(A, A, FFTW_FORWARD); |
|||
Although in the case the arrays and the plans are 2D, the result should be the same as a 1D transform since there is only one row. |
|||
The second option is to create a multi_array reference with a different 'view' and work on that matrix: |
|||
using boost::multi_array_ref; |
|||
multi_array_ref<complex, 1> Aunwrapped_ref(A.origin(), extents[10000]); |
|||
assert(Aunwrapped_ref.num_elements()==A.num_elements()); |
|||
fftw3::plan pf_Aunwrapped_1D(Aunwrapped_ref, Aunwrapped_ref, FFTW_FORWARD); // inplicitly 1D |
|||
execute(pf_Aunwrapped_1D); |
|||
The FFTW will modify the reference matrix which is only a different way of changing the original array but the element access is logically different. |
|||
== Boost.MPI Library == |
|||
I will assume that it is the case that you know the very basics of C-MPI and you want to improve the readability and design of your code. If you don't know C-MPI, you may try to learn from this examples (the syntax is very nice to learn what MPI is supposed to do --in contrast with the ugly C version--) but be aware that Boost.MPI is not widely used. |
|||
I will try to work out an example that is not trivial by using FFTW in parallel. Hopefully this approach will be more related to our application area, while the [http://www.boost.org/doc/libs/1_37_0/doc/html/mpi.html official Boost.MPI pages] are plenty of (deceptively) trivial examples. |
|||
=== Simple MPI FFTW example === |
|||
In my opinion, using a real MPI numerical library is the only way to learn MPI and make it do useful work. |
|||
The example in this section uses the MPI version of FFTW3 which must be installed prior to begin. See detailed instructions to [[Install FFTW3]] first if necessary. |
|||
Before going to the first example let us review the command line to compile it |
|||
mpicxx -I$HOME/usr/include fft_mpi_test.cpp \ |
|||
-L$HOME/usr/lib -lfftw3_mpi -lfftw3 \ |
|||
-lboost_mpi-gcc43-mt-1_37 -lboost_serialization-gcc43-mt \ |
|||
-o fft_mpi_test |
|||
besides the libfftw part, -L$HOME/usr/lib -lboost_mpi-gcc43-mt-1_37 -lboost_serialization-gcc43-mt do the job of linking against Boost.MPI and Boost.Serialization. |
|||
The reason we need Boost.Serialization (although we don't use it explicitly here) is that MPI is highly dependent on the serialization mechanism. Serialization is the capacity of converting any object or structure in memory into a linear stream of data. That is what we do over and over again when we save data to a file, in this case MPI communication is basically the same thing: one processes send streams of data to others. This integration allows to pass complex objects between processes without having to convert (deconstruct) the objects to numerical (or primitive) types first and then reconstructing the object on the other side as we would do it with C-MPI. |
|||
The following compilable example is based on [http://www.fftw.org/fftw3.3alpha_doc/Simple-MPI-example.html#Simple-MPI-example the fftw3 simple mpi example] ([[media:fftw_simple_mpi.tar |plain-C sources]]). It can be regarded as the parallel version of this [http://micro.stanford.edu/wiki/Install_FFTW3#Basic_Usage serial example]. In the following example we just add the Boost.MPI syntax (instead of the C-MPI): |
|||
#include <complex> /* must include before fftw3*.h */ |
|||
#include <fftw3-mpi.h> |
|||
#include <boost/mpi.hpp> |
|||
#include <iostream> |
|||
namespace mpi = boost::mpi; /* if not defined have to use names as boost::mpi::... */ |
namespace mpi = boost::mpi; /* if not defined have to use names as boost::mpi::... */ |
||
using std::clog; |
using std::clog; |
||
using std::endl; |
using std::endl; |
||
int main(int argc, char **argv){ /* initialize mpi world and fftw_mpi */ |
int main(int argc, char **argv){ /* initialize mpi world and fftw_mpi */ |
||
mpi::environment env(argc, argv); |
mpi::environment env(argc, argv); /* in C: MPI_Init(&argc, &argv); */ |
||
mpi::communicator world; |
mpi::communicator world; /* this will work as MPI_COMM_WORLD */ |
||
fftw_mpi_init(); |
fftw_mpi_init(); /* initialize fftw_mpi library */ |
||
const ptrdiff_t N0 = 18, N1 = 18; /* size of parallel array */ |
const ptrdiff_t N0 = 18, N1 = 18; /* size of parallel array */ |
||
fftw_plan plan; /* this is the usual fftw plan */ |
fftw_plan plan; /* this is the usual fftw plan */ |
||
std::complex<double> *data; /* in C: fftw_complex *data; this is the local (to process) data */ |
std::complex<double> *data; /* in C: fftw_complex *data; this is the local (to process) data */ |
||
// given the matrix size (N0, N1) ask fftw_mpi how to split the matrix |
// given the matrix size (N0, N1) ask fftw_mpi how to split the matrix |
||
ptrdiff_t alloc_local, local_n0, local_0_start, i, j; |
ptrdiff_t alloc_local, local_n0, local_0_start, i, j; |
||
alloc_local = fftw_mpi_local_size_2d(N0, N1, world, |
alloc_local = fftw_mpi_local_size_2d(N0, N1, world, |
||
&local_n0, &local_0_start); |
&local_n0, &local_0_start); // alloc_local isn't always local_no*N1!! |
||
//report allocation size, example of output from each process |
//report allocation size, example of output from each process |
||
if(world.rank()==0){ /* only process 0 prints this */ |
if(world.rank()==0){ /* only process 0 prints this */ |
||
| Line 112: | Line 1,396: | ||
clog<<"Process "<<world.rank()<<" has the array of size ["<<local_0_start<<"," |
clog<<"Process "<<world.rank()<<" has the array of size ["<<local_0_start<<"," |
||
<<local_0_start+local_n0<<")x[0,"<<N1<<"), total local size "<<(N0*N1)<<endl; |
<<local_0_start+local_n0<<")x[0,"<<N1<<"), total local size "<<(N0*N1)<<endl; |
||
//now allocates data in the old fashioned way |
//now allocates data in the old fashioned way |
||
data = (std::complex<double>*) fftw_malloc(sizeof(std::complex<double>)*alloc_local); |
data = (std::complex<double>*) fftw_malloc(sizeof(std::complex<double>)*alloc_local); |
||
// in C: data = (fftw_complex *) fftw_malloc(sizeof(fftw_complex) * alloc_local); |
// in C: data = (fftw_complex *) fftw_malloc(sizeof(fftw_complex) * alloc_local); |
||
//create fftw plan, note "_mpi_" and note the "_2d" |
//create fftw plan, note "_mpi_" and note the "_2d" |
||
plan = fftw_mpi_plan_dft_2d(N0, N1, (double(*)[2])data, (double(*)[2])data, world, |
plan = fftw_mpi_plan_dft_2d(N0, N1, (double(*)[2])data, (double(*)[2])data, world, |
||
FFTW_FORWARD, FFTW_ESTIMATE); |
FFTW_FORWARD, FFTW_ESTIMATE); |
||
double pdata=0; |
double pdata=0; |
||
//initialize data and compute local power |
//initialize data and compute local power |
||
| Line 132: | Line 1,416: | ||
clog<<"Process "<<world.rank() /* each process prints its local power*/ |
clog<<"Process "<<world.rank() /* each process prints its local power*/ |
||
<<" has a power of "<<pdata<<" of the original data"<<endl; |
<<" has a power of "<<pdata<<" of the original data"<<endl; |
||
//compute transform according to the plan |
//compute transform according to the plan |
||
fftw_execute(plan); |
fftw_execute(plan); |
||
double ptransform=0; |
double ptransform=0; |
||
// normalize data and compute local power of the transform |
// normalize data and compute local power of the transform |
||
| Line 145: | Line 1,429: | ||
} |
} |
||
} |
} |
||
clog<<"Process "<<world.rank() //each process print its local power |
clog<<"Process "<<world.rank() //each process print its local power |
||
<<" has a power of "<<ptransform<<" of the transformed data"<<endl; |
<<" has a power of "<<ptransform<<" of the transformed data"<<endl; |
||
fftw_destroy_plan(plan); /* destroy plan */ |
fftw_destroy_plan(plan); /* destroy plan */ |
||
fftw_free(data); |
|||
// boost::mpi::environment doesn't need finalization it is automatic, in C: MPI_Finalize(); |
// boost::mpi::environment doesn't need finalization it is automatic, in C: MPI_Finalize(); |
||
} |
} |
||
The example is rather long, this is mainly because we are using the plain-C fftw interface, and manually allocated tables. |
The example is rather long, this is mainly because we are using the plain-C fftw interface, and manually allocated tables. Note that the only change relative to the original FFTW example is the usage of Boost.MPI and the use of C++ complex type. |
||
(Imagine how much cleaner would be the code if we could take advantage of boost.multiarray and of a C++ wrapper of FFTW. Eventually, this will be the topic of another section.) |
|||
Note that using C++ and Boost does not enforce us to leave the known C-interfaces, the idea is that we can reuse the old interfaces and improve them gradually as we need it. For example we used <code>std::complex<double></code> instead of <code>double[2]</code>, (this is possible because both types are bit-compatible); the former has a much [http://www.cplusplus.com/reference/std/complex/ support] for treating complex numbers. |
|||
What the program does is to initialize parts of matrix in different processes, calculate local power (sum of squares) of the data, do a global Fourier transform of this matrix (in place) and the compute the local power of the transformed data. The global power (sum of locals powers) must be equal in the original data and in the transformed data. (See above for compilation command.) |
|||
Also note that although this is our first parallel MPI program the real MPI communication is performed by the FFTW routines and not by us. Rarely we will want to deal with passing very large amount of data ourselves between processes, libraries like FFTW and Lapack will do a faster/better job than us. The idea here is to set the environment for the usage of those well tunned numerical libraries. |
|||
Now let us download [[Media:simple_boost_mpi_example.tar.gz | simple_boost_mpi_example.tar.gz]] and run the example: |
|||
wget http://micro.stanford.edu/mediawiki-1.11.0/images/Simple_boost_mpi_example.tar.gz -O simple_boost_mpi_example.tar.gz |
|||
tar -zxvf simple_boost_mpi_example.tar.gz |
|||
cd simple_boost_mpi_example |
|||
make simple_boost_mpi_example |
|||
(Checkout the Makefile to see the compilation line). Run the example to see an output like the following: |
|||
$ mpirun -np 2 ./simple_boost_mpi_example |
|||
Process 1 has the array of size [9,18)x[0,18), total local size = 162, local allocated size = 162 |
|||
Global size of array : [0:18)x[0:18), total size 324 |
|||
Process 0 has the array of size [0,9)x[0,18), total local size = 162, local allocated size = 162 |
|||
Process 0 has a power of 3672 of the original data |
|||
Process 0 has a power of 27729 of the transformed data |
|||
Process 1 has a power of 28458 of the original data |
|||
Process 1 has a power of 4401 of the transformed data |
|||
Two copies of the program are run at the same time, FFTW takes care of the communication between them. Before looking at any number, we can see that the output messages are not well ordered. we learned the first lesson of MPI in general: not synchronized instructions can be executed in any order. |
|||
Besides this the program works as expected. The original 18x18 matrix is split among the two processors in two matrices of 9x18, and a memory for 162 elements is allocated in each process. |
|||
Also we can check that the power of the global data is the same before and after the FFT, 3672+28458 == 27729+4401 == 32130. |
|||
We can check running the program in one process. |
|||
$ mpirun -np 1 simple_boost_mpi_example |
|||
Global size of array : [0:18)x[0:18), total size 324 |
|||
Process 0 has the array of size [0,18)x[0,18), total local size = 324, local allocated size = 324 |
|||
Process 0 has a power of 32130 of the original data |
|||
Process 0 has a power of 32130 of the transformed data |
|||
==== Important notes about MPI-FFTW3 memory distribution (do not skip) ==== |
|||
Although the distribution of the matrix and allocated memory seems to very reasonable in the previous example, this is only because of the simple dimensions and process number used in the example. There are some characteristics of the memory distribution for which the previous example can be misleading, and it is worth noting them: |
|||
* Only FFTW (via [http://www.fftw.org/fftw3.3alpha_doc/Basic-and-advanced-distribution-interfaces.html fftw_mpi_local_size_2d]) knows how to split the data and allocated memory size among processes, so don't try to predict the splitting yourself. |
|||
* The splitting of the global matrix is only done over rows (first dimension, for N>2-dimensional arrays). |
|||
* 1D FFT's usage is totally different, because the splitting depends on the type of transform to be performed (e.g. in-place, out-of-place). |
|||
* The splitting of the matrix can be uneven (for example, #columns is not divisible by #processes). |
|||
* Some processes can be assigned zero columns (for example if there are more processes than columns). Even in this case, the allocated memory in that process can be different from zero! (in general, it is 1*sizeof(complex)). |
|||
* The logical size of the submatrix (local matrix) can be different from the local allocated memory, i.e. local_n0*N1 <= alloc_local. The outputs of local_size are not redundant. FFTW uses the (small) extra memory as a scratch space. |
|||
The next step in complexity could be to intercommunicate the processes in order to compute the total (global) power of the data. |
|||
=== Collective Reduce operation === |
|||
There are several kinds of MPI operations, [http://www.boost.org/doc/libs/1_37_0/doc/html/mpi/tutorial.html#mpi.point_to_point point-to-point] and [http://www.boost.org/doc/libs/1_37_0/doc/html/mpi/tutorial.html#mpi.collectives collective]. Among the collective operations, the reduce operation results in the number we need to compute: the global power of the data. Reduction means that we take *all* the objects/numbers from different processes and convert it into a *single* object/number by means of some function. More often than not, this function is simply a summation (<math>\Sigma_n a_n</math>). |
|||
In practice, a summation is a very simple function, because it is associative and commutative (sequence-product <math>\Pi_n a_n</math> is as well), so many issues are simplified in this case because the result will not depend (mostly) on how the order in which the data is gathered. This kind of operations are defined only in terms of a binary function. That is, the summation is defined in terms of the binary addition alone. |
|||
I said *mostly* because for machine numbers the usual operations are not *exactly* associative, and the result can depend on how the data is split among processes. This is the second lesson of parallel programming, numeric results *can* depend on the number of processes. In fact, if the result completely changes with the number of processes then that may point to very deep conceptual errors in your numeric algorithm, since the result depend on how the data is summed (for example, when you sum small and large numbers together). This is very different from a 'bug' in the program (or in the MPI library). |
|||
==== <code>all_reduce</code> ==== |
|||
Going back to the global power example we can make all the processes receive the summation of values by inserting the lines: |
|||
double global_pdata=-1; |
|||
mpi::all_reduce(world, pdata, global_pdata, std::plus<double>()); |
|||
double global_ptransform=-1; |
|||
mpi::all_reduce(world, ptransform, global_ptransform, std::plus<double>()); |
|||
... |
|||
cout<<"Process "<<world.rank()<<" knows that global power of original data is "<< global_pdata << endl; |
|||
cout<<"Process "<<world.rank()<<" knows that global power of transformed data is "<< global_ptransform << endl; |
|||
Now the program will compute the power of the data for us, which makes easier to verify that the Fourier transform is consistent. Note that the third argument is the variable which will be overwriten with the result and the second is the contribution from the current process to the reduction. |
|||
From this point on we can easily adapt the reduction to other needs. We can pass a different function, like std::multiply<double>() or, just as an example, selectively pass data if it fulfills certain condition: |
|||
mpi::all_reduce(world, pdata>0.?pdata:0, std::plus<double>()); // only pass positive values. |
|||
mpi::all_reduce(world, world.rank()<4?pdata:0, std::plus<double>()); //only pass data of the first 4 processes. |
|||
Note that in both cases the condition of passing the data is *inside* the function call (via the ternary function <code>condition?casetrue:casefalse</code>). If you put the condition outside the function call, there exists the possibility of not fulfilling such condition call the MPI communication will hang because all the process will be waiting for a process that refused to send the data. This is the wrong code |
|||
if(pdata>5000){ // wrong way to try to selective pass data larger than 5000 |
|||
mpi::all_reduce(world, pdata, std::plus<double>()); |
|||
} // WRONG |
|||
What is worst: the program may or may not hang depending on the actual numeric value of the data. So, third lesson of parallel programming: it is too easy to create a deadlock. In particular all blocking-communication have to be executed in non-conditional branches of the program. (Another good reason to program without <code>if</code>'s.) |
|||
==== Note on function objects (Boost.Lambda and Boost.Phoenix) ==== |
|||
In the C++ standard library, the binary addition is represented by a function object called <code>std::plus<T></code> which represents the addition over type T, we are interested in the case in which T is the double type. This functions object have become the standard way of passing arbitrary functions to other functions. Function-object seem misterious sometimes but they are simply a representation of the operation, for example this code works as expected: |
|||
std::binary_function<double, double, double>* myop; |
|||
myop=new std::plus<double>(); |
|||
double plus_result=(*myop)(4,5); // or myop->operator()(4,5); |
|||
assert(plus_result==9); |
|||
delete myop; |
|||
myop=new std::multiply<double>(); |
|||
double multiply_result=(*myop)(3,5); |
|||
assert(multiply_result==15); |
|||
delete myop; |
|||
In fact it is very easy to create your own function object, they look almost like functions: |
|||
struct concat{ //concatenation |
|||
std::string operator()(std::string const& s1, std::string const& s2){ |
|||
return s1 + s2; |
|||
} |
|||
}; |
|||
Passing such function object to reduce will make <code>all_reduce</code> concatenate the strings produced by different processes. Note that this operation is not commutative and will depend on some standard convention used by MPI to collect the data (fortunately it is well defined). |
|||
The only difference between our function object an <code>std::plus<std::string></code>, is that the later is already included in some standard library. |
|||
They differ from built-in (normal C) functions in two aspects, 1) you can parametrize them, 2) you can make copies of them, 3) you can compare them: |
|||
struct concat_with_sep{ // concatenation with separation |
|||
std::string separator_; |
|||
concat_with_sep(std::string const& separator) : separator_(separator){} |
|||
std::string operator()(std::string const& s1, std::string const& s2){ |
|||
return s1 + separator_ + s2; |
|||
} |
|||
}; |
|||
The beauty of this is that, although the separator string is arbitrary, 'concat_with_sep', viewed as function, still takes only two parameters. Function object are more general than regular functions, because they can have internal parameters. Function objects are sometimes called Functors. |
|||
Free functions (the normal C-functions) can be converted to function objects in many ways. The standard way is to use [http://www.boost.org/doc/libs/1_38_0/doc/html/function.html Boost.Function] library. Function objects can be combined with one another with the [http://www.boost.org/doc/libs/1_38_0/libs/bind/bind.html Boost.Bind] or the (more advanced) [http://www.boost.org/doc/libs/1_38_0/doc/html/lambda.html Boost.Lambda]. |
|||
For very complex functions we should define our own function objects which requires a separate definition of a class or function. There is however some cases where the function that we need to apply is simple but not trivial. It is in this case where Boost.Lambda plays a nice role by allowing us to define a function ''in-place''. For example, suppose that we need to compute not the sum but the sum of square modulus, in such case the function object can be created by replacing <code>... std::plus<double>() ...</code> by |
|||
... _1+bind(&std::norm<double>, _2) ... |
|||
where we should include and declare |
|||
#include<boost/lambda/lambda.hpp> |
|||
#include<boost/lambda/bind.hpp> |
|||
using namespace boost::lambda; |
|||
at some earlier point. If we were interested in the sum of squares, we could use just use |
|||
... _1 + _2 *_2 ... |
|||
Even simpler is the simple addition, plus<double> is exactly identical to _1+_2 once we use Boost.Lambda. |
|||
Note that this nice syntax is achieved by the Lambda library and it is not built-in in the language. How the Lambda library works internally is quite complex to explain, however it is designed to be easy to use (note the natural syntax). On the other hand it is difficult to debug (compilation errors are difficult to interpret). A working example of this library can be downloaded from [[media:Boost_lambda.tar | here]]. |
|||
The example also illustrates the usage of [http://spirit.sourceforge.net/dl_docs/phoenix-2/libs/spirit/phoenix/doc/html/index.html Boost.Phoenix] which is intended to supercede Boost.Lambda. |
|||
==== <code>reduce</code> ==== |
|||
Another variation of the example can be introduced if we realize that sometimes the result is only needed in one process. In this case the reduction result is sent to one process only, which is specified as the last argument in mpi::reduce: |
|||
double global_pdata=-1; |
|||
mpi::reduce(world, pdata, global_pdata, std::plus<double>(),0); // process 0 receives the data |
|||
cout<<"Process "<<world.rank()<<" *thinks* that global power of the original data is "<< global_pdata << endl; |
|||
In this example, only process 0 will have a meaningful value of the result; other processes will have the global_pdata variable unchanged. The process 0 will output the correct value, while the rest will output -1. Essentially, all process but one will have an uninitialized, invalid, or at best flagged to some magical value. |
|||
This selective data collection sounds reasonable idea but really it can cause trouble because we have to keep track of the process that has the right value by some conditional. There are two basic possibilities, either we "promise" to ignore the value in processes with the wrong value. |
|||
double global_pdata=-1; |
|||
mpi::reduce(world, pdata, global_pdata, std::plus<double>(),0); // process 0 receives (and send to itself) the data |
|||
if(world.rank()==0){ //always check for process number *before* using the variable. |
|||
cout<<"Process 0 knows that the global power of the origin data is "<<global_pdata<<", others do not really know"<<endl; |
|||
} |
|||
or we put the result variable inside the scope of an 'if' statement: |
|||
if(world.rank()==0){ |
|||
double global_pdata=-1; |
|||
mpi::reduce(world, pdata, global_pdata, std::plus<double>(),0); // process 0 sends and receives the data |
|||
... // do other operations with result |
|||
cout<<"Process 0 knows that the global power of the origin data is "<<global_pdata<<", others do not know at all"<<endl; |
|||
}else{ |
|||
mpi::reduce(world, pdata, std::plus<double>(),0); // others process only send the data |
|||
} |
|||
Note that the number of arguments is different in the two different calls to reduce. The first branch will receive in the result of all the processes (including itself) while the second will only send values. |
|||
In this way, only process 0 can ever access the value. Trying to use the the variable outside the first branch of the conditional will not even compile. In this case we do not have to promise about the usage of the variable but we have to promise that we will call similar versions of <code>reduce</code> in both branches of the conditional. If we forget the <code>else</code> branch (second branch), the program will hang in the same way as described above. |
|||
Next subsection will deal with this problem of 'dangling' variables. You can skip the following subsection in a first read, it will not improve your understanding of MPI. Just be aware the problem for the moment. |
|||
==== The issue of dangling variables (and Boost.Optional)==== |
|||
A variable with a value that makes sense in only one process seems to be a natural situation, after all, each process is supposed to do different part of the total work. The problem is that, within MPI, all processes must share the same code and therefore will be forced to carry variables that may not make sense to them. This situation creates tension. |
|||
Note that in the first place, this issue arises because by collecting data in only one process we are imposing some asymmetry; it is nevertheless necessary sometimes. The situation is annoying for those used to serial programming, but we have to understand that this kind of problems are at the core of parallel programming difficulties. The first step to try to avoid the problem is to think whether it may be really useful --or at least not harming-- for all the processes to know the result of reduce/all_reduce.) |
|||
It seems that the second solution above (if/else block) is more elegant because there are no dangling undefined variables around, but is not free of problems. That is because if we need to compute something non-trivial with the result, the conditional statement can become several lines long and it can be difficult to handle cases in which more than one (but not all) process receive the value. We can put all these lines in a separate function, but that function will be forced to be serial code (because it is executed only for one process). Note that I refer this as "dangling variable", in the same sense we refer to "dangling pointers", both are equally dangerous. The following proposed solution uses the library Boost.Optional and it can be also used as an introduction to this other Boost library. |
|||
[http://www.boost.org/doc/libs/1_37_0/libs/optional/doc/html/index.html Boost.Optional] introduces a kind of objects which carry information on whether their in internal value is valid (e.g. initialized or not). For example the object |
|||
boost::optional<double> a; |
|||
can hold any real value *or* be in a 'undefined state'. If it is undefined then (bool)a==false and true otherwise and its actual value is given by *a. If we try to extract a value *a for an undefined variable then we will encounter a runtime error. |
|||
There is another possibility to consider that I discovered, that is the usage of [http://www.boost.org/doc/libs/1_37_0/libs/optional/doc/html/index.html Boost.Optional]. (This solution is not suggested anywhere else to my knowledge.) In this case, the valid state of the variable is handled by the variable itself. |
|||
boost::optional<double> global_pdata(world.rank()==0, double()); //non empty at node 0 only. |
|||
if(global_pdata) mpi::reduce(world, pdata, *global_pdata, std::plus<double>(),0); |
|||
else mpi::reduce(world, pdata, std::plus<double>(),0); |
|||
... |
|||
if(global_pdata){ // only "use" if global_pdata is it is initialized |
|||
cout<<"Process "<<world.rank() |
|||
<<" knows that the global power of the origin data is "<<*global_pdata<<", others will die if they try to 'know'"<<endl; |
|||
} |
|||
// cout<<*global_pdata<<endl;//unchecked usage will throw and exception and possibly abort the program in other processes |
|||
For those used to C, they will recognize that 'global_pdata' looks like a 'nullable' pointer (whose pointee value is valid when !=0 by convention). The difference is that boost::optional behaves better than a pointer for this application, for example, boost::optional can be assigned values, or uninitialized with no memory allocation necessary, no dangling pointers can be created in this way. |
|||
If we respect this check-then-use syntax, the variable becomes transparent to all the process where it is undefined. |
|||
Note that global_pdata is not associated with any particular process number, we can forget in which process the variable was valid, the validity test is done in the variable itself. We have to remember to test whether the global_pdata is locally valid. What if we do not check and use *global_pdata without test? Unless we compile in production mode (NDEBUG=0) the calling process will terminate after an 'uninitialized' fail assertion is called from Boost.Optional. This is still better than other options, like having a dangling pointer (nullable pointer solution), using an uninitialized or invalid value (first solution), or a deadlock (second solution). |
|||
So far I mentioned three (or four) possible solutions, none of them can save us at compilation time, i.e. before running our program. The situation will not improve until someone smart writes a compiler (and a language) capable of *really* interpreting parallel code and figuring out these issues before actually running the program. (Notice that MPI is really like low level programming.) |
|||
If you think that my suggested solution (using Boost.Optional) is in the right direction, there are at least two ways to improve it further. First, we can encapsulate the two calls to 'reduce' in a single function (a template function will be even more general): |
|||
boost::optional<double> optional_reduce(boost::mpi::communicator const& world, double const& in_value, int root){ |
|||
boost::optional<double> ret(world.rank()==0, double()); |
|||
if(ret) mpi::reduce(world, in_value, *ret, std::plus<double>(),0); |
|||
else mpi::reduce(world, in_value, std::plus<double>(),0); |
|||
return ret; |
|||
} |
|||
and then use it as |
|||
boost::optional<double> global_pdata = optional_reduce(world, pdata, 0); |
|||
... |
|||
if(global_pdata) cout<<"global_pdata = "<<*global_pdata<<endl; |
|||
The second is to use exceptions. An exceptions that is thrown when trying to use a variable in a process in which is not defined. |
|||
For output purposes we can make boost::optional support output by |
|||
#include<boost/optional/optional_io.hpp> ... |
|||
cout<<"global_pdata = "<<global_pdata<<endl; |
|||
will print '--' if the value is unitialized and the actual value otherwise, note the lack of '*'. |
|||
==== Transparent blocks via exceptions (boost/assert.hpp) ==== |
|||
By default, access to an uninitialized optional variable results in a failed assert and end of execution. If you want to be able to recover from this failed assert, you can use [http://www.boost.org/doc/libs/1_38_0/libs/utility/assert.html boost/assert.hpp] to change the default behavior. |
|||
Just add the following code (extracted from [http://www.boost.org/doc/libs/1_38_0/libs/multi_array/test/assert.cpp here]) before and after <code>#include<boost/optional.hpp></code> |
|||
#define BOOST_ENABLE_ASSERT_HANDLER |
|||
#include "boost/assert.hpp" |
|||
#include "boost/otional.hpp" |
|||
// #undef BOOST_ENABLE_ASSERT_HANDLER //to constrain the new default to optional.hpp only |
|||
#include <stdexcept> |
|||
namespace boost { |
|||
void assertion_failed(char const* expr, char const* function, |
|||
char const* file, long line) { |
|||
throw std::runtime_error(expr); |
|||
} |
|||
} |
|||
This will allow you to throw an exception (instead of failing after the assert) when accessing uninitialized variable, this gives the opportunity to recover from failed executions. For example: |
|||
try{ |
|||
boost::optional<double> global_pdata = optional_reduce(world, pdata, 0); |
|||
}catch(...){ |
|||
clog<<"sorry I am a process that don't know about global_pdata"<<endl; |
|||
} |
|||
The idea is that the invalid part of the try-block becomes transparent to processes. (The valid part should be exception safe though.) |
|||
=== Primitive or point-to-point operations (<code>barrier</code>, <code>send</code> & <code>recv</code>) === |
|||
The section describe primitive operations, i.e. operation that are very basic. In principle we should be able to do all the communications with this primitives, but then many things will have to be handled manually. The result of using this operations in excess can result in a very unelegant code. (That is why this section is at the end). |
|||
The simplest operation is the member function barrier it serves to the purpose of synchronizing all the processes in a certain communicator. Then there is the member send and recv, to send an receive messages. |
|||
In pure C-MPI, the last two operators are only implemented to pass small types, like single numbers or pointers. That means that to pass a simple string we have to allocate the memory manually and hope for successful communication and that we didn't introduce any mistake when dealing with the pointers. |
|||
With Boost MPI the situation is much better thank to the automatic serialization and deserialization. |
|||
The following example shows the use of barrier, send and recv to print (from process 0) a list of the nodes involved in a MPI execution: |
|||
#include<boost/mpi.hpp> |
|||
int main(int argc, char* argv[]){ |
|||
boost::mpi::environment env(argc, argv); |
|||
boost::mpi::communicator world; |
|||
std::clog<<"greetings from process #"<<world.rank()<<" in processor named "<<boost::mpi::environment::processor_name()<<std::endl; |
|||
world.barrier(); |
|||
if(world.rank()==0) std::clog<<"all processes syncronized here"<<std::endl; |
|||
world.send(0,world.rank(), boost::mpi::environment::processor_name()); |
|||
for(int i=0; i!=world.size(); ++i){ |
|||
//std::clog<<"process #"<<i<<" sends the name of its processor to process #0"<<std::endl; |
|||
if(world.rank()==0){ |
|||
std::string buffer; |
|||
world.recv(i, i, buffer); |
|||
std::clog<<"process #0 receives string "<<buffer<<" from process #"<<i<<std::endl; |
|||
} |
|||
} |
|||
return 0; |
|||
} |
|||
compile with |
|||
mpicxx main.cpp -L$HOME/usr/lib -lboost_mpi-gcc44-mt -lboost_serialization-gcc44-mt -Wl,-rpath=$HOME/usr/lib |
|||
and run with |
|||
./a.out |
|||
The first set of messages ("greetings...") will appear in a random order, however thanks to the barrier call, all processes will wait for the others to reach that point in the execution. After that all processes will send the name of the processor in which they are executing to process number 0, which will receive the message and print the string. |
|||
Both send and recv send a tag (integer in the second argument), this number can be for example the origin process of the message itself, but the important thing is that send and recv must agree in the tag, otherwise recv will wait for a message that has the correct tag. If you make the tag disagree at send and receive the program will hang. Tag is useful to distiguish between 'different' message sent by the same process. |
|||
Although much cleaner than its counterpart pure C version, the code is still quite fragile and prone to errors. For these reason it is recommendable to use other ways to send messages as described in the other sections of this chapter. |
|||
=== Other collective operations (<code>gather</code> and <code>broadcast</code>) === |
|||
To make clear that options other than send and recv should be considered I will reimplement the program using the 'gather' operation. [http://www.boost.org/doc/libs/1_42_0/doc/html/mpi/tutorial.html#mpi.gather Gather] takes the elements send by other processes and stacks them in a vector type. |
|||
Then the vector object can be transversed to inspect its contents. |
|||
#include<boost/lexical_cast.hpp> |
|||
#include<boost/mpi.hpp> |
|||
int main(int argc, char* argv[]){ |
|||
boost::mpi::environment env(argc, argv); |
|||
boost::mpi::communicator world; |
|||
std::clog<<"greetings from process #"<<world.rank()<<" in processor named "<<boost::mpi::environment::processor_name()<<std::endl; |
|||
std::string message="process #0 receives string "+boost::mpi::environment::processor_name()+" from process #"+boost::lexical_cast<std::string>(world.rank()); |
|||
if(world.rank()==0){ |
|||
std::vector<std::string> lines; |
|||
boost::mpi::gather(world, message, lines, 0); |
|||
for(int i=0; i!=lines.size(); ++i) |
|||
std::cout<<lines[i]<<std::endl; |
|||
}else{ |
|||
boost::mpi::gather(world, message, 0); |
|||
} |
|||
return 0; |
|||
} |
|||
[http://www.boost.org/doc/libs/1_42_0/doc/html/mpi/tutorial.html#mpi.broadcast Broadcast] simply sends a value to all the processes in a certain communicator, the values are stored in the same variable as in the broadcasting process. |
|||
string v = "empty"; |
|||
if(world.rank()==0) v="special value"; |
|||
boost::mpi::broadcast(world, v,0); |
|||
assert(v=="special value"); //all processes have this value for v now |
|||
In other words it synchronizes the value of a variable with respect to the value in a certain process. |
|||
=== Parallel logging example === |
|||
One of the first problems that appears when writting our first parallel program is that when we try to print messages the usually don't appear as we intend. The reason is that logging and printing messages to the user is an inherently serial job and once we are in a parallel environment we have to be specific that such message printing is done in a serial way. We have to synchronize the processes make each process send its message (hopefully in a prescribed order) and then recover the parallel execution. What is worst is that sometimes we need all the processes to print the same message, in that case many copies of the same message will be printed. |
|||
It would be great if we can solve this problem once and for all and make the code read our mind and understand these issues and print what we intend. |
|||
The following code will make that possible, the code seems complicated because it uses advanced C++ stream techniques but what it does is very simple, for a newly defined stream (boost::mpi::ostream) all processes will accumulate strings send by each process, when a special end-of-line signal is sent (boost::mpi::endl) is send the root process collects the messages and make them unique. In this sense repeated messages are only shown once. Messages are displayed in alphabetical order. If we want the messages displayed in the order of the issuing process we can just add that information at the beginning of the message (e.g. "proc 5:..."). |
|||
In this sense we can forget that we are running a program in parallel if all messages are the same. But if the messages are different (line-by-line) all the messages are printed recovering the parallel behavior. Also, in the parallel case, messages are shown in a prescribed order. |
|||
#include<boost/mpi.hpp> |
|||
#include<set> |
|||
namespace boost{namespace mpi{ |
|||
class ostream{ |
|||
communicator const& comm_; |
|||
std::ostream& os_; |
|||
std::ostringstream local_buffer_; |
|||
public: |
|||
ostream(communicator const& c=communicator(), std::ostream& o=std::clog) : comm_(c), os_(o){} |
|||
template<typename T> ostream& operator<<(T const& t){ |
|||
local_buffer_<<t; |
|||
return *this; |
|||
} |
|||
typedef ostream& (*ostream_manipulator)(ostream&); |
|||
ostream& operator<<(ostream_manipulator manip){ |
|||
return manip(*this); |
|||
} |
|||
~ostream(){endl(*this);} |
|||
static ostream& endl(ostream& os){ |
|||
os.local_buffer_<<std::endl; |
|||
os.comm_.send(0, os.comm_.rank(), os.local_buffer_.str()); |
|||
os.local_buffer_.str(""); |
|||
if(os.comm_.rank()==0){ |
|||
std::set<std::string> messages; |
|||
for(int i=0; i!=os.comm_.size(); ++i){ |
|||
std::string remote_buffer_; |
|||
os.comm_.recv(i, i, remote_buffer_); |
|||
messages.insert(remote_buffer_); |
|||
} |
|||
for(std::set<std::string>::const_iterator i=messages.begin(); i!=messages.end(); ++i){ |
|||
os.os_<<(*i);//<<std::endl; |
|||
} |
|||
} |
|||
os.comm_.barrier(); |
|||
return os; |
|||
} |
|||
typedef std::ostream& (*std_manipulator)(std::ostream&); |
|||
ostream& operator<<(std_manipulator manip){ |
|||
manip(local_buffer_); |
|||
return *this; |
|||
} |
|||
}; |
|||
ostream& endl(ostream& os){return ostream::endl(os);} |
|||
}} |
|||
The stream is used as follows: |
|||
int main(int argc, char* argv[]){ |
|||
boost::mpi::environment env(argc, argv); |
|||
boost::mpi::communicator world; |
|||
using boost::mpi::endl; //or using std::endl; //* |
|||
boost::mpi::ostream clog(world); // or using std::clog; //* |
|||
clog<<"world says: we are the world"<<endl; //this messages is unique |
|||
clog<<"rank "<<world.rank()<<" says: I am me"<<endl; //this message is different for each process. |
|||
return 0; |
|||
} |
|||
The ouput is |
|||
$ mpirun -np 3 ./a.out |
|||
world says: we are the world |
|||
rank 0 says: I am me |
|||
rank 1 says: I am me |
|||
rank 2 says: I am me |
|||
Note that the code looks almost like the serial code, by changing the two lines marked we recover the usual behavior (i.e. multiple messages in random order). Anoether way to recover the serial behavior is to use the communicator MPI_COMM_SELF in the constructor of the stream: |
|||
boost::mpi::communicator self(MPI_COMM_SELF, boost::mpi::comm_attach); |
|||
boost::mpi::ostream clog(self); // or using std::clog; |
|||
There are a couple of assumptions in the design. First, the messages are all different or all the same, i.e. we don't care about distinguishing messages from 'groups' of processes. In any case that 'group' information can be passed explicitly as part of the string. Second, the smallest indivisible messages is an entire line and not part of it, i.e. uniqueness is relative to a full line, i.e. until boost::mpi::endl is found. |
|||
The class can be defined and used inline: |
|||
boost::mpi::ostream(world)<<world.rank()<<" says: I am me"<<endl; |
|||
By default it uses the std::clog output (stderr), if we want to use a different output stream, for example a file or std::cout (stdout) we can specify: |
|||
boost::mpi::ostream(world, std::cout)<<world.rank()<<" says: I am me"<<endl; |
|||
Finally, we have to remember that all processes (in a certain communicator) will wait to synchronize to print the message. So the parallel performance can be affected by printing these messages, even if the message is unique. If this is a price we don't want to pay, then we will have to roll back to the usual conditional statement: |
|||
if(world.rank()==0) std::clog<<"here"<<std::endl; |
|||
But then the message will not contain information about the current state of processes other than process number 0. |
|||
NB: The idea of the behavior is taken from the [https://computation.llnl.gov/casc/ppf/ppf.html Parallel Print Function] and the implementation of the stream is bases on this [http://stackoverflow.com/questions/1134388?sort=votes#sort-top generic stream class]. |
|||
=== Other Resources === |
|||
* http://daveabrahams.com/2010/09/03/whats-so-cool-about-boost-mpi/ |
|||
== Boost.Filesystem == |
|||
Sooner or later every program has to deal with the task of managing files or even directories in the disk. The following tasks are quite tedious to perform in C or C++: |
|||
* Create/[http://www.brainbell.com/tutorials/C++/Removing_A_Directory.htm#cplusplusckbk-CHP-10-SECT-11 remove] directories |
|||
* Iterate through files in a given directory |
|||
* Create filenames from other names |
|||
* Change/add filename extensions |
|||
* Copy files |
|||
* Delete files |
|||
* Check if a file exists |
|||
Besides, there is no portable way (across operating systems) to do it in general with the core language and libraries. Here is where [http://www.boost.org/doc/libs/1_38_0/libs/filesystem/doc/index.htm Boost.Filesystem] proves useful, it not only facilitates programming such tasks but also does it in a portable way by hiding the OS-specific file management. Besides it is likely that this library will become part of the next C++ standard library. If you are using the latest boost, it is better to use this manual http://www.boost.org/doc/libs/1_46_1/libs/filesystem/v3/doc/index.htm which references version 3 of the library. |
|||
This library is also described briefly in http://www.ibm.com/developerworks/aix/library/au-boostfs/index.html and http://en.highscore.de/cpp/boost/filesystem.html |
|||
=== Path === |
|||
The path object can represent a concrete file or a directory. It is basically a string that represents the path to an existing or non existing file or directory. |
|||
boost::filesystem::path p("/home/user/data.dat"); |
|||
It has certain advantages to the usual way of representing a file by means of char* or std::string. The advantage over char* is that the object is not a pointer so it can be copied without worrying about the pointer semantics. |
|||
The advantage over std::string is that path names can be combined very easily, even using the intuitive operator '/', for example: |
|||
using boost::filename::path; |
|||
path myhome("/home/user"); |
|||
path mydata("data.dat"); |
|||
path fullname = myhome / mydata; |
|||
Other facilities of boost::filename::path is that the extension of the file can be easily extracted and changed. |
|||
Traditionally filename are passed as char* in C and std::string in C++. filesystem::path can be converted back and forth from this types. Thanks to the constructors |
|||
const char c[]="file.dat"; |
|||
path p(c); |
|||
path p("file.dat"); // from literal; |
|||
std::string s="file.dat"; |
|||
path p(s); |
|||
and back from member functions |
|||
std::string s=p.string(); |
|||
const char c[99]=p.string().c_str(); |
|||
so, passing filenames to old routines is as simple as passing p.string().c_str(). (note that c_str() is a member function of std::string class). |
|||
=== Common path name functions === |
|||
The operations listed in this section manipulate paths, that is manipulates the string that *represent* files (not the files themselves). |
|||
p.extension() // was extension( path ) -> string //returns the extension of a file path (useful for guessing the file type), includes the 'dot'. |
|||
p.stem() // was basename( path ) -> string //returns everything in the filename except for the extension |
|||
p.filename() // name of file without path |
|||
note that <code>p.string() == p.stem() + p.extension()</code>. |
|||
p.replace_extension(".newext"); //(was change_extension(const path, string new_extension_w_dot) -> path) // changes the extension of a path name and returns the path |
|||
Note that these functions return strings, not paths. Strings can be manipulated as such, and eventually transformed to paths. With the path constructor: |
|||
path newp(string); |
|||
Also to interface with old routines that take C-strings we could use |
|||
p.string().c_str(); |
|||
the first function produces an std::string with the path the second call productes a C-string (char*). |
|||
=== Common file actions === |
|||
The operations listed in this section make actual changes on the file in the disk. |
|||
Most frequently file actions are straight forward once we remember the particular names given, here it is a list of functions with obvious side effects: |
|||
create_directory( path ); |
|||
create_symlink( realfile, linkname ); |
|||
create_directory_symlink( realfile, linkname ); |
|||
copy_file( origin, destination ); // exception on exisisting destination |
|||
remove( file ); |
|||
==== Caveats ==== |
|||
* Exceptions |
|||
Some actions could raise exceptions upon errors, if not catch they can abort the program. For example create_symlink raises an exception if the link already exists (but doesn't raise an exception if the destination file does exist). Another typical case is copy_file when destination already exists; for that copy_file can be given an option: |
|||
copy_file( origin, destination, copy_option::overwrite_if_exists ); |
|||
* Symlinks |
|||
Directories symlink should be created with create_directory_symlink in general, (although in some filesystems/OS they are the same). |
|||
First argument of create_symlink is basically the content of the symlink, therefore for symlinks created in an arbitrary directory should be relative to that location. |
|||
If symlink path exists (as a real file or as a symlink) an exception is raised. Therefore destinations must be removed before creating the symlink (there is no symlink overwrite option). |
|||
=== Directory Iterator === |
|||
So far, the features described are simply path name manipulations and file actions. On top of that the library gives the opportunity to scan files in a certain directory as if the directory were a typical STL container. The following code shows how to read several files with a given filename in different directories: |
|||
using namespace boost::filesystem; // ::directory_iterator ::path, etc |
|||
for(directory_iterator it("../datadir"); it == directory_iterator() /*end iterator*/; ++it){ |
|||
if(not is_directory(*it)) continue; //ignore if the entry is not a directory |
|||
for(directory_iterator it2(*it); it2 == directory_iterator(); ++it2){ |
|||
remove((*it2)/"data.txt"); |
|||
} |
|||
} |
|||
This example removes the file named "data.txt" in the directory "../datadir/*/". |
|||
There is also a 'recursive_directory_iterator' which avoids the complication of nesting different loops to reach a directory of arbitrary nesting. |
|||
This kind of directory)iterator should not be confused with other type of iterator which exists for a completely different purpose, the 'path::iterator', (which, given a path e.g. ("/a/b/c/d/e/") iterates over the components "a", "b", "c", etc.) |
|||
This is a minimal example in which the files in the current directory are backed up: |
|||
//c++ -std=c++0x -Wfatal-errors $0 -o $0.x -lboost_date_time -lboost_filesystem -lboost_system && $0.x $@ |
|||
#include<boost/filesystem.hpp> |
|||
using namespace boost::filesystem; |
|||
using std::cout; using std::endl; |
|||
int main(){ |
|||
initial_path(); // keeps information of the initial path where the program is executed |
|||
cout << "iterating over entries in directory " << initial_path() << endl; |
|||
for(directory_iterator it(initial_path()); it != directory_iterator(); ++it){ |
|||
cout << "entry " << *it << " found "; |
|||
if(is_directory(*it)){ |
|||
cout << "is a directory and will be ignored" << endl; |
|||
}else if((*it).path().extension() == ".backup"){ |
|||
cout << *it << " is already a backup and will be ignored " << endl; |
|||
}else{ |
|||
cout << "is not a directory and will be backed up" << endl; |
|||
path destination = (*it).path().string() + ".backup"; |
|||
cout << "copying from " << *it << " to " << destination << endl; |
|||
boost::filesystem::copy_file(*it, destination, copy_option::overwrite_if_exists); |
|||
} |
|||
} |
|||
return 0; |
|||
} |
|||
=== Compilation (Dynamic vs. Static Linking) === |
|||
The Filesystem library is not a "header only" library, that means that the executable must be linked to the libraries. There are two options, either dynamic linking (default) or static linking. Dynamic linking generates smaller executable files although it needs to locate the libraries at runtime, static on the other hand allows the executable to be shipped to another compatible system where the libraries are not present. In this example a small program using Boost.Filesystem is linked in two different ways, showing the difference in size: |
|||
$ c++ sizeme.cpp -lboost_filesystem -lboost_system |
|||
$ ls -all ./a.out |
|||
-rwxr-xr-x 1 user user 27867 2011-06-21 20:36 ./a.out* |
|||
$ ldd ./a.out |
|||
libboost_filesystem.so.1.47.0 => /home/user/usr/lib/libboost_filesystem.so.1.47.0 (0x00007f7ec382e000) |
|||
libboost_system.so.1.47.0 => /home/user/usr/lib/libboost_system.so.1.47.0 (0x00007f7ec362a000) |
|||
... |
|||
$ c++ sizeme.cpp -lboost_filesystem -lboost_system -static |
|||
$ ls -all ./a.out |
|||
-rwxr-xr-x 1 user user 1611409 2011-06-21 20:40 ./a.out* |
|||
$ ldd ./a.out |
|||
not a dynamic executable |
|||
Static linking produces a file 1.6 MB big, roughly the size of the Boost.Filesystem plus several other system libraries. In orther to be able to compile statically and dynamically you must have both the *.so and *.a files, this is the case if you followed the installation instructions for boost. |
|||
$; ls ~/usr/lib/libboost_filesystem.* -all |
|||
-rw-r--r-- 1 user user 390006 2011-04-13 13:17 /home/correaa/usr/lib/libboost_filesystem.a |
|||
lrwxrwxrwx 1 user user 29 2011-04-13 13:17 /home/correaa/usr/lib/libboost_filesystem.so -> libboost_filesystem.so.1.47.0* |
|||
-rwxr-xr-x 1 user user 173253 2011-04-13 13:17 /home/correaa/usr/lib/libboost_filesystem.so.1.47.0* |
|||
To have the static (and dynamic) version of the libraries in Ubuntu for example you must have the *-dev version of the packages. For example: |
|||
$ sudo apt-get install boost-all-dev libboost-filesystem1.42-dev |
|||
This library must be compiled with the C++0x support if you want to compile it with C++0x compiler. (See installation notes.) |
|||
== Boost.StaticAssert == |
|||
We can check for certain consistency conditions before some complicated code is called by means of 'assert'. For example: |
|||
#include<cassert> // usually not needed |
|||
double square_root(double x){ |
|||
assert(x>=0); |
|||
... |
|||
} |
|||
Under certain circumstances what we need is something that is checked during compilation, and not at runtime like the previous example. In other words, some conditions on macro definitions, types information and numeric constants must be checked even before the program is executed. It is here were Boost.StaticAssert can be useful. |
|||
The macro BOOST_STATIC_ASSERT, defined in [http://www.boost.org/doc/libs/1_39_0/doc/html/boost_staticassert.html#boost_staticassert.intro Boost.StaticAssert library] checks for certain conditions to be met at compile time. Of course the predicate must have the possibility to be evaluated at compile time, so dynamic variables cannot be used. This library should not be confused with Boost.Assert (BOOST_ASSERT). |
|||
Although most applications of static assert are for template instantiation, here I will give a simpler example. Suppose that we have two compilation flags -D_HOT and -D_ICECREAM that although it makes sense to use them separately it doesn't make sense to use them together because of the internal logic of the code. HOT is not compatible with ICECREAM. |
|||
The code needed to ensure that both flags are not defined simultaneously is the following: |
|||
#ifdef _HOT |
|||
#ifdef _ICECREAM |
|||
BOOST_STATIC_ASSERT(0); |
|||
#endif |
|||
#endif |
|||
The compilation will fail (at that point) if both compilation flags are defined. |
|||
A more complicated example is the following suppose that we want to make sure that the code is never compiled in a machine were the type 'int' has less than 32 digits: |
|||
BOOST_STATIC_ASSERT(std::numeric_limits<int>::digits >= 32); |
|||
Note that the condition can be evaluated at compile time (there are no variables involved). |
|||
In order to use BOOST_STATIC_ASSERT you must <code>#include<boost/static_assert.hpp></code>. |
|||
Due to technical reasons the authors of the library recommend, when possible (specially in header files), to put the condition in a special namespace. That also gives the opportunity to add some meaning to the condition: |
|||
#include <climits> |
|||
#include <cwchar> |
|||
#include <limits> |
|||
#include <boost/static_assert.hpp> |
|||
namespace my_type_conditions { |
|||
BOOST_STATIC_ASSERT(std::numeric_limits<int>::digits >= 32); |
|||
BOOST_STATIC_ASSERT(WCHAR_MIN >= 0); |
|||
} |
|||
A built-in version of static_assert is supposed to be part of the next [http://gcc.gnu.org/projects/cxx0x.html C++0x language]. |
|||
== Boost static computations (Metacomputations) == |
|||
Boost is above all a template library, that means that it concentrates in computations about types and compile time constants rather than in runtime algorithms. |
|||
Pure C is a programming language that allows to write almost any runtime algorithm, but the language doesn't give facilities to write algorithms that are performed during compilation (i.e. by the compiler). |
|||
On the other hand, C++ defined a whole set of template syntax that allows us to force the compiler to perform certain operations. Programming templates is called 'metaprogramming' in the jargon, because it is like programming the code generation of a program. In this sense many computations can be performed even before the program is run for the first time. |
|||
[http://www.boost.org/doc/libs/1_39_0/libs/mpl/doc/index.html Boost.MPL] is a great example of what it can be achieved by template metaprogramming. Here it is a basic example: |
|||
#include <boost/mpl/int.hpp> |
|||
int main(){ |
|||
typedef boost::mpl::int_<8> eight; |
|||
std::clog<<eight::value<<std::endl; |
|||
std::clog<<boost::mpl::next<eight>::type::value<<std::endl; |
|||
} |
|||
The output of this program is |
|||
8 |
|||
9 |
|||
The template class [http://www.boost.org/doc/libs/1_39_0/libs/mpl/doc/refmanual/int.html mpl::int] works like a metatype while 'eight' is a metavariable (and a type). |
|||
The operation 8 + 1 is never performed by our program. So, who did the operation? The answer is the compiler it self, in fact the program is exactly (in the binary sense) as having explicitly programmed manually: |
|||
int main(){ |
|||
std::clog<<8<<std::endl; |
|||
std::clog<<9<<std::endl; |
|||
} |
|||
The operation 'next' is a metafunction that returns a new type which wraps the value that follows the value wrapped by the argument type. The following program, makes the compiler compute the number 17. When executed, the program just prints the constant 17 (but it doesn't need to compute it). |
|||
#include <boost/mpl/int.hpp> |
|||
#include <boost/mpl/plus.hpp> |
|||
int main(){ |
|||
typedef boost::mpl::int_<8> eight; |
|||
typedef boost::mpl::int_<9> nine; |
|||
std::clog<<boost::mpl::plus<eight, nine>::type::value<<std::endl; |
|||
} |
|||
Boost.MPL have many more facilities, like compiler time containers (e.g [http://www.boost.org/doc/libs/1_39_0/libs/mpl/doc/refmanual/vector.html mpl::vector]) and algorithms (e.g. [http://www.boost.org/doc/libs/1_39_0/libs/mpl/doc/refmanual/max-element.html mpl::max_element]). |
|||
That it means that we can precompute anything at compile time so that at runtime we save many computations? Can we, for example, precompute the square root of 2 so we don't have to type the constant 1.4142... manually? no, floating point numbers are a limitation of the template metaprogramming and they are not allowed for technical reasons. But otherwise we can compute anything that involves integer arithmetic, including rational numbers and integer linear algebra. |
|||
A different kind of limitation of template metaprogramming is that the efficiency of the operations is usually much worst than runtime operations, therefore compilation can take longer as we use more complicated techniques. In fact we can make the template program so complicated that the compilation never ends (most likely fail due to recursion limits set by the compiler). |
|||
For example [http://www.boost.org/doc/libs/1_39_0/doc/html/boost_units.html Boost.Units] is a compile-time dimensional analysis library with absolutelly no-runtime cost. Dimensional analysis requires to solve linear system of equations with integer coefficients. |
|||
[http://www.boost.org/doc/libs/1_39_0/libs/math/doc/gcd/html/gcd_and_lcm/gcd_lcm/compile_time.html Parts of Boost.Math] implement static Greatest Common Divisor and Lowest Common Multiple algorithms, i.e. the numbers are computed during program compilation. |
|||
=== Static Rationals === |
|||
In this section I will describe a library that is auxiliar to the Boost.Units library which is not documented. The Boost.Units library requires an implementation detail of compile time (static) rationals and that implementation can be used if we include the boost/units/static_rational.hpp. Recently Boost accepted Boost.Ratio which implement static rationals as well but it is not yet included in the Boost distribution. |
|||
In this example, we define C++ *types* that represent compile time rationals, binary operations can be performed to define new rationals. Values can be inspected at runtime. |
|||
#include<boost/units/static_rational.hpp> |
|||
int main(){ |
|||
typedef boost::units::static_rational<6,8> six_eigths; |
|||
std::clog<<six_eigths::Numerator<<"/"<<six_eigths::Denominator<<std::endl; |
|||
typedef boost::units::static_rational<1,4> one_fourth; |
|||
typedef boost::mpl::plus<six_eigths, one_fourth>::type result; |
|||
std::clog<<result::Numerator<<"/"<<result::Denominator<<std::endl; |
|||
} |
|||
First, the program outputs |
|||
3/4 |
|||
(because 3/4==6/8). Note that it is the compiler through the static_rational.hpp metaalgorithm who computed the non-trivial simplification of the fraction. |
|||
Later it prints |
|||
1/1 |
|||
which is the result of 6/8 + 1/4 (or 3/4 + 1/4) which is computed using the mpl::plus operation (and some implementation in static_rational.hpp). Again the non trivial sum and the simplication is done at the time of the compilation. |
|||
Also note that, like in the case of mpl::int_c, the class static_rational is essentially empty, all the useful information is contained in the class and not in instances of that class. In fact there may not be instances of the class at all in the program. |
|||
Now a simple application. We mentioned earlier that these metaprograms cannot operate on floating point numbers, however we can exploit it anyway to improve floating point performance with minimal user code. |
|||
Suppose that we want an arbitrary exponentiation of a number, to a power that is known a priori but that can change later while we are writing the code. A typical line of code for this purpose looks like this: |
|||
std::cin>>base; |
|||
double const power = 5/6.; |
|||
double result=std::pow(base, power); |
|||
std::cout<<result; |
|||
so far so good, there are not many ways to improve the code, however we may want to change the value of p manually and yet obtain the maximum performance possible. If p=2 or p=1/2 calling std::pow is a waste, since it is not optimal. (Even if std::pow contains an 'if' conditional statement the code is not optimal since it will have to check for a condition at runtime.) |
|||
To obtain the maximal performance we will have to change our code ''manually'' depending on the value of 'p'. |
|||
std::cin>>base; |
|||
/* arbitrary case */ |
|||
//double const power = 5/6.; |
|||
//double result=std::pow(base, power); |
|||
/* square */ |
|||
double result=base*base; //more efficient than std::pow(base, 2); |
|||
/* square root */ |
|||
//double result=std::sqrt(base); //more efficient than std::pow(base, 0.5); |
|||
std::cout<<result; |
|||
Not very elegant, i it? Here is were our solution comes in. We can write a template function parametrized by the known-at-compile-time rational |
|||
#include<cmath> //for std::pow |
|||
template<class RationalPower> |
|||
double smart_pow(double const& base){ |
|||
return std::pow(base, (double)RationalPower::type::Numerator/(double)RationalPower::type::Denominator); |
|||
} |
|||
using boost::units::static_rational; |
|||
template<> double smart_pow<static_rational<1,2>::type>(double const& base){ |
|||
return std::sqrt(base); |
|||
} |
|||
template<> double smart_pow<static_rational<1,1>::type>(double const& base){ |
|||
return base; |
|||
} |
|||
template<> double smart_pow<static_rational<2,1>::type>(double const& base){ |
|||
return base*base; |
|||
} |
|||
then we can use the function with the confidence that the optimal function is always called, because these cases are handled efficiently. This can approach can even handle cases were the numerator and the denominator are calculated independently by some other computation. |
|||
using boost::units::static_rational; |
|||
double base; |
|||
std::cin>>base; |
|||
double result=smart_pow<static_rational<1,2>::type >(base); //automatically calls std::sqrt |
|||
std::cout<<result; |
|||
we don't have to worry about defining the case 4/2 or 2/4 or 9/9 since the static_rational metatype takes care of the simplification and reduces the case to 2, 1/2 and 1 respectively. |
|||
=== Boost.Fusion === |
|||
[http://www.boost.org/doc/libs/1_44_0/libs/mpl/doc/refmanual.html Boost.MPL] and [http://www.boost.org/doc/libs/1_35_0/libs/tuple/doc/tuple_users_guide.html Boost.Tuples] are library without which doing template programming would be difficult. MPL was designed to be the STL (Standard Template Library) of compile time, for example for having containers of types. On the other hand Tuples was the library for heterogeneous (static) containers. |
|||
I will skip MPL container and Tuples and go straight to [http://www.boost.org/doc/libs/1_44_0/libs/fusion/doc/html/index.html Boost.Fusion] which combines the best of both libraries and it is more general. |
|||
Boost.Fusion is a library of compile-time/runtime containers, that is, compile-time objects that contain other compile-time objects. In its turn each static element (types) of the container is also a placeholder for a runtime object. |
|||
#include <boost/fusion/container.hpp> |
|||
#include <boost/fusion/include/container.hpp> |
|||
#include<string> |
|||
#include<iostream> |
|||
int main(){ |
|||
using namespace boost::fusion; |
|||
vector<int, char, std::string> stuff(1, 'x', "howdy"); |
|||
int i = at_c<0>(stuff); |
|||
char ch = at_c<1>(stuff); |
|||
std::string s = at_c<2>(stuff); |
|||
return 0; |
|||
} |
|||
In this example, the object stuff is a boost::fusion::vector (analogous but different from to std::vector), as a type it contains a sequence of types and as a runtime object it contains different types, the int 1, the char 'x' and the string "howdy". Notice how the elements of the fusion::vector are accessed by the fusion:at_c<N> funcion and how therefore N must be a compile time constant. (This is the reason stuff[N] notation wouldn't work). So far, the usage is strictly equivalent to Boost.Tuple's tuples<...> type. |
|||
In parallel to the std::vector, the container fusion::vector has a size function which returns the number of elements. Now here it is where the interesting happens. As mentioned before there is a runtime aspect and a compile-time aspect of the object. There are two size functions, one returns a runtime variable and operates on instances of the class. |
|||
typedef boost::fusion::vector<int, char, std::string> stuff_type; |
|||
stuff_type stuff(1, 'x', "howdy"); |
|||
std::cout << size(stuff) << std::endl; //return a size of 3 |
|||
The other version statically operates on the type and returns a compile time mpl::int_, a compile time integer can be obtained from this mpl::int_ by the value member. MPL basic metatypes were described in a previous section. |
|||
std::cout << stuff::size::value << std::endl; |
|||
The difference between the two is obvious when we want to pass the size of the fusion::vector (which is known at compile time) as the template parameter. |
|||
boost::array<double, stuff_type::size::value> a; |
|||
Now going back to the typical C++ class, you can see that since fusion::vector can hold an arbitrary number of types and elements then any class with member fields can be replaced by a fusion::vector. |
|||
class model{ |
|||
double a; |
|||
int n; |
|||
... |
|||
double calculate() const{return pow(a, n);} |
|||
}; |
|||
class model : fusion::vector<double, int>{ |
|||
double calculate() const{return pow( at_c<0>(*this), at_c<1>(*this) );} |
|||
}; |
|||
quite a generalization, now another metafunction can inspect the parameter types of "model". The price to pay is that we lost the nice names of the parameters (a and n). If we want to give names to the parameters we should use fusion::map. |
|||
=== fusion::map === |
|||
fusion::map is analogous to std::map. In std::map we have values acceded by a key, now the keys are types. Since the types can be arbitrary we can use this mechanism to assign "names" to the keys. In practice this key type can be auxiliary. |
|||
struct a; |
|||
struct n; |
|||
class model : fusion::map< |
|||
== Boost Numerics == |
|||
Boost is not intended specifically to be a numerical library, although there are some useful tools scattered within the library. The [http://www.boost.org/doc/libs?view=category_Math numeric capabilities] random number generation, linear algebra, interval arithmetics, special functions. For numeric applications, it worths also looking at the well known C library [http://www.gnu.org/software/gsl/ GNU Scientific Library (GSL)] which is [http://www.gnu.org/software/gsl/manual/html_node/ documented] in great detail. |
|||
Most math/numeric functions in Boost are generic with respect to the numeric type with different precisions. Of course it can be used with either double or float precision, but also with arbitrary precision types, for example [http://www.boost.org/doc/libs/1_38_0/libs/math/doc/sf_and_dist/html/math_toolkit/using_udt/use_ntl.html it can be used with NTL], although in practice only with 100 digit precision at most. |
|||
=== Boost.Interval === |
|||
[http://www.boost.org/doc/libs/1_37_0/libs/numeric/interval/doc/interval.htm Boost.Interval] provides us with a support for manipulation of intervals and [http://en.wikipedia.org/wiki/Interval_arithmetic interval arithmetic]. It can be used to represent simple numeric intervals (including the empty, infinite and semi infinite intervals) or to do a kind of error propagation for example. |
|||
To create an interval choose a number type and the lower and upper bounds |
|||
#include<boost/numeric/interval.hpp> |
|||
boost::numeric::interval<double> unit(0., 1.); |
|||
boost::numeric::interval<double> two(2.); //interval contains one point |
|||
boost::numeric::interval<double> w = boost::numeric::interval<double>::whole(); |
|||
if the, yet to be, lower limit is larger than the upper limit an exception (runtime error) is raised. Different properties of the interval can be accessed by the functions <code>lower, upper, width (or norm), median, empty, singleton, equal, in, zero_in, subset, proper_subset, overlap, intersection, hull, bisect</code>. Most interesting intervals can be manipulated by arithmetic operators (+,-,*,/) with numbers or with other intervals. |
|||
using boost::numeric::interval; |
|||
interval<double> i1 = unit*2.; //duplicates the unit interval [0.,2.] |
|||
interval<double> i2 = unit-1.; //shifts the unit interval to the left [-.5,.5] |
|||
A curious (and correct) result of the interval arithmetic is that |
|||
<code> whole()*whole()==[-inf,inf]</code> but <code> pow(whole(),2) == [0,inf]</code> |
|||
A semi-infinite interval can be generated (for types that have infinity), |
|||
interval<double> si(0.,std::numeric_limits<double>::infinity()); |
|||
Some transcendental functions (<code>exp, log, sin, etc</code>) can be applied to intervals. |
|||
Intervals can be printed to the screen if we include an extra header file, |
|||
#include<boost/numeric/interval/io.hpp> |
|||
... |
|||
std::cout << unit << std::endl; |
|||
To check whether an element belongs to the interval use the in function |
|||
if ( in( 0.5, unit) ) cout << "0.5 is in " << unit; |
|||
Intervals are considered closed (e.g. according to the 'in' function). Infinite intervals are also closed in the sense that they formally contain the floating point infinity. |
|||
Here it is an example http://codepad.org/nwo9Z2Be. |
|||
The library seems simple at first sight but the details (and the implementation) are quite tricky, for example regarding the proper rounding of the bounds or whether empty intervals are allowed or not. Some special cases are handled in the library by using special (template) options (policies). Notes on the design can be found [http://www.lri.fr/~melquion/doc/06-tcs-rnc5.pdf here]. |
|||
The library doesn't consider union of intervals, since it is not always another interval. The closest thing to the union operator is the 'hull' function, which gives the union of two intervals if they overlap. |
|||
Recently, another interval library, called [http://www.boost.org/doc/libs/1_46_1/libs/icl/doc/html/index.html Interval Container Library (ICL)] was accepted and included in Boost. It has more abstraction, open/closed, multidimensional, and, in particular, interval sets and map (containers specific for intervals, which effectively work as interval unions, for example). |
|||
=== Boost.Random === |
|||
[http://www.boost.org/doc/libs/1_40_0/libs/random/index.html Boost.Random] is a library that allows us to pick random numbers from specific statistical distributions. It has the flexibility to separate the abstraction of the distribution and the randomness source. |
|||
For example, you can choose a deterministic random source (pseudo-random generator) and plug in it to a specific distribution from where you can pick up numbers one at a time. For some systems Boost can even provide non-deterministic randomness from external devices and then plug it to your desired abstract distribution (not covered here). |
|||
A typical usage is the following |
|||
#include<boost/random.hpp> |
|||
... |
|||
boost::mt19937 mt; //this is the randomness source |
|||
boost::uniform_real<> rm1p1(-1.,1.); //this is the abstract distribution |
|||
boost::variate_generator<boost::mt19937&, boost::uniform_real<> > die_real(mt, rm1p1); //glue randomness with distribution |
|||
clog<< die_real() << endl; // pick up a random real from distribution |
|||
as you can see, the distribution (uniform in the interval [-1:1) ) is independent of the randomness source (Mersenne Twister #19937). The library allows to specify a certain set of predefined and user defined distribution. |
|||
This library (with minor changes) will become part of the standard library of C++0x. |
|||
(for complete code of examples: http://codepad.org/Uj56naPF). |
|||
==== Normal distribution ==== |
|||
We can change the distribution to a normal one (gaussian) with a certain mean and deviation |
|||
using boost::mt19937; using boost::normal_distribution; using boost::variate_generator; |
|||
normal_distribution<> n_10_4(10., 4.); |
|||
variate_generator<mt19937&, normal_distribution<> > die_normal(mt, n_10_4); //glue |
|||
clog<< die_normal() <<endl; //pick a random real from normal distribution |
|||
As a side note, it is well known that when we generate a normal distributed random number from a uniform one, we get a second normal random number for the same price. In other words, generating normal numbers one by one is less efficient than generating pairs. The Boost implementation of normal_distribution takes advantage of this but gives the illusion that the numbers are generated one at time, internally only pairs are generated. The trick is that the second number is cached and not calculated in the next round (i.e. second call of die_normal()). In the third call a new pair is generated, the first number of the pair is returned and the second one is stored for the following call. |
|||
The omitted template parameter of the distribution is the type of the returned value, by default it is <code>double</code> but it can be other floating point number (namely float). In fact other useful distributions can return multiple numbers. For example, the following code returns two numbers corresponding to a random point in a 2-sphere: |
|||
using boost::rand48; using boost::uniform_on_sphere; |
|||
rand48 rand; |
|||
uniform_on_sphere<> us(2); |
|||
variate_generator<rand48&, uniform_on_sphere<> > die_place_in_the_world(rand, us); |
|||
clog<< v2[0]<<" "<<v2[1]<<endl; |
|||
v2 is normalized (v2[0]*v2[0]+v2[1]*v2[1]==1). The dimension of the sphere can be arbitrary. The example also shows how to change the source of randomness (a.k.a. the underlying random number generator). In this case we change a particular Meressene Twister generator (mt19937) by a particular linear congruential (rand48). |
|||
==== Random Sources ==== |
|||
The list of possible random generators is given [http://www.boost.org/doc/libs/1_40_0/libs/random/random-generators.html here]. For extra control on the number generator, the parameters can be passed as template parameters, for example <code>mt19937</code> and <code>ecuyer1988</code> are internally defined as |
|||
typedef mersenne_twister<uint32_t,624,397,31,0x9908b0df,11,7,0x9d2c5680,15,0xefc60000,18, 3346425566U> mt19937; |
|||
typedef linear_congruential<int32_t, 40692, 0, 2147483399, 0>, 0> ecuyer1988; |
|||
All the random generators mentioned are deterministic and return the same sequence after each run unless the seed changes. To change the seed the object can be initialized with an integer that represents the seed or it can be changed at any point with the <code>.seed</code> member function. |
|||
boost::mt19937 mt(123); //this is the randomness source |
|||
... |
|||
mt.seed(321); |
|||
==== Distributions ==== |
|||
A list of possible distributions is given [http://www.boost.org/doc/libs/1_40_0/libs/random/random-distributions.html here]. Since it is very common, it worths mentioning that integer uniform distributions can be also defined. |
|||
uniform_int<> six(1,6); // distribution that maps to 1..6 |
|||
variate_generator<boost::mt19937&, uniform_int<> > die(mt, six); |
|||
int x = die(); // simulate rolling a die |
|||
An interesting example is the Bernoulli distribution, whose associated generator returns a bool (true or false) with a given probability. |
|||
bernoulli_distribution<> mostly_true(0.75); //will return true 75% of the time |
|||
variate_generator<boost::mt19937&, bernoulli_distribution<> > acceptance(mt, mostly_true); |
|||
if(acceptance()==true) call_accept(); else call_reject(); |
|||
Another distribution of integers is the [http://en.wikipedia.org/wiki/Binomial_distribution binomial distribution] (related to the Bernoulli distribution) |
|||
binomial_distribution<> fortydrops(40,0.5); //a fair coin will be tossed 40 times. |
|||
which returns the (random) number of face outcomes of 40 coin tossings. (This distribution is undocumented.) |
|||
==== Note on syntax (<code>&</code>)==== |
|||
The ampersand (&) at the end of the random source type in the template parameter (boost::mt19937&) is important if the function where the line is, is reentrant (called many times). Because it will guaranty that the seed is not reset between calls. (It is possible that we always want the same random sequence when calling a function, but that is not very common.) |
|||
It is also important not to put it when the random generator is declared in one line. |
|||
using namespace boost; |
|||
variate_generator<mt19937, uniform_real<> > die_real(mt19937(), uniform_real<>(-0.2,0.2)); |
|||
=== Boost Math Constants === |
|||
(Not to be confused with Boost Units Constants.) This is not a library per se, is an internal part of Boost.Math. It just defines a few mathematical constants numerically to some high literal precision. The advantage compared to other libraries is that it provides the numerical constants with arbitrary type (always based in a hard-coded 100 digit expansion). |
|||
For example if we want pi to 'float' accuracy we call |
|||
#include <boost/math/constants/constants.hpp> |
|||
float pi_float = boost::math::constants::pi<float>(); |
|||
if we want the truncation at 'long double' precision we call |
|||
using boost::math::constant::pi; |
|||
long double pi_ldouble = pi<long double>(); |
|||
the translation to the right precession is made during compilation. |
|||
Besides pi, there are several other constants predefined template as listed in the [http://www.boost.org/doc/libs/1_41_0/libs/math/doc/sf_and_dist/html/math_toolkit/toolkit/internals1/constants.html reference page]: |
|||
Why as many as 100 digits? Because this works also with user defined numeric type that can exploit extended precision (for example 100 digits or more), such as mpfr numbers. |
|||
#include <iostream> |
|||
#include <boost/math/bindings/mpfr.hpp> |
|||
#include <boost/math/special_functions/gamma.hpp> |
|||
#include <boost/math/constants/constants.hpp> |
|||
int main(){ |
|||
mpfr_class::set_dprec(500); // 500 bit precision |
|||
mpfr_class v = boost::math::tgamma(sqrt(mpfr_class(2))); |
|||
mpfr_class pi = boost::math::constants::pi<mpfr_class>(); |
|||
std::cout << std::setprecision(50) << v << std::endl; |
|||
std::cout << std::setprecision(50) << pi << std::endl; |
|||
} |
|||
// output: |
|||
// 8.8658142871925912508091761239199431116828551763805e-1 |
|||
// 3.1415926535897932384626433832795028841971693993751 |
|||
(the example needs GMP, MPFR, and GMPFRXX libraries installed to work, it also shows that many Boost.Math functions work with multiprecision types and not only with doubles). |
|||
=== Boost Math Special Floating Points === |
|||
There some tricks to determine special floating point conditions (these include 'nan', 'inf' and zero). Unfortunately there is no standard way to do it since it depends on the system. In the systems in which such detection is possible Boost implements a series of functions to determine such special conditions: |
|||
#include<boost/math/special_functions/fpclassify.hpp> |
|||
using namespace boost::math |
|||
isinf |
|||
isnan |
|||
It can also detect normal conditions |
|||
isfinite |
|||
isnormal |
|||
The difference between the last two is that zero (0) is not considered a normal floating point (but 0 is finite). |
|||
Due to compatibility with macros defined elsewhere it is recommended to enclose the function in parenthesis |
|||
#include<boost/math/special_functions/fpclassify.hpp> |
|||
(boost::math::isnan)(x) // test that the value held by x represents a number, for example 5.2, 0., or inf. |
|||
Additionally, the function <code>fpclassify<></code> returns an integer code (FP_ZERO, FP_NORMAL, FP_INFINITE, FP_NAN, FP_SUBNORMAL) that tells what kind of floating point is passed, as documented [http://www.boost.org/doc/libs/1_41_0/libs/math/doc/sf_and_dist/html/math_toolkit/special/fpclass.html here]. |
|||
(The distinction between -inf and +inf can still be done with the comparison to zero). |
|||
Note that the best way to filter <code>inf</code> and <code>nan</code> together is to test agaist <code>isfinite</code> condition: 'inf' is not finite and neither is <code>nan</code>. (technically, <code>nan</code> is not infinite either). |
|||
=== Boost Math Tools === |
|||
Many math functions in Boost.Math use some [http://www.boost.org/doc/libs/1_47_0/libs/math/doc/sf_and_dist/html/math_toolkit/toolkit.html auxiliary algorithms], some of them are very useful, this is a synopsis of the root finding algorithms without derivatives. There many other libraries that provides this funcionalities, the advantange of this implementation is that the algorithm works with arbitrary types, for example extended precision integers or Boost.Units quantity types. |
|||
To be general, lets assume one has a functoid f of type F that can be evaluate at any point t of type T, and that the result of F(t) is a numeric quantity that can be ordered, i.e. f(t1) < f(t2) or f(t2) < f(t1). All these functions return a bracket contained in a pair. F can be a function pointer or a class with operator()(T t) defined. |
|||
#include <boost/math/tools/roots.hpp> |
|||
using namespace boost::math::tools |
|||
bisect(f, min, max, tol[, max_iter[, policy] ]); // needs to be bracket initaly |
|||
toms748_solve(f, a, b, tol[, max_iter[, policy] ]); // solves given the known initial bracket |
|||
toms748_solve(f, a, b, fa, fb,tol[, max_iter[, policy] ]); // idem but the initial values are known (saves two evaluations) |
|||
bracket_and_solve_root(f, guess, factor, rising, tol[, max_iter[, policy] ]); |
|||
The last does both a bracket search and uses it to obtain a root via the toms748, needs f to be monotonic and whether it is a raising function or not. The parameter factor is used to scale the value of the guess in order to first bracket the root. Therefore the guess bracket only does so within positive or negative numbers depending on guess. (Although restrictive in many cases the sign of the root is known in advance.) Internally uses the [http://www.netlib.org/toms/748 TOMS748 algorithm], i.e. toms748_solve. |
|||
=== Boost Basic Linear Algebra Math (Boost.Ublas) === |
|||
Boost.Ublas is a special kind of library, since it is an interface library. Ublas interfaces the good old BLAS library to modern C++ notation and without loss of efficiency. |
|||
First a little bit of plain old BLAS to produce the following operation |
|||
<math>C \leftarrow \alpha A B + \beta C </math> |
|||
// c++ dgemm_example.cpp -lblas |
|||
#include<iostream> |
|||
#include<cassert> |
|||
using std::cout; using std::endl; |
|||
// external declaration (to bind with fortran) |
|||
extern "C" { |
|||
void dgemm_(const char* transa, const char* transb, const int* l, const int * n, const int* m, |
|||
const double* alpha, const void* a, int* lda, |
|||
void* b, int* ldb, |
|||
double* beta, void* c, int* ldc); |
|||
} |
|||
int main(){ |
|||
// declaration |
|||
int Ac = 3; |
|||
int Ar = 2; |
|||
double* A = new double[Ac*Ar]; |
|||
int Bc = 2; |
|||
int Br = 3; |
|||
double* B = new double[Bc*Br]; |
|||
int Cc = 2; |
|||
int Cr = 2; |
|||
double* C = new double[Cc*Cr]; |
|||
// definition (and printing) |
|||
cout << "A = \n"; |
|||
for(unsigned ic = 0; ic != Ac; ++ic){ |
|||
for(unsigned ir = 0; ir != Ar; ++ir){ |
|||
cout << (A[ic*Ar + ir] = ic + ir) << ", "; |
|||
} |
|||
cout << "\n"; |
|||
} |
|||
cout << "B = \n"; |
|||
for(unsigned ic = 0; ic != Bc; ++ic){ |
|||
for(unsigned ir = 0; ir != Br; ++ir){ |
|||
cout << (B[ic*Br + ir] = ic + ir) << ", "; |
|||
} |
|||
cout << "\n"; |
|||
} |
|||
cout << "C = \n"; |
|||
for(unsigned ic = 0; ic != Cc; ++ic){ |
|||
for(unsigned ir = 0; ir != Cr; ++ir){ |
|||
cout << (C[ic*Cr + ir] = ic + ir) << ", "; |
|||
} |
|||
cout << "\n"; |
|||
} |
|||
// function call |
|||
{ |
|||
// parameters |
|||
double alpha = 1.; |
|||
double beta = 0.; |
|||
char transA = 'N'; |
|||
char transB = 'N'; |
|||
cout << "C = alpha*A^" << transA << "*B^" << transB << " + beta*C, with alpha = " << alpha << " and beta = " << beta << "\n"; |
|||
int ldA = Ar; |
|||
int ldB = Br; |
|||
int ldC = Cr; |
|||
// runtime dimension check |
|||
assert( Ac == Br ); int k = Cc; |
|||
assert( Ar == Cr ); int n = Cr; |
|||
assert( Cc == Bc ); int m = Ar; |
|||
// value to reference transformation (to bind with fortran) |
|||
dgemm_(&transA, &transB, &n, &m, &k, &alpha, A, &ldA, B, &ldB, &beta, C, &ldC); |
|||
} |
|||
// print output |
|||
cout << "new C = \n"; |
|||
for(unsigned ic = 0; ic != Cc; ++ic){ |
|||
for(unsigned ir = 0; ir != Cr; ++ir){ |
|||
cout << C[ic*Cr + ir] << ", "; |
|||
} |
|||
cout << "\n"; |
|||
} |
|||
return 0; |
|||
} |
|||
The previous code has many drawbacks, in fact all the operation have a very low level of abstraction. For these reasons, several interfaces have been created cblas (cblas_dgemm) in the context of C, and gsl_blas_dgemm in the context of GSL. Both are partial solutions to the abstraction problem. |
|||
Now let's see how to perform the same operations with Ublas. |
|||
#include <boost/numeric/ublas/matrix.hpp> |
|||
#include <boost/numeric/ublas/io.hpp> |
|||
using std::cout; using std::endl; |
|||
int main () { |
|||
using namespace boost::numeric::ublas; |
|||
matrix<double> A(3, 3); |
|||
for (unsigned i = 0; i != A.size1(); ++ i) |
|||
for (unsigned j = 0; j != A.size2(); ++ j) |
|||
A(i, j) = 3. * i + j; |
|||
cout << "A = " << A << endl; |
|||
matrix<double> B(3, 3); |
|||
for (unsigned i = 0; i != B.size1(); ++ i) |
|||
for (unsigned j = 0; j != B.size2(); ++ j) |
|||
B(i, j) = 3. * i + j; |
|||
cout << "B = " << B << endl; |
|||
matrix<double> C(2, 2); |
|||
for (unsigned i = 0; i != C.size1(); ++ i) |
|||
for (unsigned j = 0; j != C.size2(); ++ j) |
|||
C(i, j) = 3. * i + j; |
|||
cout << "C = " << C << endl; |
|||
C = 0.5*prod(A, B) + 1.*C; |
|||
cout << "new C = " << C << endl; |
|||
return 0; |
|||
} |
|||
==== Using uBlas on existing code together with C-arrays (shallow_array_adaptor) ==== |
|||
uBlas matrix and vector can be adapted to programs that already use raw C-arrays, by passing the address of the memory to be managed. The key is to #define BOOST_UBLAS_SHALLOW_ARRAY_ADAPTOR before #includeing the uBlas headers. After that a third optional parameter is passed to the matrix or vector declaration. Note that the data in memory is not copied but managed by uBlas. |
|||
#if compile |
|||
time c++ $0 -o $0.x && $0.x $@; exit |
|||
#endif |
|||
#define BOOST_UBLAS_SHALLOW_ARRAY_ADAPTOR //activate array adaptor |
|||
#include <boost/numeric/ublas/vector.hpp> |
|||
#include <boost/numeric/ublas/matrix.hpp> |
|||
#include <boost/numeric/ublas/io.hpp> |
|||
#include <iostream> |
|||
using std::cout; using std::endl; |
|||
int main() { |
|||
using namespace boost::numeric::ublas; |
|||
unsigned N = 4, M = 5; |
|||
double *data_matrix = new double[N*M]; |
|||
data_matrix[1] = -3.14159; |
|||
// Create a boost ublas matrix using the data_matrix pointer as storage. |
|||
matrix<double, row_major, shallow_array_adaptor<double> > A( |
|||
N, M, shallow_array_adaptor<double>(N*M,data_matrix) |
|||
); |
|||
A(3, 3) = 3.0; |
|||
A(0, 2) = 4.0; |
|||
// Should print -3.14159, 3 |
|||
cout << A(0,1) << ", " << A(0,2) << ", " << A(3,3) << endl; |
|||
cout << "A = " << A << endl; |
|||
cout << data_matrix[2] << endl; |
|||
return 0; |
|||
}; |
|||
=== Boost.Accumulators === |
|||
During a computer simulation it is often useful to 'collect' or 'accumulate' statistics. This accumulated statistics can be reported at the end of the simulation or used immediately on-the-fly to monitor or check for convergence or steady states. For example, the accumulated statistics can be the mean, variance and extremes of a certain variable. |
|||
Writing code that does this is indeed trivial but since it is not the main function of the code it can interfere with the normal flow of the code, we have to introduce many new variables that keep track of the statistics, such as counter, total sum, sum of squares, etc. Then these quantities have to be combined in certain ways when reporting the results. |
|||
The [http://www.boost.org/doc/libs/1_40_0/doc/html/accumulators/user_s_guide.html Boost.Accumulators] library provides a consistent way of collecting this information. Accumulators can also keep track of extreme values (min/max), element count and other user defined features. |
|||
The first step is to create an accumulator object with certain 'features'. For example if we want to have an accumulator that reports the mean value, the second moment and min and max values we declare |
|||
#include <boost/accumulators/accumulators.hpp> |
|||
#include <boost/accumulators/statistics/mean.hpp> // or other stat features |
|||
... |
|||
using namespace boost::accumulators; |
|||
accumulator_set<double, stats<tag::mean, tag::variance, tag::min> > acc; |
|||
The first template parameter is the type of the data elements that will be feed to the accumulator, the second argument is the list of features or statistics that the will be accumulated. Note that the list of features is specified at compile time, therefore the library is able to know the number of parameters that this particular accumulator will store during book-keeping. For example, if the only feature is the 'mean' value the sum of squares is not necessary and never computed (only the count and the sum of elements is stored). On the other hand, if tag::min or tag::max are the only features it is not necessary to keep the count internaly, etc. |
|||
As a second step we need to actually accumulate the information, this is done simply by enclosing the data in the parenthesis operator of the accumulator, for example in a loop |
|||
for( ... ){ |
|||
double data_point = function(...); |
|||
acc(data_point); |
|||
... |
|||
} |
|||
As third and final step and after some (or all) data is collected we need to extract the final result that the object was comited to accumulate. To extract the information we must apply a function with the name of the statistics we wanted, |
|||
std::cout<<"mean: "<<mean(acc)<<std::endl; |
|||
std::cout<<"variance: "<<variance(acc)<<std::endl; |
|||
only statistics or features that were declared are allowed to be applied to the accumulator, otherwise a compile error happens. |
|||
std::cout<<"max: "<<max(acc)<<std::endl; // <-- compile error, max feature was not requested when acc was declared |
|||
The accumulators also support covariance, i.e. how two or variables change together. |
|||
==== Special numerical operations (boost::numeric::functional) ==== |
|||
Boost.Accumulators and statistics in general assume a set of numerical operations on the accumulated objects that may be larger than the number of operators defined for that object, or may change the meaning of defined operations in the context of statistics. For example, if we accumulate <code>int</code>egers, what is supposed to be the average of integers, is the average a real number? or another integer? In any case we should be explicit because otherwise the native C operations will perform integer-division (where 3/2==1). |
|||
For Boost.Accumulators, these special operations are defined in the namespace [http://www.boost.org/doc/libs/1_40_0/doc/html/accumulators/user_s_guide.html#accumulators.user_s_guide.the_accumulators_framework.extending_the_accumulators_framework.operators_ex boost::numeric::functional]. |
|||
Another example is an object that doesn't carry information on how to divide by an integer, for example std::complex<double>. In these cases we have to extend the operations available to those objects. |
|||
An interesting case that doesn't define division by integer is boost::units::quantity<Unit>. |
|||
In this example, we specify how the division by an integer is carried on: |
|||
namespace boost {namespace numeric{namespace functional{ |
|||
using namespace boost::units; |
|||
template<class Unit, typename Y> struct quantity_tag{}; |
|||
template<class Unit, typename Y> struct tag<quantity<Unit,Y> >{ |
|||
typedef quantity_tag<Unit,Y> type; |
|||
}; |
|||
template<typename Left, typename Right> |
|||
struct average<Left, Right, quantity_tag<typename Left::unit_type, typename Left::value_type>, void>{ |
|||
typedef Left result_type; |
|||
result_type operator()(Left & left, Right & right) const{ |
|||
return left/(typename Left::value_type)right; |
|||
} |
|||
}; |
|||
}}} |
|||
(This is a technique is known as 'static tag dispatching'.) |
|||
Fortunately this step is not necessary for std::complex<double> and std::vector<double> (taken as a vector in a linear space) if we include these adaptor headers before the accumulator headers: |
|||
#include <boost/accumulators/numeric/functional/complex.hpp> |
|||
#include <boost/accumulators/numeric/functional/vector.hpp> |
|||
These headers tell the library how to average complex numbers and vector of numbers. |
|||
==== Other Resources ==== |
|||
http://www.bnikolic.co.uk/boostqf/accumulators.html |
|||
Official Manual http://boost-sandbox.sourceforge.net/libs/accumulators/doc/html/index.html |
|||
=== Boost.Format === |
|||
The C-way to print formatted variable numbers is <code>printf("...format string...", a, b, c, ...)</code>. This function call tends to have error since we have to make sure that the format string agrees with the actual type of the variable. In other words, printf is not type-safe. The C++ (which still allows to call printf) changed the preferred syntax to the std::cout (and std::cerr) shift(<<) operator: |
|||
std::cout<<"text "<<float_number<<" text"<<std::flush<<" more text "<<integer_number<<std::endl; |
|||
The user doesn't have to worry about the type of the variable that is printed and the operator<< can be overloaded to print user defined classes. This was, in my opinion, a great improvement. Yet, sometimes it is necessary to specify the format in a way that is constructed at runtime or the format is specified in a separate part of the code. Enter, Boost.Format. |
|||
[http://www.boost.org/doc/libs/1_42_0/libs/format/doc/format.html Boost.Format] allows to specify printing format in a similar way as printf used to do, with similar and even greater flexibility, But in a way that is type-safe. |
|||
The format is specified by a string were the entries are represented by a number (starting at 1) enclosed in percent symbol, like %1%, %2%, %3 etc. After the format is specified the variables (which can be any type in principle) are passer via the operator % (not the string %). For example |
|||
#include<boost/format.hpp> |
|||
... |
|||
cout<< boost::format(" %1% | %2% | %3% ") % a % b % c << "rest of the line" <<endl; |
|||
If the number of variables passed after 'operator%' is different from the maximum fields index present in the format string the program will give an error (i.e. raise an exception of type 'boost::io::too_few_args' or 'boost::io::too_many_args' respectively). |
|||
So far, in the example, the use are no different than the normal operator<< in C++. The first difference to note is that |
|||
1. The format string can be specified at runtime |
|||
std::string frmt; |
|||
... //assign frmt from user input, e.g. frmt = "%1% | %2%"; |
|||
cout<<boost::format(frmt) % a % b ; |
|||
2. Format can repeat fields |
|||
cout<< boost::format("%1%, %2% | %1%") % a+b % c ; //a+b is printed twice, e.g. '4, 5, 4' |
|||
The advantage with respect to printf is that the boost::format is flexible but still type-safe. |
|||
3. For numeric types boost::format (like printf) can specify the format of the numeric output. |
|||
cout<< boost::format("%1$+05u") % 234; //prints '+0234' |
|||
For variables of any type the field number is enclosed in '%', for numeric fields with specific format the field number is enclosed by '%' on the left and '$' on the right followed by the numeric format string. This numeric format string follows the [http://www.opengroup.org/onlinepubs/007908799/xsh/fprintf.html printf conventions] like in these [http://www.cplusplus.com/reference/clibrary/cstdio/printf/ examples]: |
|||
%[flags][width][.precision][length]specifier |
|||
For example '%1$+05u' prints an integer (unsigned or int) field using (at least) '5' characters (sign included) completing with left zeros '0'. Plus symbol '+' means that the sign is always specified even if the number is positive. If the field is not an integer (for example, it is double or string) the numeric format is ignored and no error is reported. If the integer needs more than three characters it will use them. |
|||
Invalid format strings give an error at run time. So it is a good idea to test the formatting in a small program, for example here http://codepad.org/BINppZ4i. |
|||
==== Lexical Cast (convert to string) ==== |
|||
In [http://www.boost.org/doc/libs/1_46_0/libs/conversion/lexical_cast.htm <code>boost/lexical_cast.hpp</code>] there is a function that converts any type of variable in any other as long as the the types can be used as output or inputs in a stream. A typical use is |
|||
double a = 5.123; |
|||
std::string a_string = boost::lexical_cast<std::string>(a); |
|||
assert( a == "5.123"); |
|||
The function is very useful for converting any variable in a string, which can by manipulated as such. Unfortunately that simple function does not control the format in which numbers are converted to string. A combination of the function boost::str and the boost::format object can by used to overcome this limitation: |
|||
double a = 5.123; |
|||
std::string a = boost::str(boost::format("%011.2f")%a); // str(format("%011.2f")%a); |
|||
assert( a == "00000005.12" ); |
|||
== Boost.Units == |
|||
Most physical quantities have units, if we don't take proper care of units, [http://www.cnn.com/TECH/space/9909/30/mars.metric/ bad things will happen]. When programming we usually don't care about the units because we tend to work in consistent systems of units. Sometimes, we need to output some variables in a different system of units that the main program is working on. We do this by doing some adhoc conversion with some multiplication in the output. Other times the input is an different sets of units. |
|||
To give an example, when working in atomic problems it makes complete sense to program in atomic units (au) (Hartree, Bohr, etc.). Sometimes we would like to use input parameters in Angstroms instead or more often we need to output the stress tensor in Pascals (which is not au). |
|||
Physical units are some sort of typing system of physical model, under some physical models it doesn't make sense to sum length and time. On the other hand, if we are working in a relativistic model it does make sense to sum lenght and time (and of course since there is no disticition this is called space-time). So this "typing system" is not unique an depends on the model we are using. |
|||
Once we stablish a model by can give this dimensional information to the program. The ad hoc way is to define conversion factors all over and add comments or same specific names to the variable. But this doesn't guaranties that at some point we will sum and energy for example. This function calculates the area |
|||
double area(double x, double y){return x*y;} |
|||
here we implicitly assume that the two doubles in the argument are lenghts (L), and the return type is an area (L*L) but the program doesn't know. So the subroutine is really a multiplication routine, not an area-calculating routine (it is just an interpretation). Somebody can without your knowledge can modify the routine and return x+y and you will interpret the result as an area. |
|||
So, what is the solution, in C++ we could create classes whose instance will keep track of the units. This runtime approach is very inefficient, suppose we have an array of 1000 length (e.g. measurements) we don't want each of these variables to have memory used with the redundant that all of them are lengths. |
|||
We also want to a routine that is as efficient as the routine that works with built-in double. |
|||
The solution is to store the dimension of the variable in the type it self. Then all the dimensional correctness is enforced at compilation-time and runtime efficiency is preserved. |
|||
The idea is more or less to define type that will have dimensional information in such a way that: |
|||
double_area area(double_length x, double_length y){return x*y;} |
|||
Manually writting all the possible resulting dimensions, handling products sums, etc, is not practical. Here is where Boost.Units comes to the rescue. Boost.Units is a very complicated library that enforces dimensional correctness. Dimensional analysis requires a linear algebra solver, this is implemented in Boost.Units by means of template technology. The [http://www.boost.org/doc/libs/1_39_0/doc/html/boost_units.html official manual] can also be of interest. |
|||
=== Definining Units === |
|||
==== Systems of Units ==== |
|||
To use Boost.Units we don't necessarelly need to define a system of units, most common systems SI and CGS ara already defined. But in order to give the complete example here it is the code to define the [http://en.wikipedia.org/wiki/Atomic_units#Fundamental_units atomic units] (au) system. This section gives an idea of how the library works but it can skipped. |
|||
#include <boost/units/base_unit.hpp> |
|||
#include <boost/units/physical_dimensions/mass.hpp> |
|||
#include <boost/units/physical_dimensions/length.hpp> |
|||
#include <boost/units/physical_dimensions/electric_charge.hpp> |
|||
#include <boost/units/physical_dimensions/angular_momentum.hpp> |
|||
#include <boost/units/physical_dimensions/energy.hpp> |
|||
#include <boost/units/make_system.hpp> |
|||
#include <boost/units/io.hpp> //do not remove: '...has initializer but incomplete type...' error |
|||
namespace boost{ |
|||
namespace units{ |
|||
namespace au{ |
|||
using std::string; |
|||
// taken from http://en.wikipedia.org/wiki/Atomic_units#Fundamental_units |
|||
struct electron_mass_base_unit : base_unit<electron_mass_base_unit, mass_dimension, 1>{ |
|||
static string name() {return ("electron_mass");} |
|||
static string symbol() {return ("me");} |
|||
}; |
|||
struct bohr_radius_base_unit : base_unit<bohr_radius_base_unit, length_dimension, 2>{ |
|||
static string name() {return ("bohr_radius");} |
|||
static string symbol() {return ("a0");} |
|||
}; |
|||
struct elementary_charge_base_unit : base_unit<elementary_charge_base_unit, electric_charge_dimension, 3>{ |
|||
static string name() {return ("elementary_charge");} |
|||
static string symbol() {return ("e");} |
|||
}; |
|||
struct reduced_planck_constant_base_unit : base_unit<reduced_planck_constant_base_unit, a ngular_momentum_dimension, 4>{ |
|||
static string name() {return ("reduced_planck_constant");} |
|||
static string symbol() {return ("hbar");} |
|||
}; |
|||
struct hartree_energy_base_unit : base_unit<hartree_energy_base_unit, energy_dimension, 5>{ |
|||
static string name() {return ("hartree_energy");} |
|||
static string symbol() {return ("Eh");} |
|||
}; |
|||
typedef make_system< |
|||
electron_mass_base_unit, |
|||
bohr_radius_base_unit, |
|||
elementary_charge_base_unit, |
|||
reduced_planck_constant_base_unit, |
|||
hartree_energy_base_unit |
|||
>::type system; |
|||
typedef unit<dimensionless_type, system> dimensionless; |
|||
typedef unit<mass_dimension, system> mass; |
|||
typedef unit<length_dimension, system> length; |
|||
typedef unit<electric_charge_dimension, system> electric_charge; |
|||
typedef unit<angular_momentum_dimension,system> angular_momentum; |
|||
typedef unit<energy_dimension, system> energy; |
|||
static const mass electron_mass; |
|||
static const length bohr_radius, bohr; |
|||
static const electric_charge elementary_charge; |
|||
static const angular_momentum reduced_planck_constant; |
|||
static const energy hartree_energy, hartree; |
|||
} |
|||
} |
|||
} |
|||
The code seems complicated but it has its logic. There are four important blocks. The first is basically a table of names and symbols for the fundamental units in that system, with some abstract information on the physical dimension. The symbols will be used when printing the values to the screen. The numbers on the left are conventional numbers to identify the units internally, any positive number is ok as long as it doesn't clash with other units defined, negative numbers are reserved. (For example, the SI system --included in the library-- uses negative numbers.) The <code>make_system</code> line creates the system of units, and the following block gives short type names to the physical dimension of the fundamental units. Finally last block defines unit constants (magnitudes with value 1) in the system of units. Synonyms can be defined here very easily. (For example, hartree_energy and hartree). |
|||
==== Free standing units ==== |
|||
Sometimes we need to define units that are not part of a larger system of units. In this way we can define units such as electron volts (eV). The quantities with such units will print nicely with the symbol 'eV' after it. At the same time it is useful that these quantities convert automatically to units with the same dimensions in a system: |
|||
namespace boost{namespace units{ |
|||
namespace atomic{ //optional, this free unit can be part or not of the namespace of a system |
|||
struct electron_volt_base_unit : base_unit<electron_volt_base_unit, energy_dimension, 43289 /*unique positive number*/ > { |
|||
static const char* name() { return "electronvolt"; } |
|||
static const char* symbol() { return "eV"; } |
|||
}; |
|||
typedef electron_volt_base_unit::unit_type electron_volt_unit; |
|||
static const electron_volt_unit electron_volt, eV; |
|||
} |
|||
}} |
|||
BOOST_UNITS_DEFINE_CONVERSION_FACTOR( |
|||
boost::units::atomic::electron_volt_base_unit, |
|||
boost::units::atomic::energy, |
|||
double, 0.03674932 // 1 eV = 0.03...E_h |
|||
); |
|||
BOOST_UNITS_DEFAULT_CONVERSION( |
|||
boost::units::atomic::electron_volt_base_unit, |
|||
boost::units::atomic::energy |
|||
); |
|||
use as |
|||
quantity<atomic::electron_volt_unit> ene = 1.*atomic::eV; |
|||
cout<<(double)quantity<atomic::dimensionless>(ene/atomic::hartree)<<endl; //automatic conversion to dimensionless |
|||
==== Scaled units ==== |
|||
All units (including free standing units) can defined scaled counterparts, moreover, by default a standard SI prefix (kilo/k, Mega/M, Giga/G) is added to the name of the variable if the scale is a standard power of ten (i.e. 10, 100, 1000, 1000000, but not 10000). Supported prefixes are [http://www.boost.org/doc/libs/1_42_0/doc/html/boost_units/Reference.html#header.boost.units.systems.si.prefixes_hpp zepto (-21 zeros), atto (-15), pico (-12), nano (-9) ... peta (15), exa (18), zetta (21), yotta (24)]. ([http://www.telegraph.co.uk/science/science-news/7352204/Hella-number-scientists-call-for-new-word-for-1000000000000000000000000000.html Hella (27)] not supported but can be added easily :) ) |
|||
typedef make_scaled_unit<si::pressure, scale<10, static_rational<9> > >::type gpa_unit; |
|||
static const gpa_unit gigapascal, GPa; //use as quantity<gpa_unit> P = 10.*GPa; |
|||
(si::pressure is defined in boost/units/system/si/pressure.hpp). |
|||
automatically prints to '10 GPa' ('Pa' --for pascal-- string is defined in boost/units/system/si/io.hpp, the 'G',--for giga-- is added by the scaled unit character). |
|||
A similar code can be used to define MeV (megaelectronvolt) automaticaly based on the definition of eV in the previous section. |
|||
=== Defining physical quantities === |
|||
==== Quantities with Fundamental Units ==== |
|||
Using fundamental quantities in a given system is almost trival, first we will easy the notation by specifing that we will use Boost.Units intensively and the system of units. |
|||
#include<boost/units.hpp> |
|||
using namespace boost::units; //for quantity |
|||
The we tell the compiler that a certain quantity has physical dimensions and how to store the actual value. |
|||
quantity<au::energy, double> a; |
|||
so defined 'a' will have the value 1 in the fundamental units of the system (Hartree). Note that at this point the program doesn't know what a Hartree is, it is just some abstract unit of a certain dimension called "energy". The we can assign a value |
|||
a = 30.*au::hartree; |
|||
We can print the values, by using the usual <code>cout<< "a = "<< a <<endl;</code>. The output will read |
|||
'a = 30 Eh' |
|||
where "Eh" is the symbol we defined in our table. |
|||
In the same way we can define a quantity with unit of length |
|||
quantity<au::length> b = 5.*au::bohr; |
|||
Note that we can omit 'double' since this is assumed to be the default. |
|||
We declared 'a' to have units of energy, and this can not change in the future. Here it is the first implicance of using Boost.Units, if we try to do |
|||
a = b; // compile error here |
|||
which is wrong, we will get an error... from the compiler! We don't have to wait until the program is executed to realize what we just did. In fact the program will not compile until we fix this dimensional inconsistency. |
|||
Here it the main difference with the typical procedure, we could have used the type double for a and b; but then the previous line would have been completelly ignored. |
|||
Although the previous line was wrong, different can be combined in correct ways for example |
|||
std::cout<< "a * b = "<< a * b << std::endl; |
|||
will print <code>150. Eh a0</code>. As expected, other combinations, like <code>a + b</code> will fail to compile. |
|||
==== Quantities with Derived Units ==== |
|||
Obviously that to preserve the dimensional consistency de result of 'a / b' has to have units of force. So, if we want to store the result we need to declare a new variable with units of force, this is achieved that to a the Boost.Units metafunction |
|||
quantity<unit<force_dimension, atomic> > c = a/b; |
|||
or we can shorten the syntax with using |
|||
typedef unit<force_dimension, atomic> atomic_force_unit; |
|||
quantity<atomic_force_unit> c = a/b; |
|||
There are many physical dimensions predefined in <code>boost/units/physical_dimensions/</code>. For example: acceleration, area, magnetic_flux, stress, surface_tension, velocity, volume and many others. |
|||
Still, there maybe some quantities that we need to define that don't have a name. For example if we need something with units of inverse energy times inverse volume we don't have a predefined dimension for that, although this is the units of the electronic density of states. In this case we need to define the quantity ourselves. Boost.Units uses the syntax of Boost.MPL to express such composition. For example |
|||
using mpl::times; using mpl::divides; |
|||
typedef boost::mpl::divides<dimensionless_type, times<lenght_dimension, volume_dimension>::type>::type dos_dimension; |
|||
typedef unit<dos_dimension, atomic::system> atomic_dos_unit; //or one_over_hartree_Bohr3 |
|||
quantity<atomic_dos_unit> mydos = 1./en/vol; |
|||
A big shortcut could that avoid all the previous, be to use |
|||
BOOST_AUTO(c, a/b); // or 'auto c=a/b;' in C++0x (compile with 'g++-4.4 -std=c++0x') |
|||
to let the compiler the compiler "detect" the correct type of 'c' (which can have a very complicated specification) but this can lead to problems as we loose track of the units of the quantities involved. |
|||
==== Mathematical Funtions on Physical Quantities [[http://www.boost.org/doc/libs/1_40_0/doc/html/boost_units/Reference.html#header.boost.units.cmath_hpp <code>boost/units/cmath.hpp</code>]] ==== |
|||
Many special functions for floating point numbers (not quantities) are defined in the header file <code>cmath</code> for example: |
|||
#include<cmath> |
|||
... |
|||
double a = sqrt(b); |
|||
double b = pow(b, 3); //see below why this can not be used with units |
|||
In the same way Boost.Units implement its own cmath header (called boost/units/cmath.hpp) to take care of unit correctness: |
|||
* Square Root (<code>sqrt</code>) |
|||
#include<boost/units/cmath.hpp> |
|||
... |
|||
quantity<unit<area_dimension, au::system> > square_area = 64*au::bohr*au::bohr; |
|||
quantity<unit<lenght_dimension, au::system> square_side = boost::units::sqrt(square_area); // square_size = 8*au::bohr |
|||
or much shorter |
|||
using boost::units::sqrt; |
|||
auto square_side = sqrt(square_area); |
|||
In general the 'using' statement is not even necessary since the namespace of sqrt is deduced from the argument type. 'auto' is a keyword of C++0x, if it not used then we have to explicitly declare the quantity with the right units (and dimension). |
|||
* Integer Power (<code>pow</code>) |
|||
In the case of power it is different because to track units at compile-time the value of the integer power must be a constant (i.e. a value known at compile-time), therefore it can not be passed as a second argument, instead it is passed as a template argument |
|||
auto cube_volume = boost::units::pow<3>(cube_side); |
|||
* Rational Power (<code>root</code> and <code>pow</code>) |
|||
It is not uncommon that sometimes rational power of units are used in intermediate calculations, the first example is sqrt above. More generally: |
|||
auto cube_side = boost::units::root<3>(cube_volume); |
|||
The most general case is handled with |
|||
auto cube_face = boost::unit::pow< boost::unit::static_rational<2,3> >(cube_volume); |
|||
or for short: |
|||
auto cube_face = pow<2>(root<3>(cube_volume)); |
|||
* Trigonometric functions (<code>sin, cos, atan, atan2, ...</code>) |
|||
Trigonometric functions are implemented for arguments for which its units are of plane_angle_dimension. |
|||
Using units for angles can add an extra layer of type checking but it can be counter-intuitive, specially if you are already used to work in radians. If the argument of these functions is dimensionless then the standard (e.g. std::sin is called). |
|||
* Absolute value (<code>fabs</code>) |
|||
quantity<si::length> distance = fabs(x1 - x2); |
|||
==== Use existing numerical functions, subvert dimensional checking (<code>::from_value</code> and <code>.value</code>) ==== |
|||
An arbitrary user defined function |
|||
double scale(double x, double factor){ ... ; return result;} |
|||
does not work by default with quantities, unless the quantity is dimensionless. Even if we managed to make it work by some kind of trick we would still like to keep track of the resulting dimension, which may o may not be of the same units as the input. |
|||
For simplicity let's assume that the output must have the same dimension as the input. The we can overload the function in this way and internally call the good old function |
|||
quantity<si::length> scale(quantity<si::length> x, double factor){ |
|||
return quantity<si::length>::from_value( scale(x.value(),factor) ); |
|||
} |
|||
Note that 'from_value' and 'value' subvert the dimensional checking but can be nonetheless necessary for forcing the internal arithmetic. |
|||
The function can be generalized to arbitrary quantity dimension via templates. Even more challenging is to determinine (also from template programming) the correct type for the returned quantity. For this metafuctions such as <code><nowiki>power_typeof_helper<Unit, static_rational<P,Q> >::type</nowiki></code> and <code>mpl::times</code> can be useful. |
|||
==== Display internal unit types for debugging (<code>simplify_typename</code>) ==== |
|||
Boost Units relies in creating an extremely complicated type system, which in principle should be transparent to us since as long as we know the dimensionality and the system units we are working with. In general we never refer to the complicated types that the library generates. In reality, sometimes we need to know what is the exact type of a certain variable, for example for debugging. There is a function called simplify_typename that prints the exact type of the variable (and gets rid of the ubiquitous 'boost::units' string). |
|||
#include<boost/units/io.hpp> |
|||
... |
|||
quantity<atomic::volume> v(4.*pow<3>(atomic::bohr)); |
|||
cout<<boost::units::simplify_typename(v)<<endl; |
|||
cout<<v<<endl; |
|||
In this example we know that the exact type of the variable 'v' is |
|||
quantity<unit<list<dim<length_base_dimension, static_rational<3l, 1l> >, dimensionless_type>, homogeneous_system<list<atomic::electron_mass_base_unit, list<atomic::elementary_charge_base_unit, list<atomic::reduced_planck_constant_base_unit, list<atomic::coulomb_force_constant_base_unit, list<atomic::boltzman_constant_base_unit, dimensionless_type> > > > > >, void>, double> |
|||
and its value is |
|||
4 (a_0^3) |
|||
Internally it is based on abi::__cxa_demangle from abicxx.h, which is a more general way to print type names at runtime. However 'simplify_typename' also simplifies the names by removing the 'boost::units' string. |
|||
=== Display quantities === |
|||
Boost.Units supports pretty well the display of quantities with units: |
|||
cout<< "E = " << E << endl; // prints E = 3.55791e-12 m^2 kg s^-2 |
|||
Depending on the system units can try to simplify the units before printing; this done if we #include<boost/units/systems/si/io.hpp>, in this case the printing is simplified to |
|||
... // prints E = 3.55791e-12 J |
|||
(J is for Joule unit). |
|||
Sometimes is necessary to separate the actual numerical value and the unit tag of the string representation of a quantity, for example when printing data files. We want columns of numbers to print without the units, eventually the unit string can go only once at the header of the column. So achieve this, suppose that we have a |
|||
quantity<si::energy> E = 4.*si::joules; |
|||
then we can print the numerical value and the unit separatelly in the following way: |
|||
cout<< E.value() << " * " << si::energy() ; |
|||
// or << quantity<si::energy>::unit_type(); |
|||
// or << E_type::unit_type(); //after typedef typeof(E) E_type; |
|||
// or << symbol_name(si::energy()) |
|||
Note the parenthesis () after the unit type. |
|||
(The library support also Boost.Serialization.) |
|||
=== Unit Conversions === |
|||
Units is not just about keeping dimensional consistency; it is also about being able to convert between different unit systems. |
|||
You can notice that we use the unit system prefix when declaring a quantity, for example in atomic units: |
|||
quantity<au::energy> E1 = 10.*au::hartree; |
|||
or in SI system |
|||
quantity<si::energy> E2 = 20.*si::joules; |
|||
We may wonder why we need to specify the unit system in the right hand side. We would like to abstract ourselves from the unit system, after all, both are energies! That is correct in an ideal world, in practice we work with numbers of finite precision therefore storing an atomic quantity in SI units leads to a very small number that can even go below the precision limit or at least affect the effective precision. This is the practical reason for which we use different unit systems in the first place. |
|||
In some sense giving information on the unit system to the quantity is telling the system which order of magnitude are que quantities. Nevertheless sometimes we need to convert units. |
|||
==== Ad hoc transformation ==== |
|||
There is simple way of transforming quantities without much additional code. Suppose that we have a data file with energies and we know those energies are in electron volts (eV). But our system of choise is Hartree, we can express this information in the code by just defining a quantity called eV in our unit system. In this case eV is not a unit but a physical constant. |
|||
const quantity<au::energy> eV = 0.03674932540*au::hartree; //http://physics.nist.gov/cgi-bin/cuu/Value?evhr |
|||
double numerical_input; |
|||
cin>>numerical_input; //e.g. read from file |
|||
quantity<au::energy> E = numerical_input*eV; //this tells that original data was in eV |
|||
Now for the output we can use the opposite trick. If we print E, the output will be in Hartree. |
|||
cout<<E; // shows for example 4. Eh |
|||
we can force the output to we in eV by doing |
|||
cout<< (double)(E/eV) << " eV"; //shows 108.84 eV |
|||
The cast operation '(double)' is needed to convert the dimensionless quantity to the built-in variable. Otherwise the printing will print 'dimensionless' after the numerical value. |
|||
Cast operation to built-in type are useful in other context and only and it is only permitted if the quantity is dimensionless. |
|||
This solution is not optimal but it works pretty well for quantities output. |
|||
=== Automatic Type Declaration in C++0x (auto) === |
|||
A word of caution about the 'auto' declaration type: In this example the type (and units) of x is determined automatically: |
|||
auto x = 0*au::hartree; //valid C++0x |
|||
x+=0.1*au::hartree; |
|||
cout<<x<<endl; |
|||
but outputs '0 Eh'!! This is because this declaration is equivalent to |
|||
quantity<au::energy, int> x = 0*au::hartree; |
|||
Note the 'int'. That is, the internal value is an integer. This is most likely not the desired behavior. Although the value initialized is zero the type the variable carries is not a floating point. Remember that '0' is an integer literal in C++. There are few alternatives. |
|||
* Declare the type explicitly |
|||
quantity<au::energy, double> x = 0*au::hartree; |
|||
* Add a decimal point or a literal indicator to the literal number |
|||
auto x = 0.*au::hartree; // or 0.0*au::hartree; // or 0f*au::hartree; |
|||
=== Physical Constants (SI codata) === |
|||
For convenience the library includes the most common physical constants. For the moment physical data is given only for the SI systems, but with the implicit conversions it can be used in any unit system. The definitions are contained in <code>boost/units/systems/si/codata_constants.hpp</code>. |
|||
Most popular ones include |
|||
si::constants::codata::k_B; //Boltzmann constant |
|||
si::constants::codata::c; //Speed of light |
|||
si::constants::codata::G; //Gravitational constant |
|||
si::constants::codata::hbar; //Reduced Plank constant |
|||
si::constants::codata::m_e; //Electron Mass |
|||
si::constants::codata::e; //Elementary Charge (positive electron charge). |
|||
all defined in SI units with the proper dimension. Note that these are not units but global constant C++ variables. For example k_B has units of Joule over Kelvin and has a specific numeric (double) value. |
|||
For detailed reference see http://physics.nist.gov/cuu/Constants/index.html |
|||
The 'using', and 'using namespace' declaration can by used to simplify the notation: |
|||
using si::constant::codata::k_B; |
|||
quantity<si::energy> e = 3.*k_B * 1000.*si::kelvin; |
|||
== Boost.Operators and Boost.Concepts == |
|||
This section is more 'mathematical'. When we program we tend to forget all that we know about mathematical theory we do so because too much abstraction at the coding level does not improve the code quality and speed, also C++ (or any other compiled language) does not provide by default enough facilities to allow this abstraction. |
|||
These two libraries, when combined, provide a nice framework to introduce some abstractions and at the same time a certain level of mathematical correctness to our code. |
|||
Abstraction will not provide faster code or readability for the most part, it only generalizes our code to the maximum extend that the mathematics behind the problem can provide. |
|||
Another constrain that I will impose is that the mathematical abstraction should not be introduced at the cost of program execution speed. |
|||
Let's start with a very simple example, let's try to construct a type that represents a vector space of fixed dimension (think of a 3D vector), a naive and perfectly valid implementation is the following |
|||
class v3d{ |
|||
public: |
|||
boost::array<double, 3> coord_; |
|||
v3d(boost::array<double, 3> const& init) : coord_(init){} |
|||
friend v3d operator+(v3d const& , v3d const&){ ... elementwise addition ... } |
|||
friend v3d operator*(double const&, v3d const&){ ... elementwise scalar multiplication ... } |
|||
}; |
|||
(I will use the 'friend' version for non-member operations). |
|||
In principle this a perfectly valid three-dimensional (real) linear space. This implementation allows us to perform any operation possible in this space. (There is no inner product defined, but this is not necessary to have a vector space, we are trying to be mathematical here, remember?). |
|||
The aim this section will be to create a class (or more exactly a template class) that is almost mathematically correct. |
|||
There are obvious ways to make this class more general, first of all. We may want to have a complex space or different number of dimensions. Since we don't want to affect the performance if we need only a real space or a small number of dimensions, the way to implement this is with C++ templates: |
|||
template<typename T, unsigned NDim> |
|||
class vs{ |
|||
boost::array<T, NDim> coords_; |
|||
... |
|||
}; |
|||
This is a huge generalization. Yet we realize very soon that the use of the library is quite limited in the syntax. For example the user can not substract directly two vectors (v1-v2), it can not increment a given vector (v1+=v2), etc. The list of possible operation between elements is very large, in principle we need to implement +=,+,-=,-,*=,*,/=,/,==,!= and furthermore the implementations have to be consistent with each other. This is prone to errors. On the other hand we know that many of these operations have redundant implementations. |
|||
This is where Boost.Operators come to the rescue, it implements several operators in terms of others in a way that is consistent with the usual arithmetics. Let's start with += and +. We can have both operators at the price of one with inheriting our class from a template class provided by boost operatos. |
|||
template<typename T, unsigned D> |
|||
class vs : boost::addable< vs<T,NDim> >{ |
|||
... |
|||
friend vs& operator+=(vs&, vs const&){ ... } |
|||
}; |
|||
The trick is that the operator+ (for v1+v2) will be available to the class thanks to boost::addable even if we don't implement such operation explicitly. Two for the price of one. We can go further and do the same with operator -= and have operator- defined automatically. |
|||
template<typename T, unsigned D> |
|||
class vs : boost::substractable< vs<T,D> >, boost::addable< vs<T,D> >{ |
|||
... |
|||
friend vs& operator+=(vs&, vs const&){ ... } |
|||
friend vs& operator-=(vs&, vs const&){ ... } |
|||
}; |
|||
for each set of operation we can add a new dependency, in order for the list not to become to long, the library offers shortcuts, for example a class that additive is substractable and addable. This shortcut is provided because this is a usual case (mathematically speaking) |
|||
template<typename T, unsigned D> |
|||
class vs : boost::additive< vs<T,D> >{ |
|||
... |
|||
}; |
|||
This way of defining mathematical classes does not only provides ''syntactic sugar'' but also ''consistency'' within the syntaxis. |
|||
To complete the operations we need a couple of additional operations: for scalar multiplication/division and for comparisons. It turns out that this operators also can be defined with a minimal effort. |
|||
template<typename T, unsigned D> |
|||
class vs : |
|||
boost::additive< vs<T,D> >, //provides + (binary) and - (binary) given += and -= |
|||
boost::multiplicative< vs<T,D> , T>, //provides * and / given *= and /= |
|||
boost::equality_comparable< vs<T,D> , T> //provides != given == |
|||
{ |
|||
boost::array<T, D> data_; |
|||
friend vs& operator+=(vs&, vs const&){ ... } |
|||
friend vs& operator*=(vs const&, T const&){ ... } |
|||
friend vs& operator-=(vs& self, vs const& other){return self+=-other;} |
|||
friend vs& operator/=(vs const&, T const& t){return self*=(T(1)/t);} |
|||
// operator== is defined by default, and is correct if implementation is value based |
|||
} |
|||
Note that that we only need to implement two operators '+=' and '*='. |
|||
=== Concrete implementation vs. mathematical abstraction === |
|||
An obvious implementation for operator += and *= looks like this |
|||
for(std::size_t i=0; i!=self.size(); ++i) |
|||
self.data_[i] += other.data_[i]; //or *=t; |
|||
This is also completely acceptable but it is not the degree of abstraction we want. The produced class has the expected mathematical behavior but it has information extraneous to a mathematical description. |
|||
For example, why are we forced to use boost::array, for example a user may want to implement the linear space with std::vector or some other container or not even a container. Second, even if we stick to a container implementation, why element by element addition has to be the concept of sum (there are other composition laws that allow for a correct concept of sum). |
|||
Furthermore, why the linear space has to have finite dimensionality!! (finite dimensionality is not a mathematical requirement for a linear-space.) |
|||
To solve the first problem we have to parametrize the template class in the implementation, the class is changed to something like |
|||
template<class Element> |
|||
class vs{ |
|||
Element data_; |
|||
friend vs& operator+=(vs&, vs const&){ ... ??? ... } |
|||
friend vs& operator*=(vs&, T??? const&){ ... ??? ... } |
|||
}; |
|||
Now, this is completely generic regarding the internal representation of the linear space; but it creates a big problem. There is no way to implement += in a generic way, even if we constrain Element to be a container we don't know what is the type of the 'scalar' (T???) in the scalar multiplication!! |
|||
The only way out is to add this information explicitly to the class template parametrization. |
|||
This is becoming really ''mathematical''. Let's review for a moment the mathematical definition of a |
|||
linear space. Using Nakahara's [http://books.google.com/books?id=cH-XQB0Ex5wC&dq=nakahara] definition |
|||
"A vector space (or linear space) <math>V</math> over a ''field'' <math>K</math> is a ''set'' in which two operations, addition and multiplication by an element of <math>K</math> (called scalar), are defined. The elements (called vectors) of <math>V</math> satisfy the following axioms |
|||
* <math>\mathbf{u} + \mathbf{v} == \mathbf{v} + \mathbf{u} </math> (commutative addition) |
|||
* <math>(\mathbf{u} + \mathbf{v}) + \mathbf{w} == \mathbf{u} + (\mathbf{v} + \mathbf{w}) </math> (associative addition) |
|||
* <math>\exists \mathbf{0}_V / \mathbf{v} + \mathbf{0}_V == \mathbf{v} </math> (additive identity/zero) |
|||
* <math>\exists (-\mathbf{u}) / \mathbf{u} + (-\mathbf{u}) == \mathbf{0}_V </math> (additive inverse/negation) |
|||
* <math>c( \mathbf{u} + \mathbf{v} ) = c \mathbf{u} + c \mathbf{v}</math> (linear scalar product) |
|||
* <math>(cd) \mathbf{u} == c(d \mathbf{u})</math> (distributive scalar product) |
|||
* <math>1_K \mathbf{u} == \mathbf{u}</math> (scalar product multiplicative) |
|||
(where <math> \mathbf{u},\mathbf{v} \in V</math> and <math> c,d \in K</math>)" |
|||
The first paragraph tell us that we have to define a ''set'' of elements (the equivalent of our implementation data type), then we need to define a scalar ''field'' and then implicitly it says that we should introduce a concept of ''addition'' and ''multiplication'' with certain properties. |
|||
These four pieces of information are in general independent, so we need to parametrize the vector space with all of them. |
|||
template< |
|||
class Element, |
|||
typename Field_Scalar, |
|||
class Commutative_Associative_Addition, |
|||
class Associative_ScalarMultiplication> |
|||
... |
|||
At the same time we want a desirable mathematical property, that is that when we provide the linear space properties to a given set, the elements of the set preserve certain properties. For example, given the set of one variable functions, we can define a linear space by providing a concept of sum and multiplication (e.g. <math>(a f + b g)(x) = a f(x) + b g(x)</math>) in the process we still think of the elements of the set as functions. The way to preserve the properties of the original set is to use public inheritance from the set type instead of class member of that type. |
|||
... |
|||
class vs : public Element, |
|||
boost::additive< vs<T,D> >, |
|||
boost::multiplicative< vs<T,D> , T>, |
|||
boost::equality_comparable< vs<T,D> , T> |
|||
{ ... |
|||
We follow by defining certain details for the space |
|||
... |
|||
typedef Element element; |
|||
typedef Field_Scalar scalar; |
|||
typedef Commutative_Associative_Addition addition; |
|||
typedef ScalarMultiplication scalar_multiplication; |
|||
... |
|||
The addition and scalar multiplications are of course mean to be function objects that are shared by all the class elements and that are inmutable (constant) from the point of view of the vector space. |
|||
... |
|||
static addition add_; |
|||
static scalar_multiplication mult_; |
|||
... |
|||
We also need to be able to generate a vector in the linear space from an element of the set |
|||
... |
|||
vs(element const& elem) : Element(elem){} |
|||
... |
|||
Then we follow with |
|||
... |
|||
friend vs& operator+=(vs& self, vs const& other){return self = add_(self, other ) ;} |
|||
friend vs& operator*=(vs& self, scalar const& s){return self = mult_(self, s ) ;} |
|||
friend vs& operator-=(vs& self, vs const& other){return self += (-scalar(1) )*other ;} |
|||
friend vs& operator/=(vs& self, scalar const& s){return self *= ( scalar(1)/s) ;} |
|||
... |
|||
Thanks to Boost.Operators the other operators are defined consistently. |
|||
Following the mathematical definition we have to define (or at least be able to generate) the zero element (it must exists regardless of anything else) and the negative element |
|||
... |
|||
friend vs zero(); |
|||
friend vs operator-(vs const& v){vs ret(v); ret*=-scalar(1); return ret;} |
|||
friend vs const& operator+(vs const& v){return v; } |
|||
}; |
|||
which finishes for the moment the definition of the class. The definition of the function 'zero()' is left undefined. This is because it is not always needed, second it depends on the details of the linear space we want to define. |
|||
The complete code looks like [http://codepad.org/0r61CTiY this]. |
|||
=== Example 1: boost::array as a finite dimensional linear space === |
|||
By default boost::array doesn't have any arithmetical operator associated to it. So any arbitrary concept of addition in a linear space formed for such elements has to be defined manually. In the context of the class defined above there are three different way to achieve this depending on the syntax we want to achieve. The first two options produce the most compact syntax but requieres us to define classes in the std namespace or the boost namespace |
|||
a) Specialize <code>std::plus</code> |
|||
using boost::array |
|||
namespace std{ |
|||
template<> |
|||
array<double,3> plus<array<double, 3> >::operator()(array<double, 3> const& a1, array<double, 3> const& a2) const{ |
|||
array<double, 3> ret; |
|||
for(unsigned i=0; i!=3; ++i){ |
|||
ret[i]=a1[i]+a2[i]; |
|||
} |
|||
return ret; |
|||
} |
|||
} |
|||
... |
|||
b) Overload + in namespace boost (the namespace of <code>boost::array</code>) |
|||
namespace boost{ |
|||
array<double, 3> operator+(array<double, 3> const& a1, array<double, 3> const& a2){ |
|||
.. same function body ... |
|||
} |
|||
In both cases the linear space class can be used ''without'' explicitly specifying the operation |
|||
array<double,3> a_init=<nowiki>{{1,2,3}}</nowiki>; array<double, 3> b_init=<nowiki>{{2,3,4}}</nowiki>; |
|||
linear_space<array<double, 3> > a=a_init; |
|||
linear_space<array<double, 3> > b=b_init; |
|||
c) Define an independent object-function |
|||
class elementwise_add{ |
|||
array<double,3> operator()(array<double> const& a1, array<double,3> const a2) const{ |
|||
... same function body ... |
|||
} |
|||
}; |
|||
In this last case the addition operation has to be explicit |
|||
linear_space<array<double, 3>, elementwise_add> a=a_init; |
|||
linear_space<array<double, 3>, elementwise_add> b=b_init; |
|||
After either a), b) or c) is chosen we can apply the + operation in the sense prescribed for this linear space: |
|||
linear_space<array<double, 3>, elementwise_add> c=a+b; |
|||
Note that the example the linear space works even though the scalar multiplication was never specified. This is fine, since such operation was not used in the example. (You don't pay for what you don't use.) |
|||
To define a multiplication we just need to define a new class and pass it as argument in the linear space. |
|||
class elementwise_mult{ |
|||
array<double,3> operator()(double const& s, array<double, 3> const& a) const{ |
|||
array<double, 3> ret=a; |
|||
for(unsigned i=0; i!=3; ++i){ |
|||
ret[i]*=s; |
|||
} |
|||
return ret; |
|||
}; |
|||
... |
|||
linear_space<array<double, 3>, elementwise_addition, double, elementwise_multiplication> a=a_init; |
|||
At this point it may be beneficial to use typedef |
|||
typedef linear_space<array<double, 3>, elementwise_add, double, elementwise_mult> mylinear_space; |
|||
mylinear_space a, b; |
|||
mylinear_space c=a+b; |
|||
=== Example 2: Implement an infinite dimensional space with <code>std::vector</code> === |
|||
Given the famous vector class, we can define an addition concept that accept vectors of any size, provided that we interpret out of bound elements as zeros. std::vector class allow us to return vectors of arbitrary size. NB: std::vector is not a vector space, it is called vector for other reasons, it doesn't carry an addition concept for example. |
|||
class size_variable_elementwise_add{ |
|||
std::vector operator()(std::vector const& v1, std::vector const& v2) const{ |
|||
std::vector ret(std::max(v1.size(), v2.size())); |
|||
for(unsigned i=0; i!=ret.size(); ++i){ |
|||
if(i>=v2.size()) ret[i]=v1[i]; |
|||
else if(i>=v1.size()) ret[i]=v2[i]; |
|||
else ret[i]=v1[i]+v2[i]; |
|||
} |
|||
return ret; |
|||
} |
|||
}; |
|||
This particular implementation of addition induces a linear space |
|||
linear_space<std::vector, size_variable_elementwise_add> |
|||
which dimensionality is infinite in the mathematical sense that the number of independent vectors is not bounded by the program, (well ... it is bounded by the available memory and the runtime.) |
|||
=== Example 3: implement an efficient infinite dimensional linear with <code>std::map</code> === |
|||
For practical causes when we think of a potentially infinite dimensional vector we probably think of a sparse vector. A sparse vector (many elements are zero) a good choice is to use something different from an array, for example std::map provides a way to defining a sparse option, the reason is that zero elements are omitted. |
|||
class elementwise_add_map{ |
|||
std::map<unsigned, double> operator()(std::map<unsigned, double> const& m1, std::map<unsigned, double> m2) const{ |
|||
std::map ret; |
|||
for(std::map<unsigned, double>::const_iterator im1=m1.begin(); im1!=m1.end(); ++im1){ |
|||
ret[im1->first]=im1->second; |
|||
} |
|||
for(std::map<unsigned, double>::const_iterator im2=m2.begin(); im2!=m2.end(); ++im2){ |
|||
ret[im2->first]+=im2->second; if(ret[im1->first]==0) ret.erase(im2->first); |
|||
} |
|||
return ret; |
|||
} |
|||
}; |
|||
... |
|||
linear_space<std::map<unsigned, double> , elementwise_add_map> |
|||
Multiplication also can be defined for non-zero elements only. |
|||
This is an optimal implementation of a sparse vector, the point of this example is to show that the efficiency of the program is related to the actual implementation of the class and not to the linear_space template class itself. |
|||
We can temporarily view any object as an element as a linear space without constructing a new object by casting to a reference type. This allows to perform linear space operations for arbitrary objects. |
|||
array<double,3> a ,b, c; |
|||
typedef linear_space<array<double,3>, element_wise_add, element_wise_mult >& ls_elem; |
|||
(ls_elem&)c=(ls_elem&)a+(ls_elem&)b; |
|||
(section not finished) |
|||
== Boost.Spirit == |
|||
One of the more tedious and unfortunately unavoidable task in programming is parsing. |
|||
Parsing means that given a "free form" text string we have to convert that into data that the program can manipulate, in order to do that we need to "interpret" the text. The parsing includes walking the input text and executing actions each time a certain pattern in found. |
|||
Doing this manually is prone to errors and lack of generality. Boost.Spirit is a complex library that allows parsing text quite easily. |
|||
The following parser a line that is expected to be comprised by two integers and a word. As this values are read and interpreted they are assigned to target variables. The beauty of it is that the syntax is straight forward and we will know if the parsing failed right away because of the bool returned, in that case the iterator first will contain the point at which parsing failed. This is the simplest use of Boost.Spirit, more complex parse patterns can be defined. |
|||
#include <string> |
|||
#include <boost/spirit/include/qi.hpp> |
|||
#include <boost/spirit/include/phoenix_core.hpp> |
|||
#include <boost/spirit/include/phoenix_operator.hpp> |
|||
namespace qi = boost::spirit::qi; |
|||
namespace px = boost::phoenix; |
|||
namespace ascii = boost::spirit::ascii; |
|||
int main(){ |
|||
std::string source = " 1 13 Al"; |
|||
int idx; |
|||
int atomic_number; |
|||
std::string name; |
|||
std::string::iterator first = source.begin(), last = source.end(); |
|||
qi::phrase_parse(first, last, |
|||
( qi::int_ >> qi::int_ >> qi::lexeme[+~qi::char_(' ')] ), ascii::space, |
|||
idx, atomic_number, name |
|||
); |
|||
std::cout << idx << " " << atomic_number << " " << name << std::endl; |
|||
return 0; |
|||
} |
|||
== Boost.Proto == |
|||
{{{#!comment |
|||
A language inside a language. It is common that after a code matures to a certain level the program itself defines a new language. We may not realize it, but even an innocent "input parameter" file is a language in itself. |
|||
This stage is natural because each problem at hand has a language that is specific to that domain. The technical term for it is DSEL (Domain Specific Language), that is a language that is suited to pose and solve problems in a certain domain of problems. |
|||
Building a language is perfectly valid and is the only natural way to proceed. But as soon as we try to build this language we encounter at least two major problems. First, the common approach is to program an interpreter. |
|||
This interpreter can be interactive or just read an input file. In both cases the parser is responsible for enforcing a certain syntax and drive the core of the code to perform certain actions. |
|||
Second, except for very simple syntaxes the language ideally can specify some operations (e.g. arithmetic operations), language types (e.g. some input parameters are vectors, others are numbers) and even programming blocks (loops, conditionals, etc). |
|||
Needless to say that most of these DSELs (at leasts the ones I know) are terrible, their syntax is not very well defined and the flexibility is really limited. DSELs can be improved, sure but building an ideal DSEL is as complicated that building a new programming language. |
|||
Programming language... we are already writing our program in a programming language! So there is obviously a lot of redundant work, we are writing a language that at best will have the same capabilities as the main language we are using (although it will have a syntax better suited for specific purposes, that's the main point). |
|||
Wouldn't be ideal if we can use C++ itself to implement our own DSEL? Can our "input file" be written in C++ itself? How domain specific we can be by doing that? |
|||
The answer is positive in general. C++, by means of operator overloading, is a language to actually allow many nice compact and rich syntaxes that can be interpreted and converted to concrete code by the C++ compiler. For example by defining the right operators and functions the following becomes a valid C++ statement: |
|||
sum_index i; |
|||
double result = sum(i=from(0).to(10))(1./i); |
|||
(On the other hand a syntax like <code>double result = sum i = 0 to 10 of 1/i</code> is not a valid C++ syntax. So a DSEL syntax written in C++ will have its limitations.) |
|||
Making a DSEL syntax possible with C++ can take lot of code but the resulting syntax may worth it depending on how expressive or compact we want to be. |
|||
The effort can be aleviated with Boost.Proto which makes building such syntax much easier. Boost.Proto is a collection of template meta-functions and macros that help us to define the syntax. |
|||
Another intriguing valid C++ expression that can be part of a syntax is: |
|||
Vexpectation = (psi|V|psi); |
|||
(compare with the quantum mechanical syntax <math>V = \langle \psi | \hat{V} |\psi \rangle</math>.) |
|||
The first thing to understand is that the syntax is not necesarelly bound to an operation. It might be that the syntax is used to construct a complex expression which is only evaluated when necessary. |
|||
In the last example, it is most efficient that the oparation "<code>V|phi</code>" does not perform any action but just stores the abtract concept of "applying a certain operator to a wavevector (ket)". |
|||
For example, only when the other wavevector (bra) is added to the expression a certain real computation is performed. |
|||
... |
|||
}}} |
|||
== Boost.ResultOf, generic return types == |
|||
A great deal of generic programming is related to the ability to deduce the return type of a given C++ expression. For example suppose that we have an object that behaves like a function called 'f' and of type 'F'. When I say that it behaves like function it means that it can be a function pointer, and overloaded function pointer, a member function pointer, a function reference or a object with member operator '()' either monomorphic (only one operator() defined) or polymorphic. |
|||
To be specific let us say that I want a '''generic''' function that evaluates that entity at a given value. |
|||
eval(f, 5.); // to be identical in behaviour to f(5.); |
|||
Of course the 'eval' function body is almost trivial |
|||
template<class F, class T> |
|||
??? |
|||
eval(F f, T t){ |
|||
return f(t); |
|||
} |
|||
But what about the return type ???. |
|||
If you are using c++0x just use |
|||
decltype(F()(T()) |
|||
or use the new C++0x function declaration style |
|||
auto eval(F t, T t) -> decltype(f(t)){ return f(t);} |
|||
and that's it, you can stop reading this section unless you are interested in making the code C++98-compatible. |
|||
If you are not using C++0x or want to make your code compatible with C++ then it is where Boost.ResultOf comes to the rescue to achieve something that otherwise is not possible because there is no decltype in C++. The library provides a type given by the expression |
|||
boost::result_of<F(T)>::type |
|||
that can be used in ???. As described here http://www.boost.org/doc/libs/1_46_1/libs/utility/utility.htm#result_of |
|||
But ResultOf is more a standardized protocol than a library. It works provided that you follow a certain rule, that all the function objects that you write have a nested class called result that gives information of the return type of each overload of operator(). It works on function objects and also on free functions (or overloaded free functions). |
|||
For example the following class has basically a template result struct for each version of operator(). The example is rather artificial but it the most complicated case that can appear. |
|||
struct F{ |
|||
template<class Signature> struct result; // protocol kick-in |
|||
template<class This, class T> struct result<This(T const&)>{typedef T type;}; |
|||
template<class T> |
|||
typename result<F(T const&)>::type operator()(T const& x) const{return x;} |
|||
template<class This> struct result<This(double const&)>{typedef double type;}; |
|||
result<F(double const&)>::type operator()(double const& x) const{return x*x;} |
|||
template<class This> struct result<This(int const&)>{typedef std::string type;}; |
|||
result<F(int const&)>::type operator()(int const& x) const{return '"' + boost::lexical_cast<std::string>(x+x) + '"';} |
|||
}; |
|||
note that the template declaration (Signature) is necessary to specialize the class 'result' for each case and how the redundant 'This' is necessary for the first specialization. The series of template definitions and specializations is the so called "result_of protocol". |
|||
The other good thing is that boost::result_of is backward compatible with C++ and C++0x. In fact in C++0x std::result_of (from TR1) is internally implemented in terms of decltype. |
|||
In summary, depending on the level of compatibility we have three options: |
|||
#include<boost/utility/result_of.hpp> //boost::result_of |
|||
#include<functional> //std::result_of |
|||
template<class F, class T> |
|||
option 1: decltype(F()(T())) //works in C++0x only |
|||
option 2: typename std::result_of<F(T const&)>::type //works in C++ with TR1 (needs the result_of protocol for F) and in C++0x |
|||
option 3: typename boost::result_of<F(T const&)>::type // works in C++ and C++0x (always needs the result_of protocol for F) |
|||
eval_at(F f, T const& t){ |
|||
return f(t); |
|||
} |
|||
Note that in any case, within C++0x the most general technique is to use 'auto' and 'decltype' on expressions, because it will work on overloaded functions as well. |
|||
Based on: http://stackoverflow.com/questions/2689709/difference-between-stdresult-of-and-decltype |
|||
== Boost Toys == |
|||
Boost has a few hidden small libraries, some hidden as implementation details in other libraries |
|||
=== Version === |
|||
Once in a while we use features that are available in one version of Boost but not in previous versions. Different systems have different version of Boost, for example, by default: |
|||
* Chaos 4.3/RedHat 5.5: Boost 1.33 |
|||
* Ubuntu 8.04/10.04/10.10/11.04: Boost 1.34/1.40/1.42/1.42 |
|||
* Fedora 13/15: Boost 1.41/1.46 |
|||
Using new features generally in a system with an older version usually results in a compilation error. Unfortunately the error messages are cryptic and this can be very confusing for the user. This can be resolved by a static (compile time) assert that verifies that the system is using a certain version of Boost. |
|||
The version information of the Boost distribution is included in |
|||
#include<boost/version.hpp> |
|||
in a macro defined (for example) as |
|||
#define BOOST_VERSION 103301 |
|||
So, for example to check that one is using Boost version 1.42 *at least* we can add the following code before using such feature. |
|||
BOOST_STATIC_ASSERT( BOOST_VERSION / 100000 >= 1 && BOOST_VERSION / 100 % 1000 >= 42); |
|||
.. code valid only in 1.42 here .. |
|||
The expression is verified at compile time and it will give and error message that will be easy to spot. BOOST_VERSION can also be used in #if/#endif blocks. |
|||
Such error message will make clear that to use that code we need a newer version of Boost installed. (For example manually, see at the beginning on how to install the current version.) |
|||
=== Progress Bar === |
|||
One of them is [http://www.boost.org/doc/libs/1_43_0/libs/timer/timer.htm#Class%20progress_display Boost.Progress]. It gives a convenient way of printing a progress bar to output. We only need to supply an estimated completion count and increment the counted as progress in the calculation is made: |
|||
#include<boost/progress.hpp> |
|||
... |
|||
boost::progress_display show_progress( max_counter, std::cout, "title of bar:\n" ); |
|||
for (int count = 0; count < max_counter; ++count){ |
|||
... |
|||
++show_progress; |
|||
} |
|||
Each time ++show_progress the internal counter is incremented proportionally and a nicely formatted percentage bar will be updated: |
|||
0% 10 20 30 40 50 60 70 80 90 100% |
|||
|----|----|----|----|----|----|----|----|----|----| |
|||
**************************** |
|||
The last two arguments are optional. |
|||
If the process requires more counts than the estimate provided (max_counter) it will keep adding stars beyond the 100% mark. |
|||
Printing other stuff from inside the loop (e.g. using cout) can ruin the formatting of the bar. |
|||
=== Printing type names (<code>boost::units::detail::demangle</code>) === |
|||
Obtaining type information at runtime is implemented in C++ via typeid. Unfortunatelly the typename results extraction can give cryptic results, probably associated with internal strings used to represent a type. To transform a name into an human redeable string we need a name demangler. The function described here returns the undecorated (no reference or const qualifier) of a type. |
|||
For example, simply calling <code>typeid(variable).name()</code> can give something like |
|||
N5boost5units8quantityINS0_4unitINS0_4listINS0_3dimINS0_19time_base_dimensionENS0_15static_rationalILln1ELl1EEEEENS0_18dimensionless_typeEEENS0_18homogeneous_systemINS3_INS0_6atomic23electron_mass_base_unitENS3_INSC_27elementary_charge_base_unitENS3_INSC_33reduced_planck_constant_base_unitENS3_INSC_32coulomb_force_constant_base_unitENS3_INSC_27boltzman_constant_base_unitES9_EEEEEEEEEEEEvEEdEE |
|||
which has straneous symbols contained in it. Instead |
|||
#include<boost/units/detail/utility.hpp> //not necessary if boost.units is already included |
|||
using boost::units::detail::demangle; |
|||
... |
|||
cout << demangle(typeid(var).name()) << endl; |
|||
gives |
|||
quantity<unit<list<dim<time_base_dimension, static_rational<-1l, 1l> >, dimensionless_type>, homogeneous_system<list<atomic::electron_mass_base_unit, list<atomic::elementary_charge_base_unit, list<atomic::reduced_planck_constant_base_unit, list<atomic::coulomb_force_constant_base_unit, list<atomic::boltzman_constant_base_unit, dimensionless_type> > > > > >, void>, double> |
|||
this particular function is specifically written to simplify the names of boost.Units quantities (by removing the string 'boost::units::') but can be used for any type. |
|||
=== Boost.ProgramOptions === |
|||
This is not a small library but its description is small and covers also a small but frequent feature of every pogram, the command line arguments. We start with a small program, and quickly we realize that we need to pass options at the command-line. In C (and C++) command line options are received by main program via the argc and argv parameters |
|||
int main(int argc, char* argv[]) |
|||
the first argument is the number of parameters (words) and argv is a c-array of c-string passed (included the name of the executable itself). Typically we would implement some sort of parser that reads the strings one by one, reading names of variables, etc. This process is tedious and prone to errors. Even after much effort the solution will be short of features. For example suppose that we want to have an option file instead of command line options, we will need to rewrite the parser for the file, etc. An after we finish we will need to implement a help message describing all the options available and maintain it. |
|||
Boost.ProgramOptions solves this problem elegantly. It parses the options from a source (command line, config file or environment variables) and stores it in local variables from where the values can be interpreted more easily. |
|||
The simplest example is one in which the main program needs to know the value of several input variables, some of the with default values. |
|||
// program.cpp |
|||
int main(int argc, char* argv[]){ |
|||
double kT=-1; |
|||
double EF=-1; |
|||
{ |
|||
using namespace boost::program_options; |
|||
options_description desc("options"); |
|||
desc.add_options() |
|||
("temp,T" , boost::program_options::value<double>(&kT), "set electronic temperature") |
|||
("fermi,EF", boost::program_options::value<double>(&EF)->default_value(10.), "set fermi energy"); |
|||
variables_map vm; |
|||
try{ |
|||
store(parse_command_line(argc, argv, desc),vm); |
|||
}catch(...){clog<< desc <<endl; throw;} |
|||
notify(vm); //don't forget this line |
|||
} |
|||
cout<<"EF is "<<EF<<endl; |
|||
cout<<"kT is "<<kT<<endl; |
|||
... |
|||
The program can be called as |
|||
./program --temp=300. --fermi=10. #or as ./program -T 300. -EF 10. |
|||
Note that this follows the [GNU Argument Syntax] for long and short options. |
|||
If we pass options that are not listed (or have a typo) or the parameter has an incorrect value type (string instead of number) the program will give an error: |
|||
what(): unknown option dada |
|||
what(): in option 'fermi': invalid option value |
|||
Also the "help" message is printed by the line |
|||
clog << desc << end; |
|||
In this case: |
|||
options: |
|||
-T [ --temp ] arg set electronic temperature |
|||
-E [ --fermi ] arg (=10) set fermi energy |
|||
This message is constructed automatically from the description object. Option names, default values, assignment to local variables and options description (help message) are all in one place. |
|||
This example covers most of the simplest cases. For positional arguments (arguments with no names intepreted by position in command line), other ways to read the information on the variables_map, reading options from config files and non-convetional syntax, see the [http://www.boost.org/doc/libs/1_43_0/doc/html/program_options.html manual]. |
|||
Note: the executable needs to be linked to the boost_program_options (<code>c++ a.cpp -lboost_program_options</code>). |
|||
Here it is a more complicated example in which we add positional (as opposed to named arguments), |
|||
#include<boost/program_options.hpp> |
|||
using namespace boost::program_options; |
|||
variables_map interpret_command_line(int argc, char* argv[]){ |
|||
using namespace boost::program_options; |
|||
options_description desc("Usage " + std::string(argv[0]) + " [OPTION]... FILE...\nRuns several projectile velocities."); |
|||
desc.add_options() |
|||
("help,H", "produce this help") |
|||
("input-file", "input file, also positional parameter") |
|||
; |
|||
positional_options_description p; |
|||
p.add("input-file", -1); |
|||
variables_map vm; |
|||
store(command_line_parser(argc, argv).options(desc).positional(p).run(), vm); |
|||
notify(vm); |
|||
if (vm.count("help")){ |
|||
std::cout << desc << std::endl; |
|||
exit(-1); |
|||
} |
|||
if (not vm.count("input-file")){ |
|||
std::cout << argv[0] << ": missing operand\n Try `" << argv[0] << " --help' for more information." << std::endl; |
|||
exit(-1); |
|||
} |
|||
return vm; |
|||
} |
|||
int main(int argc, char* argv[]){ |
|||
boost::program_options::variables_map vm = interpret_command_line(argc, argv); |
|||
std::cout << vm["input-file"].as<std::string>() << std::endl; |
|||
// do something with vm["input-file"].as<std::string>() |
|||
return 0; |
|||
} |
|||
=== Boost.RegEx (regular expressions) === |
|||
[http://www.boost.org/doc/libs/1_44_0/libs/regex/doc/html/index.html Boost.RegEx] is a another major complex library for string manipulation by regular expressions. In this context can be useful sometimes to process outputs from other programs. RegEx is also part of TR1 the next C++ standard library. |
|||
Regular expressions are ways to define complicated text patterns. There are several regular expression definitions (syntaxes), Perl regular expressions is one of them. Regular expression can be used to validate string input, extract substrings and do string replacement. |
|||
In the following example, the numbers contained in the input string are isolated and enclosed in curly brackets. |
|||
#include <iostream> |
|||
#include <string> |
|||
#include "boost/regex.hpp" |
|||
int main() { |
|||
boost::regex reg("[+,-]*\\d{1,}", boost::regex::icase|boost::regex::perl); |
|||
std::string s = "a_0^-1 b^2 c^-32 d^+44"; |
|||
s = boost::regex_replace(s,reg,"{$0}"); |
|||
std::cout << s << std::endl; |
|||
} |
|||
will print "a_0^{-1} b^{2} c^{-32} d^{+44}". |
|||
(This is a simple way to convert unit outputs (see Boost.Units) to something that understandable for LaTeX code.) |
|||
There is not much else to say about Boost.RegEx except that it can save you lots of coding for text processing and that you actually have to learn about regular expressions before using Boost.RegEx. Just some notes of caution: |
|||
* a regular expression involving the character <code>\</code> has to by typed as <code>\\</code> in the code. |
|||
* not all strings are valid regular expressions, if the string is not a valid regular expression the program will abort the execution (it won't give an error at compile time). If you want compile-time (static) regular expressions you have to use Boost.Xpressive. |
|||
* Make sure you specify the right regular expression syntax (in this example <code>boost::regex::perl</code>) of your regular expression. |
|||
* You have to link the executable to libboost_regex (<code>c++ -lboost_regex ...</code>). Make sure (with command ldd) that you are linking to the binary library (libboost_regex) consistent with header file (regex.hpp). If it not correctly linked the library can return the wrong result without warning. |
|||
* Some characters used in regular expressions (eg. \) are reserved characters in C, therefore they must be preceded by another '\'. |
|||
Other sources: |
|||
* http://onlamp.com/pub/a/onlamp/2006/04/06/boostregex.html |
|||
* http://www.boost.org/doc/libs/1_43_0/libs/regex/doc/html/index.html |
|||
* http://regexpal.com/ (Regular expression tester, good for learning regular expressions) |
|||
* http://gskinner.com/RegExr/ (Another tester with replace options) |
|||
* http://en.highscore.de/cpp/boost/stringhandling.html#stringhandling_regex |
|||
* http://www.johndcook.com/cpp_regex.html (actually for the new standard --i.e. not boost-- version of regex, but based on the Boost's one) |
|||
=== boost::to_string === |
|||
Boost provides a template function called <code>lexical_cast</code> that can potentially convert any type into any other type as long as the source type is printable to stream and the destination type is readable from a stream. Most of the time this function is applied to obtain the string representation of a number: |
|||
std::string s = boost::lexical_cast<std::string>(5.); // s == "5." |
|||
which evidences some syntactic redundancy is long to write. When the destination is a string there is a function called to_string |
|||
#include<boost/exception/to_string.hpp> |
|||
... |
|||
std::string s = boost::to_string(5.); |
|||
This is particularly useful to construct error messages or exceptions descriptions. |
|||
using boost::to_string; |
|||
... |
|||
throw std::domain_error("value x = " + to_string(x) + " is outside allowed domain [0,1)"); |
|||
Anther alternative is to use |
|||
#include<boost/units/io.hpp> |
|||
... |
|||
std::string s = boost::units::to_string(5.); |
|||
== (not)Boost.Process == |
|||
This library is not included officially in Boost, hopefully it will be accepted in the future. It can be added to Boost very easily by |
|||
The library is obtained here http://www.highscore.de/boost/gsoc2010/process.zip with documentation at http://www.highscore.de/boost/gsoc2010/. Also available as svn: |
|||
mkdir -p ~/soft/boost_process.svn |
|||
cd ~/soft/boost_process.svn |
|||
svn co http://svn.boost.org/svn/boost/sandbox/SOC/2010/process/boost |
|||
cd boost |
|||
mkdir -p ~/usr/include/boost/ |
|||
cp -r * ~/usr/include/boost/ |
|||
(No compilation of the library is necessary but ~/usr/include must be in the include-path --e.g. export CPATH=$CPATH:/home/correaa/usr/include.--) |
|||
There is an older version of the library that can create confusion, check that you are using at least he GSOC version. To remove use |
|||
rm -r ~/usr/include/boost/process* |
|||
== Boost.Phoenix3 == |
|||
This library is now officially in Boost. (Boost includes also Phoenix2 as part of Boost.Spirit.) |
|||
Its goal is to be able to define expression templates that resemble (as much as possible) normal C++ expressions. The documentation is available [https://svn.boost.org/svn/boost/trunk/libs/phoenix/doc/html/index.html here]. |
|||
=== Coexisting with Spirit/Phoenix2 === |
|||
Spirit uses Phoenix internally and, by default, it uses Phoenix2. Phoenix2 and Phoenix3 can not be uses simultaneously therefore if you are using Spirit at all you have to tell Spirit too use Phoenix3. This is achieved by defining this before including Spirit. |
|||
#define BOOST_SPIRIT_USE_PHOENIX_V3 |
|||
#define BOOST_PROTO_MAX_ARITY 11 |
|||
#define BOOST_PROTO_MAX_LOGICAL_ARITY 11 |
|||
#define BOOST_MPL_LIMIT_METAFUNCTION_ARITY 12 |
|||
#include<boost/spirit.hpp> |
|||
=== For Boost < 1.46 === |
|||
For old versions of Boost, it can be added to Boost by |
|||
cd ~/usr/include/boost # assuming that the rest of Boost is in ~/usr/include/boost |
|||
svn co https://svn.boost.org/svn/boost/sandbox/SOC/2010/phoenix3/boost/phoenix |
|||
(No compilation of the library is required.) Although there had been some changes in the library since then. |
|||
== (not)Boost.Geometry == |
|||
[Boost.Geometry http://geometrylibrary.geodan.nl/index.html] is a library that is accepted to Boost but it not yet shipped with the Boost distribution (but available from trunk). It is a library designed to manipulate geometric objects, vectors, polygons, etc. independently of it internal representation, coordinate system or dimension. |
|||
For the moment it can be installed together with Boost by doing: |
|||
mkdir -p ~/usr/include/boost |
|||
cd ~/usr/include/boost |
|||
svn co http://svn.boost.org/svn/boost/trunk/boost/geometry |
|||
svn co http://svn.boost.org/svn/boost/trunk/boost/range |
|||
(this is not necessary if you get Boost from the svn trunk). |
|||
== P-Stade Oven == |
|||
The library P-Stade is not part of Boost but it follows its design and it deserves honorable mention. Technically it is a "range" library that operates on ranges of elements in a container. It is a header only library and therefore the installation is very easy. Just download and copy the header files from http://sourceforge.net/projects/p-stade/ your local directory (e.g. $HOME/usr/include) |
|||
cd ~/usr/include |
|||
wget http://downloads.sourceforge.net/project/p-stade/pstade/1.04.3/pstade_1_04_3.zip |
|||
unzip pstade_1_04_3.zip |
|||
rm pstade_1_04_3.zip |
|||
all the files will be installed in ~/usr/include/pstade. This library needs Boost to be installed already. |
|||
The official manual is located [http://p-stade.sourceforge.net/oven/doc/html/index.html here] while a nice set of examples can be found in the entries of this [http://d.hatena.ne.jp/faith_and_brave/ blog]. Another source of examples is [http://www.codeproject.com/KB/stl/oven.aspx here]. |
|||
With pstade/oven/initial_values.hpp every sequence (except c-arrays) becomes directly intializable with a list of "initial values". This is particularly useful to initialize const containers (this is similar to Boost.Assign). Once we have a collection of elements (in some container) we can apply functions to this object. |
|||
The trivial function is pstade/oven/'''identities'''.hpp, it does nothing to the elements. This function is useful anyway because internally it changes the representation of the collection to something that can be printed independently of the source. In the following example I use identities in order to print the original sequence. Other functions used are reversed, sorted, filtered and partitioned. |
|||
The function names are self-explanatory. The only one that need explanation is partitioned wich returns a std::pair with two subsets depending in some boolean function. Note the intensive use of the pipe operator |, the functions can be applied in any sequence. |
|||
#include <iostream> |
|||
#include <pstade/oven/initial_values.hpp> |
|||
#include <pstade/oven/identities.hpp> |
|||
#include <pstade/oven/reversed.hpp> |
|||
#include <pstade/oven/sorted.hpp> |
|||
#include <pstade/oven/filtered.hpp> |
|||
#include <pstade/oven/partitioned.hpp> |
|||
#include <pstade/oven/io.hpp> |
|||
#include <pstade/oven/equals.hpp> |
|||
using namespace pstade::oven; |
|||
//a free function |
|||
bool is_even(int const& x) { return x % 2 == 0; } |
|||
int main(){ |
|||
using std::cout; using std::endl; |
|||
//works with std::vector |
|||
const std::vector<int> v = initial_values(3, 1, 4, 5, 2); |
|||
cout << (v | identities) << endl; //prints {3,1,4,5,2} |
|||
//works with boost::array |
|||
const boost::array<double, 3> a = initial_values(1, 3, 5); |
|||
cout<< (a | reversed ) <<endl; //prints {5,3,1} |
|||
cout<< (a | identities) <<endl; //prints the original {1,3,5} |
|||
//works with a c-array |
|||
const double arr[3] = {22,33,11}; |
|||
cout<< (arr | sorted) <<std::endl; //prints {11,22,33} |
|||
cout<< (v | filtered(&is_even) ) <<endl; //prints {4,2} |
|||
cout<< (v | partitioned(&is_even)).first <<endl; //prints {4,2} |
|||
cout<< (v | partitioned(&is_even)).second <<endl; //prints {3,1,5} |
|||
(v | back) = 2; //assing a value to the last element |
|||
//(v | back_value) = 3; //doesn't work because back_value returns a copy (and not the original element) |
|||
cout<< (equals(v, initial_values(1,2,3)) <<endl; |
|||
return 0; |
|||
} |
|||
= Other Resources = |
|||
What the program does is to initialize a matrix in different processes, calculate local power of the data, do a global Fourier transform of this matrix (in place) and the compute the local power of the transformed data. The global power (sum of locals) is equal in the original data and in the transformed data. |
|||
* Online book on selected Boost libraries http://en.highscore.de/cpp/boost/index.html |
|||
Also note that although this is our first parallel mpi program the real mpi communication is performed by the fftw and not by us. Rarely we will want to deal with passing data ourselves between processes, libraries like fftw and Lapack will do a faster/better job than us. The idea here is to set the environment for the usage of those highly tunned numerical libraries. |
|||
* Reference on C++ and the Standard Template Library http://www.cppreference.com/wiki/start |
|||
* The classic SGI Standard Template Library Programmer's Guide http://www.sgi.com/tech/stl/ |
|||
* Boost Users group list http://groups.google.com/group/boostusers?hl=en |
|||
Latest revision as of 09:58, 23 July 2014
DRAFT (written by Alfredo Correa Copyright 2009)
This document gives a practical introduction to the BOOST Libraries oriented to common tasks in scientific computing. For those using C++ as a programming language in general, BOOST is a very useful set of libraries that help programming in C++ by providing tools and commonly used idioms that solve recurring problems that emerge when developing in this language, even at a fairly low level. It also encourages certain style of programming, where generality and efficiency are priority. These tools include macro definitions, template functions and classes, functions, generic data containers, and common algorithms. It can be viewed as a natural extension of the C++ Standard Template Library (STL).
General Remarks
BOOST, like C++, has a very steep learning curve and the syntax can become sometimes very ugly and complicated, but it faster to learn BOOST than to reinvent the wheel and come with buggy home-brew solutions that at the end of the day will look even more ugly and won't be that good anyway. (By good I mean generic and efficient.) Note that I am not comparing BOOST and C++ with other tools like Matlab, or Fortran, or Python. Those languages are likely to do better (and quick development) than C++ in many specific areas. Boost also closes the gap with those languages without introducing modifications to the core C++ language.
Libraries included in Boost range from very simple ones (like Boost.Conversion) to very complex ones (like Boost.MPI and Boost.BGL).
The aim of this document/tutorial is to incrementally document the utility and working practical examples. Extra documentation added by users is very welcome, specially in the area of non-trivial but concise examples, related to our common programming problems (i.e. scientific computing).
Topics covered in the first stages of this document will include: Boost building and installation, Boost.MultiArray and Boost.MPI. That are the libraries that I have to deal with at this point. Ideally I would like to add Boost.uBlas in the future and document examples of small libraries that are very usefull for numerical tasks. The focus will be to document the interaction of these libraries with numerical libraries such as FFTW and Lapack, and above all, the documentation of practical example code.
BOOST build and installation
It is a bless that most Linux distributions now include some versions of Boost, at least the compiled libraries. Many Boost libraries are header-only, which means that no linking is necessary (an #include<boost/ ... > line set us ready to use a particular library). For the rest of the libraries, linking is not hard either, it is just a matter of adding something like "-lboost_NAME" during compilation (with certain naming convention). The current tutorial has been tested in wcr.stanford.edu, with GCC 3 and in my personal Ubuntu with GCC 4. Some libraries seem to compile a little better (fewer warnings) with GCC 4. If you are going to install GCC 4, do it before going ahead with this tutorial.
The difficulty with linking is that most Boost distributions in the operating system are outdated; for example the current version of Boost is 1.37, while the most recent distributions come with version 1.34 at best. Once in a while we will need a library not included in older releases (for example Boost.MPI). For "header-only" libraries the situation is not too bad because the header files can be downloaded and they ready to use. For binary libraries we need to compile them. There are tools to compile specific libraries but it will be easier for the moment to compile the whole Boost set.
Below are the instructions for version 1.37/8 or 1.39. An alternative build instructions (including Windows system) is provided here.
Also if you need Boost.MPI install the mpi compiler before continuing.
Versions 1.37 and 1.38
Before going thorugh these instructions be aware that if you need these versions of boost in your computer you can check whether your distribution already ships with them. For example Ubuntu 9.10 ships with Boost 1.38.1 and it can be installed just by doing
sudo apt-get install libboost-dev
which doesn't include the mpi libraries by default.
If the aforementioned instructions doesn't work or if you need Boost.MPI proceed as follows:
As usual, to be kind with our usage in administrated system, we will install Boost in our user space, for example in $HOME/usr.
mkdir $HOME/usr
the directories ~/usr/include/boost and ~/usr/lib will be created and populated after the building of Boost. Be brave and install the last current version from the link bellow:
mkdir $HOME/soft cd $HOME/soft wget http://internap.dl.sourceforge.net/sourceforge/boost/boost_1_37_0.tar.gz
If the download fails, try to download the last version from Sourceforge Boost download page.
Then decompress it:
tar -zxvf boost_1_37_0.tar.gz cd boost_1_37_0 ./configure --prefix=$HOME/usr
Append the following quoted line to user-config.jam to include Boost.MPI among the installed libraries.
echo "using mpi : mpicxx ;" >> user-config.jam
where mpicxx can be replaced by your mpi compiler wrapper, for example "/usr/lib/mpich/bin/mpiCC", or first selected interactively with "mpi-selector-menu" if is available. Then compile and install the package
make make install
Go lunch, that takes ~30 minutes. After successful compilation and installation, all the header files will be installed in ~/usr/include/boost-1-37/boost/*.hpp and the binary linkable files will be in ~/usr/lib/libboost_*.
Depending on how you want to compile your programs it might be necessary to add these directories to the include path of compilation for example, LD_LIBRARY_PATH. It can also be useful to make a link to the more standard usr/include/boost/:
cd ~/usr/include ln -s boost-1-37/boost boost
Versions 1.39 and up
You can download an specific version of Boost (To see what is the last version of Boost go to http://sourceforge.net/projects/boost/files/boost)
You need subversion, for example
sudo apt-get install subversion
BOOST_VER=1_47_0 mkdir -p $HOME/usr mkdir -p $HOME/soft cd $HOME/soft wget http://downloads.sourceforge.net/boost/boost_$BOOST_VER.tar.gz tar -zxvf boost_$BOOST_VER.tar.gz cd boost_$BOOST_VER
or you can use the SVN to download (and update) the latest version. You need subversion for that (for example sudo apt-get install subversion
mkdir -p $HOME/soft/boost.svn cd $HOME/soft/boost.svn time svn co http://svn.boost.org/svn/boost/trunk grep BOOST_LIB_VERSION trunk/boost/version.hpp #will show the nominal version of the lib cd trunk
(20 minutes)
The installation can slightly benefit from having libicu and other basic libraries installed, for example
sudo apt-get install libicu-dev libbz2-dev python-dev
Then execute
./bootstrap.sh --prefix=$HOME/usr --exec-prefix=$HOME/usr`uname -m` #if you need mpi, replace "mpicxx" by your mpi compiler wrapper (e.g. mpicxx.mpich, or /opt/mvapich-gnu-gen2-0.9.9/bin/mpicxx) echo "using mpi : mpic++ ;" >> project-config.jam NUM_CORES=`cat /proc/cpuinfo | grep processor | wc -l` time ./bjam -j $NUM_CORES cxxflags=-fPIC | tee bjam.out rm -rf ~/usr/include/boost/* ~/usr`uname -m`/lib$CLUSTER/libboost_* ./bjam install
It takes ~30 minutes (or ~15 minutes in two processors or ~3 minutes in 8 processors).
Then create the following link (not necessary in version >=1.40):
cd ~/usr/include ln -s boost-1_39/boost boost
Boost 1.40 can be installed in Ubuntu 9.10 with the line
sudo aptitude install libboost1.40-all-dev
which includes Boost MPI (compiled with openMPI).
Add the following variable to your environment
export LD_LIBRARY_PATH=$HOME/usr/lib$CLUSTER
If downloading from SVN, Boost can be easily updated by doing
cd ~/soft/boost.svn/trunk grep BOOST_LIB_VERSION boost/version.hpp #will show the last nominal version of the lib svn update grep BOOST_LIB_VERSION boost/version.hpp #will show the new nominal version of the lib ./b2 install -j 2
To delete the installation:
rm -rf ~/usr/include/boost ~/usr/lib$CLUSTER/libboost_*
Versions 1.48 and up
bjam is replaced by b2.
If you are planning to use c++0x standard it may be better to compile with c++0x support:
./bootstrap.sh --prefix=$HOME/usr NUM_CORES=`cat /proc/cpuinfo | grep processor | wc -l` time ./b2 -j $NUM_CORES cxxflags='-fPIC -std=c++11 -O3' | tee bjam.out #rm -rf ~/usr/include/boost/* ~/usr`uname -m`/lib$CLUSTER/libboost_* ./b2 install
(take only 3 minutes in 8-cores computer)
Possible Issues
Boost compilation may require bzlib library installed (libbz2-dev package in Ubuntu, if it is not available use the -sNO_COMPRESSION=1 as a bjam option), or if it complain about python headers install python development (python-dev pack in Ubuntu)
Intel compiler for Boost.MPI
The following steps allows us to compile boost-mpi 1.42.0 library using intel compiler (on su-ahpcrc).
tar zxvf boost_1_42_0.tar.gz cd boost_1_42_0 ./bootstrap.sh --with-toolset=intel-linux --with-libraries=mpi --prefix=$HOME/usr-intel --libdir=$HOME/usr-intel/lib echo "using mpi : mpicxx ;" >> project-config.jam vi tools/build/v2/tools/mpi.jam time ./bjam -j $NUM_CORES --with-mpi toolset=intel | tee bjam.out ./bjam install --prefix=$HOME/usr-intel
In the line with command vi, we need to modify the mpi.jam file (at line 208) to something like:
if $(otherflags) {
for unknown in $(unknown-features)
{
result += "$(unknown)$(otherflags:J= )" ;
}
}
The line in the original mpi.jam file reads:
result += "$(unknown)$(otherflags)" ;
This bug fix is discussed at here.
Also see here.
bjam "-sINTEL_PATH=/opt/intel/cce/10.0.014/" -sTOOLS=intel-linux -sINTEL_CC="icc -xT -O3 -no-prec-div " -sINTEL_CXX="icc -xT -O3 -no-prec-div" "-sBUILD=release <debug-symbols>on" >& bjam-build-icc-xT-O3-no-prec-div.log &
Remove Boost
To remove an old or incomplete installation
rm -rf ~/usr/lib/libboost_* ~/usr/include/boost*
Boost.Array Library
Boost.Array is one of the simplest Boost libraries yet it solves a common problem in C++. That is that C++ fixed size arrays are not objects but simple pointers whose size (number of elements) cannot be passed to functions unless we add an argument to the function with the number of elements.
double a[] = {1,2,3}; double* b=0; // Typical C-arrays
double f(double* c, int c_size){ ... } // Typical function taking C-array by reference
To alleviate the problem the Array library defines a class called boost::array<T,N> where T is the type of the elements and N is the (fixed) size of elements.
The syntax is very easy but it can be tricky sometimes
boost::array<double, 3> a; a[0]=1; a[1]=2; a[2]=3;
They can be used as other STL-containers:
assert(a.size()==3); for(double* i=a.begin(); i!=a.end(); ++i) std::cout<<*i<<std::endl;
The corresponding (internal) mutable C-array can be accessed by the member function .c_array(). The member .data() is a constant C-array view. Arrays of the same size can be compared, swapped, copied, filled, etc (see detailed specification), things that can not be directly done with C-arrays.
Comparison with other containers
std::vector<T> in some sense plays a similar role and it has the same memory layout (stored elements are contiguous in memory). But in the case of std::vector the size is not set at compile time but it is dynamic. This extra flexibility comes at a cost. First, there is a (small) overhead in std::vector. Second, sometimes we want to express that some array has a constant size.
Programmatically, static information of this type can be accessed with ::static_size and ::value type (http://codepad.org/gOnoJebu).
Some technical differences with other containers: 1) initial elements may have an undetermined value even if they have default constructor 2) there is no efficient swap() (no constant complexity) 3) no allocator support.
Initialization
Canonical
There not much else to say about this library. The only tricky part is how to construct and define a boost::array 'in place'. To do so we still have to use the old literal C++-fixed arrays and put a double curly bracket:
boost::array<double, 3> a={{1,2,3}};
Let's say we have a function 'f' that takes a boost::array<int,3> and we want to call the function but do not want to create an explicit temporary, in this case some intuitive syntaxes don't work
f({{1,2,3}}); //doesn't work
f(boost::array<int, 3>(1,2,3)) //doesn't work
f(boost::array<int, 3>({1,2,3})) //doesn't work
f(boost::array<int, 3>({{1,2,3}})) //doesn't work
what eventually works is the following (see http://codepad.org/IEHmUEmu)
f((boost::array<int,3>){{1,2,3}}); //does work for some compilers, note the parenthesis
This trick is undocumented and it works in modern versions of the compilers, for example in g++ 4.3.3. For old compilers, a named variable has to be created:
boost::array<int,3> tmp={{1,2,3}};
f(tmp);
Another problem is that if we initialize the array with less elements than necessary the compiler will not complain.
boost::array<int,3> a={{1,2}}; // a[2] may be initialized to zero without warning
Any C-array can be used to be copied to a boost::array but that requieres an additional line of code and it is not strictly an initialization.
boost::array<int, N> a; std::copy(carr, carr+N, a.begin());
Besides the tricky syntax for initialization, these objects are useful to explicitly say that some arrays are not just pointers and that they have a size that is determined during compilation and does not depend on the state of the program. Also we can forget about the problems associated with built-in fixed arrays and pointer errors.
Using Boost.Assign
Alternatively a library called Boost.Assign can be used
#include<boost/assign/list_of.hpp> namespace assign = boost::assign; boost::array<double,3> a=assing::list_of(1.)(2.)(3.);
Boost.Assign works with boost::array, but it is also more general. The same method can be used to initialize other containers, such std::vector, std::list, or std::set
using boost::assing::list_of; const std::list<int> primes = list_of(2)(3)(5)(7)(11);
It is one of the few ways to initialize constant (const) containers.
std::maps can by initialized with map_list_of(key1,value1)(key2,value2)...(keyN,valueN), containers of tuples can be initialized with tuple_list_of.
Although not strickly connected with the problem of initialization the library can also be used to assign values to containers very easily. Without the library we usually need something like
std::vector<int> a(10); a[0]=1; a[1]=2; ... a[9]=10;
With the library this long code can be replaced by
#include <boost/assign/std/vector.hpp> #include <boost/assert.hpp> using namespace boost::assign; //brings operator+= for std::vector into scope vector<int> v; v += 1,2,3,repeat(10,4),5,6,7,8,9; // v = [1,2,3,4,4,4,4,4,4,4,4,4,4,5,6,7,8,9]
(This doesn't work with boost::array, because arrays can not grow dynamically).
Replacing raw c-arrays
There are certain advantages in using boost::array instead of plain c-array. For example, boost::array are first class objects, they can be duplicated, or copied, and there is no complications with pointers. If you ever programmed in C you will remember how easily c-arrays decay into pointers, loosing track of its lengths, and how much code is then devoted to take care of the pointed elements. Boost.Array also provides range (out of bound) checking in debug mode. Also, boost::array can be used where other STL containers and algorithms can be used.
Then, a question arises, if have a lot of code that uses plain old c-arrays, for example, in function calls
void f(unsigned size, const double c[])
how do we pass a boost array a. The answer is that the old function can be called in three different ways:
f(a.size(), a.data()); //gives a pointer to const element f(a.size(), a.c_array()); //gives a raw pointer whose pointee can be modified f(a.size(), &a[0]); //but it is not clear whether a elements can be modified in the
The other way around, in which we write a new function that takes advantage of boost::array
g(boost::array<double, 3>& a);
but the code is handling c-arrays from the caller side. Then one possibility is to
double c[3]; ... boost::array<double, 3> const& c_as_barray = *(boost::array<double, 3>*)c; // = (boost::array<double, 3>&)c; may not work g(c_as_barray);
This trick can be used even if the function modifies the argument, modifications are reflected in the original array.
The function can be generalised to different sizes by using templates
template<unsigned N>
g(boost::array<double, N>& a){ ... }
and calling g<3>(a) . If the size is not known at the time of coding (because the size is a variable) then boost::array is not the right container, you can use std::vector instead (part of STL).
g(std::vector<double>& a) ...
std::vector should be used when the size of the array is only known at runtime, otherwise use boost::array.
Static Multidimensional arrays via Boost.Array
Another feauture of boost::array is that array of arrays (of arrays...) can be defined, and intialized with its elements.
using boost::array;
array<array<double, 2>,3> a={{
{{1,2}},
{{3,4}},
{{5,6}}
}};
then a[0..2][0..1] can be used. (Most brackets could be removed in GCC 4.). This should replace the c-array
double ca[3][2] = {1,2,3,4,5,6};
In both cases the elements will be continuous in memory.
(A similar effect also can be achieved with vector<vector<double> >, the important difference is that in this case each vector can have a different size and the elements are not contiguous.)
A shortcut can be defined with template typedefs:
struct array2<class T, unsigned D1, unsigned D2>{
typedef array< array<T, D2>, D1> type;
... //optional metainfo
};
...
array2<double, 2, 3>::type a;
See example code here http://codepad.org/Gp2AOP92.
Using boost::array in general is slightly better than using c-arrays because boost::array can never be confused with a pointer. For better support of multidimensional arrays (or for dynamic sizing) read the following section.
Printing Boost.Array
Boost.Array has no default print mechanism, even if the contained elements are printable (i.e. std::cout << elem code is valid). One option is to write a loop that prints each element. An easier options is to use Boost.Fusion (covered in another section). Just by including
#include <boost/fusion/adapted/boost_array.hpp> #include <boost/fusion/sequence/io.hpp> using namespace boost::fusion; // or using boost::fusion::operator<<; // important!
int main(){
boost::array<double, 3> a = {1.,2.,3.};
std::clog << a << std::endl;
std::clog << tuple_open('[') << tuple_close(']') << tuple_delimiter(", ") << a << std::endl;
std::clog << "input array here [respect default format (x y z) !!] : ";
std::cin >> a;
std::clog << a;
return 0;
}
This works out-of-the-box as long as element_type is printable. An added benefit compared to home brew print functions is that it works also for input streams (default format) and also that the format of the delimiters can be specified.
C++0x Array
The standard library of the C++0x compiler includes a class called std::array that is almost the same as boost::array. However the interface is not exactly the same, (for example there is no .data member in std::array and static_size is replaced by tuple_size<.>::size) and also. strictly speaking they are different types (for example the printing code using Fusion doesn't work). But besides these differences they can be used interchangeably. If you want your code to work with Boost.Array and also with with std.array one strategy can be to exploit the using declaration:
#if __cplusplus > 199711L // new C++ standard and library is used by the compiler #include<array> // std::array using std::array; #else #include<boost/array.hpp> using boost::array; #endif
array<double, 3> arr = {1.,2.,3.}; //use std/boost array without the namespace
Boost.MultiArray Library
The MultiArray is a library solves a very annoying problem with C/C++ for our area. C/C++ have a very limited support for arrays and the situation is even worst for multidimensional arrays (2, 3 or more indices). Passing arrays to functions is very error prone, and functions with endless parameters with array dimensions must be continuously passed around. Moreover, writing algorithms for arbitrary array dimensionality is very hard. For one-dimensional arrays we have Boost.Array, std::vector and std::valarray. For multidimensional arrays we have Boost.MultiArray.
The C++ (or C) language was designed to be very flexible, there are infinite ways to arrange multidimensional arrays in memory and infinite ways to resize them, so the C++ language did not enforce any particular standard. Yet for scientific computing (which is only a subset of all computing areas), we can agree that there are a few multidimensional arrangements that make sense in numerical problems: namely, contiguous blocks memory with either column-major ordering (Fortran-convention) or row-major ordering (C-convention).
Among this reasonable constrains of the memory layout, MultiArray provides a very flexible, very small and nice interface. The official MultiArray manual provides rather simple but useful examples to begin with. Below there are more examples more suitable for existing numerical programs.
From the simple examples it can be noticed MultiArray usage is very similar to the default C-arrays with few very convenient differences 1) no explicit allocation (ie. malloc/new) is needed 2) no explicit deallocation (free/delete[]) 3) indices ordering can be C-like or Fortran-like (useful for interfacing with Fortran routines) 4) Arrays keep track of their shape (no separate variables with size are needed).
The typing system of MultiArray is quite complicated and can be confusing, one way to alleviate this problem is to make extensive use of typedefs. (You may seldom benefit from using Template typedefs.)
Basic Usage (multi_array)
MultiArray is a header-only library -- no special linking is necessary. You just need to include the header file:
#include<boost/multi_array.hpp>
The usage of the multi_array class is pretty intuitive. First we declare and construct an variable of type multi_array, specifying its dimensionality (2) and its dimensions (100x100).
boost::multi_array<double, 2> A(boost::extents[10][10]);
The number of arguments of extents has to agree with the dimensionality. Then you can use the matrix as usual,
for(int i=0; i!=A.shape()[0]; ++i)
for(int j=0; j!=A.shape()[1]; ++j)
A[i][j]=i+j;
The elements can be accessed by a sequence of square brakets '[]' or passing a collection of integers with the parenthesis operator '()' as in this example example:
boost::array<int, 2> idxs={{1,2}};
A(idxs)+=1.2;
For an N-dimensional array we would need N similar loops. For any dimensionality, the elements can be iterated in its underlying memory representation:
for(double* it=A.origin(); it!=A.origin()+A.num_elements(); ++it) (*it)*=3;
In this example, the whole multiarray is multiplied by 3, regardless of its dimensionality. Note that num_elements()==shape()[0]*shape()[1]...shape()[N].
The dimensionality of a *particular array* can be known by using A.num_dimensions(). The dimensionality of the *type* can be known by using the static constant "::dimensionality".
typedef boost::multi_array<double, 4> marray; marray A; assert( A.num_dimensions() == marray::dimensionality );
The official documentation and this paper have additional examples and some explanation of the library internals.
Using MultiArray on existing code together with C-arrays (multi_array_ref)
MultiArray can be adapted easily in programs that already use raw-arrays since the first element to the contiguous array in memory can be known for functions that expose the pointer to the array in memory; the shape of the array can be also accessed:
boost::multi_array<double, 2> A(boost::extents[10][10]); // allocates memory for 100 elements ... double* A_ptr = A.origin(); int A_N0 = array.shape()[0]; // A_N0 <- 10 int A_N1 = array.shape()[1]; // A_N1 <- 10 ... // use the array in a typical c-function somecfunction(A_N0, A_N1, A_ptr); // or as cfunction(A.shape()[0], A.shape()[1], A.origin() ); ... // .. no need to free memory, it is deallocated when A goes out of scope
Such function can even modify the elements of the array. In the example the array memory is managed by the multi_array A and therefore the memory is freed when A goes out of scope. Sometimes, for existing applications, this is not possible because the memory is managed in some other complicated way. In that case we can still take advantage of Boost.MultiArray by using the boost::multi_array_ref class, that is the same as a boost::array except that it does not manage the underlaying memory:
int A_N0 = 10; int A_N1=10; double* A_arr = new double[N1*N2]; // or = malloc(sizeof(double)*A_N0*A_N1); ... boost::multi_array_ref<double, 2> A_ref(A_arr, boost::extents[A_N0][A_N1]); ... // access A_arr through A_ref or call a MultiArray aware function somemultiarrayfunction(A_ref); ... delete[] A_arr; //or free(A_arr); (do not use A_view after this point!)
In this way we can go back and forth form C-arrays (raw-arrays) to boost::multi_array. Generalizations to higher dimensions are similarly easy. (One of the strong points of MultiArray is that dimensionality can be arbitrary). Let me stress that in the first case the memory is automatically managed for us, while in the second case we have to take care of that by some external (even manual) process.
A simple FFTW example
In the context of numerical libraries the following is an example to call FFTW3 in combination with MultiArray. (Remember to Install FFTW3 first.)
boost::multi_array<std::complex<double>, 2> A(boost::extents[10][10]);
for(int i=0; i!=10; i++)
for(int j=0; j!=10; j++)
A[i][j]= std::complex<double>(i,j);
...
fftw_plan p=fftw_plan_dft_2d(A.shape()[0], A.shape()[1], A.origin(), A.origin(), FFTW_FORWARD, FFTW_ESTIMATE); // in place
fftw_execute(p);
fftw_destroy_plan(p);
Advanced interface of FFTW take stride arguments as well, this is not a limitation for MultiArray since, for example, array_view's (see below) also keep track of internal strides (e.g. A_center.stride()[n] is the stride in direction n of the referenced memory).
Generic code, but not too generic (Boost.Utility enable_if and Boost.TypeTraits)
As soon as we start working with MultiArray we will recognize than there are several kinds of multiarrays: 1) multi_array, 2) multi_array_ref, 3) const_multi_array_ref and in general they are useful in different contexts. The first is the plain object that manages its own memory, the second is brings read write access to an existing C-array and the last one gives read-only access. On top of that when specifying partial indices (e.g. a[3] in a 2-dimensional array) the type is 4) subarray, and if we do this is a const_ version we get a 5) const_subarray. And all these types are formally different. This complicated type system has the drawback that there is no way to write a generic function that takes any of them unless we use templates.
So instead of repeating 5 versions of some function
return_type some_function(multi_array<double, 2>& ma){ ...
return_type some_function(const_multi_array<double,2>& ma){ ...
return_type some_function(multi_array_ref<double,2>& ma){ ...
return_type some_function(subarray<double,2>& ma){ ...
we write a single template function
template<class MultiArray>
return_type some_function(MultiArray& ma){ ...
return_type is just to indicate the return type of the function, it could be for example void.
When calling this function it will work with any of the mentioned types. This solves the problem of the multiple types as long as the actual deduced type behaves like a multi_array the program will work.
The problem is that now the function is too generic. In our first five versions of the function we were specific enough to only work with arrays of doubles but now we lost that constrain and it maybe that the function logically shouldn't work for not-real numbers.
For example the some_function could rely on comparing (ordering) different elements, if that is the case then when using the function with an array of complex<double> the compiler will complain. This not an ideal solution because the error will usually appear in mysterious places. Or it can be more subtle than that the function may rely and we may not even get an error from the compiler but silently have a bug.
Here is were enable_if (part of Boost.Utility) allows us to constraint function arguments is some situations. What enable_if does when within the function declaration/specification (see later) is to disable or enable a particular instantiation of the template function depending on some condition. For example this condition can be that the elements of the array is a double (an nothing else). There are several ways to use enable_if, but there are mainly two for functions definitions (the example described here). Continuing with the example, we can
#include <boost/utility/enable_if.hpp> #include "boost/type_traits.hpp" // for is_same using boost::enable_if_c; using boost::is_same
1) Replace the return type (return_type) with an enable_if
template<typename MultiArray>
typename enable_if_c<is_same<typename MultiArray::element, double>::value,
return_type>::type return_type some_function(MultiArray& ma){ ...
This is the preferable syntax since the condition is right before the function name. Another possibility is to
2) Replace and input parameter with an enable_if that on success gives the type of that original parameter
template<typename MultiArray>
return_type some_function(
typename enable_if_c<is_same<typename MultiArray::element, double>::type,
MultiArray >::type & ma){ ...
This can be also preferable because the constrain on the generic type MultiArray is next to the parameter itself.
Option 1) is not always applicable: some special functions don't have return types (i.e. class constructors), in this case we can
3) Add a dummy optional parameter
template<typename MultiArray>
return_type some_function(
MultiArray& ma, typename enable_if_c<is_same<typename MultiArray::element, double>::value, void*>::type = 0
){ ...
The '=0' makes the last parameter optional which means that we can still use the function by passing the usual parameter. This optional parameter is not meant to be used at all during the function call.
In the three cases the effect is the same, function is invisible if we try to call it with a kind of multi_array whose elements are not doubles.
boost::multi_array<double, 2> adbl; some_function(adbl); //ok, some_function will be called boost::multi_array<std::complex<double>, 2> acmplx; some_function(acmplx); // compiler error, no such function (i.e. not for std::complex arrays)
In the example we constrained the multiarray by its elements type and relied in the the function is_same from Boost.TypeTraits library. is_same<T1,T2>::value is (a constant) true if T1 is identical to T2 and (the constant) false otherwise. Of course the value is obtained at compilation time.
We could have constrained the function to take only arrays with certain dimensionality, is that case we could also have used enable_if_c<MultiArray::dimensionality==2, ...>::type as an improvement over more crude approaches like compile-time checking BOOST_STATIC_ASSERT(MultiArray::dimensionality==2).
One difference with respect to BOOST_STATIC_ASSERT is that when using enable_if the kind of error given by the compiler is the usual 'no matching function'. In short, use static assert to force the compiler to issue an error directly if the type is not correct but use enable_if if you want the compiler to try with a different (overloaded) function and if this fails then issue a function not found error.
The reason that makes enable_if work is too complicated to explain here, but I only will mention that it is because of a feature of C++ called SFINAE[1] (an acronym that doesn't seem to describe what it's supposed to describe). Luckly nowadays most compilers support this feature. I admit that the syntax becomes so complicated that I wouldn't call it a feature but an embarrassment of the language. Another rough corner of C++.
Fortran ordering and Fortran functions (fortran_storage_order)
Functions in libraries like Lapack expect arrays in column-major storage order format, this option of storage is supported by Boost.MultiArray. (For linear algebra matrices it is more recommendable to use Boost.uBlas which is more oriented to linear algebra operations and one or two dimensional arrays, i.e. vectors and matrices.)
The ordering of the elements in memory is controlled by an option that can be set when declaring the array. (By default, if not specified, boost::c_storage_order is used.)
boost::multi_array<double,2> A(boost::extents[10][10], boost::fortran_storage_order() ); ... // assign A value with Fortran index ordering
In order to imitate Fortran convention, there is even an option to declare the multi_array to be 1-based (instead of 0-based) (see below).
We then can call a Fortran function as:
int A_N1=A.shape()[0]; int A_N2=A.shape()[1]; fortran_function(&A_N1, &A_N2, A.origin()); // Fortran arguments are passed by reference(pointer)
or write a wrapper to the Fortran function
wrapper_function(boost::multi_array<double, 2>& in){
int in_N1=in.shape()[0]; int in_N2=A.shape()[1];
fortran_function(&in_N1, &in_N2, in.origin()); // Fortran arguments are passed by reference(pointer)
}
and then call it simply by doing 'wrapper_function(A);'.
Calling a Fortran function that takes stride arguments can be done by means of subarray's and this case is not covered here.
The difference between c_storage_order and fortran_storage_order is illustrated in this paste.
Lapack examples
This section shows how to combine fortran_storage_ordering, ::shape() and ::origin() in order to call Fortran routines. There are better ways to use linear algebra routines from C++, like Boost.uBlas, this section is intended to show the usage of the multi_array.
Note that the Lapack routines are called twice, this is because the first call is used to estimate the working space, which is allocated manually. The real work is done by the second call. Lapack functions are declared using the 'extern "C" void' prefix.
Singular Value Decomposition
The design of the SVD class for decomposing complex matrices follows the one from Numerical Recipe 3rd edition (page 67) and the actual lapack calling is taken from this example. The following is the complete program, note the use of fortran_storage_order in order to maintain the fortran convention. The decomposition (hard work) is called from the constructor of the lapack::svd class, then the decomposition matrices are the member multi_arrays, u (), vh () (2-dimensional) and w () (1-dimensional vector) (). The contents of the original matrix are destroyed.
#if 0
c++ -I$HOME/usr/include -D_TEST $0 -L$HOME/usr/lib -llapack -lptf77blas -latlas -lgfortran -lpthread -Wfatal-errors && ./a.out
exit
#endif
#include<boost/multi_array.hpp>
#include<iostream>
/* ZGESVD prototype */
struct _dcomplex{double re, im;};
extern "C" void zgesvd_( char* jobu, char* jobvt, int* m, int* n, _dcomplex* a,
int* lda, double* s, _dcomplex* u, int* ldu, _dcomplex* vt, int* ldvt,
_dcomplex* work, int* lwork, double* rwork, int* info );
namespace lapack{
using boost::multi_array;
using boost::extents;
class svd{
public:
multi_array<std::complex<double>, 2> u;
multi_array<std::complex<double>, 2> vh;
multi_array<double, 1> w;
svd(multi_array<std::complex<double>, 2>& a) :
u (extents[a.shape()[0]][a.shape()[0]], boost::fortran_storage_order()),
vh(extents[a.shape()[1]][a.shape()[1]], boost::fortran_storage_order()),
w (extents[std::min(a.shape()[0], a.shape()[1])]){
int lwork = -1;
std::complex<double> wkopt;
std::vector<double> rwork(std::max(1u, 5*std::min(a.shape()[0], a.shape()[1])));
int info=0;
char All='A';
zgesvd_(&All, &All, &(int&)a.shape()[0], &(int&)a.shape()[1],
(_dcomplex*)a.origin(), &(int&)a.shape()[0],
w.origin(), (_dcomplex*)u.origin(), &(int&)a.shape()[0],
(_dcomplex*)vh.origin(), &(int&)a.shape()[1],
&(_dcomplex&)wkopt, &lwork, &rwork[0], &info);
lwork = (int)wkopt.real();
std::vector<std::complex<double> > work(lwork);
zgesvd_(&All, &All, &(int&)a.shape()[0], &(int&)a.shape()[1],
(_dcomplex*)a.origin(), &(int&)a.shape()[0],
w.origin(), (_dcomplex*)u.origin(), &(int&)a.shape()[0],
(_dcomplex*)vh.origin(), &(int&)a.shape()[1],
&(_dcomplex&)work[0], &lwork, &rwork[0], &info);
assert(info<1);
}
};
}
int main(){
boost::multi_array<std::complex<double>,2 > a(boost::extents[2][2], boost::fortran_storage_order() );
double data[2][2][2]={
{{1., 0.}, {1., 0}},
{{2., 0.}, {2., 0}}
};
a.assign((std::complex<double>*)data, (std::complex<double>*)data + a.num_elements());
lapack::svd uvh(ma);
std::clog
<<"u = {{"<<uvh.u[0][0]<<","<<uvh.u[0][1]<<"},\n"
<<" {"<<uvh.u[1][0]<<","<<uvh.u[1][1]<<"}}\n";
std::clog
<<"v^h = {{"<<uvh.vh[0][0]<<","<<uvh.vh[0][1]<<"\n"
<<" {"<<uvh.vh[1][0]<<","<<uvh.vh[1][1]<<"}}\n";
std::clog
<<"w = {{"<<uvh.w[0]<<",0},\n"
<<" {0,"<<uvh.w[1]<<"}}\n";
/*output
u = {{(-0.707107,0),(-0.707107,0)},
{(-0.707107,0),(0.707107,0)}}
v^h = {{(-0.447214,0),(-0.894427,0)
{(0.894427,-0),(-0.447214,0)}}
w = {{3.16228,0},
{0,1.98603e-16}}
*/
return 0;
}
QR Decomposition
Complex QR decomposition is performed in two stages, first the matrix R (upper triangular) and an intermediate representation of Q (unitary ) is obtained via Lapack ZGEQRF, second the proper Q is calculated with ZUNGQR. The design of the qrd class follows that of Numerical Recipes. After the decomposition the members q and r contains the respective decomposition (, no conjugate, no transposition). During construction, it is checked that the multi_array has fortran ordering (the algorithm doesn't give QR decomposition for c-ordering). Output arrays also have fortran ordering.
#if 0
cp $0 $0.cpp
g++ -Wl,-rpath=$HOME/usr/lib -I$HOME/usr/include -I.. -D_TEST $0.cpp -o $0.cpp.x -L$HOME/usr/lib -llapack -lptf77blas -latlas -lgfortran -lpthread -lboost_system -Wfatal-errors && ./$0.cpp.x
rm -f $0.cpp
exit
#endif
#include<boost/multi_array.hpp>
#include<iostream>
#include<mathematica/archive.hpp>
/* ZGEQRF and ZUNGQR prototype */
struct _dcomplex{double re, im;};
extern "C" void zgeqrf_(int* m, int* n, _dcomplex* a, int* lda, _dcomplex* tau, _dcomplex* work, int* lwork, int* info);
extern "C" void zungqr_(int* m, int* n, int* k, _dcomplex* a, int* lda, _dcomplex* tau, _dcomplex* work, int* lwork, int* info);
namespace lapack{
using boost::multi_array;
using boost::extents;
class qrd{
public:
multi_array<std::complex<double>, 2> q;
multi_array<std::complex<double>, 2> r;
multi_array<double, 1> w;
qrd(multi_array<std::complex<double>, 2> const& a) : // q.r = a (a is destroyed after the fact, no transpose no conjugate)
q (extents[a.shape()[0]][a.shape()[1]], boost::fortran_storage_order()),
r (extents[a.shape()[0]][a.shape()[1]], boost::fortran_storage_order()){
assert(a.storage_order() == boost::fortran_storage_order() );
std::vector<std::complex<double> > tau(std::min(a.shape()[0],a.shape()[1]));
int m=a.shape()[0];
int lda=std::max(1u,a.shape()[0]);
_dcomplex* tau_=(_dcomplex*)&tau[0];
_dcomplex work_size={0,0};
_dcomplex* a_=(_dcomplex*)a.origin();
{
int n=a.shape()[1];
int lwork=-1;
int info=0;
//see http://www.intel.com/software/products/mkl/docs/webhelp/lse/functn_geqrf.html
zgeqrf_(&m, &n, a_, &lda, tau_, &work_size, &lwork, &info);
assert(info==0);
lwork=(int)work_size.re;
std::vector<std::complex<double> > work(lwork);
_dcomplex* work_=(_dcomplex*)&work[0];
zgeqrf_(&m, &n, a_, &lda, tau_, work_, &lwork, &info);
assert(info==0);
r=a;
for(unsigned i=0; i!=r.shape()[0]; ++i)
for(unsigned j=0; j!=std::min(i, r.shape()[1]); ++j)
r[i][j]=0.;
}
{
int n=std::min(a.shape()[1], a.shape()[0]);
int k=n;
int lwork = -1;
int info=0;
//see http://www.intel.com/software/products/mkl/docs/webhelp/lse/functn_ungqr.html
zungqr_(&m, &n, &k, a_, &lda, tau_, &work_size, &lwork, &info);
assert(info==0);
lwork=(int)work_size.re;
std::vector<std::complex<double> > work(lwork);
_dcomplex* work_=(_dcomplex*)&work[0];
zungqr_(&m, &n, &k, a_, &lda, tau_, work_, &lwork, &info);
assert(info==0);
q=a;
q.resize(extents[a.shape()[0]][a.shape()[0]]); //exploits the fact that the extra elements are set to zero
}
}
};
}
#ifdef _TEST
int main(){
boost::multi_array<std::complex<double>,2 > a(boost::extents[3][3], boost::fortran_storage_order() );
mathematica::oarchive oa(std::cout);
double c_data[3][3][2]={
{{11., 0.}, {21., 0.},{32.,0}},
{{31., 0.}, {41., 1.},{21.,0}},
{{20., 1.}, {32., 2.},{45.,1.}}
};
boost::multi_array<std::complex<double>,2 > data(boost::extents[3][3], boost::c_storage_order() );
data.assign((std::complex<double>*)c_data, (std::complex<double>*)c_data+data.num_elements());
a=data;
oa<<"a = "<<a;
lapack::qrd qr(a);
oa<<"q = "<<qr.q;
oa<<"r = "<<qr.r;
return 0;
}
#endif
Orthogonalization via Cholesky Decomposition
Given a set n linearly vectors v (stored as rows in V) a new set of n vectors w (stored in the rows of W) can be generated by the transformation
where is the upper triangular Cholesky decomposition of the square matrix
in its turn is the array of overlaps.
The result is that
, i.e. is an orthonormal set, as long as rows are linearly independent.
The array is the Hermitian square of and can be obtained with Lapack ZHERK, the Cholesky decomposition () is obtained via Lapack ZPOTR, finally the orthonormal (actually unitary because the implementation is for complex numbers) set is obtained with the solution of
given by ZTRSM.
The previous examples (SV and QR decompositions) only work properly for fortran_storage_order (if the order is c_storage_order the functions don't run). In contrast this implementation of orthogonalize works for both fortran and c-ordering and is this feature what makes the code complicated. (A more complicated code can even achive column orthogonalization also). In both cases the result is a set of orthonormal rows. (The original matrix is destroyed). The function uses a temporary square array with order equal to the number of rows in the input. Other operations are in-place and the output replaces the input.
namespace lapack{
void orthogonalize_rows(multi_array<std::complex<double>,2>& v){
multi_array<std::complex<double>,2> c(extents[v.shape()[0]][v.shape()[0]], boost::fortran_storage_order() );
{// hermitian_square
char uplo=(v.storage_order()==boost::fortran_storage_order())?'U':'L';
char trans=(v.storage_order()==boost::fortran_storage_order())?'N':'C';
int n=((trans=='N') or (trans=='n'))?v.shape()[0]:v.shape()[1];
int k=((trans=='N') or (trans=='n'))?v.shape()[1]:v.shape()[0];
if(not(v.storage_order()==boost::fortran_storage_order())) std::swap(n,k);
double alpha=1.;
_dcomplex* a_=(_dcomplex*)v.origin();
int lda=((trans=='N') or (trans=='n'))?std::max(1, n):std::max(1, k);
double beta=0.;
_dcomplex* c_=(_dcomplex*)c.origin();
int ldc=std::max(1, n);
zherk_(&uplo, &trans, &n, &k, &alpha, a_, &lda, &beta, c_, &ldc);
}
{//cholesky decomposition
char uplo=(v.storage_order()==boost::fortran_storage_order())?'U':'L';
int n=c.shape()[0];
_dcomplex* a_ = (_dcomplex*)c.origin();
int lda=std::max(1, n);
int info=0;
zpotrf_(&uplo, &n, a_, &lda, &info);
if(info>0) throw std::domain_error("matrix is not positive definite");
}
{//inverse of cholesky applied to original array via solution of hermitian system
char side=(v.storage_order()==boost::fortran_storage_order())?'L':'R';
char uplo=(v.storage_order()==boost::fortran_storage_order())?'U':'L';
char transa='C';
char diag='N';
int m=v.shape()[0];
int n=v.shape()[1];
if(not(v.storage_order()==boost::fortran_storage_order())) std::swap(n,m);
double alpha=1.;
_dcomplex* a_=(_dcomplex*)c.origin();
int lda=(side=='L' or side=='l')?std::max(1,m):std::max(1,n);
_dcomplex* b_=(_dcomplex*)v.origin();
int ldb=std::max(1,m);
ztrsm_(&side, &uplo, &transa, &diag, &m, &n, &alpha, a_, &lda, b_, &ldb);
}
//now v is orthonormal (input is overwritten with orthonormal set)
}catch(std::domain_error& e){
throw std::domain_error("can not orthogonalize because vectors are not linearly independent");
}
}
The code can be used as follows,
multi_array<std::complex<double>,2> v_fortr(extents[2][3], boost::fortran_storage_order() ); multi_array<std::complex<double>,2> v_clang(extents[2][3], boost::c_storage_order()); ... fill v with values ...
both arrays formally contain 2 vectors of 3 components.
lapack::orthogonalize_rows(a_fortr); lapack::orthogonalize_rows(a_clang);
in both cases the answer is the same, the only difference is how the numbers are arranged internally in memory. The trick is achieved by passing different options to the Lapack routines depending on the known ordering instead of explicitly transposing elements --which requieres copying the full arrays-- to accomodate for one or the other ordering).
(The algorithm throws if the input is not a linearly independent set, this is detected by the Cholesky decomposition, sometimes the lack of linear dependence is not detected but the routine returns vectors with almost zero components --in particular not normalized --).
As a reference, if the input is linearly independent, the output is numerically similar to Mathematica's Orthogonalize[A], except that Mathematica accepts linearly dependent input (giving zero vectors in the output).
Arbitrary ordering (and direction)
For 2 dimensional arrays the ordering can be either fortran or C-type, for higher dimensionality there are other possibilities. Sometimes we may need to work with some arbitrary ordering, for example some 3D FFT routines do effectively transpose the first and second index in order to optimize for speed. Instead of taking care of this transposition (either by actually performing the transposition in memory or by adding some ad-hoc logic to the program), it might be convenient to just declare a different memory storage:
int ordering[] = {1,0,2};
bool ascending[] = {true,true,true}; // we can choose to completely invert the memory order by 'false, false, false'
boost::multi_array A(extents[3][4][2], boost::general_storage_order<3>(ordering,ascending));
This also works for multi_array_ref
boost::multi_array_ref Aref(ptr, extents[3][4][2], boost::general_storage_order<3>(ordering,ascending));
In this case we can treat A and Aref with the conventional ordering of the index even if some external routine is transposing the internal memory layout.
The storage and ascending order can be accessed a posteriori by storage_order(), for example we can ensure fortran-ordering by doing
assert( a.storage_order()==boost::fortran_storage_order());
We can check that all dimensions are in ascending order (independently of fortran or c-ordering) by
assert( a.storage_order().all_dims_ascending() == true);
or we can check that the second dimension (index) is ascending and that it is the slowest-index.
assert( a.storage_order().ascending(1)==true and a.storage_order().ordering(1)==0 );
finally we can check that two arrays follow the exact same convention.
assert( a.storage_order() == b.storage_order() );
Assign from c-arrays
The assign method can be used to assign element from other containers (including c-arrays) into multi_array.
multi_array<double,2> a(extents[3][3]);
double a_data[2][2] = //this is what I call a c-array
{{1,2},
{3,4}};
a.assign(a_data, a_data+4);
The trick can be used to assign complex elements too
multi_array<std::complex<double>, 2> b(extents[2][2]);
double b_data[2][2][2] =
{ { {1,0}, {2,0} },
{ {3,0}, {4,0} } };
b.assign( (std::complex<double>*)b_data, (std::complex<double>*)b_data+4);
(as a pointer to std::complex<double> b_data has effectivelly 4 elements). The trick exploits the fact that a complex number can be viewed as a struct with 2 double elements.
The ordering (X_storage_ordering) and the direction (asc/descending) does not affect how assign works. Assign fills the internal memory of the array independently of the storage convention. This is confusing because in the previous example if 'a' is declared as fortran_storage_order the transpose array is obtained.
A workaround for fortran_storage_order is to declare an intermediate array with c_storage_order and then copy the contents of one array into the other. Elements are copied element by element in the right order.
multi_array<double> d(extents[2][2], fortran_storage_order()); multi_array<double> d_corder(extents[2][2], c_storage_order()); d_coder.assign(a_data, a_data+4); d = d_coder;
Custom index ranges (reindex)
In previous examples we used some obscure entity called boost::extents, it is obviously used to pass the valid range of indices to the matrix declaration.
Technically extents is an (empty) global variable of type boost::extent_gen; this is just a trick to allow a nice syntax with arbitrary number of parameters, like passing extents[10][10][10][10] for a 4D-array.
The real power of this syntax is that we can really pass other types than just integers declaring the size in each dimension, namely we can declare ranges instead. For example, this will create a 1-based array of size 10x10.
using boost::extents; typedef boost::multi_array_types::extent_range range; boost::multi_array<double, 2> A(extents[range(1,11)][range(1,11)]);
In this case the elements A[0][0], A[0][1], A[1][0] are out-of-bounds, just like A[11][11]. Furthermore, we can create an array that is behaves completely like a Fortran-array by also indicating the ordering.
boost::multi_array<double, 2> B_Flike(extents[range(1,11)][range(1,11)], boost::fortran_storage_order() );
Let also mention that we can reindex an array on the fly by calling the .reindex method:
A.reindex(0); // now all dimensions are 0-based, valid elements 0..9 x 0..9 ... // take advantage of 0-based array A.reindex(1); // 0-based index restored, valid elements 1..10 x 1..10
After reindexing the range of each dimension i is given by A.index_bases()[i] ... A.index_bases()[i]+A.shape()[i].
(Reindixing is just a new logical rearrangement of element referencing, no copies of the matrix or the elements is made.)
The indices can start an any arbitrary point (not just 0 or 1), even negative values -- in this example (taken from here), the indexing is totally arbitrary in each dimension:
typedef boost::multi_array<double, 3> array3D; array3D A(extents[8][range(1,4)][range(-1,3)][range(10,20)]);
(Note that, as an argument, [8] is the same as [range(0,8)].)
It is also good to note that the memory allocated by an array is always located between A.origin() and A.origin()+A.num_elements() (non-inclusive).
Let's clarify the difference between A.data() and A.origin(). As we saw before origin() returns a pointer to the first element of the array, what ever this is. On the other hand, A.data() returns a pointer to the eventual location of A[0][0]..[0], regardless of whether that element is the first or not; such element may not even be part of the array (for example in 1-based arrays). In general we have to deal origin() and rarely with data(), but it is very easy to confuse one with the other.
Range Checking (and Boost.Assert)
Once the ranges for a multi_array are defined, the elements can be accessed using the square bracket operators. By default the library does check whether the indexes correspond to a valid element (index is not out of bounds). If it is not a valid element, the whole program stops with a failed assertion that looks like this:
./a.out: ... /boost/multi_array/base.hpp:136: Reference boost::detail::multi_array::value_accessor_n< ... >::access( ... ) const [with Reference = boost::detail::multi_array::sub_array< ... >, 1u>, ... , unsigned int NumDims = 2u]: Assertion `size_type(idx - index_bases[0]) < extents[0]' failed.
Thanks to the Boost.Assert facilities, this behavior can be suppressed (in which case there is not checking and overrun memory will be accessed) or changed to an C++ exception mode.
To disable the checking completely just #define BOOST_DISABLE_ASSERT before #including multi_array.hpp.
#define BOOST_DISABLE_ASSERT #include<boost/multi_array.hpp>
The library will run faster because there is no range checking each time an element is accessed.
To change the behavior such that an exception is thrown if index is out of bounds, define the tag BOOST_ENABLE_ASSERT_HANDLER and the function void boost::assertion_failed(char const * expr, char const * function, char const * file, long line) before including the MultiArray library (http://codepad.org/JgXciACG):
#define BOOST_ENABLE_ASSERT_HANDLER
#include<stdexcept>
namespace boost{
void assertion_failed(char const * expr, char const * function, char const * file, long line){
if(string(expr)=="size_type(idx - index_bases[0]) < extents[0]"){
throw std::range_error("multi_array index range error")");
}else{
throw std::runtime_error("generic multi_array error");
}
}
}
#include<boost/multi_array.hpp>
(Unfortunately the information about the actual index value that caused the error is not part of the information carried by the exception.) This change in behavior may not work if the program is linked to files already compiled without this definition, or if multi_array.hpp is included before this block is defined.
Now we are able to catch range errors:
multi_array<double, 3> a(extents[10][10][10]);
try{
a[11][-1][20]=99;
}catch(std::range_error&){ ... }
Most Boost libraries use Boost.Assert to indicate errors. With this mechanisms users can choose the way to handle these errors. Using the exception version of Boost.Assert doesn't make the program run slower and it allow us to recover the program from some types of errors.
Boost.Assert should not be confused with Boost.StaticAssert.
Bound checks for multi_arrays is more critical than bounds checking for unidimensional containers (std::vector, etc); the reason being that the logic behind the indices can be more complicated and difficult to get right (for example exchange columns by rows in the indices). std::vector provides operator[] which doesn't raise exceptions and .at() which throws an instance of std::out_of_range exception.
In most implementations std::vector<...>::operator[] doesn't check bounds at all. To have bound checks, the alternative is to change all the v[n] occurrences by v.at(n). If that is not an option the flag -D_GLIBCXX_DEBUG or #define _GLIBCXX_DEBUG before #include<vector> can add bounds check temporarily.
A third option is to use __gnu_debug::vector in the declarion (instead of std::vector) which header is #include<debug/vector>.
Accessing elements
Elements of multiarrays can be accessed in two ways, by using operator[] in sequence (as in the above examples) or by using operator() with an array of indices as described here. This last form has the advantage that if a given combination of indexes needs to be used repeatedly then it can be defined only once. For example the following code applies a certain operation that depends on a complicated combination of indices.
for(int i=0; i!=a.shape()[0]; ++i){
for(int j=0; j!=a.shape()[1]; ++j){
for(int k=0; k!=a.shape()[2]; ++k){
array<int, 3> idx={{i,j,k}}, idx2 = {{i+1==a.shape()[0]?0:i+1, j,k}};
A(idx) = f(idx) * B(idx2);
A(idx) += g(idx) + C(idx2); //f and g are functions defined elsewhere
}
}
}
Regular subarrays
One of the useful properties of raw-arrays is that *regular* subarrays can be defined simply by defining offsets and strides. MultiArray provides this kind of creation of subarray in a quite elegant manner. For example, given the array defined in the example above, the most simple subarray is the type of the object A[5] which is automatically a one-dimensional array containing a reference to the sixth (indices start at 0) row of the array, so for example (A[5])[1] = 0 changes the value of element A[5][1].
There are more complicated subarray views that can be defined in terms the original array. The interesting thing is that these are only 'views' of the original array, no copy of elements are made.
The basic examples are given in the official MultiArray tutorial. The following are some digested examples, which at the same time introduces 'typedef' statements to reduce verbosity. The following examples will create arrays from the original 2D array A:
typedef std::complex<double> complex; typedef boost::multi_array<complex,2> array2d; using boost::extents; array2d A(extents[10][10]); // 2D, 100 allocated elements
Subarrays (sub_array)
These are simply references to the original array when the first n indexes are specified. The dimension of the resulting subarray is the number of dimensions of the original array minus n. typedef array2d::subarray<1>::type subarray1d; subarray1d A_row5 = A[5]; // 1D, 10 elements
Views (array_view)
typedef array2d::array_view<1>::type view1d; typedef boost::multi_array_types::index_range range; boost::multi_array_types::index_gen indices; view1d A_row3_5to10 = A[3][indices[range(5,10)]]; // or A[indices[3][range(5,10)] view1d A_row3_5to10odd = A[indices[3][range(5,10,2)]; typedef array2d::array_view<2>::type view2d; view2d A_center = A[indices[range(2,7)][range(2,7)]]; // 2D, 25 elements view2d A_upperleft_quadrant_oddonly = A[indices[range(0,5,2)][range(0,5,2)]]; // 2D, 4 (2x2) elements view2d A_lowerright_quadrant_evenonly = A(range(5,10,2),range(5,10,2)); // 2D, 4 (2x2) elements
Last two arrays refer to odd and even elements of the original array.
Also note that something like
view1d A_error = A[indices[range(2,7)][range(2,7)]];'
is not allowed because the RHS is 2-dimensional and the LHS is declared to be 1-dimensional.
None of the arrays defined here make any copy of elements, remember that sub_arrays and array_view are just references, just another way to interpret (view) the data of original array.
Views are more general than subarrays. For example A[1] is an object of type subarray, which can be implicitly converted into a view. However A[1][indices[range(0,5)]] is an array_view object which can not be converted into a subarray.
Array views can be recursive, one can create an array view of an existing array view. For example, the even element of the even element view, is the 'every 4 elements' view of the original array.
Wrap view (code example) (array_wrap_view)
Most discrete Fourier transform routines (e.g. FFTW3 and Numerical Recipes) really compute the so-called 'in-order' transform. This means that the 'positive' frequencies are stored in the first half of the output and the 'negative' frequencies are stored in the second half of the output.
If the input data represents a continuous peridic function then applying operations on the output requires special code to treat the first and second half of the output data (quadrants in 2 dimensions). Numerical Recipes 3 (page 614), for example, implements a very basic class WrapVecDoub (one dimensional) that takes care of this ordering without affecting the output memory. In the same spirit we will implement a class wrap_array_view that takes care of the indexing in the context of array views.
namespace boost{
template<class MultiArray> class wrap_array_view;
template<typename Element, typename Value>
struct return_helper{
typedef wrap_array_view<Value> type;
};
template<typename Element>
struct return_helper<Element, Element>{
typedef Element type;
};
template<class MultiArrayRef>
class wrap_array_view{
MultiArrayRef ma_;
public:
wrap_array_view(MultiArrayRef const&ma) : ma_(ma){};
typedef typename return_helper<
typename MultiArrayRef::element&,
typename MultiArrayRef::reference>::type return_helper_type;
return_helper_type operator[](int i){
if(i<ma_.index_bases()[0]) i+=ma_.shape()[0];
return return_helper_type(ma_[i]);
}
};
}//namespace boost
When accessing each element (e.g. rows) the code takes care of shifting the index to wrap the second half of the array, the object returned is also wrapped (recursive), if the return type is an element of the original array the return type is a reference to the element instead. (Wrapped version of an element --e.g. double-- usually do not make sense, thats why the ofuscated code of return_helper).
The wrap_array_view object behaves as a reference to the original array and modifying its elements modify the original array. A small example code follows:
multi_array<double, 2> mm(extents[10][10]); wrap_array_view<multi_array_ref<double, 2> > wmm(mm); wmm[-5][-5]=111; assert(mm[5][5]==111); wmm[-1][-1]=222; assert(mm[9][9]==222);
A simple FFTW wrapper
Here I show that it is very easy to build a wrapper of known libraries that will make the syntax of the main program very easy to understand by packaging common tasks.
The packaging of common tasks is traditionally achieved by writing simple C-like functions. Sometimes need more flexibility, like the one given by C++ classes. Naturally, this will need a deep understanding of the language and of the wrapped libraries. But once written they will make easier to program.
For example, in the case of FFTW, although written in C, it is obvious that 'fftw_plan' structure behaves almost as a C++ object: 1) It has a certain state [pointers and size of arrays], 2) lifetime [until fftw_destroy_plan] 3) and can be referred recurrently during it lifetime [can be executed several times during it lifetime]. If we combine this with what we already know of Boost.MultiArray we can design a robust interface. This is a draft of the wrapper classes:
namespace fftw3{
using namespace boost;
class plan : private counted<plan>{
plan(fftw_plan const& p) : p_(p) {}
public:
template<class ConstMultiArray, class MultiArray>
plan(ConstMultiArray const& in, MultiArray const& out, int direction, unsigned flags=FFTW_ESTIMATE){
assert(boost::shape(in)==boost::shape(out));
if(in.num_dimensions()>1){
for(int i=in.num_dimensions()-1, dense_in=in.strides()[i], dense_out=out.strides()[i]; i!=-1; dense_in*=in.shape()[i], dense_out[i]*=out.shape()[i], --i){
assert(dense_in==in.strides()[i] and dense_out==out.strides()[i]); // check for dense representation of multidimensional
}
}
p_ = fftw_plan_many_dft(in.num_dimensions(), (const int *)in.shape(), 1 /*int howmany*/,
(double(*)[2])in.origin() , 0 /*const int *inembed, 0=in.shape()*/,
in.strides()[in.num_dimensions()-1] /*int istride*/, 0 /*int idist ignored for howmany=1*/,
(double(*)[2])out.origin(), 0 /*const int *onembed, 0=out.shape()*/,
out.strides()[out.num_dimensions()-1] /*int ostride*/, 0 /*int odist ignored for howmany=1*/,
direction, flags);
}
void operator()() const{return fftw_execute(p_);}
virtual ~plan(){fftw_destroy_plan(p_);}
private:
fftw_plan p_;
public:
static void cleanup(){
if(get_count()<=0) throw std::logic_error("trying to call fftw3::cleanup() with "+boost::lexical_cast<std::string>(get_count())+" active plans, should be zero");
fftw_cleanup();
}
};
void execute(plan const& p) {p.operator()();}
}
Then we can use it in the program:
boost::multi_array<std::complex<double>,2> A(boost::extents[100][100]); boost::multi_array<std::complex<double>,2> B(boost::extents[100][100]); // must have the same shape! fttw3::plan p(A,B, FFTW_FORWARD, FFTW_ESTIMATE); ... // assign values to A execute(p); // repeated times ... // no need to explicitly destroy plan
The wrapper is elegant in some aspects, 1) it is uniform for all input/output dimensionalities and for in-place/out-of-place transform, 2) special cases where array_views are passed can be handled by the same syntax without having to call the advanced FFTW functions, 3) no explicit destruction needed (no risk to execute a destroyed plan).
The wrapper can be used with the condition that the input and output arrays are stored densely in memory. (This is not the case for some array_views.) This is a limitation of the library (which was designed to take advantage of contiguous memory) and this limitation must be reflected in the wrapper. Besides that, the wrapper can deal with arrays, sub_arrays and array_views (with strides==1 if multidimensional).
In-place transform can be achieved simply by using the same argument in the first and second placeholder. The wrapper will also check for size compatibility.
Using the original arrays, there are many possible ways to use the wrapper:
fftw3::plan p(A, B, FFTW_FORWARD); // 2D FFT 100x100 fftw3::plan p(A, A, FFTW_FORWARD); // 2D inplace-FFT fftw3::plan p(A[21], B[32], FFTW_FORWARD); // 1D FFT 100 elements, fftw3::plan p(A[10][indices[range(5,10)], B[20][indices[range(5,10)]], FFTW_FORWARD); // 1D FFT on 5 elements
If not specified, the option FFTW_ESTIMATE is used. If using this default option the point at which the plan is created is unimportant. Instead, when using the options FFTW_MEASURE or FFTW_PATIENT, the location of plan creation matters. This is so because in these cases the arrays can be overwritten when FFTW looks for the optimal algorithm to use. That means that the plans should be created before the matrix is assigned values and after the matrix has the desired size and shape.
multi_array<comple,2> A(extents[100][100]); fftw3::plan p(A,A, FFTW_FORWARD, FFTW_MEASURE); //overwrites values ... // assign values; execute(p); // performs the transform
Once the plans are created over existing arrays these arrays must be not reallocated! That means not resize operations please. Assignment (global or partial) is OK because in this case MultiArray guarranties no internal reallocation is done. (Global assignment is only allowed for arrays with same sizes). The is reflects a limitation of the FFTW in which case the array pointer doesn't have to change after the plan is created.
'Many' Transforms
The wrapper can also be asked to perform multiple individual transform, by assuming that the first index of the array is used to index smaller arrays and using the following syntax:
fftw3::plan p=fftw3::plan::many(A, B, FFTW_FORWARD); // 100 1D-FFT of 100 elements each
note that this is not equivalent to the first example.
Again, the hard work is done by the library, which uses certain optimizations to perform repeated transforms of the same size.
Special Memory Allocation
FFTW recommends to use its own version of malloc to allocate arrays. This is supposedly to improve speed taking advantage of some special allocation that is exploited by the numerical routines. For example they recommend to allocate the arrays to be used in the following way:
typedef std::complex<double> complex; complex* in=(complex*)fftw_malloc(sizeof(complex)*N); //instead of malloc(sizeof(complex)*N); fftw_free(in); //instead of free(in)
Up to this point we tried not to deal with memory allocation at all.
So, the question is: How can we take advantage of allocation of memory in a context of C++? (where manual allocation can --and should-- be avoided in most cases). The answer is 'custom allocators'. For STL containers and for multi_arrays, the allocator is passed as an extra template parameter.
In this example, I pass an special allocator that uses fftw_malloc internally, for the construction of an STL vector and also for a Boost.MultiArray multi_array.
std::vector<complex, fftw3::allocator<complex> > v(100); boost::multi_array<complex, 3, fftw3::allocator<complex> > A(extents[100][100][100]);
The memory is freed (automatically) by fftw_free. After this, the object can be used like any other standard object. If the allocator is not specified, std::allocator<T> is used.
Declaring the objects in this way is optional in the same way as the usage of fftw_malloc instead of malloc is optional.
Reshape (reshape)
Although MultiArray objects can be resized, the underlying memory layout is not specially good for resizing operations. In general, for resizing, the memory is reallocated and *all* the elements copied. This is even true for operations reducing the size of the matrix (since elements have to be contiguous in memory). So, resizing operations should be avoided as much as possible and they will not be covered here (otherwise see here).
Multidimensional arrays are different from one dimensional arrays because they can be reshaped. In the library, reshape operations are different from resize in the sense that they preserve the total number of elements. The contained elements are rearranged logically but no copies or reallocations are performed. The reshape operation is done as follows:
boost::multi_array<complex, 3> a(extents[5][4][3]); //assert(a.shape()[0]==5 and a.shape()[1]==4 and a.shape()[2]==3);
{
boost::array<int,3> new_shape={{5,6,2}}; //assert(5*6*2==5*4*3);
a.reshape(new_shape); //assert(a.shape()[0]==5 and a.shape()[1]==6 and a.shape()[2]==2);
}
In this example, the reshape operation is allowed because the number of elements is preserved since 5*6*2==5*4*3. If the number of elements is not preserved the operation will fail during execution.
The following three syntaxes are not allowed due to very strange technical limitations of C++:
a.reshape({{5,6,2}}); // does not compile
a.reshape({5,6,2}); // neither
a.reshape(boost::array<int, 3>({{5,6,2}})); // neither
What eventually works (and only in new compilers) is the following syntax:
a.reshape((boost::array<int,3>){{5,6,2}}); // watch for parenthesis
or, depending on the context, it might be better to use:
using boost::array;
typedef array<int,3> array3;
a.reshape((array3){{5,6,2}});
Note 1: boost::array is not the same as boost::multi_array. boost::arrays objects are part of a much simpler library called Boost.Array. In this other library all the arrays are one-dimensional and the number after the element type indicates the (constant) number of elements and not the number of dimensions.
So far, the only constrain for reshaping is that it preserves the number of elements. The other requirement that is not very obvious is that the reshape operation must preserve the number of dimensions (in this case 3). This is not a big limitation since we can choose one of the dimensions to be 1 in size
a.reshape((array<int,3>){{5,1,12}}); // or {{5,12,1}} depending on your application
Note 2: Do not try to do something like a.reshape((array<int,2>){{5,12}}) for a 3D array to change dimensionality, it compiles but gives unexpected results. In fact, there is no way to have 'a' change its dimensionality during the program execution, since it is of type multi_array<double,3>. Another thing that is very confusing is that reshape does not work with extents; do not do A.reshape(extents[5][4][3]). Again it compiles and gives the wrong result.
If you really need an object with smaller dimensionality with the same elements you can do
multi_array<complex, 3> a3D(extents[5][4][3]); multi_array_ref<complex, 2> a2D_ref(a3D.origin(), extents[5][12]);
and use 'a2D_ref' instead of 'a3D'. In this case nobody will check that the sizes are compatible unless you add something like assert(a3D.num_elements()==a2D_ref.num_elements()); . Alternativelly we can do
multi_array_ref<complex, 2> a2D_ref(a3D.origin(), extents[a3D.shape()[0]][a3D.num_elements()/a3D.shape()[0]]);
// num_elements is always divisible by shape()[0]
to ensure that all elements are referenced.
Note 3: Of course you can always copy element by element from one type of array to another with different shape or different dimensionality. But this section is exactly about avoiding copies.
Example of reshape with FFTW
The FFTW wrapper presented earlier honors the dimensionality of the arrays passed. I mean by this that the dimensionality of the transform follows the dimensionality of the passed array instead of for example interpreting the input as a one dimensional array which would of course give a completely different result. In this section I will illustrate how to force FFTW to interpret the data differently.
For simplicity let us assume that we have a two dimensional array, and that we want to do our transforms in-place on complex data:
typedef std::complex<double> complex; using boost::multi_array; using boost::extents; multi_array<complex, 2> A(extents[100][100]); ... // fill A with data
The simplest case is that we want a 2D FFT on the 2D data structure, I present it here as a review:
fftw3::plan pf_A2A_2D(A, A, FFTW_FORWARD); // implicitly 2D transform execute(pf_A2A_2D);
Now suppose that we need to do a different thing that is to transform each row of the the array independently, the approach in this case will be different, for example we could iterate over the rows and perform a 1D transform (this is allowed because each A row is contiguous data in memory):
for(int i=0; i!=A.shape()[0]; i++){
fftw3::plan pf_Arow2Arow_1D(A[i],A[i], FFTW_FORWARD); // implicitly 1D transform
execute(pf_Arow2Arow_1D);
}
Below there is a note against using FFTW plans inside loops. A different approach, *to do the same thing* would be to use the 'many' option of the library and the wrapper. This can be faster because the library can take advantage of how the overall data is distributed.
fftw3::plan pf_Arow2Arow_many_1D=fftw3::plan::many(A, A, FFTW_FORWARD); // many 1D transforms execute(pf_Arow2Arow_many_1D);
The last option is faster, there is no explicit loop, and only one plan is created, this is more elegant because the plan can be created just after the array is constructed, this is handy if want to create plans with the FFTW_MEASURE option. (But keep in mind that this is possible only because all the 1D transforms are performed inside a single multi_array.)
Note: There is also an insidious issue with creating many FFTW plans that I discovered accidentally. The library keeps track of all the created plans (even the 'destroyed' ones). Each created/destroyed plan leaves a small memory footprint. FFTW authors claim that this is for performance optimization purposes. But when creating, using and destroying plans repeatedly, let's say few million times, the used memory will grow up and the application crash. Authors of the library say this is not a memory leak, and in part they are right, just because there a function called fftw_cleanup() [called fftw3::plan::cleanup() in the wrapper] that makes sure all that memory is reclaimed. The problem is that the function must be called at some point where it is known that there are no plans active (the wrapper will checks for this condition). It is your responsibility to call this function and since there are only few points in the program at which we know that there are no plans active, where to call cleanup(), this *effectively* behaves like a memory leak. The way to avoid this altogether is to keep the number of created plans controlled, and specially avoid creating plans inside loops. The 'many' plan can save lots of plans to be created.
Now suppose that we have our data in a 2D multi_array but that we want to interpret the data differently and perform a long 1D transform over a contiguous sequence of rows in the array. This can be motivated by special boundary conditions in the problem but also illustrates the utility of reshape.
There are two options here, one using reshape and the other using multi_array_ref (see above). By using reshape we *temporarely* rearrange the array data (logically) and perform the transform:
using boost::array;
A.reshape((array<int,2>){{1,10000}}); //remember that original was 100x100
fftw3::plan pf_Aunwrappedrows_1D(A[0],A[0], FFTW_FORWARD); // implicitly 1D
A.reshape((array<int,2>){{100,100}});
execute(pf_Aunwrappedrows_1D);
First, we reshape the original matrix to be effectively a one dimensional array. Strictly speaking A is 2D, but it has only one row A[0] and we create a 1D FFT plan for that single row that contains all the data. After the plan is created we can reshape the array again, since FFTW will remember the shape of the array. Another variant is to simply declare a plan:
fftw3::plan pf_Aunwrappedrows_2D(A, A, FFTW_FORWARD);
Although in the case the arrays and the plans are 2D, the result should be the same as a 1D transform since there is only one row.
The second option is to create a multi_array reference with a different 'view' and work on that matrix:
using boost::multi_array_ref; multi_array_ref<complex, 1> Aunwrapped_ref(A.origin(), extents[10000]); assert(Aunwrapped_ref.num_elements()==A.num_elements()); fftw3::plan pf_Aunwrapped_1D(Aunwrapped_ref, Aunwrapped_ref, FFTW_FORWARD); // inplicitly 1D execute(pf_Aunwrapped_1D);
The FFTW will modify the reference matrix which is only a different way of changing the original array but the element access is logically different.
Boost.MPI Library
I will assume that it is the case that you know the very basics of C-MPI and you want to improve the readability and design of your code. If you don't know C-MPI, you may try to learn from this examples (the syntax is very nice to learn what MPI is supposed to do --in contrast with the ugly C version--) but be aware that Boost.MPI is not widely used.
I will try to work out an example that is not trivial by using FFTW in parallel. Hopefully this approach will be more related to our application area, while the official Boost.MPI pages are plenty of (deceptively) trivial examples.
Simple MPI FFTW example
In my opinion, using a real MPI numerical library is the only way to learn MPI and make it do useful work.
The example in this section uses the MPI version of FFTW3 which must be installed prior to begin. See detailed instructions to Install FFTW3 first if necessary.
Before going to the first example let us review the command line to compile it
mpicxx -I$HOME/usr/include fft_mpi_test.cpp \
-L$HOME/usr/lib -lfftw3_mpi -lfftw3 \
-lboost_mpi-gcc43-mt-1_37 -lboost_serialization-gcc43-mt \
-o fft_mpi_test
besides the libfftw part, -L$HOME/usr/lib -lboost_mpi-gcc43-mt-1_37 -lboost_serialization-gcc43-mt do the job of linking against Boost.MPI and Boost.Serialization.
The reason we need Boost.Serialization (although we don't use it explicitly here) is that MPI is highly dependent on the serialization mechanism. Serialization is the capacity of converting any object or structure in memory into a linear stream of data. That is what we do over and over again when we save data to a file, in this case MPI communication is basically the same thing: one processes send streams of data to others. This integration allows to pass complex objects between processes without having to convert (deconstruct) the objects to numerical (or primitive) types first and then reconstructing the object on the other side as we would do it with C-MPI.
The following compilable example is based on the fftw3 simple mpi example (plain-C sources). It can be regarded as the parallel version of this serial example. In the following example we just add the Boost.MPI syntax (instead of the C-MPI):
#include <complex> /* must include before fftw3*.h */
#include <fftw3-mpi.h>
#include <boost/mpi.hpp>
#include <iostream>
namespace mpi = boost::mpi; /* if not defined have to use names as boost::mpi::... */
using std::clog;
using std::endl;
int main(int argc, char **argv){ /* initialize mpi world and fftw_mpi */
mpi::environment env(argc, argv); /* in C: MPI_Init(&argc, &argv); */
mpi::communicator world; /* this will work as MPI_COMM_WORLD */
fftw_mpi_init(); /* initialize fftw_mpi library */
const ptrdiff_t N0 = 18, N1 = 18; /* size of parallel array */
fftw_plan plan; /* this is the usual fftw plan */
std::complex<double> *data; /* in C: fftw_complex *data; this is the local (to process) data */
// given the matrix size (N0, N1) ask fftw_mpi how to split the matrix
ptrdiff_t alloc_local, local_n0, local_0_start, i, j;
alloc_local = fftw_mpi_local_size_2d(N0, N1, world,
&local_n0, &local_0_start); // alloc_local isn't always local_no*N1!!
//report allocation size, example of output from each process
if(world.rank()==0){ /* only process 0 prints this */
clog<<"Global size of array : [0:"<<N0<<")x[0:"<<N1<<"), total size "<<N0*N1<<endl;
}
//all processes print the following but each prints a different report (there are better ways to print)
clog<<"Process "<<world.rank()<<" has the array of size ["<<local_0_start<<","
<<local_0_start+local_n0<<")x[0,"<<N1<<"), total local size "<<(N0*N1)<<endl;
//now allocates data in the old fashioned way
data = (std::complex<double>*) fftw_malloc(sizeof(std::complex<double>)*alloc_local);
// in C: data = (fftw_complex *) fftw_malloc(sizeof(fftw_complex) * alloc_local);
//create fftw plan, note "_mpi_" and note the "_2d"
plan = fftw_mpi_plan_dft_2d(N0, N1, (double(*)[2])data, (double(*)[2])data, world,
FFTW_FORWARD, FFTW_ESTIMATE);
double pdata=0;
//initialize data and compute local power
for (i = 0; i < local_n0; ++i){
for (j = 0; j < N1; ++j){
std::complex<double> val=std::complex<double>(local_0_start + i, 0);
data[i*N1 + j] = val;
pdata+=norm(val);
}
}
clog<<"Process "<<world.rank() /* each process prints its local power*/
<<" has a power of "<<pdata<<" of the original data"<<endl;
//compute transform according to the plan
fftw_execute(plan);
double ptransform=0;
// normalize data and compute local power of the transform
for (i = 0; i < local_n0; ++i){
for (j = 0; j < N1; ++j){
data[i*N1+j]/=sqrt((double)(N0*N1)); // this is the normalization of the transform
std::complex<double> val=data[i*N1+j];
ptransform+=norm(val);
}
}
clog<<"Process "<<world.rank() //each process print its local power
<<" has a power of "<<ptransform<<" of the transformed data"<<endl;
fftw_destroy_plan(plan); /* destroy plan */
fftw_free(data);
// boost::mpi::environment doesn't need finalization it is automatic, in C: MPI_Finalize();
}
The example is rather long, this is mainly because we are using the plain-C fftw interface, and manually allocated tables. Note that the only change relative to the original FFTW example is the usage of Boost.MPI and the use of C++ complex type. (Imagine how much cleaner would be the code if we could take advantage of boost.multiarray and of a C++ wrapper of FFTW. Eventually, this will be the topic of another section.)
Note that using C++ and Boost does not enforce us to leave the known C-interfaces, the idea is that we can reuse the old interfaces and improve them gradually as we need it. For example we used std::complex<double> instead of double[2], (this is possible because both types are bit-compatible); the former has a much support for treating complex numbers.
What the program does is to initialize parts of matrix in different processes, calculate local power (sum of squares) of the data, do a global Fourier transform of this matrix (in place) and the compute the local power of the transformed data. The global power (sum of locals powers) must be equal in the original data and in the transformed data. (See above for compilation command.)
Also note that although this is our first parallel MPI program the real MPI communication is performed by the FFTW routines and not by us. Rarely we will want to deal with passing very large amount of data ourselves between processes, libraries like FFTW and Lapack will do a faster/better job than us. The idea here is to set the environment for the usage of those well tunned numerical libraries.
Now let us download simple_boost_mpi_example.tar.gz and run the example:
wget http://micro.stanford.edu/mediawiki-1.11.0/images/Simple_boost_mpi_example.tar.gz -O simple_boost_mpi_example.tar.gz tar -zxvf simple_boost_mpi_example.tar.gz cd simple_boost_mpi_example make simple_boost_mpi_example
(Checkout the Makefile to see the compilation line). Run the example to see an output like the following:
$ mpirun -np 2 ./simple_boost_mpi_example Process 1 has the array of size [9,18)x[0,18), total local size = 162, local allocated size = 162 Global size of array : [0:18)x[0:18), total size 324 Process 0 has the array of size [0,9)x[0,18), total local size = 162, local allocated size = 162 Process 0 has a power of 3672 of the original data Process 0 has a power of 27729 of the transformed data Process 1 has a power of 28458 of the original data Process 1 has a power of 4401 of the transformed data
Two copies of the program are run at the same time, FFTW takes care of the communication between them. Before looking at any number, we can see that the output messages are not well ordered. we learned the first lesson of MPI in general: not synchronized instructions can be executed in any order. Besides this the program works as expected. The original 18x18 matrix is split among the two processors in two matrices of 9x18, and a memory for 162 elements is allocated in each process.
Also we can check that the power of the global data is the same before and after the FFT, 3672+28458 == 27729+4401 == 32130.
We can check running the program in one process.
$ mpirun -np 1 simple_boost_mpi_example Global size of array : [0:18)x[0:18), total size 324 Process 0 has the array of size [0,18)x[0,18), total local size = 324, local allocated size = 324 Process 0 has a power of 32130 of the original data Process 0 has a power of 32130 of the transformed data
Important notes about MPI-FFTW3 memory distribution (do not skip)
Although the distribution of the matrix and allocated memory seems to very reasonable in the previous example, this is only because of the simple dimensions and process number used in the example. There are some characteristics of the memory distribution for which the previous example can be misleading, and it is worth noting them:
- Only FFTW (via fftw_mpi_local_size_2d) knows how to split the data and allocated memory size among processes, so don't try to predict the splitting yourself.
- The splitting of the global matrix is only done over rows (first dimension, for N>2-dimensional arrays).
- 1D FFT's usage is totally different, because the splitting depends on the type of transform to be performed (e.g. in-place, out-of-place).
- The splitting of the matrix can be uneven (for example, #columns is not divisible by #processes).
- Some processes can be assigned zero columns (for example if there are more processes than columns). Even in this case, the allocated memory in that process can be different from zero! (in general, it is 1*sizeof(complex)).
- The logical size of the submatrix (local matrix) can be different from the local allocated memory, i.e. local_n0*N1 <= alloc_local. The outputs of local_size are not redundant. FFTW uses the (small) extra memory as a scratch space.
The next step in complexity could be to intercommunicate the processes in order to compute the total (global) power of the data.
Collective Reduce operation
There are several kinds of MPI operations, point-to-point and collective. Among the collective operations, the reduce operation results in the number we need to compute: the global power of the data. Reduction means that we take *all* the objects/numbers from different processes and convert it into a *single* object/number by means of some function. More often than not, this function is simply a summation ().
In practice, a summation is a very simple function, because it is associative and commutative (sequence-product is as well), so many issues are simplified in this case because the result will not depend (mostly) on how the order in which the data is gathered. This kind of operations are defined only in terms of a binary function. That is, the summation is defined in terms of the binary addition alone.
I said *mostly* because for machine numbers the usual operations are not *exactly* associative, and the result can depend on how the data is split among processes. This is the second lesson of parallel programming, numeric results *can* depend on the number of processes. In fact, if the result completely changes with the number of processes then that may point to very deep conceptual errors in your numeric algorithm, since the result depend on how the data is summed (for example, when you sum small and large numbers together). This is very different from a 'bug' in the program (or in the MPI library).
all_reduce
Going back to the global power example we can make all the processes receive the summation of values by inserting the lines:
double global_pdata=-1; mpi::all_reduce(world, pdata, global_pdata, std::plus<double>()); double global_ptransform=-1; mpi::all_reduce(world, ptransform, global_ptransform, std::plus<double>()); ... cout<<"Process "<<world.rank()<<" knows that global power of original data is "<< global_pdata << endl; cout<<"Process "<<world.rank()<<" knows that global power of transformed data is "<< global_ptransform << endl;
Now the program will compute the power of the data for us, which makes easier to verify that the Fourier transform is consistent. Note that the third argument is the variable which will be overwriten with the result and the second is the contribution from the current process to the reduction.
From this point on we can easily adapt the reduction to other needs. We can pass a different function, like std::multiply<double>() or, just as an example, selectively pass data if it fulfills certain condition:
mpi::all_reduce(world, pdata>0.?pdata:0, std::plus<double>()); // only pass positive values. mpi::all_reduce(world, world.rank()<4?pdata:0, std::plus<double>()); //only pass data of the first 4 processes.
Note that in both cases the condition of passing the data is *inside* the function call (via the ternary function condition?casetrue:casefalse). If you put the condition outside the function call, there exists the possibility of not fulfilling such condition call the MPI communication will hang because all the process will be waiting for a process that refused to send the data. This is the wrong code
if(pdata>5000){ // wrong way to try to selective pass data larger than 5000
mpi::all_reduce(world, pdata, std::plus<double>());
} // WRONG
What is worst: the program may or may not hang depending on the actual numeric value of the data. So, third lesson of parallel programming: it is too easy to create a deadlock. In particular all blocking-communication have to be executed in non-conditional branches of the program. (Another good reason to program without if's.)
Note on function objects (Boost.Lambda and Boost.Phoenix)
In the C++ standard library, the binary addition is represented by a function object called std::plus<T> which represents the addition over type T, we are interested in the case in which T is the double type. This functions object have become the standard way of passing arbitrary functions to other functions. Function-object seem misterious sometimes but they are simply a representation of the operation, for example this code works as expected:
std::binary_function<double, double, double>* myop; myop=new std::plus<double>(); double plus_result=(*myop)(4,5); // or myop->operator()(4,5); assert(plus_result==9); delete myop; myop=new std::multiply<double>(); double multiply_result=(*myop)(3,5); assert(multiply_result==15); delete myop;
In fact it is very easy to create your own function object, they look almost like functions:
struct concat{ //concatenation
std::string operator()(std::string const& s1, std::string const& s2){
return s1 + s2;
}
};
Passing such function object to reduce will make all_reduce concatenate the strings produced by different processes. Note that this operation is not commutative and will depend on some standard convention used by MPI to collect the data (fortunately it is well defined).
The only difference between our function object an std::plus<std::string>, is that the later is already included in some standard library.
They differ from built-in (normal C) functions in two aspects, 1) you can parametrize them, 2) you can make copies of them, 3) you can compare them:
struct concat_with_sep{ // concatenation with separation
std::string separator_;
concat_with_sep(std::string const& separator) : separator_(separator){}
std::string operator()(std::string const& s1, std::string const& s2){
return s1 + separator_ + s2;
}
};
The beauty of this is that, although the separator string is arbitrary, 'concat_with_sep', viewed as function, still takes only two parameters. Function object are more general than regular functions, because they can have internal parameters. Function objects are sometimes called Functors.
Free functions (the normal C-functions) can be converted to function objects in many ways. The standard way is to use Boost.Function library. Function objects can be combined with one another with the Boost.Bind or the (more advanced) Boost.Lambda.
For very complex functions we should define our own function objects which requires a separate definition of a class or function. There is however some cases where the function that we need to apply is simple but not trivial. It is in this case where Boost.Lambda plays a nice role by allowing us to define a function in-place. For example, suppose that we need to compute not the sum but the sum of square modulus, in such case the function object can be created by replacing ... std::plus<double>() ... by
... _1+bind(&std::norm<double>, _2) ...
where we should include and declare
#include<boost/lambda/lambda.hpp> #include<boost/lambda/bind.hpp> using namespace boost::lambda;
at some earlier point. If we were interested in the sum of squares, we could use just use
... _1 + _2 *_2 ...
Even simpler is the simple addition, plus<double> is exactly identical to _1+_2 once we use Boost.Lambda.
Note that this nice syntax is achieved by the Lambda library and it is not built-in in the language. How the Lambda library works internally is quite complex to explain, however it is designed to be easy to use (note the natural syntax). On the other hand it is difficult to debug (compilation errors are difficult to interpret). A working example of this library can be downloaded from here. The example also illustrates the usage of Boost.Phoenix which is intended to supercede Boost.Lambda.
reduce
Another variation of the example can be introduced if we realize that sometimes the result is only needed in one process. In this case the reduction result is sent to one process only, which is specified as the last argument in mpi::reduce:
double global_pdata=-1; mpi::reduce(world, pdata, global_pdata, std::plus<double>(),0); // process 0 receives the data cout<<"Process "<<world.rank()<<" *thinks* that global power of the original data is "<< global_pdata << endl;
In this example, only process 0 will have a meaningful value of the result; other processes will have the global_pdata variable unchanged. The process 0 will output the correct value, while the rest will output -1. Essentially, all process but one will have an uninitialized, invalid, or at best flagged to some magical value.
This selective data collection sounds reasonable idea but really it can cause trouble because we have to keep track of the process that has the right value by some conditional. There are two basic possibilities, either we "promise" to ignore the value in processes with the wrong value.
double global_pdata=-1;
mpi::reduce(world, pdata, global_pdata, std::plus<double>(),0); // process 0 receives (and send to itself) the data
if(world.rank()==0){ //always check for process number *before* using the variable.
cout<<"Process 0 knows that the global power of the origin data is "<<global_pdata<<", others do not really know"<<endl;
}
or we put the result variable inside the scope of an 'if' statement:
if(world.rank()==0){
double global_pdata=-1;
mpi::reduce(world, pdata, global_pdata, std::plus<double>(),0); // process 0 sends and receives the data
... // do other operations with result
cout<<"Process 0 knows that the global power of the origin data is "<<global_pdata<<", others do not know at all"<<endl;
}else{
mpi::reduce(world, pdata, std::plus<double>(),0); // others process only send the data
}
Note that the number of arguments is different in the two different calls to reduce. The first branch will receive in the result of all the processes (including itself) while the second will only send values.
In this way, only process 0 can ever access the value. Trying to use the the variable outside the first branch of the conditional will not even compile. In this case we do not have to promise about the usage of the variable but we have to promise that we will call similar versions of reduce in both branches of the conditional. If we forget the else branch (second branch), the program will hang in the same way as described above.
Next subsection will deal with this problem of 'dangling' variables. You can skip the following subsection in a first read, it will not improve your understanding of MPI. Just be aware the problem for the moment.
The issue of dangling variables (and Boost.Optional)
A variable with a value that makes sense in only one process seems to be a natural situation, after all, each process is supposed to do different part of the total work. The problem is that, within MPI, all processes must share the same code and therefore will be forced to carry variables that may not make sense to them. This situation creates tension.
Note that in the first place, this issue arises because by collecting data in only one process we are imposing some asymmetry; it is nevertheless necessary sometimes. The situation is annoying for those used to serial programming, but we have to understand that this kind of problems are at the core of parallel programming difficulties. The first step to try to avoid the problem is to think whether it may be really useful --or at least not harming-- for all the processes to know the result of reduce/all_reduce.)
It seems that the second solution above (if/else block) is more elegant because there are no dangling undefined variables around, but is not free of problems. That is because if we need to compute something non-trivial with the result, the conditional statement can become several lines long and it can be difficult to handle cases in which more than one (but not all) process receive the value. We can put all these lines in a separate function, but that function will be forced to be serial code (because it is executed only for one process). Note that I refer this as "dangling variable", in the same sense we refer to "dangling pointers", both are equally dangerous. The following proposed solution uses the library Boost.Optional and it can be also used as an introduction to this other Boost library.
Boost.Optional introduces a kind of objects which carry information on whether their in internal value is valid (e.g. initialized or not). For example the object
boost::optional<double> a;
can hold any real value *or* be in a 'undefined state'. If it is undefined then (bool)a==false and true otherwise and its actual value is given by *a. If we try to extract a value *a for an undefined variable then we will encounter a runtime error.
There is another possibility to consider that I discovered, that is the usage of Boost.Optional. (This solution is not suggested anywhere else to my knowledge.) In this case, the valid state of the variable is handled by the variable itself.
boost::optional<double> global_pdata(world.rank()==0, double()); //non empty at node 0 only.
if(global_pdata) mpi::reduce(world, pdata, *global_pdata, std::plus<double>(),0);
else mpi::reduce(world, pdata, std::plus<double>(),0);
...
if(global_pdata){ // only "use" if global_pdata is it is initialized
cout<<"Process "<<world.rank()
<<" knows that the global power of the origin data is "<<*global_pdata<<", others will die if they try to 'know'"<<endl;
}
// cout<<*global_pdata<<endl;//unchecked usage will throw and exception and possibly abort the program in other processes
For those used to C, they will recognize that 'global_pdata' looks like a 'nullable' pointer (whose pointee value is valid when !=0 by convention). The difference is that boost::optional behaves better than a pointer for this application, for example, boost::optional can be assigned values, or uninitialized with no memory allocation necessary, no dangling pointers can be created in this way. If we respect this check-then-use syntax, the variable becomes transparent to all the process where it is undefined.
Note that global_pdata is not associated with any particular process number, we can forget in which process the variable was valid, the validity test is done in the variable itself. We have to remember to test whether the global_pdata is locally valid. What if we do not check and use *global_pdata without test? Unless we compile in production mode (NDEBUG=0) the calling process will terminate after an 'uninitialized' fail assertion is called from Boost.Optional. This is still better than other options, like having a dangling pointer (nullable pointer solution), using an uninitialized or invalid value (first solution), or a deadlock (second solution).
So far I mentioned three (or four) possible solutions, none of them can save us at compilation time, i.e. before running our program. The situation will not improve until someone smart writes a compiler (and a language) capable of *really* interpreting parallel code and figuring out these issues before actually running the program. (Notice that MPI is really like low level programming.)
If you think that my suggested solution (using Boost.Optional) is in the right direction, there are at least two ways to improve it further. First, we can encapsulate the two calls to 'reduce' in a single function (a template function will be even more general):
boost::optional<double> optional_reduce(boost::mpi::communicator const& world, double const& in_value, int root){
boost::optional<double> ret(world.rank()==0, double());
if(ret) mpi::reduce(world, in_value, *ret, std::plus<double>(),0);
else mpi::reduce(world, in_value, std::plus<double>(),0);
return ret;
}
and then use it as
boost::optional<double> global_pdata = optional_reduce(world, pdata, 0); ... if(global_pdata) cout<<"global_pdata = "<<*global_pdata<<endl;
The second is to use exceptions. An exceptions that is thrown when trying to use a variable in a process in which is not defined.
For output purposes we can make boost::optional support output by
#include<boost/optional/optional_io.hpp> ... cout<<"global_pdata = "<<global_pdata<<endl;
will print '--' if the value is unitialized and the actual value otherwise, note the lack of '*'.
Transparent blocks via exceptions (boost/assert.hpp)
By default, access to an uninitialized optional variable results in a failed assert and end of execution. If you want to be able to recover from this failed assert, you can use boost/assert.hpp to change the default behavior.
Just add the following code (extracted from here) before and after #include<boost/optional.hpp>
#define BOOST_ENABLE_ASSERT_HANDLER
#include "boost/assert.hpp"
#include "boost/otional.hpp"
// #undef BOOST_ENABLE_ASSERT_HANDLER //to constrain the new default to optional.hpp only
#include <stdexcept>
namespace boost {
void assertion_failed(char const* expr, char const* function,
char const* file, long line) {
throw std::runtime_error(expr);
}
}
This will allow you to throw an exception (instead of failing after the assert) when accessing uninitialized variable, this gives the opportunity to recover from failed executions. For example:
try{
boost::optional<double> global_pdata = optional_reduce(world, pdata, 0);
}catch(...){
clog<<"sorry I am a process that don't know about global_pdata"<<endl;
}
The idea is that the invalid part of the try-block becomes transparent to processes. (The valid part should be exception safe though.)
Primitive or point-to-point operations (barrier, send & recv)
The section describe primitive operations, i.e. operation that are very basic. In principle we should be able to do all the communications with this primitives, but then many things will have to be handled manually. The result of using this operations in excess can result in a very unelegant code. (That is why this section is at the end).
The simplest operation is the member function barrier it serves to the purpose of synchronizing all the processes in a certain communicator. Then there is the member send and recv, to send an receive messages.
In pure C-MPI, the last two operators are only implemented to pass small types, like single numbers or pointers. That means that to pass a simple string we have to allocate the memory manually and hope for successful communication and that we didn't introduce any mistake when dealing with the pointers. With Boost MPI the situation is much better thank to the automatic serialization and deserialization.
The following example shows the use of barrier, send and recv to print (from process 0) a list of the nodes involved in a MPI execution:
#include<boost/mpi.hpp>
int main(int argc, char* argv[]){
boost::mpi::environment env(argc, argv);
boost::mpi::communicator world;
std::clog<<"greetings from process #"<<world.rank()<<" in processor named "<<boost::mpi::environment::processor_name()<<std::endl;
world.barrier();
if(world.rank()==0) std::clog<<"all processes syncronized here"<<std::endl;
world.send(0,world.rank(), boost::mpi::environment::processor_name());
for(int i=0; i!=world.size(); ++i){
//std::clog<<"process #"<<i<<" sends the name of its processor to process #0"<<std::endl;
if(world.rank()==0){
std::string buffer;
world.recv(i, i, buffer);
std::clog<<"process #0 receives string "<<buffer<<" from process #"<<i<<std::endl;
}
}
return 0;
}
compile with
mpicxx main.cpp -L$HOME/usr/lib -lboost_mpi-gcc44-mt -lboost_serialization-gcc44-mt -Wl,-rpath=$HOME/usr/lib
and run with
./a.out
The first set of messages ("greetings...") will appear in a random order, however thanks to the barrier call, all processes will wait for the others to reach that point in the execution. After that all processes will send the name of the processor in which they are executing to process number 0, which will receive the message and print the string.
Both send and recv send a tag (integer in the second argument), this number can be for example the origin process of the message itself, but the important thing is that send and recv must agree in the tag, otherwise recv will wait for a message that has the correct tag. If you make the tag disagree at send and receive the program will hang. Tag is useful to distiguish between 'different' message sent by the same process.
Although much cleaner than its counterpart pure C version, the code is still quite fragile and prone to errors. For these reason it is recommendable to use other ways to send messages as described in the other sections of this chapter.
Other collective operations (gather and broadcast)
To make clear that options other than send and recv should be considered I will reimplement the program using the 'gather' operation. Gather takes the elements send by other processes and stacks them in a vector type. Then the vector object can be transversed to inspect its contents.
#include<boost/lexical_cast.hpp>
#include<boost/mpi.hpp>
int main(int argc, char* argv[]){
boost::mpi::environment env(argc, argv);
boost::mpi::communicator world;
std::clog<<"greetings from process #"<<world.rank()<<" in processor named "<<boost::mpi::environment::processor_name()<<std::endl;
std::string message="process #0 receives string "+boost::mpi::environment::processor_name()+" from process #"+boost::lexical_cast<std::string>(world.rank());
if(world.rank()==0){
std::vector<std::string> lines;
boost::mpi::gather(world, message, lines, 0);
for(int i=0; i!=lines.size(); ++i)
std::cout<<lines[i]<<std::endl;
}else{
boost::mpi::gather(world, message, 0);
}
return 0;
}
Broadcast simply sends a value to all the processes in a certain communicator, the values are stored in the same variable as in the broadcasting process.
string v = "empty"; if(world.rank()==0) v="special value"; boost::mpi::broadcast(world, v,0); assert(v=="special value"); //all processes have this value for v now
In other words it synchronizes the value of a variable with respect to the value in a certain process.
Parallel logging example
One of the first problems that appears when writting our first parallel program is that when we try to print messages the usually don't appear as we intend. The reason is that logging and printing messages to the user is an inherently serial job and once we are in a parallel environment we have to be specific that such message printing is done in a serial way. We have to synchronize the processes make each process send its message (hopefully in a prescribed order) and then recover the parallel execution. What is worst is that sometimes we need all the processes to print the same message, in that case many copies of the same message will be printed.
It would be great if we can solve this problem once and for all and make the code read our mind and understand these issues and print what we intend.
The following code will make that possible, the code seems complicated because it uses advanced C++ stream techniques but what it does is very simple, for a newly defined stream (boost::mpi::ostream) all processes will accumulate strings send by each process, when a special end-of-line signal is sent (boost::mpi::endl) is send the root process collects the messages and make them unique. In this sense repeated messages are only shown once. Messages are displayed in alphabetical order. If we want the messages displayed in the order of the issuing process we can just add that information at the beginning of the message (e.g. "proc 5:...").
In this sense we can forget that we are running a program in parallel if all messages are the same. But if the messages are different (line-by-line) all the messages are printed recovering the parallel behavior. Also, in the parallel case, messages are shown in a prescribed order.
#include<boost/mpi.hpp>
#include<set>
namespace boost{namespace mpi{
class ostream{
communicator const& comm_;
std::ostream& os_;
std::ostringstream local_buffer_;
public:
ostream(communicator const& c=communicator(), std::ostream& o=std::clog) : comm_(c), os_(o){}
template<typename T> ostream& operator<<(T const& t){
local_buffer_<<t;
return *this;
}
typedef ostream& (*ostream_manipulator)(ostream&);
ostream& operator<<(ostream_manipulator manip){
return manip(*this);
}
~ostream(){endl(*this);}
static ostream& endl(ostream& os){
os.local_buffer_<<std::endl;
os.comm_.send(0, os.comm_.rank(), os.local_buffer_.str());
os.local_buffer_.str("");
if(os.comm_.rank()==0){
std::set<std::string> messages;
for(int i=0; i!=os.comm_.size(); ++i){
std::string remote_buffer_;
os.comm_.recv(i, i, remote_buffer_);
messages.insert(remote_buffer_);
}
for(std::set<std::string>::const_iterator i=messages.begin(); i!=messages.end(); ++i){
os.os_<<(*i);//<<std::endl;
}
}
os.comm_.barrier();
return os;
}
typedef std::ostream& (*std_manipulator)(std::ostream&);
ostream& operator<<(std_manipulator manip){
manip(local_buffer_);
return *this;
}
};
ostream& endl(ostream& os){return ostream::endl(os);}
}}
The stream is used as follows:
int main(int argc, char* argv[]){
boost::mpi::environment env(argc, argv);
boost::mpi::communicator world;
using boost::mpi::endl; //or using std::endl; //*
boost::mpi::ostream clog(world); // or using std::clog; //*
clog<<"world says: we are the world"<<endl; //this messages is unique
clog<<"rank "<<world.rank()<<" says: I am me"<<endl; //this message is different for each process.
return 0;
}
The ouput is
$ mpirun -np 3 ./a.out world says: we are the world rank 0 says: I am me rank 1 says: I am me rank 2 says: I am me
Note that the code looks almost like the serial code, by changing the two lines marked we recover the usual behavior (i.e. multiple messages in random order). Anoether way to recover the serial behavior is to use the communicator MPI_COMM_SELF in the constructor of the stream:
boost::mpi::communicator self(MPI_COMM_SELF, boost::mpi::comm_attach); boost::mpi::ostream clog(self); // or using std::clog;
There are a couple of assumptions in the design. First, the messages are all different or all the same, i.e. we don't care about distinguishing messages from 'groups' of processes. In any case that 'group' information can be passed explicitly as part of the string. Second, the smallest indivisible messages is an entire line and not part of it, i.e. uniqueness is relative to a full line, i.e. until boost::mpi::endl is found.
The class can be defined and used inline:
boost::mpi::ostream(world)<<world.rank()<<" says: I am me"<<endl;
By default it uses the std::clog output (stderr), if we want to use a different output stream, for example a file or std::cout (stdout) we can specify:
boost::mpi::ostream(world, std::cout)<<world.rank()<<" says: I am me"<<endl;
Finally, we have to remember that all processes (in a certain communicator) will wait to synchronize to print the message. So the parallel performance can be affected by printing these messages, even if the message is unique. If this is a price we don't want to pay, then we will have to roll back to the usual conditional statement:
if(world.rank()==0) std::clog<<"here"<<std::endl;
But then the message will not contain information about the current state of processes other than process number 0.
NB: The idea of the behavior is taken from the Parallel Print Function and the implementation of the stream is bases on this generic stream class.
Other Resources
Boost.Filesystem
Sooner or later every program has to deal with the task of managing files or even directories in the disk. The following tasks are quite tedious to perform in C or C++:
- Create/remove directories
- Iterate through files in a given directory
- Create filenames from other names
- Change/add filename extensions
- Copy files
- Delete files
- Check if a file exists
Besides, there is no portable way (across operating systems) to do it in general with the core language and libraries. Here is where Boost.Filesystem proves useful, it not only facilitates programming such tasks but also does it in a portable way by hiding the OS-specific file management. Besides it is likely that this library will become part of the next C++ standard library. If you are using the latest boost, it is better to use this manual http://www.boost.org/doc/libs/1_46_1/libs/filesystem/v3/doc/index.htm which references version 3 of the library.
This library is also described briefly in http://www.ibm.com/developerworks/aix/library/au-boostfs/index.html and http://en.highscore.de/cpp/boost/filesystem.html
Path
The path object can represent a concrete file or a directory. It is basically a string that represents the path to an existing or non existing file or directory.
boost::filesystem::path p("/home/user/data.dat");
It has certain advantages to the usual way of representing a file by means of char* or std::string. The advantage over char* is that the object is not a pointer so it can be copied without worrying about the pointer semantics. The advantage over std::string is that path names can be combined very easily, even using the intuitive operator '/', for example:
using boost::filename::path;
path myhome("/home/user");
path mydata("data.dat");
path fullname = myhome / mydata;
Other facilities of boost::filename::path is that the extension of the file can be easily extracted and changed.
Traditionally filename are passed as char* in C and std::string in C++. filesystem::path can be converted back and forth from this types. Thanks to the constructors
const char c[]="file.dat";
path p(c);
path p("file.dat"); // from literal;
std::string s="file.dat";
path p(s);
and back from member functions
std::string s=p.string(); const char c[99]=p.string().c_str();
so, passing filenames to old routines is as simple as passing p.string().c_str(). (note that c_str() is a member function of std::string class).
Common path name functions
The operations listed in this section manipulate paths, that is manipulates the string that *represent* files (not the files themselves).
p.extension() // was extension( path ) -> string //returns the extension of a file path (useful for guessing the file type), includes the 'dot'. p.stem() // was basename( path ) -> string //returns everything in the filename except for the extension p.filename() // name of file without path
note that p.string() == p.stem() + p.extension().
p.replace_extension(".newext"); //(was change_extension(const path, string new_extension_w_dot) -> path) // changes the extension of a path name and returns the path
Note that these functions return strings, not paths. Strings can be manipulated as such, and eventually transformed to paths. With the path constructor:
path newp(string);
Also to interface with old routines that take C-strings we could use
p.string().c_str();
the first function produces an std::string with the path the second call productes a C-string (char*).
Common file actions
The operations listed in this section make actual changes on the file in the disk. Most frequently file actions are straight forward once we remember the particular names given, here it is a list of functions with obvious side effects:
create_directory( path ); create_symlink( realfile, linkname ); create_directory_symlink( realfile, linkname ); copy_file( origin, destination ); // exception on exisisting destination remove( file );
Caveats
- Exceptions
Some actions could raise exceptions upon errors, if not catch they can abort the program. For example create_symlink raises an exception if the link already exists (but doesn't raise an exception if the destination file does exist). Another typical case is copy_file when destination already exists; for that copy_file can be given an option:
copy_file( origin, destination, copy_option::overwrite_if_exists );
- Symlinks
Directories symlink should be created with create_directory_symlink in general, (although in some filesystems/OS they are the same).
First argument of create_symlink is basically the content of the symlink, therefore for symlinks created in an arbitrary directory should be relative to that location.
If symlink path exists (as a real file or as a symlink) an exception is raised. Therefore destinations must be removed before creating the symlink (there is no symlink overwrite option).
Directory Iterator
So far, the features described are simply path name manipulations and file actions. On top of that the library gives the opportunity to scan files in a certain directory as if the directory were a typical STL container. The following code shows how to read several files with a given filename in different directories:
using namespace boost::filesystem; // ::directory_iterator ::path, etc
for(directory_iterator it("../datadir"); it == directory_iterator() /*end iterator*/; ++it){
if(not is_directory(*it)) continue; //ignore if the entry is not a directory
for(directory_iterator it2(*it); it2 == directory_iterator(); ++it2){
remove((*it2)/"data.txt");
}
}
This example removes the file named "data.txt" in the directory "../datadir/*/". There is also a 'recursive_directory_iterator' which avoids the complication of nesting different loops to reach a directory of arbitrary nesting.
This kind of directory)iterator should not be confused with other type of iterator which exists for a completely different purpose, the 'path::iterator', (which, given a path e.g. ("/a/b/c/d/e/") iterates over the components "a", "b", "c", etc.)
This is a minimal example in which the files in the current directory are backed up:
//c++ -std=c++0x -Wfatal-errors $0 -o $0.x -lboost_date_time -lboost_filesystem -lboost_system && $0.x $@
#include<boost/filesystem.hpp>
using namespace boost::filesystem;
using std::cout; using std::endl;
int main(){
initial_path(); // keeps information of the initial path where the program is executed
cout << "iterating over entries in directory " << initial_path() << endl;
for(directory_iterator it(initial_path()); it != directory_iterator(); ++it){
cout << "entry " << *it << " found ";
if(is_directory(*it)){
cout << "is a directory and will be ignored" << endl;
}else if((*it).path().extension() == ".backup"){
cout << *it << " is already a backup and will be ignored " << endl;
}else{
cout << "is not a directory and will be backed up" << endl;
path destination = (*it).path().string() + ".backup";
cout << "copying from " << *it << " to " << destination << endl;
boost::filesystem::copy_file(*it, destination, copy_option::overwrite_if_exists);
}
}
return 0;
}
Compilation (Dynamic vs. Static Linking)
The Filesystem library is not a "header only" library, that means that the executable must be linked to the libraries. There are two options, either dynamic linking (default) or static linking. Dynamic linking generates smaller executable files although it needs to locate the libraries at runtime, static on the other hand allows the executable to be shipped to another compatible system where the libraries are not present. In this example a small program using Boost.Filesystem is linked in two different ways, showing the difference in size:
$ c++ sizeme.cpp -lboost_filesystem -lboost_system $ ls -all ./a.out -rwxr-xr-x 1 user user 27867 2011-06-21 20:36 ./a.out* $ ldd ./a.out libboost_filesystem.so.1.47.0 => /home/user/usr/lib/libboost_filesystem.so.1.47.0 (0x00007f7ec382e000) libboost_system.so.1.47.0 => /home/user/usr/lib/libboost_system.so.1.47.0 (0x00007f7ec362a000) ... $ c++ sizeme.cpp -lboost_filesystem -lboost_system -static $ ls -all ./a.out -rwxr-xr-x 1 user user 1611409 2011-06-21 20:40 ./a.out* $ ldd ./a.out not a dynamic executable
Static linking produces a file 1.6 MB big, roughly the size of the Boost.Filesystem plus several other system libraries. In orther to be able to compile statically and dynamically you must have both the *.so and *.a files, this is the case if you followed the installation instructions for boost.
$; ls ~/usr/lib/libboost_filesystem.* -all -rw-r--r-- 1 user user 390006 2011-04-13 13:17 /home/correaa/usr/lib/libboost_filesystem.a lrwxrwxrwx 1 user user 29 2011-04-13 13:17 /home/correaa/usr/lib/libboost_filesystem.so -> libboost_filesystem.so.1.47.0* -rwxr-xr-x 1 user user 173253 2011-04-13 13:17 /home/correaa/usr/lib/libboost_filesystem.so.1.47.0*
To have the static (and dynamic) version of the libraries in Ubuntu for example you must have the *-dev version of the packages. For example:
$ sudo apt-get install boost-all-dev libboost-filesystem1.42-dev
This library must be compiled with the C++0x support if you want to compile it with C++0x compiler. (See installation notes.)
Boost.StaticAssert
We can check for certain consistency conditions before some complicated code is called by means of 'assert'. For example:
#include<cassert> // usually not needed
double square_root(double x){
assert(x>=0);
...
}
Under certain circumstances what we need is something that is checked during compilation, and not at runtime like the previous example. In other words, some conditions on macro definitions, types information and numeric constants must be checked even before the program is executed. It is here were Boost.StaticAssert can be useful.
The macro BOOST_STATIC_ASSERT, defined in Boost.StaticAssert library checks for certain conditions to be met at compile time. Of course the predicate must have the possibility to be evaluated at compile time, so dynamic variables cannot be used. This library should not be confused with Boost.Assert (BOOST_ASSERT).
Although most applications of static assert are for template instantiation, here I will give a simpler example. Suppose that we have two compilation flags -D_HOT and -D_ICECREAM that although it makes sense to use them separately it doesn't make sense to use them together because of the internal logic of the code. HOT is not compatible with ICECREAM. The code needed to ensure that both flags are not defined simultaneously is the following:
#ifdef _HOT #ifdef _ICECREAM BOOST_STATIC_ASSERT(0); #endif #endif
The compilation will fail (at that point) if both compilation flags are defined.
A more complicated example is the following suppose that we want to make sure that the code is never compiled in a machine were the type 'int' has less than 32 digits:
BOOST_STATIC_ASSERT(std::numeric_limits<int>::digits >= 32);
Note that the condition can be evaluated at compile time (there are no variables involved).
In order to use BOOST_STATIC_ASSERT you must #include<boost/static_assert.hpp>.
Due to technical reasons the authors of the library recommend, when possible (specially in header files), to put the condition in a special namespace. That also gives the opportunity to add some meaning to the condition:
#include <climits>
#include <cwchar>
#include <limits>
#include <boost/static_assert.hpp>
namespace my_type_conditions {
BOOST_STATIC_ASSERT(std::numeric_limits<int>::digits >= 32);
BOOST_STATIC_ASSERT(WCHAR_MIN >= 0);
}
A built-in version of static_assert is supposed to be part of the next C++0x language.
Boost static computations (Metacomputations)
Boost is above all a template library, that means that it concentrates in computations about types and compile time constants rather than in runtime algorithms. Pure C is a programming language that allows to write almost any runtime algorithm, but the language doesn't give facilities to write algorithms that are performed during compilation (i.e. by the compiler). On the other hand, C++ defined a whole set of template syntax that allows us to force the compiler to perform certain operations. Programming templates is called 'metaprogramming' in the jargon, because it is like programming the code generation of a program. In this sense many computations can be performed even before the program is run for the first time.
Boost.MPL is a great example of what it can be achieved by template metaprogramming. Here it is a basic example:
#include <boost/mpl/int.hpp>
int main(){
typedef boost::mpl::int_<8> eight;
std::clog<<eight::value<<std::endl;
std::clog<<boost::mpl::next<eight>::type::value<<std::endl;
}
The output of this program is
8 9
The template class mpl::int works like a metatype while 'eight' is a metavariable (and a type).
The operation 8 + 1 is never performed by our program. So, who did the operation? The answer is the compiler it self, in fact the program is exactly (in the binary sense) as having explicitly programmed manually:
int main(){
std::clog<<8<<std::endl;
std::clog<<9<<std::endl;
}
The operation 'next' is a metafunction that returns a new type which wraps the value that follows the value wrapped by the argument type. The following program, makes the compiler compute the number 17. When executed, the program just prints the constant 17 (but it doesn't need to compute it).
#include <boost/mpl/int.hpp>
#include <boost/mpl/plus.hpp>
int main(){
typedef boost::mpl::int_<8> eight;
typedef boost::mpl::int_<9> nine;
std::clog<<boost::mpl::plus<eight, nine>::type::value<<std::endl;
}
Boost.MPL have many more facilities, like compiler time containers (e.g mpl::vector) and algorithms (e.g. mpl::max_element).
That it means that we can precompute anything at compile time so that at runtime we save many computations? Can we, for example, precompute the square root of 2 so we don't have to type the constant 1.4142... manually? no, floating point numbers are a limitation of the template metaprogramming and they are not allowed for technical reasons. But otherwise we can compute anything that involves integer arithmetic, including rational numbers and integer linear algebra.
A different kind of limitation of template metaprogramming is that the efficiency of the operations is usually much worst than runtime operations, therefore compilation can take longer as we use more complicated techniques. In fact we can make the template program so complicated that the compilation never ends (most likely fail due to recursion limits set by the compiler).
For example Boost.Units is a compile-time dimensional analysis library with absolutelly no-runtime cost. Dimensional analysis requires to solve linear system of equations with integer coefficients.
Parts of Boost.Math implement static Greatest Common Divisor and Lowest Common Multiple algorithms, i.e. the numbers are computed during program compilation.
Static Rationals
In this section I will describe a library that is auxiliar to the Boost.Units library which is not documented. The Boost.Units library requires an implementation detail of compile time (static) rationals and that implementation can be used if we include the boost/units/static_rational.hpp. Recently Boost accepted Boost.Ratio which implement static rationals as well but it is not yet included in the Boost distribution.
In this example, we define C++ *types* that represent compile time rationals, binary operations can be performed to define new rationals. Values can be inspected at runtime.
#include<boost/units/static_rational.hpp>
int main(){
typedef boost::units::static_rational<6,8> six_eigths;
std::clog<<six_eigths::Numerator<<"/"<<six_eigths::Denominator<<std::endl;
typedef boost::units::static_rational<1,4> one_fourth;
typedef boost::mpl::plus<six_eigths, one_fourth>::type result;
std::clog<<result::Numerator<<"/"<<result::Denominator<<std::endl;
}
First, the program outputs
3/4
(because 3/4==6/8). Note that it is the compiler through the static_rational.hpp metaalgorithm who computed the non-trivial simplification of the fraction.
Later it prints
1/1
which is the result of 6/8 + 1/4 (or 3/4 + 1/4) which is computed using the mpl::plus operation (and some implementation in static_rational.hpp). Again the non trivial sum and the simplication is done at the time of the compilation.
Also note that, like in the case of mpl::int_c, the class static_rational is essentially empty, all the useful information is contained in the class and not in instances of that class. In fact there may not be instances of the class at all in the program.
Now a simple application. We mentioned earlier that these metaprograms cannot operate on floating point numbers, however we can exploit it anyway to improve floating point performance with minimal user code. Suppose that we want an arbitrary exponentiation of a number, to a power that is known a priori but that can change later while we are writing the code. A typical line of code for this purpose looks like this:
std::cin>>base; double const power = 5/6.; double result=std::pow(base, power); std::cout<<result;
so far so good, there are not many ways to improve the code, however we may want to change the value of p manually and yet obtain the maximum performance possible. If p=2 or p=1/2 calling std::pow is a waste, since it is not optimal. (Even if std::pow contains an 'if' conditional statement the code is not optimal since it will have to check for a condition at runtime.)
To obtain the maximal performance we will have to change our code manually depending on the value of 'p'.
std::cin>>base; /* arbitrary case */ //double const power = 5/6.; //double result=std::pow(base, power); /* square */ double result=base*base; //more efficient than std::pow(base, 2); /* square root */ //double result=std::sqrt(base); //more efficient than std::pow(base, 0.5); std::cout<<result;
Not very elegant, i it? Here is were our solution comes in. We can write a template function parametrized by the known-at-compile-time rational
#include<cmath> //for std::pow
template<class RationalPower>
double smart_pow(double const& base){
return std::pow(base, (double)RationalPower::type::Numerator/(double)RationalPower::type::Denominator);
}
using boost::units::static_rational;
template<> double smart_pow<static_rational<1,2>::type>(double const& base){
return std::sqrt(base);
}
template<> double smart_pow<static_rational<1,1>::type>(double const& base){
return base;
}
template<> double smart_pow<static_rational<2,1>::type>(double const& base){
return base*base;
}
then we can use the function with the confidence that the optimal function is always called, because these cases are handled efficiently. This can approach can even handle cases were the numerator and the denominator are calculated independently by some other computation.
using boost::units::static_rational; double base; std::cin>>base; double result=smart_pow<static_rational<1,2>::type >(base); //automatically calls std::sqrt std::cout<<result;
we don't have to worry about defining the case 4/2 or 2/4 or 9/9 since the static_rational metatype takes care of the simplification and reduces the case to 2, 1/2 and 1 respectively.
Boost.Fusion
Boost.MPL and Boost.Tuples are library without which doing template programming would be difficult. MPL was designed to be the STL (Standard Template Library) of compile time, for example for having containers of types. On the other hand Tuples was the library for heterogeneous (static) containers.
I will skip MPL container and Tuples and go straight to Boost.Fusion which combines the best of both libraries and it is more general. Boost.Fusion is a library of compile-time/runtime containers, that is, compile-time objects that contain other compile-time objects. In its turn each static element (types) of the container is also a placeholder for a runtime object.
#include <boost/fusion/container.hpp>
#include <boost/fusion/include/container.hpp>
#include<string>
#include<iostream>
int main(){
using namespace boost::fusion;
vector<int, char, std::string> stuff(1, 'x', "howdy");
int i = at_c<0>(stuff);
char ch = at_c<1>(stuff);
std::string s = at_c<2>(stuff);
return 0;
}
In this example, the object stuff is a boost::fusion::vector (analogous but different from to std::vector), as a type it contains a sequence of types and as a runtime object it contains different types, the int 1, the char 'x' and the string "howdy". Notice how the elements of the fusion::vector are accessed by the fusion:at_c<N> funcion and how therefore N must be a compile time constant. (This is the reason stuff[N] notation wouldn't work). So far, the usage is strictly equivalent to Boost.Tuple's tuples<...> type.
In parallel to the std::vector, the container fusion::vector has a size function which returns the number of elements. Now here it is where the interesting happens. As mentioned before there is a runtime aspect and a compile-time aspect of the object. There are two size functions, one returns a runtime variable and operates on instances of the class.
typedef boost::fusion::vector<int, char, std::string> stuff_type; stuff_type stuff(1, 'x', "howdy"); std::cout << size(stuff) << std::endl; //return a size of 3
The other version statically operates on the type and returns a compile time mpl::int_, a compile time integer can be obtained from this mpl::int_ by the value member. MPL basic metatypes were described in a previous section.
std::cout << stuff::size::value << std::endl;
The difference between the two is obvious when we want to pass the size of the fusion::vector (which is known at compile time) as the template parameter.
boost::array<double, stuff_type::size::value> a;
Now going back to the typical C++ class, you can see that since fusion::vector can hold an arbitrary number of types and elements then any class with member fields can be replaced by a fusion::vector.
class model{
double a;
int n;
...
double calculate() const{return pow(a, n);}
};
class model : fusion::vector<double, int>{
double calculate() const{return pow( at_c<0>(*this), at_c<1>(*this) );}
};
quite a generalization, now another metafunction can inspect the parameter types of "model". The price to pay is that we lost the nice names of the parameters (a and n). If we want to give names to the parameters we should use fusion::map.
fusion::map
fusion::map is analogous to std::map. In std::map we have values acceded by a key, now the keys are types. Since the types can be arbitrary we can use this mechanism to assign "names" to the keys. In practice this key type can be auxiliary.
struct a; struct n;
class model : fusion::map<
Boost Numerics
Boost is not intended specifically to be a numerical library, although there are some useful tools scattered within the library. The numeric capabilities random number generation, linear algebra, interval arithmetics, special functions. For numeric applications, it worths also looking at the well known C library GNU Scientific Library (GSL) which is documented in great detail.
Most math/numeric functions in Boost are generic with respect to the numeric type with different precisions. Of course it can be used with either double or float precision, but also with arbitrary precision types, for example it can be used with NTL, although in practice only with 100 digit precision at most.
Boost.Interval
Boost.Interval provides us with a support for manipulation of intervals and interval arithmetic. It can be used to represent simple numeric intervals (including the empty, infinite and semi infinite intervals) or to do a kind of error propagation for example.
To create an interval choose a number type and the lower and upper bounds
#include<boost/numeric/interval.hpp> boost::numeric::interval<double> unit(0., 1.); boost::numeric::interval<double> two(2.); //interval contains one point boost::numeric::interval<double> w = boost::numeric::interval<double>::whole();
if the, yet to be, lower limit is larger than the upper limit an exception (runtime error) is raised. Different properties of the interval can be accessed by the functions lower, upper, width (or norm), median, empty, singleton, equal, in, zero_in, subset, proper_subset, overlap, intersection, hull, bisect. Most interesting intervals can be manipulated by arithmetic operators (+,-,*,/) with numbers or with other intervals.
using boost::numeric::interval; interval<double> i1 = unit*2.; //duplicates the unit interval [0.,2.] interval<double> i2 = unit-1.; //shifts the unit interval to the left [-.5,.5]
A curious (and correct) result of the interval arithmetic is that
whole()*whole()==[-inf,inf] but pow(whole(),2) == [0,inf]
A semi-infinite interval can be generated (for types that have infinity),
interval<double> si(0.,std::numeric_limits<double>::infinity());
Some transcendental functions (exp, log, sin, etc) can be applied to intervals.
Intervals can be printed to the screen if we include an extra header file,
#include<boost/numeric/interval/io.hpp> ... std::cout << unit << std::endl;
To check whether an element belongs to the interval use the in function
if ( in( 0.5, unit) ) cout << "0.5 is in " << unit;
Intervals are considered closed (e.g. according to the 'in' function). Infinite intervals are also closed in the sense that they formally contain the floating point infinity.
Here it is an example http://codepad.org/nwo9Z2Be.
The library seems simple at first sight but the details (and the implementation) are quite tricky, for example regarding the proper rounding of the bounds or whether empty intervals are allowed or not. Some special cases are handled in the library by using special (template) options (policies). Notes on the design can be found here.
The library doesn't consider union of intervals, since it is not always another interval. The closest thing to the union operator is the 'hull' function, which gives the union of two intervals if they overlap.
Recently, another interval library, called Interval Container Library (ICL) was accepted and included in Boost. It has more abstraction, open/closed, multidimensional, and, in particular, interval sets and map (containers specific for intervals, which effectively work as interval unions, for example).
Boost.Random
Boost.Random is a library that allows us to pick random numbers from specific statistical distributions. It has the flexibility to separate the abstraction of the distribution and the randomness source. For example, you can choose a deterministic random source (pseudo-random generator) and plug in it to a specific distribution from where you can pick up numbers one at a time. For some systems Boost can even provide non-deterministic randomness from external devices and then plug it to your desired abstract distribution (not covered here).
A typical usage is the following
#include<boost/random.hpp> ... boost::mt19937 mt; //this is the randomness source boost::uniform_real<> rm1p1(-1.,1.); //this is the abstract distribution boost::variate_generator<boost::mt19937&, boost::uniform_real<> > die_real(mt, rm1p1); //glue randomness with distribution clog<< die_real() << endl; // pick up a random real from distribution
as you can see, the distribution (uniform in the interval [-1:1) ) is independent of the randomness source (Mersenne Twister #19937). The library allows to specify a certain set of predefined and user defined distribution.
This library (with minor changes) will become part of the standard library of C++0x.
(for complete code of examples: http://codepad.org/Uj56naPF).
Normal distribution
We can change the distribution to a normal one (gaussian) with a certain mean and deviation
using boost::mt19937; using boost::normal_distribution; using boost::variate_generator; normal_distribution<> n_10_4(10., 4.); variate_generator<mt19937&, normal_distribution<> > die_normal(mt, n_10_4); //glue clog<< die_normal() <<endl; //pick a random real from normal distribution
As a side note, it is well known that when we generate a normal distributed random number from a uniform one, we get a second normal random number for the same price. In other words, generating normal numbers one by one is less efficient than generating pairs. The Boost implementation of normal_distribution takes advantage of this but gives the illusion that the numbers are generated one at time, internally only pairs are generated. The trick is that the second number is cached and not calculated in the next round (i.e. second call of die_normal()). In the third call a new pair is generated, the first number of the pair is returned and the second one is stored for the following call.
The omitted template parameter of the distribution is the type of the returned value, by default it is double but it can be other floating point number (namely float). In fact other useful distributions can return multiple numbers. For example, the following code returns two numbers corresponding to a random point in a 2-sphere:
using boost::rand48; using boost::uniform_on_sphere; rand48 rand; uniform_on_sphere<> us(2); variate_generator<rand48&, uniform_on_sphere<> > die_place_in_the_world(rand, us); clog<< v2[0]<<" "<<v2[1]<<endl;
v2 is normalized (v2[0]*v2[0]+v2[1]*v2[1]==1). The dimension of the sphere can be arbitrary. The example also shows how to change the source of randomness (a.k.a. the underlying random number generator). In this case we change a particular Meressene Twister generator (mt19937) by a particular linear congruential (rand48).
Random Sources
The list of possible random generators is given here. For extra control on the number generator, the parameters can be passed as template parameters, for example mt19937 and ecuyer1988 are internally defined as
typedef mersenne_twister<uint32_t,624,397,31,0x9908b0df,11,7,0x9d2c5680,15,0xefc60000,18, 3346425566U> mt19937; typedef linear_congruential<int32_t, 40692, 0, 2147483399, 0>, 0> ecuyer1988;
All the random generators mentioned are deterministic and return the same sequence after each run unless the seed changes. To change the seed the object can be initialized with an integer that represents the seed or it can be changed at any point with the .seed member function.
boost::mt19937 mt(123); //this is the randomness source ... mt.seed(321);
Distributions
A list of possible distributions is given here. Since it is very common, it worths mentioning that integer uniform distributions can be also defined.
uniform_int<> six(1,6); // distribution that maps to 1..6 variate_generator<boost::mt19937&, uniform_int<> > die(mt, six); int x = die(); // simulate rolling a die
An interesting example is the Bernoulli distribution, whose associated generator returns a bool (true or false) with a given probability.
bernoulli_distribution<> mostly_true(0.75); //will return true 75% of the time variate_generator<boost::mt19937&, bernoulli_distribution<> > acceptance(mt, mostly_true); if(acceptance()==true) call_accept(); else call_reject();
Another distribution of integers is the binomial distribution (related to the Bernoulli distribution)
binomial_distribution<> fortydrops(40,0.5); //a fair coin will be tossed 40 times.
which returns the (random) number of face outcomes of 40 coin tossings. (This distribution is undocumented.)
Note on syntax (&)
The ampersand (&) at the end of the random source type in the template parameter (boost::mt19937&) is important if the function where the line is, is reentrant (called many times). Because it will guaranty that the seed is not reset between calls. (It is possible that we always want the same random sequence when calling a function, but that is not very common.)
It is also important not to put it when the random generator is declared in one line.
using namespace boost; variate_generator<mt19937, uniform_real<> > die_real(mt19937(), uniform_real<>(-0.2,0.2));
Boost Math Constants
(Not to be confused with Boost Units Constants.) This is not a library per se, is an internal part of Boost.Math. It just defines a few mathematical constants numerically to some high literal precision. The advantage compared to other libraries is that it provides the numerical constants with arbitrary type (always based in a hard-coded 100 digit expansion).
For example if we want pi to 'float' accuracy we call
#include <boost/math/constants/constants.hpp> float pi_float = boost::math::constants::pi<float>();
if we want the truncation at 'long double' precision we call
using boost::math::constant::pi; long double pi_ldouble = pi<long double>();
the translation to the right precession is made during compilation.
Besides pi, there are several other constants predefined template as listed in the reference page:
Why as many as 100 digits? Because this works also with user defined numeric type that can exploit extended precision (for example 100 digits or more), such as mpfr numbers.
#include <iostream>
#include <boost/math/bindings/mpfr.hpp>
#include <boost/math/special_functions/gamma.hpp>
#include <boost/math/constants/constants.hpp>
int main(){
mpfr_class::set_dprec(500); // 500 bit precision
mpfr_class v = boost::math::tgamma(sqrt(mpfr_class(2)));
mpfr_class pi = boost::math::constants::pi<mpfr_class>();
std::cout << std::setprecision(50) << v << std::endl;
std::cout << std::setprecision(50) << pi << std::endl;
}
// output:
// 8.8658142871925912508091761239199431116828551763805e-1
// 3.1415926535897932384626433832795028841971693993751
(the example needs GMP, MPFR, and GMPFRXX libraries installed to work, it also shows that many Boost.Math functions work with multiprecision types and not only with doubles).
Boost Math Special Floating Points
There some tricks to determine special floating point conditions (these include 'nan', 'inf' and zero). Unfortunately there is no standard way to do it since it depends on the system. In the systems in which such detection is possible Boost implements a series of functions to determine such special conditions:
#include<boost/math/special_functions/fpclassify.hpp> using namespace boost::math isinf isnan
It can also detect normal conditions
isfinite isnormal
The difference between the last two is that zero (0) is not considered a normal floating point (but 0 is finite).
Due to compatibility with macros defined elsewhere it is recommended to enclose the function in parenthesis
#include<boost/math/special_functions/fpclassify.hpp> (boost::math::isnan)(x) // test that the value held by x represents a number, for example 5.2, 0., or inf.
Additionally, the function fpclassify<> returns an integer code (FP_ZERO, FP_NORMAL, FP_INFINITE, FP_NAN, FP_SUBNORMAL) that tells what kind of floating point is passed, as documented here.
(The distinction between -inf and +inf can still be done with the comparison to zero).
Note that the best way to filter inf and nan together is to test agaist isfinite condition: 'inf' is not finite and neither is nan. (technically, nan is not infinite either).
Boost Math Tools
Many math functions in Boost.Math use some auxiliary algorithms, some of them are very useful, this is a synopsis of the root finding algorithms without derivatives. There many other libraries that provides this funcionalities, the advantange of this implementation is that the algorithm works with arbitrary types, for example extended precision integers or Boost.Units quantity types.
To be general, lets assume one has a functoid f of type F that can be evaluate at any point t of type T, and that the result of F(t) is a numeric quantity that can be ordered, i.e. f(t1) < f(t2) or f(t2) < f(t1). All these functions return a bracket contained in a pair. F can be a function pointer or a class with operator()(T t) defined.
#include <boost/math/tools/roots.hpp> using namespace boost::math::tools
bisect(f, min, max, tol[, max_iter[, policy] ]); // needs to be bracket initaly
toms748_solve(f, a, b, tol[, max_iter[, policy] ]); // solves given the known initial bracket
toms748_solve(f, a, b, fa, fb,tol[, max_iter[, policy] ]); // idem but the initial values are known (saves two evaluations)
bracket_and_solve_root(f, guess, factor, rising, tol[, max_iter[, policy] ]);
The last does both a bracket search and uses it to obtain a root via the toms748, needs f to be monotonic and whether it is a raising function or not. The parameter factor is used to scale the value of the guess in order to first bracket the root. Therefore the guess bracket only does so within positive or negative numbers depending on guess. (Although restrictive in many cases the sign of the root is known in advance.) Internally uses the TOMS748 algorithm, i.e. toms748_solve.
Boost Basic Linear Algebra Math (Boost.Ublas)
Boost.Ublas is a special kind of library, since it is an interface library. Ublas interfaces the good old BLAS library to modern C++ notation and without loss of efficiency.
First a little bit of plain old BLAS to produce the following operation
// c++ dgemm_example.cpp -lblas #include<iostream> #include<cassert> using std::cout; using std::endl;
// external declaration (to bind with fortran)
extern "C" {
void dgemm_(const char* transa, const char* transb, const int* l, const int * n, const int* m,
const double* alpha, const void* a, int* lda,
void* b, int* ldb,
double* beta, void* c, int* ldc);
}
int main(){
// declaration
int Ac = 3;
int Ar = 2;
double* A = new double[Ac*Ar];
int Bc = 2;
int Br = 3;
double* B = new double[Bc*Br];
int Cc = 2;
int Cr = 2;
double* C = new double[Cc*Cr];
// definition (and printing)
cout << "A = \n";
for(unsigned ic = 0; ic != Ac; ++ic){
for(unsigned ir = 0; ir != Ar; ++ir){
cout << (A[ic*Ar + ir] = ic + ir) << ", ";
}
cout << "\n";
}
cout << "B = \n";
for(unsigned ic = 0; ic != Bc; ++ic){
for(unsigned ir = 0; ir != Br; ++ir){
cout << (B[ic*Br + ir] = ic + ir) << ", ";
}
cout << "\n";
}
cout << "C = \n";
for(unsigned ic = 0; ic != Cc; ++ic){
for(unsigned ir = 0; ir != Cr; ++ir){
cout << (C[ic*Cr + ir] = ic + ir) << ", ";
}
cout << "\n";
}
// function call
{
// parameters
double alpha = 1.;
double beta = 0.;
char transA = 'N';
char transB = 'N';
cout << "C = alpha*A^" << transA << "*B^" << transB << " + beta*C, with alpha = " << alpha << " and beta = " << beta << "\n";
int ldA = Ar;
int ldB = Br;
int ldC = Cr;
// runtime dimension check
assert( Ac == Br ); int k = Cc;
assert( Ar == Cr ); int n = Cr;
assert( Cc == Bc ); int m = Ar;
// value to reference transformation (to bind with fortran)
dgemm_(&transA, &transB, &n, &m, &k, &alpha, A, &ldA, B, &ldB, &beta, C, &ldC);
}
// print output
cout << "new C = \n";
for(unsigned ic = 0; ic != Cc; ++ic){
for(unsigned ir = 0; ir != Cr; ++ir){
cout << C[ic*Cr + ir] << ", ";
}
cout << "\n";
}
return 0;
}
The previous code has many drawbacks, in fact all the operation have a very low level of abstraction. For these reasons, several interfaces have been created cblas (cblas_dgemm) in the context of C, and gsl_blas_dgemm in the context of GSL. Both are partial solutions to the abstraction problem.
Now let's see how to perform the same operations with Ublas.
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/ublas/io.hpp>
using std::cout; using std::endl;
int main () {
using namespace boost::numeric::ublas;
matrix<double> A(3, 3);
for (unsigned i = 0; i != A.size1(); ++ i)
for (unsigned j = 0; j != A.size2(); ++ j)
A(i, j) = 3. * i + j;
cout << "A = " << A << endl;
matrix<double> B(3, 3);
for (unsigned i = 0; i != B.size1(); ++ i)
for (unsigned j = 0; j != B.size2(); ++ j)
B(i, j) = 3. * i + j;
cout << "B = " << B << endl;
matrix<double> C(2, 2);
for (unsigned i = 0; i != C.size1(); ++ i)
for (unsigned j = 0; j != C.size2(); ++ j)
C(i, j) = 3. * i + j;
cout << "C = " << C << endl;
C = 0.5*prod(A, B) + 1.*C;
cout << "new C = " << C << endl;
return 0;
}
Using uBlas on existing code together with C-arrays (shallow_array_adaptor)
uBlas matrix and vector can be adapted to programs that already use raw C-arrays, by passing the address of the memory to be managed. The key is to #define BOOST_UBLAS_SHALLOW_ARRAY_ADAPTOR before #includeing the uBlas headers. After that a third optional parameter is passed to the matrix or vector declaration. Note that the data in memory is not copied but managed by uBlas.
#if compile
time c++ $0 -o $0.x && $0.x $@; exit
#endif
#define BOOST_UBLAS_SHALLOW_ARRAY_ADAPTOR //activate array adaptor
#include <boost/numeric/ublas/vector.hpp>
#include <boost/numeric/ublas/matrix.hpp>
#include <boost/numeric/ublas/io.hpp>
#include <iostream>
using std::cout; using std::endl;
int main() {
using namespace boost::numeric::ublas;
unsigned N = 4, M = 5;
double *data_matrix = new double[N*M];
data_matrix[1] = -3.14159;
// Create a boost ublas matrix using the data_matrix pointer as storage.
matrix<double, row_major, shallow_array_adaptor<double> > A(
N, M, shallow_array_adaptor<double>(N*M,data_matrix)
);
A(3, 3) = 3.0;
A(0, 2) = 4.0;
// Should print -3.14159, 3
cout << A(0,1) << ", " << A(0,2) << ", " << A(3,3) << endl;
cout << "A = " << A << endl;
cout << data_matrix[2] << endl;
return 0;
};
Boost.Accumulators
During a computer simulation it is often useful to 'collect' or 'accumulate' statistics. This accumulated statistics can be reported at the end of the simulation or used immediately on-the-fly to monitor or check for convergence or steady states. For example, the accumulated statistics can be the mean, variance and extremes of a certain variable.
Writing code that does this is indeed trivial but since it is not the main function of the code it can interfere with the normal flow of the code, we have to introduce many new variables that keep track of the statistics, such as counter, total sum, sum of squares, etc. Then these quantities have to be combined in certain ways when reporting the results.
The Boost.Accumulators library provides a consistent way of collecting this information. Accumulators can also keep track of extreme values (min/max), element count and other user defined features.
The first step is to create an accumulator object with certain 'features'. For example if we want to have an accumulator that reports the mean value, the second moment and min and max values we declare
#include <boost/accumulators/accumulators.hpp> #include <boost/accumulators/statistics/mean.hpp> // or other stat features ... using namespace boost::accumulators; accumulator_set<double, stats<tag::mean, tag::variance, tag::min> > acc;
The first template parameter is the type of the data elements that will be feed to the accumulator, the second argument is the list of features or statistics that the will be accumulated. Note that the list of features is specified at compile time, therefore the library is able to know the number of parameters that this particular accumulator will store during book-keeping. For example, if the only feature is the 'mean' value the sum of squares is not necessary and never computed (only the count and the sum of elements is stored). On the other hand, if tag::min or tag::max are the only features it is not necessary to keep the count internaly, etc.
As a second step we need to actually accumulate the information, this is done simply by enclosing the data in the parenthesis operator of the accumulator, for example in a loop
for( ... ){
double data_point = function(...);
acc(data_point);
...
}
As third and final step and after some (or all) data is collected we need to extract the final result that the object was comited to accumulate. To extract the information we must apply a function with the name of the statistics we wanted,
std::cout<<"mean: "<<mean(acc)<<std::endl; std::cout<<"variance: "<<variance(acc)<<std::endl;
only statistics or features that were declared are allowed to be applied to the accumulator, otherwise a compile error happens.
std::cout<<"max: "<<max(acc)<<std::endl; // <-- compile error, max feature was not requested when acc was declared
The accumulators also support covariance, i.e. how two or variables change together.
Special numerical operations (boost::numeric::functional)
Boost.Accumulators and statistics in general assume a set of numerical operations on the accumulated objects that may be larger than the number of operators defined for that object, or may change the meaning of defined operations in the context of statistics. For example, if we accumulate integers, what is supposed to be the average of integers, is the average a real number? or another integer? In any case we should be explicit because otherwise the native C operations will perform integer-division (where 3/2==1).
For Boost.Accumulators, these special operations are defined in the namespace boost::numeric::functional.
Another example is an object that doesn't carry information on how to divide by an integer, for example std::complex<double>. In these cases we have to extend the operations available to those objects. An interesting case that doesn't define division by integer is boost::units::quantity<Unit>.
In this example, we specify how the division by an integer is carried on:
namespace boost {namespace numeric{namespace functional{
using namespace boost::units;
template<class Unit, typename Y> struct quantity_tag{};
template<class Unit, typename Y> struct tag<quantity<Unit,Y> >{
typedef quantity_tag<Unit,Y> type;
};
template<typename Left, typename Right>
struct average<Left, Right, quantity_tag<typename Left::unit_type, typename Left::value_type>, void>{
typedef Left result_type;
result_type operator()(Left & left, Right & right) const{
return left/(typename Left::value_type)right;
}
};
}}}
(This is a technique is known as 'static tag dispatching'.)
Fortunately this step is not necessary for std::complex<double> and std::vector<double> (taken as a vector in a linear space) if we include these adaptor headers before the accumulator headers:
#include <boost/accumulators/numeric/functional/complex.hpp> #include <boost/accumulators/numeric/functional/vector.hpp>
These headers tell the library how to average complex numbers and vector of numbers.
Other Resources
http://www.bnikolic.co.uk/boostqf/accumulators.html Official Manual http://boost-sandbox.sourceforge.net/libs/accumulators/doc/html/index.html
Boost.Format
The C-way to print formatted variable numbers is printf("...format string...", a, b, c, ...). This function call tends to have error since we have to make sure that the format string agrees with the actual type of the variable. In other words, printf is not type-safe. The C++ (which still allows to call printf) changed the preferred syntax to the std::cout (and std::cerr) shift(<<) operator:
std::cout<<"text "<<float_number<<" text"<<std::flush<<" more text "<<integer_number<<std::endl;
The user doesn't have to worry about the type of the variable that is printed and the operator<< can be overloaded to print user defined classes. This was, in my opinion, a great improvement. Yet, sometimes it is necessary to specify the format in a way that is constructed at runtime or the format is specified in a separate part of the code. Enter, Boost.Format.
Boost.Format allows to specify printing format in a similar way as printf used to do, with similar and even greater flexibility, But in a way that is type-safe.
The format is specified by a string were the entries are represented by a number (starting at 1) enclosed in percent symbol, like %1%, %2%, %3 etc. After the format is specified the variables (which can be any type in principle) are passer via the operator % (not the string %). For example
#include<boost/format.hpp>
...
cout<< boost::format(" %1% | %2% | %3% ") % a % b % c << "rest of the line" <<endl;
If the number of variables passed after 'operator%' is different from the maximum fields index present in the format string the program will give an error (i.e. raise an exception of type 'boost::io::too_few_args' or 'boost::io::too_many_args' respectively).
So far, in the example, the use are no different than the normal operator<< in C++. The first difference to note is that
1. The format string can be specified at runtime
std::string frmt; ... //assign frmt from user input, e.g. frmt = "%1% | %2%"; cout<<boost::format(frmt) % a % b ;
2. Format can repeat fields
cout<< boost::format("%1%, %2% | %1%") % a+b % c ; //a+b is printed twice, e.g. '4, 5, 4'
The advantage with respect to printf is that the boost::format is flexible but still type-safe.
3. For numeric types boost::format (like printf) can specify the format of the numeric output.
cout<< boost::format("%1$+05u") % 234; //prints '+0234'
For variables of any type the field number is enclosed in '%', for numeric fields with specific format the field number is enclosed by '%' on the left and '$' on the right followed by the numeric format string. This numeric format string follows the printf conventions like in these examples:
%[flags][width][.precision][length]specifier
For example '%1$+05u' prints an integer (unsigned or int) field using (at least) '5' characters (sign included) completing with left zeros '0'. Plus symbol '+' means that the sign is always specified even if the number is positive. If the field is not an integer (for example, it is double or string) the numeric format is ignored and no error is reported. If the integer needs more than three characters it will use them.
Invalid format strings give an error at run time. So it is a good idea to test the formatting in a small program, for example here http://codepad.org/BINppZ4i.
Lexical Cast (convert to string)
In boost/lexical_cast.hpp there is a function that converts any type of variable in any other as long as the the types can be used as output or inputs in a stream. A typical use is
double a = 5.123; std::string a_string = boost::lexical_cast<std::string>(a); assert( a == "5.123");
The function is very useful for converting any variable in a string, which can by manipulated as such. Unfortunately that simple function does not control the format in which numbers are converted to string. A combination of the function boost::str and the boost::format object can by used to overcome this limitation:
double a = 5.123;
std::string a = boost::str(boost::format("%011.2f")%a); // str(format("%011.2f")%a);
assert( a == "00000005.12" );
Boost.Units
Most physical quantities have units, if we don't take proper care of units, bad things will happen. When programming we usually don't care about the units because we tend to work in consistent systems of units. Sometimes, we need to output some variables in a different system of units that the main program is working on. We do this by doing some adhoc conversion with some multiplication in the output. Other times the input is an different sets of units.
To give an example, when working in atomic problems it makes complete sense to program in atomic units (au) (Hartree, Bohr, etc.). Sometimes we would like to use input parameters in Angstroms instead or more often we need to output the stress tensor in Pascals (which is not au).
Physical units are some sort of typing system of physical model, under some physical models it doesn't make sense to sum length and time. On the other hand, if we are working in a relativistic model it does make sense to sum lenght and time (and of course since there is no disticition this is called space-time). So this "typing system" is not unique an depends on the model we are using.
Once we stablish a model by can give this dimensional information to the program. The ad hoc way is to define conversion factors all over and add comments or same specific names to the variable. But this doesn't guaranties that at some point we will sum and energy for example. This function calculates the area
double area(double x, double y){return x*y;}
here we implicitly assume that the two doubles in the argument are lenghts (L), and the return type is an area (L*L) but the program doesn't know. So the subroutine is really a multiplication routine, not an area-calculating routine (it is just an interpretation). Somebody can without your knowledge can modify the routine and return x+y and you will interpret the result as an area.
So, what is the solution, in C++ we could create classes whose instance will keep track of the units. This runtime approach is very inefficient, suppose we have an array of 1000 length (e.g. measurements) we don't want each of these variables to have memory used with the redundant that all of them are lengths. We also want to a routine that is as efficient as the routine that works with built-in double.
The solution is to store the dimension of the variable in the type it self. Then all the dimensional correctness is enforced at compilation-time and runtime efficiency is preserved.
The idea is more or less to define type that will have dimensional information in such a way that:
double_area area(double_length x, double_length y){return x*y;}
Manually writting all the possible resulting dimensions, handling products sums, etc, is not practical. Here is where Boost.Units comes to the rescue. Boost.Units is a very complicated library that enforces dimensional correctness. Dimensional analysis requires a linear algebra solver, this is implemented in Boost.Units by means of template technology. The official manual can also be of interest.
Definining Units
Systems of Units
To use Boost.Units we don't necessarelly need to define a system of units, most common systems SI and CGS ara already defined. But in order to give the complete example here it is the code to define the atomic units (au) system. This section gives an idea of how the library works but it can skipped.
#include <boost/units/base_unit.hpp>
#include <boost/units/physical_dimensions/mass.hpp>
#include <boost/units/physical_dimensions/length.hpp>
#include <boost/units/physical_dimensions/electric_charge.hpp>
#include <boost/units/physical_dimensions/angular_momentum.hpp>
#include <boost/units/physical_dimensions/energy.hpp>
#include <boost/units/make_system.hpp>
#include <boost/units/io.hpp> //do not remove: '...has initializer but incomplete type...' error
namespace boost{
namespace units{
namespace au{
using std::string;
// taken from http://en.wikipedia.org/wiki/Atomic_units#Fundamental_units
struct electron_mass_base_unit : base_unit<electron_mass_base_unit, mass_dimension, 1>{
static string name() {return ("electron_mass");}
static string symbol() {return ("me");}
};
struct bohr_radius_base_unit : base_unit<bohr_radius_base_unit, length_dimension, 2>{
static string name() {return ("bohr_radius");}
static string symbol() {return ("a0");}
};
struct elementary_charge_base_unit : base_unit<elementary_charge_base_unit, electric_charge_dimension, 3>{
static string name() {return ("elementary_charge");}
static string symbol() {return ("e");}
};
struct reduced_planck_constant_base_unit : base_unit<reduced_planck_constant_base_unit, a ngular_momentum_dimension, 4>{
static string name() {return ("reduced_planck_constant");}
static string symbol() {return ("hbar");}
};
struct hartree_energy_base_unit : base_unit<hartree_energy_base_unit, energy_dimension, 5>{
static string name() {return ("hartree_energy");}
static string symbol() {return ("Eh");}
};
typedef make_system<
electron_mass_base_unit,
bohr_radius_base_unit,
elementary_charge_base_unit,
reduced_planck_constant_base_unit,
hartree_energy_base_unit
>::type system;
typedef unit<dimensionless_type, system> dimensionless;
typedef unit<mass_dimension, system> mass;
typedef unit<length_dimension, system> length;
typedef unit<electric_charge_dimension, system> electric_charge;
typedef unit<angular_momentum_dimension,system> angular_momentum;
typedef unit<energy_dimension, system> energy;
static const mass electron_mass;
static const length bohr_radius, bohr;
static const electric_charge elementary_charge;
static const angular_momentum reduced_planck_constant;
static const energy hartree_energy, hartree;
}
}
}
The code seems complicated but it has its logic. There are four important blocks. The first is basically a table of names and symbols for the fundamental units in that system, with some abstract information on the physical dimension. The symbols will be used when printing the values to the screen. The numbers on the left are conventional numbers to identify the units internally, any positive number is ok as long as it doesn't clash with other units defined, negative numbers are reserved. (For example, the SI system --included in the library-- uses negative numbers.) The make_system line creates the system of units, and the following block gives short type names to the physical dimension of the fundamental units. Finally last block defines unit constants (magnitudes with value 1) in the system of units. Synonyms can be defined here very easily. (For example, hartree_energy and hartree).
Free standing units
Sometimes we need to define units that are not part of a larger system of units. In this way we can define units such as electron volts (eV). The quantities with such units will print nicely with the symbol 'eV' after it. At the same time it is useful that these quantities convert automatically to units with the same dimensions in a system:
namespace boost{namespace units{
namespace atomic{ //optional, this free unit can be part or not of the namespace of a system
struct electron_volt_base_unit : base_unit<electron_volt_base_unit, energy_dimension, 43289 /*unique positive number*/ > {
static const char* name() { return "electronvolt"; }
static const char* symbol() { return "eV"; }
};
typedef electron_volt_base_unit::unit_type electron_volt_unit;
static const electron_volt_unit electron_volt, eV;
}
}}
BOOST_UNITS_DEFINE_CONVERSION_FACTOR(
boost::units::atomic::electron_volt_base_unit,
boost::units::atomic::energy,
double, 0.03674932 // 1 eV = 0.03...E_h
);
BOOST_UNITS_DEFAULT_CONVERSION(
boost::units::atomic::electron_volt_base_unit,
boost::units::atomic::energy
);
use as
quantity<atomic::electron_volt_unit> ene = 1.*atomic::eV; cout<<(double)quantity<atomic::dimensionless>(ene/atomic::hartree)<<endl; //automatic conversion to dimensionless
Scaled units
All units (including free standing units) can defined scaled counterparts, moreover, by default a standard SI prefix (kilo/k, Mega/M, Giga/G) is added to the name of the variable if the scale is a standard power of ten (i.e. 10, 100, 1000, 1000000, but not 10000). Supported prefixes are zepto (-21 zeros), atto (-15), pico (-12), nano (-9) ... peta (15), exa (18), zetta (21), yotta (24). (Hella (27) not supported but can be added easily :) )
typedef make_scaled_unit<si::pressure, scale<10, static_rational<9> > >::type gpa_unit; static const gpa_unit gigapascal, GPa; //use as quantity<gpa_unit> P = 10.*GPa;
(si::pressure is defined in boost/units/system/si/pressure.hpp).
automatically prints to '10 GPa' ('Pa' --for pascal-- string is defined in boost/units/system/si/io.hpp, the 'G',--for giga-- is added by the scaled unit character).
A similar code can be used to define MeV (megaelectronvolt) automaticaly based on the definition of eV in the previous section.
Defining physical quantities
Quantities with Fundamental Units
Using fundamental quantities in a given system is almost trival, first we will easy the notation by specifing that we will use Boost.Units intensively and the system of units.
#include<boost/units.hpp> using namespace boost::units; //for quantity
The we tell the compiler that a certain quantity has physical dimensions and how to store the actual value.
quantity<au::energy, double> a;
so defined 'a' will have the value 1 in the fundamental units of the system (Hartree). Note that at this point the program doesn't know what a Hartree is, it is just some abstract unit of a certain dimension called "energy". The we can assign a value
a = 30.*au::hartree;
We can print the values, by using the usual cout<< "a = "<< a <<endl;. The output will read
'a = 30 Eh'
where "Eh" is the symbol we defined in our table.
In the same way we can define a quantity with unit of length
quantity<au::length> b = 5.*au::bohr;
Note that we can omit 'double' since this is assumed to be the default. We declared 'a' to have units of energy, and this can not change in the future. Here it is the first implicance of using Boost.Units, if we try to do
a = b; // compile error here
which is wrong, we will get an error... from the compiler! We don't have to wait until the program is executed to realize what we just did. In fact the program will not compile until we fix this dimensional inconsistency. Here it the main difference with the typical procedure, we could have used the type double for a and b; but then the previous line would have been completelly ignored.
Although the previous line was wrong, different can be combined in correct ways for example
std::cout<< "a * b = "<< a * b << std::endl;
will print 150. Eh a0. As expected, other combinations, like a + b will fail to compile.
Quantities with Derived Units
Obviously that to preserve the dimensional consistency de result of 'a / b' has to have units of force. So, if we want to store the result we need to declare a new variable with units of force, this is achieved that to a the Boost.Units metafunction
quantity<unit<force_dimension, atomic> > c = a/b;
or we can shorten the syntax with using
typedef unit<force_dimension, atomic> atomic_force_unit; quantity<atomic_force_unit> c = a/b;
There are many physical dimensions predefined in boost/units/physical_dimensions/. For example: acceleration, area, magnetic_flux, stress, surface_tension, velocity, volume and many others.
Still, there maybe some quantities that we need to define that don't have a name. For example if we need something with units of inverse energy times inverse volume we don't have a predefined dimension for that, although this is the units of the electronic density of states. In this case we need to define the quantity ourselves. Boost.Units uses the syntax of Boost.MPL to express such composition. For example
using mpl::times; using mpl::divides; typedef boost::mpl::divides<dimensionless_type, times<lenght_dimension, volume_dimension>::type>::type dos_dimension; typedef unit<dos_dimension, atomic::system> atomic_dos_unit; //or one_over_hartree_Bohr3 quantity<atomic_dos_unit> mydos = 1./en/vol;
A big shortcut could that avoid all the previous, be to use
BOOST_AUTO(c, a/b); // or 'auto c=a/b;' in C++0x (compile with 'g++-4.4 -std=c++0x')
to let the compiler the compiler "detect" the correct type of 'c' (which can have a very complicated specification) but this can lead to problems as we loose track of the units of the quantities involved.
Mathematical Funtions on Physical Quantities [boost/units/cmath.hpp]
Many special functions for floating point numbers (not quantities) are defined in the header file cmath for example:
#include<cmath> ... double a = sqrt(b); double b = pow(b, 3); //see below why this can not be used with units
In the same way Boost.Units implement its own cmath header (called boost/units/cmath.hpp) to take care of unit correctness:
- Square Root (
sqrt)
#include<boost/units/cmath.hpp> ... quantity<unit<area_dimension, au::system> > square_area = 64*au::bohr*au::bohr; quantity<unit<lenght_dimension, au::system> square_side = boost::units::sqrt(square_area); // square_size = 8*au::bohr
or much shorter
using boost::units::sqrt; auto square_side = sqrt(square_area);
In general the 'using' statement is not even necessary since the namespace of sqrt is deduced from the argument type. 'auto' is a keyword of C++0x, if it not used then we have to explicitly declare the quantity with the right units (and dimension).
- Integer Power (
pow)
In the case of power it is different because to track units at compile-time the value of the integer power must be a constant (i.e. a value known at compile-time), therefore it can not be passed as a second argument, instead it is passed as a template argument
auto cube_volume = boost::units::pow<3>(cube_side);
- Rational Power (
rootandpow)
It is not uncommon that sometimes rational power of units are used in intermediate calculations, the first example is sqrt above. More generally:
auto cube_side = boost::units::root<3>(cube_volume);
The most general case is handled with
auto cube_face = boost::unit::pow< boost::unit::static_rational<2,3> >(cube_volume);
or for short:
auto cube_face = pow<2>(root<3>(cube_volume));
- Trigonometric functions (
sin, cos, atan, atan2, ...)
Trigonometric functions are implemented for arguments for which its units are of plane_angle_dimension. Using units for angles can add an extra layer of type checking but it can be counter-intuitive, specially if you are already used to work in radians. If the argument of these functions is dimensionless then the standard (e.g. std::sin is called).
- Absolute value (
fabs)
quantity<si::length> distance = fabs(x1 - x2);
Use existing numerical functions, subvert dimensional checking (::from_value and .value)
An arbitrary user defined function
double scale(double x, double factor){ ... ; return result;}
does not work by default with quantities, unless the quantity is dimensionless. Even if we managed to make it work by some kind of trick we would still like to keep track of the resulting dimension, which may o may not be of the same units as the input.
For simplicity let's assume that the output must have the same dimension as the input. The we can overload the function in this way and internally call the good old function
quantity<si::length> scale(quantity<si::length> x, double factor){
return quantity<si::length>::from_value( scale(x.value(),factor) );
}
Note that 'from_value' and 'value' subvert the dimensional checking but can be nonetheless necessary for forcing the internal arithmetic.
The function can be generalized to arbitrary quantity dimension via templates. Even more challenging is to determinine (also from template programming) the correct type for the returned quantity. For this metafuctions such as power_typeof_helper<Unit, static_rational<P,Q> >::type and mpl::times can be useful.
Display internal unit types for debugging (simplify_typename)
Boost Units relies in creating an extremely complicated type system, which in principle should be transparent to us since as long as we know the dimensionality and the system units we are working with. In general we never refer to the complicated types that the library generates. In reality, sometimes we need to know what is the exact type of a certain variable, for example for debugging. There is a function called simplify_typename that prints the exact type of the variable (and gets rid of the ubiquitous 'boost::units' string).
#include<boost/units/io.hpp> ... quantity<atomic::volume> v(4.*pow<3>(atomic::bohr)); cout<<boost::units::simplify_typename(v)<<endl; cout<<v<<endl;
In this example we know that the exact type of the variable 'v' is
quantity<unit<list<dim<length_base_dimension, static_rational<3l, 1l> >, dimensionless_type>, homogeneous_system<list<atomic::electron_mass_base_unit, list<atomic::elementary_charge_base_unit, list<atomic::reduced_planck_constant_base_unit, list<atomic::coulomb_force_constant_base_unit, list<atomic::boltzman_constant_base_unit, dimensionless_type> > > > > >, void>, double>
and its value is
4 (a_0^3)
Internally it is based on abi::__cxa_demangle from abicxx.h, which is a more general way to print type names at runtime. However 'simplify_typename' also simplifies the names by removing the 'boost::units' string.
Display quantities
Boost.Units supports pretty well the display of quantities with units:
cout<< "E = " << E << endl; // prints E = 3.55791e-12 m^2 kg s^-2
Depending on the system units can try to simplify the units before printing; this done if we #include<boost/units/systems/si/io.hpp>, in this case the printing is simplified to
... // prints E = 3.55791e-12 J
(J is for Joule unit).
Sometimes is necessary to separate the actual numerical value and the unit tag of the string representation of a quantity, for example when printing data files. We want columns of numbers to print without the units, eventually the unit string can go only once at the header of the column. So achieve this, suppose that we have a
quantity<si::energy> E = 4.*si::joules;
then we can print the numerical value and the unit separatelly in the following way:
cout<< E.value() << " * " << si::energy() ; // or << quantity<si::energy>::unit_type(); // or << E_type::unit_type(); //after typedef typeof(E) E_type; // or << symbol_name(si::energy())
Note the parenthesis () after the unit type.
(The library support also Boost.Serialization.)
Unit Conversions
Units is not just about keeping dimensional consistency; it is also about being able to convert between different unit systems.
You can notice that we use the unit system prefix when declaring a quantity, for example in atomic units:
quantity<au::energy> E1 = 10.*au::hartree;
or in SI system
quantity<si::energy> E2 = 20.*si::joules;
We may wonder why we need to specify the unit system in the right hand side. We would like to abstract ourselves from the unit system, after all, both are energies! That is correct in an ideal world, in practice we work with numbers of finite precision therefore storing an atomic quantity in SI units leads to a very small number that can even go below the precision limit or at least affect the effective precision. This is the practical reason for which we use different unit systems in the first place.
In some sense giving information on the unit system to the quantity is telling the system which order of magnitude are que quantities. Nevertheless sometimes we need to convert units.
Ad hoc transformation
There is simple way of transforming quantities without much additional code. Suppose that we have a data file with energies and we know those energies are in electron volts (eV). But our system of choise is Hartree, we can express this information in the code by just defining a quantity called eV in our unit system. In this case eV is not a unit but a physical constant.
const quantity<au::energy> eV = 0.03674932540*au::hartree; //http://physics.nist.gov/cgi-bin/cuu/Value?evhr double numerical_input; cin>>numerical_input; //e.g. read from file quantity<au::energy> E = numerical_input*eV; //this tells that original data was in eV
Now for the output we can use the opposite trick. If we print E, the output will be in Hartree.
cout<<E; // shows for example 4. Eh
we can force the output to we in eV by doing
cout<< (double)(E/eV) << " eV"; //shows 108.84 eV
The cast operation '(double)' is needed to convert the dimensionless quantity to the built-in variable. Otherwise the printing will print 'dimensionless' after the numerical value. Cast operation to built-in type are useful in other context and only and it is only permitted if the quantity is dimensionless.
This solution is not optimal but it works pretty well for quantities output.
Automatic Type Declaration in C++0x (auto)
A word of caution about the 'auto' declaration type: In this example the type (and units) of x is determined automatically:
auto x = 0*au::hartree; //valid C++0x x+=0.1*au::hartree; cout<<x<<endl;
but outputs '0 Eh'!! This is because this declaration is equivalent to
quantity<au::energy, int> x = 0*au::hartree;
Note the 'int'. That is, the internal value is an integer. This is most likely not the desired behavior. Although the value initialized is zero the type the variable carries is not a floating point. Remember that '0' is an integer literal in C++. There are few alternatives.
- Declare the type explicitly
quantity<au::energy, double> x = 0*au::hartree;
- Add a decimal point or a literal indicator to the literal number
auto x = 0.*au::hartree; // or 0.0*au::hartree; // or 0f*au::hartree;
Physical Constants (SI codata)
For convenience the library includes the most common physical constants. For the moment physical data is given only for the SI systems, but with the implicit conversions it can be used in any unit system. The definitions are contained in boost/units/systems/si/codata_constants.hpp.
Most popular ones include
si::constants::codata::k_B; //Boltzmann constant si::constants::codata::c; //Speed of light si::constants::codata::G; //Gravitational constant si::constants::codata::hbar; //Reduced Plank constant si::constants::codata::m_e; //Electron Mass si::constants::codata::e; //Elementary Charge (positive electron charge).
all defined in SI units with the proper dimension. Note that these are not units but global constant C++ variables. For example k_B has units of Joule over Kelvin and has a specific numeric (double) value. For detailed reference see http://physics.nist.gov/cuu/Constants/index.html
The 'using', and 'using namespace' declaration can by used to simplify the notation:
using si::constant::codata::k_B; quantity<si::energy> e = 3.*k_B * 1000.*si::kelvin;
Boost.Operators and Boost.Concepts
This section is more 'mathematical'. When we program we tend to forget all that we know about mathematical theory we do so because too much abstraction at the coding level does not improve the code quality and speed, also C++ (or any other compiled language) does not provide by default enough facilities to allow this abstraction.
These two libraries, when combined, provide a nice framework to introduce some abstractions and at the same time a certain level of mathematical correctness to our code.
Abstraction will not provide faster code or readability for the most part, it only generalizes our code to the maximum extend that the mathematics behind the problem can provide.
Another constrain that I will impose is that the mathematical abstraction should not be introduced at the cost of program execution speed.
Let's start with a very simple example, let's try to construct a type that represents a vector space of fixed dimension (think of a 3D vector), a naive and perfectly valid implementation is the following
class v3d{
public:
boost::array<double, 3> coord_;
v3d(boost::array<double, 3> const& init) : coord_(init){}
friend v3d operator+(v3d const& , v3d const&){ ... elementwise addition ... }
friend v3d operator*(double const&, v3d const&){ ... elementwise scalar multiplication ... }
};
(I will use the 'friend' version for non-member operations).
In principle this a perfectly valid three-dimensional (real) linear space. This implementation allows us to perform any operation possible in this space. (There is no inner product defined, but this is not necessary to have a vector space, we are trying to be mathematical here, remember?).
The aim this section will be to create a class (or more exactly a template class) that is almost mathematically correct. There are obvious ways to make this class more general, first of all. We may want to have a complex space or different number of dimensions. Since we don't want to affect the performance if we need only a real space or a small number of dimensions, the way to implement this is with C++ templates:
template<typename T, unsigned NDim>
class vs{
boost::array<T, NDim> coords_;
...
};
This is a huge generalization. Yet we realize very soon that the use of the library is quite limited in the syntax. For example the user can not substract directly two vectors (v1-v2), it can not increment a given vector (v1+=v2), etc. The list of possible operation between elements is very large, in principle we need to implement +=,+,-=,-,*=,*,/=,/,==,!= and furthermore the implementations have to be consistent with each other. This is prone to errors. On the other hand we know that many of these operations have redundant implementations.
This is where Boost.Operators come to the rescue, it implements several operators in terms of others in a way that is consistent with the usual arithmetics. Let's start with += and +. We can have both operators at the price of one with inheriting our class from a template class provided by boost operatos.
template<typename T, unsigned D>
class vs : boost::addable< vs<T,NDim> >{
...
friend vs& operator+=(vs&, vs const&){ ... }
};
The trick is that the operator+ (for v1+v2) will be available to the class thanks to boost::addable even if we don't implement such operation explicitly. Two for the price of one. We can go further and do the same with operator -= and have operator- defined automatically.
template<typename T, unsigned D>
class vs : boost::substractable< vs<T,D> >, boost::addable< vs<T,D> >{
...
friend vs& operator+=(vs&, vs const&){ ... }
friend vs& operator-=(vs&, vs const&){ ... }
};
for each set of operation we can add a new dependency, in order for the list not to become to long, the library offers shortcuts, for example a class that additive is substractable and addable. This shortcut is provided because this is a usual case (mathematically speaking)
template<typename T, unsigned D>
class vs : boost::additive< vs<T,D> >{
...
};
This way of defining mathematical classes does not only provides syntactic sugar but also consistency within the syntaxis.
To complete the operations we need a couple of additional operations: for scalar multiplication/division and for comparisons. It turns out that this operators also can be defined with a minimal effort.
template<typename T, unsigned D>
class vs :
boost::additive< vs<T,D> >, //provides + (binary) and - (binary) given += and -=
boost::multiplicative< vs<T,D> , T>, //provides * and / given *= and /=
boost::equality_comparable< vs<T,D> , T> //provides != given ==
{
boost::array<T, D> data_;
friend vs& operator+=(vs&, vs const&){ ... }
friend vs& operator*=(vs const&, T const&){ ... }
friend vs& operator-=(vs& self, vs const& other){return self+=-other;}
friend vs& operator/=(vs const&, T const& t){return self*=(T(1)/t);}
// operator== is defined by default, and is correct if implementation is value based
}
Note that that we only need to implement two operators '+=' and '*='.
Concrete implementation vs. mathematical abstraction
An obvious implementation for operator += and *= looks like this
for(std::size_t i=0; i!=self.size(); ++i) self.data_[i] += other.data_[i]; //or *=t;
This is also completely acceptable but it is not the degree of abstraction we want. The produced class has the expected mathematical behavior but it has information extraneous to a mathematical description. For example, why are we forced to use boost::array, for example a user may want to implement the linear space with std::vector or some other container or not even a container. Second, even if we stick to a container implementation, why element by element addition has to be the concept of sum (there are other composition laws that allow for a correct concept of sum). Furthermore, why the linear space has to have finite dimensionality!! (finite dimensionality is not a mathematical requirement for a linear-space.)
To solve the first problem we have to parametrize the template class in the implementation, the class is changed to something like
template<class Element>
class vs{
Element data_;
friend vs& operator+=(vs&, vs const&){ ... ??? ... }
friend vs& operator*=(vs&, T??? const&){ ... ??? ... }
};
Now, this is completely generic regarding the internal representation of the linear space; but it creates a big problem. There is no way to implement += in a generic way, even if we constrain Element to be a container we don't know what is the type of the 'scalar' (T???) in the scalar multiplication!! The only way out is to add this information explicitly to the class template parametrization. This is becoming really mathematical. Let's review for a moment the mathematical definition of a linear space. Using Nakahara's [2] definition
"A vector space (or linear space) over a field is a set in which two operations, addition and multiplication by an element of (called scalar), are defined. The elements (called vectors) of satisfy the following axioms
- (commutative addition)
- (associative addition)
- (additive identity/zero)
- (additive inverse/negation)
- (linear scalar product)
- (distributive scalar product)
- (scalar product multiplicative)
(where and )"
The first paragraph tell us that we have to define a set of elements (the equivalent of our implementation data type), then we need to define a scalar field and then implicitly it says that we should introduce a concept of addition and multiplication with certain properties. These four pieces of information are in general independent, so we need to parametrize the vector space with all of them.
template< class Element, typename Field_Scalar, class Commutative_Associative_Addition, class Associative_ScalarMultiplication> ...
At the same time we want a desirable mathematical property, that is that when we provide the linear space properties to a given set, the elements of the set preserve certain properties. For example, given the set of one variable functions, we can define a linear space by providing a concept of sum and multiplication (e.g. ) in the process we still think of the elements of the set as functions. The way to preserve the properties of the original set is to use public inheritance from the set type instead of class member of that type.
...
class vs : public Element,
boost::additive< vs<T,D> >,
boost::multiplicative< vs<T,D> , T>,
boost::equality_comparable< vs<T,D> , T>
{ ...
We follow by defining certain details for the space
... typedef Element element; typedef Field_Scalar scalar; typedef Commutative_Associative_Addition addition; typedef ScalarMultiplication scalar_multiplication; ...
The addition and scalar multiplications are of course mean to be function objects that are shared by all the class elements and that are inmutable (constant) from the point of view of the vector space.
... static addition add_; static scalar_multiplication mult_; ...
We also need to be able to generate a vector in the linear space from an element of the set
...
vs(element const& elem) : Element(elem){}
...
Then we follow with
...
friend vs& operator+=(vs& self, vs const& other){return self = add_(self, other ) ;}
friend vs& operator*=(vs& self, scalar const& s){return self = mult_(self, s ) ;}
friend vs& operator-=(vs& self, vs const& other){return self += (-scalar(1) )*other ;}
friend vs& operator/=(vs& self, scalar const& s){return self *= ( scalar(1)/s) ;}
...
Thanks to Boost.Operators the other operators are defined consistently. Following the mathematical definition we have to define (or at least be able to generate) the zero element (it must exists regardless of anything else) and the negative element
...
friend vs zero();
friend vs operator-(vs const& v){vs ret(v); ret*=-scalar(1); return ret;}
friend vs const& operator+(vs const& v){return v; }
};
which finishes for the moment the definition of the class. The definition of the function 'zero()' is left undefined. This is because it is not always needed, second it depends on the details of the linear space we want to define. The complete code looks like this.
Example 1: boost::array as a finite dimensional linear space
By default boost::array doesn't have any arithmetical operator associated to it. So any arbitrary concept of addition in a linear space formed for such elements has to be defined manually. In the context of the class defined above there are three different way to achieve this depending on the syntax we want to achieve. The first two options produce the most compact syntax but requieres us to define classes in the std namespace or the boost namespace
a) Specialize std::plus
using boost::array
namespace std{
template<>
array<double,3> plus<array<double, 3> >::operator()(array<double, 3> const& a1, array<double, 3> const& a2) const{
array<double, 3> ret;
for(unsigned i=0; i!=3; ++i){
ret[i]=a1[i]+a2[i];
}
return ret;
}
}
...
b) Overload + in namespace boost (the namespace of boost::array)
namespace boost{
array<double, 3> operator+(array<double, 3> const& a1, array<double, 3> const& a2){
.. same function body ...
}
In both cases the linear space class can be used without explicitly specifying the operation
array<double,3> a_init={{1,2,3}}; array<double, 3> b_init={{2,3,4}};
linear_space<array<double, 3> > a=a_init;
linear_space<array<double, 3> > b=b_init;
c) Define an independent object-function
class elementwise_add{
array<double,3> operator()(array<double> const& a1, array<double,3> const a2) const{
... same function body ...
}
};
In this last case the addition operation has to be explicit
linear_space<array<double, 3>, elementwise_add> a=a_init; linear_space<array<double, 3>, elementwise_add> b=b_init;
After either a), b) or c) is chosen we can apply the + operation in the sense prescribed for this linear space:
linear_space<array<double, 3>, elementwise_add> c=a+b;
Note that the example the linear space works even though the scalar multiplication was never specified. This is fine, since such operation was not used in the example. (You don't pay for what you don't use.)
To define a multiplication we just need to define a new class and pass it as argument in the linear space.
class elementwise_mult{
array<double,3> operator()(double const& s, array<double, 3> const& a) const{
array<double, 3> ret=a;
for(unsigned i=0; i!=3; ++i){
ret[i]*=s;
}
return ret;
};
...
linear_space<array<double, 3>, elementwise_addition, double, elementwise_multiplication> a=a_init;
At this point it may be beneficial to use typedef
typedef linear_space<array<double, 3>, elementwise_add, double, elementwise_mult> mylinear_space; mylinear_space a, b; mylinear_space c=a+b;
Example 2: Implement an infinite dimensional space with std::vector
Given the famous vector class, we can define an addition concept that accept vectors of any size, provided that we interpret out of bound elements as zeros. std::vector class allow us to return vectors of arbitrary size. NB: std::vector is not a vector space, it is called vector for other reasons, it doesn't carry an addition concept for example.
class size_variable_elementwise_add{
std::vector operator()(std::vector const& v1, std::vector const& v2) const{
std::vector ret(std::max(v1.size(), v2.size()));
for(unsigned i=0; i!=ret.size(); ++i){
if(i>=v2.size()) ret[i]=v1[i];
else if(i>=v1.size()) ret[i]=v2[i];
else ret[i]=v1[i]+v2[i];
}
return ret;
}
};
This particular implementation of addition induces a linear space
linear_space<std::vector, size_variable_elementwise_add>
which dimensionality is infinite in the mathematical sense that the number of independent vectors is not bounded by the program, (well ... it is bounded by the available memory and the runtime.)
Example 3: implement an efficient infinite dimensional linear with std::map
For practical causes when we think of a potentially infinite dimensional vector we probably think of a sparse vector. A sparse vector (many elements are zero) a good choice is to use something different from an array, for example std::map provides a way to defining a sparse option, the reason is that zero elements are omitted.
class elementwise_add_map{
std::map<unsigned, double> operator()(std::map<unsigned, double> const& m1, std::map<unsigned, double> m2) const{
std::map ret;
for(std::map<unsigned, double>::const_iterator im1=m1.begin(); im1!=m1.end(); ++im1){
ret[im1->first]=im1->second;
}
for(std::map<unsigned, double>::const_iterator im2=m2.begin(); im2!=m2.end(); ++im2){
ret[im2->first]+=im2->second; if(ret[im1->first]==0) ret.erase(im2->first);
}
return ret;
}
};
...
linear_space<std::map<unsigned, double> , elementwise_add_map>
Multiplication also can be defined for non-zero elements only.
This is an optimal implementation of a sparse vector, the point of this example is to show that the efficiency of the program is related to the actual implementation of the class and not to the linear_space template class itself.
We can temporarily view any object as an element as a linear space without constructing a new object by casting to a reference type. This allows to perform linear space operations for arbitrary objects.
array<double,3> a ,b, c; typedef linear_space<array<double,3>, element_wise_add, element_wise_mult >& ls_elem; (ls_elem&)c=(ls_elem&)a+(ls_elem&)b;
(section not finished)
Boost.Spirit
One of the more tedious and unfortunately unavoidable task in programming is parsing. Parsing means that given a "free form" text string we have to convert that into data that the program can manipulate, in order to do that we need to "interpret" the text. The parsing includes walking the input text and executing actions each time a certain pattern in found.
Doing this manually is prone to errors and lack of generality. Boost.Spirit is a complex library that allows parsing text quite easily.
The following parser a line that is expected to be comprised by two integers and a word. As this values are read and interpreted they are assigned to target variables. The beauty of it is that the syntax is straight forward and we will know if the parsing failed right away because of the bool returned, in that case the iterator first will contain the point at which parsing failed. This is the simplest use of Boost.Spirit, more complex parse patterns can be defined.
#include <string>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/phoenix_core.hpp>
#include <boost/spirit/include/phoenix_operator.hpp>
namespace qi = boost::spirit::qi;
namespace px = boost::phoenix;
namespace ascii = boost::spirit::ascii;
int main(){
std::string source = " 1 13 Al";
int idx;
int atomic_number;
std::string name;
std::string::iterator first = source.begin(), last = source.end();
qi::phrase_parse(first, last,
( qi::int_ >> qi::int_ >> qi::lexeme[+~qi::char_(' ')] ), ascii::space,
idx, atomic_number, name
);
std::cout << idx << " " << atomic_number << " " << name << std::endl;
return 0;
}
Boost.Proto
V
Boost.ResultOf, generic return types
A great deal of generic programming is related to the ability to deduce the return type of a given C++ expression. For example suppose that we have an object that behaves like a function called 'f' and of type 'F'. When I say that it behaves like function it means that it can be a function pointer, and overloaded function pointer, a member function pointer, a function reference or a object with member operator '()' either monomorphic (only one operator() defined) or polymorphic.
To be specific let us say that I want a generic function that evaluates that entity at a given value.
eval(f, 5.); // to be identical in behaviour to f(5.);
Of course the 'eval' function body is almost trivial
template<class F, class T>
???
eval(F f, T t){
return f(t);
}
But what about the return type ???.
If you are using c++0x just use
decltype(F()(T())
or use the new C++0x function declaration style
auto eval(F t, T t) -> decltype(f(t)){ return f(t);}
and that's it, you can stop reading this section unless you are interested in making the code C++98-compatible.
If you are not using C++0x or want to make your code compatible with C++ then it is where Boost.ResultOf comes to the rescue to achieve something that otherwise is not possible because there is no decltype in C++. The library provides a type given by the expression
boost::result_of<F(T)>::type
that can be used in ???. As described here http://www.boost.org/doc/libs/1_46_1/libs/utility/utility.htm#result_of
But ResultOf is more a standardized protocol than a library. It works provided that you follow a certain rule, that all the function objects that you write have a nested class called result that gives information of the return type of each overload of operator(). It works on function objects and also on free functions (or overloaded free functions).
For example the following class has basically a template result struct for each version of operator(). The example is rather artificial but it the most complicated case that can appear.
struct F{
template<class Signature> struct result; // protocol kick-in
template<class This, class T> struct result<This(T const&)>{typedef T type;};
template<class T>
typename result<F(T const&)>::type operator()(T const& x) const{return x;}
template<class This> struct result<This(double const&)>{typedef double type;};
result<F(double const&)>::type operator()(double const& x) const{return x*x;}
template<class This> struct result<This(int const&)>{typedef std::string type;};
result<F(int const&)>::type operator()(int const& x) const{return '"' + boost::lexical_cast<std::string>(x+x) + '"';}
};
note that the template declaration (Signature) is necessary to specialize the class 'result' for each case and how the redundant 'This' is necessary for the first specialization. The series of template definitions and specializations is the so called "result_of protocol".
The other good thing is that boost::result_of is backward compatible with C++ and C++0x. In fact in C++0x std::result_of (from TR1) is internally implemented in terms of decltype.
In summary, depending on the level of compatibility we have three options:
#include<boost/utility/result_of.hpp> //boost::result_of #include<functional> //std::result_of
template<class F, class T>
option 1: decltype(F()(T())) //works in C++0x only
option 2: typename std::result_of<F(T const&)>::type //works in C++ with TR1 (needs the result_of protocol for F) and in C++0x
option 3: typename boost::result_of<F(T const&)>::type // works in C++ and C++0x (always needs the result_of protocol for F)
eval_at(F f, T const& t){
return f(t);
}
Note that in any case, within C++0x the most general technique is to use 'auto' and 'decltype' on expressions, because it will work on overloaded functions as well.
Based on: http://stackoverflow.com/questions/2689709/difference-between-stdresult-of-and-decltype
Boost Toys
Boost has a few hidden small libraries, some hidden as implementation details in other libraries
Version
Once in a while we use features that are available in one version of Boost but not in previous versions. Different systems have different version of Boost, for example, by default:
- Chaos 4.3/RedHat 5.5: Boost 1.33
- Ubuntu 8.04/10.04/10.10/11.04: Boost 1.34/1.40/1.42/1.42
- Fedora 13/15: Boost 1.41/1.46
Using new features generally in a system with an older version usually results in a compilation error. Unfortunately the error messages are cryptic and this can be very confusing for the user. This can be resolved by a static (compile time) assert that verifies that the system is using a certain version of Boost.
The version information of the Boost distribution is included in
#include<boost/version.hpp>
in a macro defined (for example) as
#define BOOST_VERSION 103301
So, for example to check that one is using Boost version 1.42 *at least* we can add the following code before using such feature.
BOOST_STATIC_ASSERT( BOOST_VERSION / 100000 >= 1 && BOOST_VERSION / 100 % 1000 >= 42); .. code valid only in 1.42 here ..
The expression is verified at compile time and it will give and error message that will be easy to spot. BOOST_VERSION can also be used in #if/#endif blocks.
Such error message will make clear that to use that code we need a newer version of Boost installed. (For example manually, see at the beginning on how to install the current version.)
Progress Bar
One of them is Boost.Progress. It gives a convenient way of printing a progress bar to output. We only need to supply an estimated completion count and increment the counted as progress in the calculation is made:
#include<boost/progress.hpp>
...
boost::progress_display show_progress( max_counter, std::cout, "title of bar:\n" );
for (int count = 0; count < max_counter; ++count){
...
++show_progress;
}
Each time ++show_progress the internal counter is incremented proportionally and a nicely formatted percentage bar will be updated:
0% 10 20 30 40 50 60 70 80 90 100% |----|----|----|----|----|----|----|----|----|----| ****************************
The last two arguments are optional.
If the process requires more counts than the estimate provided (max_counter) it will keep adding stars beyond the 100% mark.
Printing other stuff from inside the loop (e.g. using cout) can ruin the formatting of the bar.
Printing type names (boost::units::detail::demangle)
Obtaining type information at runtime is implemented in C++ via typeid. Unfortunatelly the typename results extraction can give cryptic results, probably associated with internal strings used to represent a type. To transform a name into an human redeable string we need a name demangler. The function described here returns the undecorated (no reference or const qualifier) of a type.
For example, simply calling typeid(variable).name() can give something like
N5boost5units8quantityINS0_4unitINS0_4listINS0_3dimINS0_19time_base_dimensionENS0_15static_rationalILln1ELl1EEEEENS0_18dimensionless_typeEEENS0_18homogeneous_systemINS3_INS0_6atomic23electron_mass_base_unitENS3_INSC_27elementary_charge_base_unitENS3_INSC_33reduced_planck_constant_base_unitENS3_INSC_32coulomb_force_constant_base_unitENS3_INSC_27boltzman_constant_base_unitES9_EEEEEEEEEEEEvEEdEE
which has straneous symbols contained in it. Instead
#include<boost/units/detail/utility.hpp> //not necessary if boost.units is already included using boost::units::detail::demangle; ... cout << demangle(typeid(var).name()) << endl;
gives
quantity<unit<list<dim<time_base_dimension, static_rational<-1l, 1l> >, dimensionless_type>, homogeneous_system<list<atomic::electron_mass_base_unit, list<atomic::elementary_charge_base_unit, list<atomic::reduced_planck_constant_base_unit, list<atomic::coulomb_force_constant_base_unit, list<atomic::boltzman_constant_base_unit, dimensionless_type> > > > > >, void>, double>
this particular function is specifically written to simplify the names of boost.Units quantities (by removing the string 'boost::units::') but can be used for any type.
Boost.ProgramOptions
This is not a small library but its description is small and covers also a small but frequent feature of every pogram, the command line arguments. We start with a small program, and quickly we realize that we need to pass options at the command-line. In C (and C++) command line options are received by main program via the argc and argv parameters
int main(int argc, char* argv[])
the first argument is the number of parameters (words) and argv is a c-array of c-string passed (included the name of the executable itself). Typically we would implement some sort of parser that reads the strings one by one, reading names of variables, etc. This process is tedious and prone to errors. Even after much effort the solution will be short of features. For example suppose that we want to have an option file instead of command line options, we will need to rewrite the parser for the file, etc. An after we finish we will need to implement a help message describing all the options available and maintain it.
Boost.ProgramOptions solves this problem elegantly. It parses the options from a source (command line, config file or environment variables) and stores it in local variables from where the values can be interpreted more easily.
The simplest example is one in which the main program needs to know the value of several input variables, some of the with default values.
// program.cpp
int main(int argc, char* argv[]){
double kT=-1;
double EF=-1;
{
using namespace boost::program_options;
options_description desc("options");
desc.add_options()
("temp,T" , boost::program_options::value<double>(&kT), "set electronic temperature")
("fermi,EF", boost::program_options::value<double>(&EF)->default_value(10.), "set fermi energy");
variables_map vm;
try{
store(parse_command_line(argc, argv, desc),vm);
}catch(...){clog<< desc <<endl; throw;}
notify(vm); //don't forget this line
}
cout<<"EF is "<<EF<<endl;
cout<<"kT is "<<kT<<endl;
...
The program can be called as
./program --temp=300. --fermi=10. #or as ./program -T 300. -EF 10.
Note that this follows the [GNU Argument Syntax] for long and short options.
If we pass options that are not listed (or have a typo) or the parameter has an incorrect value type (string instead of number) the program will give an error:
what(): unknown option dada what(): in option 'fermi': invalid option value
Also the "help" message is printed by the line
clog << desc << end;
In this case:
options: -T [ --temp ] arg set electronic temperature -E [ --fermi ] arg (=10) set fermi energy
This message is constructed automatically from the description object. Option names, default values, assignment to local variables and options description (help message) are all in one place.
This example covers most of the simplest cases. For positional arguments (arguments with no names intepreted by position in command line), other ways to read the information on the variables_map, reading options from config files and non-convetional syntax, see the manual.
Note: the executable needs to be linked to the boost_program_options (c++ a.cpp -lboost_program_options).
Here it is a more complicated example in which we add positional (as opposed to named arguments),
#include<boost/program_options.hpp>
using namespace boost::program_options;
variables_map interpret_command_line(int argc, char* argv[]){
using namespace boost::program_options;
options_description desc("Usage " + std::string(argv[0]) + " [OPTION]... FILE...\nRuns several projectile velocities.");
desc.add_options()
("help,H", "produce this help")
("input-file", "input file, also positional parameter")
;
positional_options_description p;
p.add("input-file", -1);
variables_map vm;
store(command_line_parser(argc, argv).options(desc).positional(p).run(), vm);
notify(vm);
if (vm.count("help")){
std::cout << desc << std::endl;
exit(-1);
}
if (not vm.count("input-file")){
std::cout << argv[0] << ": missing operand\n Try `" << argv[0] << " --help' for more information." << std::endl;
exit(-1);
}
return vm;
}
int main(int argc, char* argv[]){
boost::program_options::variables_map vm = interpret_command_line(argc, argv);
std::cout << vm["input-file"].as<std::string>() << std::endl;
// do something with vm["input-file"].as<std::string>()
return 0;
}
Boost.RegEx (regular expressions)
Boost.RegEx is a another major complex library for string manipulation by regular expressions. In this context can be useful sometimes to process outputs from other programs. RegEx is also part of TR1 the next C++ standard library.
Regular expressions are ways to define complicated text patterns. There are several regular expression definitions (syntaxes), Perl regular expressions is one of them. Regular expression can be used to validate string input, extract substrings and do string replacement.
In the following example, the numbers contained in the input string are isolated and enclosed in curly brackets.
#include <iostream>
#include <string>
#include "boost/regex.hpp"
int main() {
boost::regex reg("[+,-]*\\d{1,}", boost::regex::icase|boost::regex::perl);
std::string s = "a_0^-1 b^2 c^-32 d^+44";
s = boost::regex_replace(s,reg,"{$0}");
std::cout << s << std::endl;
}
will print "a_0^{-1} b^{2} c^{-32} d^{+44}". (This is a simple way to convert unit outputs (see Boost.Units) to something that understandable for LaTeX code.)
There is not much else to say about Boost.RegEx except that it can save you lots of coding for text processing and that you actually have to learn about regular expressions before using Boost.RegEx. Just some notes of caution:
- a regular expression involving the character
\has to by typed as\\in the code. - not all strings are valid regular expressions, if the string is not a valid regular expression the program will abort the execution (it won't give an error at compile time). If you want compile-time (static) regular expressions you have to use Boost.Xpressive.
- Make sure you specify the right regular expression syntax (in this example
boost::regex::perl) of your regular expression. - You have to link the executable to libboost_regex (
c++ -lboost_regex ...). Make sure (with command ldd) that you are linking to the binary library (libboost_regex) consistent with header file (regex.hpp). If it not correctly linked the library can return the wrong result without warning. - Some characters used in regular expressions (eg. \) are reserved characters in C, therefore they must be preceded by another '\'.
Other sources:
- http://onlamp.com/pub/a/onlamp/2006/04/06/boostregex.html
- http://www.boost.org/doc/libs/1_43_0/libs/regex/doc/html/index.html
- http://regexpal.com/ (Regular expression tester, good for learning regular expressions)
- http://gskinner.com/RegExr/ (Another tester with replace options)
- http://en.highscore.de/cpp/boost/stringhandling.html#stringhandling_regex
- http://www.johndcook.com/cpp_regex.html (actually for the new standard --i.e. not boost-- version of regex, but based on the Boost's one)
boost::to_string
Boost provides a template function called lexical_cast that can potentially convert any type into any other type as long as the source type is printable to stream and the destination type is readable from a stream. Most of the time this function is applied to obtain the string representation of a number:
std::string s = boost::lexical_cast<std::string>(5.); // s == "5."
which evidences some syntactic redundancy is long to write. When the destination is a string there is a function called to_string
#include<boost/exception/to_string.hpp> ... std::string s = boost::to_string(5.);
This is particularly useful to construct error messages or exceptions descriptions.
using boost::to_string;
...
throw std::domain_error("value x = " + to_string(x) + " is outside allowed domain [0,1)");
Anther alternative is to use
#include<boost/units/io.hpp> ... std::string s = boost::units::to_string(5.);
(not)Boost.Process
This library is not included officially in Boost, hopefully it will be accepted in the future. It can be added to Boost very easily by
The library is obtained here http://www.highscore.de/boost/gsoc2010/process.zip with documentation at http://www.highscore.de/boost/gsoc2010/. Also available as svn:
mkdir -p ~/soft/boost_process.svn cd ~/soft/boost_process.svn svn co http://svn.boost.org/svn/boost/sandbox/SOC/2010/process/boost cd boost mkdir -p ~/usr/include/boost/ cp -r * ~/usr/include/boost/
(No compilation of the library is necessary but ~/usr/include must be in the include-path --e.g. export CPATH=$CPATH:/home/correaa/usr/include.--)
There is an older version of the library that can create confusion, check that you are using at least he GSOC version. To remove use
rm -r ~/usr/include/boost/process*
Boost.Phoenix3
This library is now officially in Boost. (Boost includes also Phoenix2 as part of Boost.Spirit.)
Its goal is to be able to define expression templates that resemble (as much as possible) normal C++ expressions. The documentation is available here.
Coexisting with Spirit/Phoenix2
Spirit uses Phoenix internally and, by default, it uses Phoenix2. Phoenix2 and Phoenix3 can not be uses simultaneously therefore if you are using Spirit at all you have to tell Spirit too use Phoenix3. This is achieved by defining this before including Spirit.
#define BOOST_SPIRIT_USE_PHOENIX_V3 #define BOOST_PROTO_MAX_ARITY 11 #define BOOST_PROTO_MAX_LOGICAL_ARITY 11 #define BOOST_MPL_LIMIT_METAFUNCTION_ARITY 12 #include<boost/spirit.hpp>
For Boost < 1.46
For old versions of Boost, it can be added to Boost by
cd ~/usr/include/boost # assuming that the rest of Boost is in ~/usr/include/boost svn co https://svn.boost.org/svn/boost/sandbox/SOC/2010/phoenix3/boost/phoenix
(No compilation of the library is required.) Although there had been some changes in the library since then.
(not)Boost.Geometry
[Boost.Geometry http://geometrylibrary.geodan.nl/index.html] is a library that is accepted to Boost but it not yet shipped with the Boost distribution (but available from trunk). It is a library designed to manipulate geometric objects, vectors, polygons, etc. independently of it internal representation, coordinate system or dimension.
For the moment it can be installed together with Boost by doing:
mkdir -p ~/usr/include/boost cd ~/usr/include/boost svn co http://svn.boost.org/svn/boost/trunk/boost/geometry svn co http://svn.boost.org/svn/boost/trunk/boost/range
(this is not necessary if you get Boost from the svn trunk).
P-Stade Oven
The library P-Stade is not part of Boost but it follows its design and it deserves honorable mention. Technically it is a "range" library that operates on ranges of elements in a container. It is a header only library and therefore the installation is very easy. Just download and copy the header files from http://sourceforge.net/projects/p-stade/ your local directory (e.g. $HOME/usr/include)
cd ~/usr/include wget http://downloads.sourceforge.net/project/p-stade/pstade/1.04.3/pstade_1_04_3.zip unzip pstade_1_04_3.zip rm pstade_1_04_3.zip
all the files will be installed in ~/usr/include/pstade. This library needs Boost to be installed already.
The official manual is located here while a nice set of examples can be found in the entries of this blog. Another source of examples is here.
With pstade/oven/initial_values.hpp every sequence (except c-arrays) becomes directly intializable with a list of "initial values". This is particularly useful to initialize const containers (this is similar to Boost.Assign). Once we have a collection of elements (in some container) we can apply functions to this object. The trivial function is pstade/oven/identities.hpp, it does nothing to the elements. This function is useful anyway because internally it changes the representation of the collection to something that can be printed independently of the source. In the following example I use identities in order to print the original sequence. Other functions used are reversed, sorted, filtered and partitioned.
The function names are self-explanatory. The only one that need explanation is partitioned wich returns a std::pair with two subsets depending in some boolean function. Note the intensive use of the pipe operator |, the functions can be applied in any sequence.
#include <iostream>
#include <pstade/oven/initial_values.hpp>
#include <pstade/oven/identities.hpp>
#include <pstade/oven/reversed.hpp>
#include <pstade/oven/sorted.hpp>
#include <pstade/oven/filtered.hpp>
#include <pstade/oven/partitioned.hpp>
#include <pstade/oven/io.hpp>
#include <pstade/oven/equals.hpp>
using namespace pstade::oven;
//a free function
bool is_even(int const& x) { return x % 2 == 0; }
int main(){
using std::cout; using std::endl;
//works with std::vector
const std::vector<int> v = initial_values(3, 1, 4, 5, 2);
cout << (v | identities) << endl; //prints {3,1,4,5,2}
//works with boost::array
const boost::array<double, 3> a = initial_values(1, 3, 5);
cout<< (a | reversed ) <<endl; //prints {5,3,1}
cout<< (a | identities) <<endl; //prints the original {1,3,5}
//works with a c-array
const double arr[3] = {22,33,11};
cout<< (arr | sorted) <<std::endl; //prints {11,22,33}
cout<< (v | filtered(&is_even) ) <<endl; //prints {4,2}
cout<< (v | partitioned(&is_even)).first <<endl; //prints {4,2}
cout<< (v | partitioned(&is_even)).second <<endl; //prints {3,1,5}
(v | back) = 2; //assing a value to the last element
//(v | back_value) = 3; //doesn't work because back_value returns a copy (and not the original element)
cout<< (equals(v, initial_values(1,2,3)) <<endl;
return 0;
}
Other Resources
- Online book on selected Boost libraries http://en.highscore.de/cpp/boost/index.html
- Reference on C++ and the Standard Template Library http://www.cppreference.com/wiki/start
- The classic SGI Standard Template Library Programmer's Guide http://www.sgi.com/tech/stl/
- Boost Users group list http://groups.google.com/group/boostusers?hl=en