Chapter VI.1. Algebraic expressions

Table of Contents

VI.1.1. Basic expressions
VI.1.2. Merge and assign

Kicksey-Winsey uses the Portable Expression Template Engine (PETE), to handle algebraic expressions. PETE is a small library that was originally developed at Los Alamos as part of the much bigger Pooma project, another generic and distributed array library. The website credits Steve Karmesin and Scott Haney as the main developers.

For most purposes, you should not need to worry about PETE, but if you want to implement new operators you do have to specialize a few classes. You can find some guidelines below.

VI.1.1. Basic expressions

We denote by f1, f2, f3 three instances of coefficents<Engine,Basis>. Their engines are assumed to be compatible. Here are some examples of valid expressions :

	f1 *= f2 + f3;
	f1 += 2*f2 - f3;
	f1 -= f3*2 / f2;
	f1 /= f2 * f3/2;
	

What happens when such expressions are executed ? First, PETE creates a class template based on the right hand side, storing a raw pointer to the data of f2 and f3, and a reference to the scalar constants. Then it calls a special function called evaluate, and passes it both the left hand side and the pre-processed right hand side. This function then loops over the elements of f1, and in each iteration, dereferences the pointers to f2's and f3's data to compute the required value.

A part from the syntaxical convenience and the good readability, this technique has the advantage of factorizing the "loops" involved in all algebraic expressions in a single function. In this way they are all OpenMP parallelized seemlessly with a single #pragma.

Obviously, if the sequence of events described above would actually occur during the code's execution, performances would be dreadful compared to a good old loop over C-style arrays. The technique relies heavily on the ability of the compiler to optimize out most function calls. Depending on your compiler, this may require some parameter tweaking.

Note that contrary to Blitz library, no effort has been done yet to optimize memory access order.

VI.1.2. Merge and assign

Sometimes some expressions have a circular dependency that would require the use of a temporary array to avoid overwriting:

	tmp = f1; //1
	f1 = f2+f1; //2
	f2 = tmp+f3; //3

But if one were using loops, there would be no need to store a whole temporary array:

	for(int i = 0; i != N; ++i)
	{
		double tmp = f1[i];
		f1[i] = f2[i]+f1[i];
		f2[i] = tmp+f3[i];
	}

To achieve the same effect with expression templates, expressions (2) and (3) should be merged and executed within a single loop. This is exactly what the "merge_and_assign" template function does. Thus the following call is equivalent to (1), (2), (3) above:

	merge_and_assign(
		f1, f2+f1,
		f2, f1+f3
	);