CHAPTER 5: Integration using the Trapezoidal Rule

5.1 Estimating the Area Under a Curve

As our next example, let’s look at the problem of estimating the area under a curve. If you have taken a calculus course, you may recognize this problem as the Riemann sum or Trapezoidal Rule, which approximates the area under the curve (i.e. the integral) by splitting the area under the curve into a series of trapezoids.

In the following programs, we attempt to use the trapezoid rule to approximate the integral

\[\int_0^{\pi} sin(x)_{dx}\]

using \(2^{20}\) equal subdivisions. The answer from this computation should be 2.0. The following video shows how a single thread would solve this problem:

In this example, the single thread serially computes the area of each trapezoid and adds all the trapezoids together to one value.

A C implementation of this program may look like the following:

 1#include <stdio.h>
 2#include <math.h>
 3#include <stdlib.h>
 4
 5//function we are calculating the area under
 6double f(double x) {
 7    return sin(x);
 8}
 9
10//estimation of pi
11const double pi = 3.141592653589793238462643383079;
12
13int main(int argc, char** argv) {
14    //Variables
15    double a = 0.0, b = pi;         //limits of integration
16    int n = 1048576;                //number of subdivisions = 2^20
17    double h = (b - a) / n;         //width of each subdivision
18    double integral;                // accumulates answer
19
20    integral = (f(a) + f(b))/2.0;  //initial value
21
22    //sum up all the trapezoids
23    int i;
24    for(i = 1; i < n; i++) {
25        integral += f(a+i*h);
26    }
27
28    integral = integral * h;
29    printf("With %d trapezoids, our estimate of the integral from \n", n);
30    printf("%f to %f is %f\n", a,b,integral);
31
32    return 0;
33}

Let’s now consider how we can use multiple threads to approximate the area under the curve in parallel. One strategy would be to assign each thread a subset of the total set of subdivisions, so that each thread separately computes its assigned set of trapezoids.

The following video illustrates how 4 threads would work together to approximate the area under the curve:

5.2 Parallel Integration - First Attempt

Let’s visit the integration directory by using the following command (note that this directory is on the same level as the patternlets directory) in the reduction directory:

cd ../../integration

The following program attempts to update the serial version of the integration program with OpenMP pragmas:

 1#include <math.h>
 2#include <stdio.h> // printf()
 3#include <stdlib.h> // atoi()
 4#include <omp.h> // OpenMP
 5
 6/* Demo program for OpenMP: 
 7computes trapezoidal approximation to an integral*/
 8
 9const double pi = 3.141592653589793238462643383079;
10
11int main(int argc, char** argv) {
12    /* Variables */
13    double a = 0.0, b = pi; /* limits of integration */;
14    int n = 1048576; /* number of subdivisions = 2^20 */
15    double h = (b - a) / n; /* width of subdivision */
16    double integral; /* accumulates answer */
17    int threadcnt = 1;
18
19    double f(double x);
20
21    /* parse command-line arg for number of threads */
22    if (argc > 1) {
23        threadcnt = atoi(argv[1]);
24    }
25
26    #ifdef _OPENMP
27        omp_set_num_threads( threadcnt );
28        printf("OMP defined, threadct = %d\n", threadcnt);
29    #else
30        printf("OMP not defined");
31    #endif
32
33    integral = (f(a) + f(b))/2.0;
34    int i;
35
36    #pragma omp parallel for \
37    private(i) shared (a, n, h, integral)
38    for(i = 1; i < n; i++) {
39        integral += f(a+i*h);
40    }
41
42    integral = integral * h;
43    printf("With %d trapezoids, our estimate of the integral from \n", n);
44    printf("%f to %f is %f\n", a,b,integral);
45}
46
47double f(double x) {
48    return sin(x);
49}

Lines 36-37 contain the omp parallel for pragma that parallelizes the loop that is computing the integral. This example is different from others you have seen so far, in that the variables are explicitly declared to be either private (each thread gets its own copy) or shared (all threads use the same copy of the variable in memory).

If our parallel program is implemented correctly, the program should estimate the area under the curve as 2.00, which would be identical to the output of the serial program.

First, run this program serially on 1 thread by typing

make
./trap-notworking 1

Next run this code in parallel using 4 threads by typing:

make
./trap-notworking 4

As with other examples you have done so far, the command-line argument of 4 above indicates use four threads.

5.3 Parallel Integration - Second Attempt

We will use the previously discussed reduction clause to fix the issue. Here is the improved version, called trap-omp-working.c:

 1#include <math.h>
 2#include <stdio.h> // printf()
 3#include <stdlib.h> // atoi()
 4#include <omp.h> // OpenMP
 5
 6
 7/* Demo program for OpenMP:
 8computes trapezoidal approximation to an integral*/
 9
10const double pi = 3.141592653589793238462643383079;
11
12int main(int argc, char** argv) {
13    /* Variables */
14    double a = 0.0, b = pi; /* limits of integration */;
15    int n = 1048576; /* number of subdivisions = 2^20 */
16    double h = (b - a) / n; /* width of subdivision */
17    double integral; /* accumulates answer */
18    int threadcnt = 1;
19
20    double f(double x);
21
22    /* parse command-line arg for number of threads */
23    if (argc > 1) {
24        threadcnt = atoi(argv[1]);
25    }
26
27#ifdef _OPENMP
28    omp_set_num_threads( threadcnt );
29    printf("OMP defined, threadct = %d\n", threadcnt);
30#else
31    printf("OMP not defined");
32#endif
33
34integral = (f(a) + f(b))/2.0;
35int i;
36
37#pragma omp parallel \
38for private(i) shared (a, n, h) reduction(+: integral)
39    for(i = 1; i < n; i++) {
40        integral += f(a+i*h);
41    }
42
43    integral = integral * h;
44    printf("With %d trapezoids, our estimate of the integral from \n", n);
45    printf("%f to %f is %f\n", a,b,integral);
46}
47
48double f(double x) {
49    return sin(x);
50}

As with the previous reduction patternlet example, we again have an accumulator variable in the loop, this time called integral. Thus our reduction clause reads reduction(+:integral).

The above program was compiled with our first use of the make command. Run the above program using the following command:

./trap-working 4

Note

Because we did not explicitly add an additional static or dynamic clause to the pragma on lines 37-38 on the working version above, the default behavior of assigning equal amounts of work doing consecutive iterations of the loop (i.e. static scheduling) was used to decompose the problem onto threads. In this case, since the number of trapezoids used was 1048576, then with 4 threads, thread 0 will do the first 1048576/4 trapezoids, thread 1 the next 1048576/4 trapezoids, and so on.

You have attempted of activities on this page