Here I will consider firstprivate and lastprivate. Recall one of the earlier entries about private variables. When a variable is declared as private, each thread gets a unique memory address of where to store values for that variable while in the parallel region. When the parallel region ends, the memory is freed and these variables no longer exist. Consider the following bit of code as an example:
#include <stdio.h> #include <stdlib.h> #include <omp.h> int main(void){ int i; int x; x=44; #pragma omp parallel for private(x) for(i=0;i<=10;i++){ x=i; printf("Thread number: %d x: %d\n",omp_get_thread_num(),x); } printf("x is %d\n", x); }
Yields…
Thread number: 0 x: 0 Thread number: 0 x: 1 Thread number: 0 x: 2 Thread number: 3 x: 9 Thread number: 3 x: 10 Thread number: 2 x: 6 Thread number: 2 x: 7 Thread number: 2 x: 8 Thread number: 1 x: 3 Thread number: 1 x: 4 Thread number: 1 x: 5 x is 44
You’ll notice that x is exactly the value it was before the parallel region.
Suppose we wanted to keep the last value of x after the parallel region. This can be achieved with lastprivate. Replace private(x) with lastprivate(x) and this is the result:
Thread number: 3 x: 9 Thread number: 3 x: 10 Thread number: 1 x: 3 Thread number: 1 x: 4 Thread number: 1 x: 5 Thread number: 0 x: 0 Thread number: 0 x: 1 Thread number: 0 x: 2 Thread number: 2 x: 6 Thread number: 2 x: 7 Thread number: 2 x: 8 x is 10
Notice that it is 10 and not 8. That is to say, it is the last iteration which is kept, not the last operation. Now what if we replace lastprivate(x) with firstprivate(x). What do you think it will do? This:
Thread number: 3 x: 9 Thread number: 3 x: 10 Thread number: 1 x: 3 Thread number: 1 x: 4 Thread number: 1 x: 5 Thread number: 0 x: 0 Thread number: 0 x: 1 Thread number: 0 x: 2 Thread number: 2 x: 6 Thread number: 2 x: 7 Thread number: 2 x: 8 x is 44
If you were like me, you were expecting to get the value 0 i.e. the value of x on the first iteration. NO
firstprivate Specifies that each thread should have its own instance of a variable, and that the variable should be initialized with the value of the variable, because it exists before the parallel construct.
That is, every thread gets its own instance of x and that instance equals 44.