Ex1 - An Introduction to Parallel Programming
Ex1 - An Introduction to Parallel Programming
1.1
Q: Devise formulas for the functions that calculate my_first_i and my_last_i in the global sum example. Remember that each core should be assigned roughly the same number of elements of computations in the loop. Hint: First consider the case when n is evenly divisible by p.
A:
$id = get_thread_num()$
$base = (n + p - 1) / p = \lceil \frac{n}{p} \rceil$
$my_fisrt_i = base * id$
$my_last_i = min{base * (id+1),\ n}$
In this way, the number of elements of computations assigned on the last kernel may be a little bit smaller than other core (no more than p). Since p << n, this difference can be ignored.
1.2
Q: We’ve implicitly assumed that each call to Compute_next_value requires roughly the same amount of work as the other calls. How would you change your answer to the preceding question if call i = k requires k + 1 times as much work as the call with i = 0? So if the first call (i = 0) requires 2 milliseconds, the second call (i = 1) requires 4, the third (i = 2) requires 6, and so on.
A:
Assume that the function call cost 1 time when i=0.
We can divide n calls into n/2 pairs:
${1, n}$, ${2, n-1}$, ${3, n-2}$, … ${n/2+1 (when\ n\ is\ odd)}$ or ${n/2, n/2+1 (when\ n\ is
\ even)}$
The cost in any pair is $n+1$.
The number of elements of calculations is the same for each pair (except when $n$ is odd, $n/2+1$ cannot be paired).
So we can assign tasks like we did in the first question for each core. Each time the $i$-th call is evaluated, the $(n-i)$-th call should also be evaluated.
1.3
Reference
https://github.com/rootusercop/An-Introduction-to-Parallel-Programming