8.2.4 : The performances
Figure 16 : Left panel : total averaged time for the reduction function. Right panel : averaged time to compute one single element.
Ok, let's be realistic : there is a problem with the performances.
- Performances -O0 : slow but reasonable
- Other performances (-O1, -O2, -O3, -Ofast) are too fast (non sence)
GCC is smart of guileful depending on the points of view.
- GCC noticed you do not use the result of the reduction function.
- The call to reduction is considered as dead code (or never called code).
To avoid that, you have to compile the reduction function in an other file.