Visualising N-Body Simulation in OpenCV : Multicore Processing with OpenMP

The Good side of OpenMP is its inherent simplicity. The ease with which it enables you to write parallel code is remarkable.





This post is an update to a previous post: Building OpenCV with OpenMP In this post, it's all about performance analysis of OpenCV with and without OpenMP.

The code being used is the popular N-body Simulation used to simulate gravitational effect on large particle systems. I have used source code as available on Mark Harris's github repo with a few modifications.

Modifications made are as follows:
  • A 2-D Coordinate system in place of a 3-D system for visualisation in OpenCV
  • Integer precision for co-ordinate calculation for visualisation.
  • Integer time step dt which determines the speed of our simulation . 
  • Drawing circles with radius = 0, giving us particles the size of a unit pixel.
  • Dynamic Window size declaration and adaptation
  • Re-Wrote random coordinate generation method for 2-D coordinate system.
The modified source code is available on my github profile/visualise-nbody-opencv  

Benchmarking: 

Approach #1: Total Execution Time
The Total Execution time for computing N iterations for the particle system is directly indicative of performance for N-body Sim. In general terms, if N iterations take time T on a single core , then N iterations should theoretically take time T/4 on a Quad-core CPU. Though this might not always be the case, it is a good parameter to evaluate.

Approach #2: CPU Resource Monitor
All OSes are bundled with a resource monitor that maps CPU utilisation with time.The resource monitor is an effective tool to visually examine the CPU per core usage.

A combination of approach #1 and #2 is used to examine OpenCV with and without OpenMP parallelization.

Have a look at what my code for N-body simulation for N= 2500 particles looks like:



The following benchmark has been evaluated for 1000 iterations of 2500 particles.

EVALUATION:
  • Without OpenMP Parallelization
Time:
time is a command in the Unix operating systems. It is used to determine the duration of execution of a particular command.

  time ./nbody
  real        2m7.258s
  user        1m37.096s
  sys         0m0.500s  

CPU Core Usage:

The calculations being done on a single core with occasional core switching 
    The CPU usage graph shows that at any given time, Only a single core is being used for the calculation. Additionally, the core being used is also switched by the OS occasionally.
  • With OpenMP Parallelization
Time:

   time ./nbody
   real        1m23.373s
   user        3m31.596s
   sys        0m0.696s 

CPU Core Usage:



The CPU usage during OpenMP being used is sufficient to show that the code is run parallel on multiple cores. The CPU time shows the same as we have a reduced real time (as in wall time) by running computations on 4 cores of the CPU. For details of how to interpret the time output, refer this answer on Stack Overflow.

OpenMP is therefore an easy to use framework in cases where code needs to be distributed on multiple cores. Since its inception, it has advanced sufficiently and have been adopted among developers looking to leverage improved hardware capabilities. 
To Know more about OpenMp, visit their official website.
Kudos.

Push Yourself Again and Again.Don't give an inch until the final buzzer sounds.

Stats:
Ubuntu 17.04
Intel(R) Core(TM) i5-4200U CPU @ 1.60GHz
8 GB DDR3 RAM
Code::Blocks 16.01
GCC 6.3.0