C# Parallel Programming .NET 4.0 - learn by examples
dateAugust 21 2013 | comments 1 comments

Did you observe that the hardware world is booming and its processing power increases? Multi-core (and multi-processer) machines are dominant these days.
Indeed, the responsibility lies with developers to utilize such power. Running a program on a multi-core machine automatically would improve performance? The answer is NO! Running a serial program on a multi-core machine has nearly zero performance enhancements. The traditional way of writing serial programs will not be able to make use of such power, by default. The programs should be written in a way which utilizes the power of multi-cores. Multi-threaded programming is the way which allows developers to take advantage of multiple threads concurrently.

What is Parallelism? Parallelism is taking a certain task and dividing it into a set of related tasks to be executed concurrently. Sweet! So again, thinking of these tasks as threads, Parallel Programming still has the same problems of concurrent programming (deadlocks, data races, etc...). This makes Parallel Programming even more difficult and debugging even more complicated :(
However, the good news is .NET 4.0 parallel programming is a great step onwards. The new API solves a lot of problems (but not all) of Parallel Programming.

The Parallel Programming of .NET 4.0 has the following:
 -   The Task Parallel Library (TPL) divided in:
        -Task class: the unit of work you will code against instead of the previous thread model.
        -Parallel class: a static class that exposes a task-based version of some parallel-nature problems. It contains the following methods:
             - For
             - Foreach
             - Invoke

 - Parallel LINQ (PLINQ): exposes the familiar LINQ as Parallel extensions and it's built on top of the TPL

Example - 'Parallel For' in a 'sleepy' application

    static void Main(string[] args)
    {
        Stopwatch watch = new Stopwatch();
        watch.Start();

        //normal implementation
        for (int i = 0; i < 100; i++)
        {
            Thread.Sleep(1000); //1sec
        }
        watch.Stop();
        Console.WriteLine("Normal Time: " + watch.Elapsed.Seconds.ToString());

        //parallel implementation
        watch = new Stopwatch();
        watch.Start();
       
        System.Threading.Tasks.Parallel.For(0, 100, i =>
        {
            Thread.Sleep(1000); //1sec
        }
        );
        watch.Stop();
        Console.WriteLine("Parallel Time: " + watch.Elapsed.Seconds.ToString());

        Console.ReadLine();
    }

   
The above code shows the difference in execution time of performing a unit of work serially versus in parallel.
In my dual-core machine, the time is cut by half, in both cases. Now, this is a huge improvement! Imagine all the work you usually do inside For loops (or Foreach loops) and think of the time enhancement you now can get by utilizing the new Parallel.

Bad side of Parallel

But this new feature of .NET 4.0 has some limitations, some cases when the loading time of the parallel loop might not go any faster than the regular loop.
Some simple questions I had to answer in my development process: does it make sense to you to use for every normal foreach a parallel.foreach loop ? When should I start using parallel.foreach, only iterating 1000000 items?
There is no lower limit for doing parallel operations. If you have only 2 items to work on but each one will take a while, it might still make sense to use Parallel.ForEach. On the other hand if you have 1000000 items but they don't do very much, the parallel loop might not go any faster than the regular loop.
For example, I wrote a simple program to time nested loops where the outer loop ran both with a for loop and with Parallel.ForEach
Here's a run with only 2 items to work on, but each takes a while:
2 outer iterations, 1000000 inner iterations:
for loop: 00:00:00.1460441
ForEach : 00:00:00.0842240
Here's a run with millions of items to work on, but they don't do very much:
1000000 outer iterations, 2 inner iterations:
for loop: 00:00:00.0866330
ForEach : 00:00:02.1303315
The important point here is not really the number of iterations, but the work to be done. If your work is really simple executing 1000000 delegates in parallel will add a huge overhead and will most likely be slower than a traditional single threaded solution.
It doesn't make sense for every foreach to use Parallel.ForEach. You should have in your mind the following:
- Your code may not actually be parallelizable. For example, if you're using the "results so far" for the next iteration and the order is important
- If you're aggregating (e.g. counting values) then there are ways of using Parallel.ForEach for this, but you shouldn't just do it blindly
- If your work will complete very fast anyway, there's no benefit, and it may well slow things down
So, the only real way to know the benefits of Parallel.ForEach is to try it, because the loading time is the improvement that final customer will be happy to find out in the application.


dateAugust 21 2013 | comments 1 comments

C# Parallel Programming .NET 4.0 - learn by examples
Comments
Kunjal Gupta
Very Impressive Spark tutorial. The content seems to be pretty exhaustive and excellent and will definitely help in learning Spark Project. I'm also a learner taken up Spark Tutorial and I think your content has cleared some concepts of mine. While browsing for Spark Training on YouTube i found this fantastic video on Spark Tutorial. Do check it out if you are interested to know more.:-https://www.youtube.com/watch?v=8Kcu63H0d8c&
9/29/2018 4:18:17 PM

Leave comment



 Security code