.NET Task Parallel Library Advanced Data Parallel

Tuesday Jul 5th 2011 by Jeffrey Juday

Much of the .NET Task Parallel Library (TPL) Data Parallel functionality is encapsulated in Parallel Loops. Unlike a regular loop, Parallel loops must partition a collection, requiring a developer to address concurrency issues like cancellation and thread safe operations. This article introduces the TPL Data Parallel core classes and concepts.

Much of the .NET Task Parallel Library (TPL) Data Parallel functionality is encapsulated in Parallel Loops. Parallel Loops are conceptually similar to regular loops. Like a regular loop a Parallel Loop usually operates on a collection of data, performing a computation on each element in the collection. Controlling a parallel loop's execution is more complicated though.

Unlike a regular loop, Parallel loops must partition a collection so computations can be doled out to worker Tasks. Parallel Loops also require a developer to address concurrency issues like cancellation and thread safe operations.

The TPL Parallel class includes methods and overloads to execute and control Parallel loops. Wading through all the overloads, methods, and options can take time. Luckily, the methods and overloads are variations on a handful of core classes and concepts. What follows will introduce the TPL Data Parallel core classes and concepts.

Data Parallel

Prior articles introduced Data Parallel so a complete introduction to Data Parallel is beyond the scope of this article. However some context is important.

Data Parallel algorithms take advantage of the natural properties of collections and the common computations applied to a collection member. Often collection contents are uniform and independent of one another. So as long as two of the same operations are executing on different collection members, an operation can often be carried out in parallel.

Data Parallel algorithms often follow this pattern:

  • The collection is broken into chunks or many smaller collections.
  • Each chunk is distributed to a separate Thread or worker task.
  • Worker tasks operate on their individual chunks.
  • The whole algorithm is complete when all worker tasks are complete.

As stated earlier, much of the TPL Data Parallel functionality is encapsulated in the Parallel class.


For and ForEach methods comprise the Parallel class' Data Parallel functionality. ForEach operates over a collection. ForEach will be addressed later in the article. For encapsulates a more general Data Parallel algorithm. A sample For implementation doing a parallel string concatenation appears below:

int startCount = 0;
string reportedResults = "";
var result =
    Parallel.For<string>(0, 100

    , () => { return "Start " + (++startCount).ToString(); }

    ,(i, loopState, inVal) => 
        Console.WriteLine("Thread == " + Thread.CurrentThread.ManagedThreadId.ToString() 
            + " At " + i.ToString() + " " 
            + inVal+ "\r\n"); 

        return inVal + " " + i.ToString(); 

    , (s) => { reportedResults = reportedResults + " -- " + s;}

Console.WriteLine("Reported results are " + reportedResults);
Console.WriteLine("Result was " + result.IsCompleted.ToString());

The For signature utilized in the sample above appears below.

public static ParallelLoopResult For<TLocal>
int fromInclusive
, int toExclusive
, Func<TLocal> localInit
, Func<int, ParallelLoopState, TLocal, TLocal> body
, Action<TLocal> localFinally

Like a traditional For loop, the first two values define the scope of the workload. In the sample above the code concatenates the numbers 0 to 99 together, separated by spaces. Internally, TPL determines how to divide the workload range and allocates a Task for each range.

Each executing Task first makes a call to the localInit parameter Func delegate. Including this parameter allows a developer to initialize the workload. The Func delegate return value is dictated by the For<> generic implementation. Had the sample operated on a collection the Func could have returned a reference to the collection or a smaller segment of a larger collection.

The body delegate parameter executes on each workload iteration. Body delegate will be running in parallel across the Tasks allocated to the workload. ParallelLoopState will be discussed later in the article. As would be expected the iteration and class instance allocated in the localInit delegate is passed to each body delegate invocation.

The localFinally delegate parameter runs when the allocated Task has completed. This delegate would support Scatter-Gather scenario where each Task performs some portion of the work and returns the results of its efforts. Though somewhat unclear in the documentation; TPL appears to handle calling the localFinally delegate in isolation.

ForEach works a lot like the For method. The biggest difference is ForEach operates on IEnumerable classes. This difference means that, instead of calling a localInit delegate to initialize the workload, TPL includes overloades for a Partitioner class implementation.


A Partitioner intelligently divides and balances a collection so different Tasks can operate independently on segments of the collection. A Parallel.ForEach sample appears below.

var enumerable = new string[20] {"0","1","2","3","4"
var opts = new ParallelOptions();
var cancelS = new CancellationTokenSource();//read about under cancellations.

opts.CancellationToken = cancelS.Token;

var result = Parallel.ForEach<string>

    , opts

    , (s, loopState, on) => 
        //How you can use Loopstate
        if (loopState.ShouldExitCurrentIteration) { loopState.Break(); }

        enumerable[on] = s + " Thread " 
            + Thread.CurrentThread.ManagedThreadId.ToString() 
            + " on == " + on.ToString();



foreach (var val in enumerable) { Console.WriteLine(val); }
Console.WriteLine("The result was " + result.IsCompleted.ToString());

Running the earlier For sample results may have yielded some strange output. For method does not guarantee ordered execution. In fact a Task allocated to the middle of the For range may have finished after the Task handling the end of the range. Partitions can instill result ordering.

TPL includes some standard Partitions, but a developer may need something more specialized. Collections of more complex objects may require special handling to load balance a workload across executing Tasks. A complete Partitioning review could fill an entire article. For a complete review; there are good resources at the end of the article. The sample above also demonstrates the ParallelLoopState and ParallelOptions classes.

ParallelLoopState and ParallelOptions

ParallelOptions cancellations are demonstrated in the ForEach sample. Cancellations allow code external to abort a Task during or before Task execution. Including the CancellationToken in the ParallelOptions surfaces the Cancellation through the ParallelLoopState.ShouldExitCurrentIteration property. To leverage a Cancellation, a Parallel Loop Body must query ShouldExitCurrentIteration at some point during execution. ParallelLoopState includes Break and Stop methods. As might be expected, Break and Stop halt execution. Unlike traditional looping, however, running Tasks may be executing in parallel and in various completion stages.


Task Parallel Library Data Parallel loops are encapsulated in the Parallel class. Data Parallel loops require more complicated control mechanism than traditional loops.


"Custom Parallel Partitioning with .NET 4.0"

"Task Parallel Library"

"Data Parallelism (Task Parallel Library)"

Mobile Site | Full Site
Copyright 2017 © QuinStreet Inc. All Rights Reserved