C# · CodeProject · Dotnet

Overview on Task Parallel Library


In this article, i would just explain the overview of the TPL in .NET 4 which will just get you started in understanding about it.

TPL stands for Task Parallel Library. This library was introduced in .NET 4 framework.
In today’s world, the devices on which our code runs are powerful yet compact. These machines do come with multi processors or with multi cores. So to harness the power of these machines, we need to code more efficiently with smartness.

In .NET world we already had threads so far all these years. So why do we need TPL any way?

Well, we needed much simpler coding interface with in the language such a way that our code (multi-threaded) looks easier to read. Also the objects created (threads) was not light weight. And to solve the problem of harnessing concurrency in a better way than we used to do.

Earlier, one has to have an in depth strong knowledge on multi-threading concepts supported by the language also on the machine architecture level to code better utilizing the complete power of the architecture.

Before TPL, it was really hard to write code which was evenly distributed across processors or cores. Also it was really bad to meddle with the OS about assigning jobs equally to processors/cores, but many times things would go chaos.

Hence TPL enabled developers write code in a new way, which is much better than before. TPL helped in achieving much better results with same hardware’s with cleaner, simpler code being written.

So basically you can think TPL is just a wrapper library for System.Threading APIs. But, with much more intelligence added. Hence TPL has eased the interface of writing multi-threading code in .NET lately.

So, ill show you some basic code in traditional vs TPL API!

Below is a small example, where in you need to check the status of your service running in 5 different machine in your LAN from your application. So how do we code?

Traditional way:

Now this way of coding, its not pretty clear whether your code is really optimized and is exploiting the cores of your processor efficiently also light weight w.r.t threads. To make sure, you need to write more code in getting number of cores/processors you have in your machine and which core/processor is free and assign job to it and get it done. Basically, the developer has to write more algorithms/intelligent code to exploit all the processor and get the best out of it.

Thankfully, TPL solves this overhead.  So how you code in TPL you ask?

To use TPL APIs, we use System.Thread.Tasks namespace. This namespace has many APIs to use. One such we use now to solve above problem is using Parallel.For

See, how simple and easier the code is. As said above, the TPL in the framework is smart enough to distribute the work evenly on the powerful machines which the developers need not to worry about also it is much light weight objects and uses efficient ThreadPool internally than traditional ThreadPools.

In the above code,i have used anonymous delegate via lambda expression as third argument. This is because, Parallel.For API accepts Action<T> delegate, hence internally the compiler converts your anonymous delegate to an delegate like in IL as shown in image below:

You have to note that TPL wont work efficient if the machine hardware doesn’t support multiple cores/processors. So thus running a TPL code on a single core/single processor machine will execute all tasks sequentially.

The beauty about the TPL unlike traditional Threading API is, each time you wish to create a task, a new Thread isnt created on the workerthread pool. Rather, an existing Task object is pulled out of the TaskFactory pool, thus improving the overall efficiency. This ThreadPool in TPL is smart enough to figure itself when it has to create a new object in its pool when it detects all the pool objects are getting used up. So every thing is happening behind the scenes. Thus, TPL helps developer breath calm.

If you take alook at the above Parallel.For code, we are not explicitly using any ThreadPool. But rather the CLR internally uses task scheduler and pool objects to get the job done, but i must warn you that the tasks sequence will not be in order and neither you can control.
This is because, in TPL each Task is defined as an independent unit of work and each Task could have more Threads in it.

As i already said, TPL is just a syntactic sugar provided in the framework. It wont add itself to Base Class Libraries. Because internally all existing concepts viz ThreadPool, WorkerThreads, etc. are getting used, but in a much efficient way.

Okey, so the above code shows a better way of starting a task in a loop. But in our daily coding life, we don’t get the same problem to solve or in other words its not enough to solve all the problem. So we need to control/handle tasks individually on our way.

So how do we do it, you ask? Well see below code.

We will use Task class for that. Task class provides various APIs to get the job done easily and cleaner.

So as you can see from above code, its much simpler and nicer and all spaghetti algorithmic logic is left to the framework to take care for me.

Now if you take a closer look at the above code, we are using Task.Factory. This is because as already said Task internally uses ThreadPool objects to get the job done. Now you might wonder how to start, well as soon as you call StartNew() on the Task API, the tasks gets started in the background.

You may ask, can you postpone later i.e starting task. Yes you can, but not recommended. But to show you,

Now, as you can see from above code. The Task internally uses Factory pool. So not explicitly specified. But, trick comes when you have to choose TaskScheduler to do your job.

TaskScheduler provides 2 properties Current and Default so these two acts differently in a different way. Hence it is recommended that to use Task.Factory.StartNew rather than this.
You can find a bit more indepth about StartNew vs Task() by TPL team engineer blog: http://blogs.msdn.com/b/pfxteam/archive/2010/06/13/10024153.aspx

Now if you remember, using traditional Threading APIs from Thread class, you were able to assign your own name to the Threads being created in your code. But in TPL’s Task style, this is not possible. Since it uses every thing from pool. But it rather does gives us ID which is unique to each task. So you can get Task.CurrentId value from the task object.

Now lets get under the hood and see whats really happening at the IL level. The same Task line will be converted to Task t = Task.Factory.StartNew(new Action(MainClass.MyTask)); by the compiler.

So upon disassembling the code, it looks like in the image:

Do not worry if you don’t understand any thing in above image. Its the Intermediate Language which every .NET variant language converts high level language into.

So as you can see in line 15, Task.TaskFactory is a property.. Hence the compiler first calls get_Factory() method internally Why?, “because we know, at IL all properties i.e set and get accessors are converted to equivalent method’s”. At line 17 and 18, it loads MyTask() method via Action delegate. Then it calls StartNew method using this delegate in line 19 to start the Task execution.

Well that’s all friends for now. There’s no use writing more in depth since it would be same as others have written on web. To dig deep into TPL, you can read excellent article by sacha barber here: http://www.codeproject.com/KB/cs/TPL1.aspx

Enjoy 🙂


Leave a Reply

Please log in using one of these methods to post your comment:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )


Connecting to %s