jocPTask

Idea


PTask (for Parallel Task), is a library that aims to simplify the implementation of parallelized algorithms.  The base concept is to define a description of your algorithms with it's parallel and serial sections and then execute these descriptions on a fixed amount of worker threads.


Main goals are

  • fixed number of threads if possible, usually same as number of CPUs
  • no waits/joins unless unavoidable
  • exception handling
  • Allow multiple independent algorithm descriptions (from different threads) to be executed in parallel transparently.


 The current PTask implementation uses internal bookkeeping to achieve these goals. Waits or Joins (which i.e. mean more threads) are not required except when tasks inside one description execute their own independent algorithm descriptions.



Project Status


The PTask library is open source and is released under LGPL license. You can find it also on sourceforge.net. Your feedback is highly appreciated! Please do not hesitate to mail me about bugs or implementation faults. I will provide more in-depth documentation about the current implementation soon.



How To Use


An algorithm description is defined by nesting ptask queues and ptasks according to the work flow your algorithm has. PTask queues are themselves ptasks so in fact you are nesting just ptasks. Computing for example statistics on a data array followed by finding a proper threshold, followed by application of a filter to each data item using this threshold can be defined in the PTask framework as a serial queue consisting of one parallel queue containing a number of ptasks gathering statistics, followed by the filter ptask, followed by a parallel queue containing a number of ptasks applying the filter. 


This is how it would look in code:


SerialPTaskQueue sq = new SerialPTaskQueue();

ParallelPTaskQueue pq1 = new ParallelPTaskQueue();


pq1.addPTask(new StatPTask(0, 99999));

pq1.addPTask(new StatPTask(100000, 199999));


sq.addPTask(pq1);

sq.addPTask(new ThresholdPTask());


ParallelPTaskQueue pq2 = new ParallelPTaskQueue();


pq2.addPTask(new FilterPTask(0, 99999));

pq2.addPTask(new FilterPTask(100000, 199999));


sq.addPTask(pg2);


try {

    sq.process(myData); //waits until all the tasks are executed

} catch (ExecutionException e) {

    … //process exceptions from your code here

}


The PTask interface defines just one method that you must implement to get work done:


void process(Object data) throws InterruptedException, ExecutionException;


The number of tasks to use depends on the number of CPUs in your system. There is also a DynamicParallelPTaskQueue which is configured by a factory to generate the appropriate number of parallel tasks for a given number of parallel threads (CPUs) on demand. In my first tests it runs considerably faster than just creating threads and join them for the parallel portions and still somewhat faster if the threads are pre-created using a ThreadPoolExecutor.


Input data and results are managed using just an Object reference passed to the process methods of the tasks. I used a synchronized Map in my tests and some synchronized custom objects inside the map. What you use is entirely up to you, the framework classes just pass the data object along without using it. 


IMPORTANT:

Accessing synchronized objects is slow! if you use synchronized objects for communication between your tasks, make sure to call to them sparsely as otherwise your parallelized code might run slower as the serial version. If you e.g. compute statistics, compute local stats first and merge them to the global one once at the end of each ptask. You can find a samples for this in the provided sample code.

Do not block your tasks by waits or joins, as the framework will currently not detect this and may stall.



PTask Home


You can find the latest update here at http://www.jocware.com and of course at http://ptask.sourceforge.net.

Please send me mail to support _at_ jocware _dot_ com for Questions, Bug reports and suggestions!


Have fun with PTasks!


…Jochen Riekhof

Version 0.84 Released

Finally Future cancellation support is implemented. Also I added MultiJuliaApp sample application and fixed a bug in Ticket interrupt handling which caused InterruptedExceptions to be eaten.

The intended features for 1.0 are now all implemented. Just more testing is needed

Version 0.82 Released

Just uploaded version 0.82. Among the new features are support for Future's, a convenience QueueBuilder for even easier setup of your parallel tasks, a new realtime fractal generator with nice graphics and of course more unit tests.

Version 0.81 Released

Version 0.81. Several important bug fixes. I also started on Future support but this is not tested yet.

Initial Public Version 0.8 Released

Just uploaded the first version - 0.8. The status is pre-alpha. A couple of unit tests and some small test apps run, but no real live application uses this framework yet. I will now implement a simple GUI demo application for some real life testing.

Copyright © 2009 by Jochen Riekhof