PDL::ParallelCPU man page on Mageia

Man page or keyword search:  
man Server   17783 pages
apropos Keyword Search (all sections)
Output format
Mageia logo
[printable version]

PARALLELCPU(1)	      User Contributed Perl Documentation	PARALLELCPU(1)

NAME
       PDL::ParallelCPU - Parallel Processor MultiThreading Support in PDL
       (Experimental)

DESCRIPTION
       PDL has support (currently experimental) for splitting up numerical
       processing between multiple parallel processor threads (or pthreads)
       using the set_autopthread_targ and set_autopthread_size functions.
       This can improve processing performance (by greater than 2-4X in most
       cases) by taking advantage of multi-core and/or multi-processor
       machines.

SYNOPSIS
	 use PDL;

	 # Set target of 4 parallel pthreads to create, with a lower limit of
	 #  5Meg elements for splitting processing into parallel pthreads.
	 set_autopthread_targ(4);
	 set_autopthread_size(5);

	 $a = zeroes(5000,5000); # Create 25Meg element array

	 $b = $a + 5; # Processing will be split up into multiple pthreads

	 # Get the actual number of pthreads for the last
	 #  processing operation.
	 $actualPthreads = get_autopthread_actual();

Terminology
       The use of the term threading can be confusing with PDL, because it can
       refer to PDL threading, as defined in the PDL::Threading docs, or to
       processor multi-threading.

       To reduce confusion with the existing PDL threading terminology, this
       document uses pthreading to refer to processor multi-threading, which
       is the use of multiple processor threads to split up numerical
       processing into parallel operations.

Functions that control PDL PThreads
       This is a brief listing and description of the PDL pthreading
       functions, see the PDL::Core docs for detailed information.

       set_autopthread_targ
	    Set the target number of processor-threads (pthreads) for multi-
	    threaded processing. Setting auto_pthread_targ to 0 means that no
	    pthreading will occur.

	    See PDL::Core for details.

       set_autopthread_size
	    Set the minimum size (in Meg-elements or 2**20 elements) of the
	    largest PDL involved in a function where auto-pthreading will be
	    performed. For small PDLs, it probably isn't worth starting
	    multiple pthreads, so this function is used to define a minimum
	    threshold where auto-pthreading won't be attempted.

	    See PDL::Core for details.

       get_autopthread_actual
	    Get the actual number of pthreads executed for the last pdl
	    processing function.

	    See PDL::get_autopthread_actual for details.

Global Control of PDL PThreading using Environment Variables
       PDL PThreading can be globally turned on, without modifying existing
       code by setting environment variables PDL_AUTOPTHREAD_TARG and
       PDL_AUTOPTHREAD_SIZE before running a PDL script.  These environment
       variables are checked when PDL starts up and calls to
       set_autopthread_targ and set_autopthread_size functions made with the
       environment variable's values.

       For example, if the environment var PDL_AUTOPTHREAD_TARG is set to 3,
       and PDL_AUTOPTHREAD_SIZE is set to 10, then any pdl script will run as
       if the following lines were at the top of the file:

	set_autopthread_targ(3);
	set_autopthread_size(10);

How It Works
       The auto-pthreading process works by analyzing threaded array
       dimensions in PDL operations and splitting up processing based on the
       thread dimension sizes and desired number of pthreads (i.e. the pthread
       target or pthread_targ). The offsets and increments that PDL uses to
       step thru the data in memory are modified for each pthread so each one
       sees a different set of data when performing processing.

       Example

	$a = sequence(20,4,3); # Small 3-D Array, size 20,4,3

	# Setup auto-pthreading:
	set_autopthread_targ(2); # Target of 2 pthreads
	set_autopthread_size(0); # Zero so that the small PDLs in this example will be pthreaded

	# This will be split up into 2 pthreads
	$c = maximum($a);

       For the above example, the maximum function has a signature of "(a(n);
       [o]c())", which means that the first dimension of $a (size 20) is a
       Core dimension of the maximum function. The other dimensions of $a
       (size 4,3) are threaded dimensions (i.e. will be threaded-over in the
       maximum function.

       The auto-pthreading algorithm examines the threaded dims of size (4,3)
       and picks the 4 dimension, since it is evenly divisible by the
       autopthread_targ of 2. The processing of the maximum function is then
       split into two pthreads on the size-4 dimension, with dim indexes 0,2
       processed by one pthread
	and dim indexes 1,3 processed by the other pthread.

Limitations
   Must have POSIX Threads Enabled
       Auto-PThreading only works if your PDL installation was compiled with
       POSIX threads enabled. This is normally the case if you are running on
       linux, or other unix variants.

   Non-Threadsafe Code
       Not all the libraries that PDL intefaces to are thread-safe, i.e. they
       aren't written to operate in a multi-threaded environment without
       crashing or causing side-effects. Some examples in the PDL core is the
       fft function and the pnmout functions.

       To operate properly with these types of functions, the PPCode flag
       NoPthread has been introduced to indicate a function as not being
       pthread-safe. See PDL::PP docs for details.

   Size of PDL Dimensions and PThread Target
       Due to the way a PDL is split-up for operation using multiple pthreads,
       the size of a dimension must be evenly divisible by the pthread target.
       For example, if a PDL has threaded dimension sizes of (4,3,3) and the
       auto_pthread_targ has been set to 2, then the first threaded dimension
       (size 4) will be picked to be split up into two pthreads of size 2 and
       2. However, if the threaded dimension sizes are (3,3,3) and the
       auto_pthread_targ is still 2, then pthreading won't occur, because no
       threaded dimensions are divisible by 2.

       The algorithm that picks the actual number of pthreads has some smarts
       (but could probably be improved) to adjust down from the
       auto_pthread_targ to get a number of pthreads that can evenly divide
       one of the threaded dimensions. For example, if a PDL has threaded
       dimension sizes of (9,2,2) and the auto_pthread_targ is 4, the
       algorithm will see that no dimension is divisible by 4, then adjust
       down the target to 3, resulting in splitting up the first threaded
       dimension (size 9) into 3 pthreads.

   Speed improvement might be less than you expect.
       If you have a 8 core machine and call auto_pthread_targ with 8 to
       generate 8 parallel pthreads, you probably won't get a 8X improvement
       in speed, due to memory bandwidth issues. Even though you have 8
       separate CPUs crunching away on data, you will have (for most common
       machine architectures) common RAM that now becomes your bottleneck. For
       simple calculations (e.g simple additions) you can run into a
       performance limit at about
	4 pthreads. For more complex calculations the limit will be higher.

COPYRIGHT
       Copyright 2011 John Cerney. You can distribute and/or modify this
       document under the same terms as the current Perl license.

       See: http://dev.perl.org/licenses/

perl v5.18.1			  2013-05-12			PARALLELCPU(1)
[top]

List of man pages available for Mageia

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net