PDL::PP man page on Peanut

PDL::PP man page on Peanut
Man page or keyword search:
man Server 7435 pages
apropos Keyword Search (all sections)
Output format
PP(1)		      User Contributed Perl Documentation		 PP(1)

NAME
       PDL::PP - Generate PDL routines from concise descriptions

SYNOPSIS
       e.g.

	       pp_def(
		       'sumover',
		       Pars => 'a(n); [o]b();',
		       Code => 'double tmp=0;
			       loop(n) %{ tmp += $a(); %}
			       $b() = tmp;
			       '
	       );

	       pp_done();

DESCRIPTION
       In much of what follows we will assume familiarity of the reader with
       the concepts of implicit and explicit threading and index manipulations
       within PDL. If you have not yet heard of these concepts or are not very
       comfortable with them it is time to check PDL::Indexing.

       As you may appreciate from its name PDL::PP is a Pre-Processor, i.e.
       it expands code via substitutions to make real C-code (well, actually
       it outputs XS code (See perlxs) but that is very close to C).

Overview
       Why do we need PP? Several reasons: firstly, we want to be able to
       generate subroutine code for each of the PDL datatypes (PDL_Byte,
       PDL_Short,. etc).  AUTOMATICALLY.  Secondly, when referring to slices
       of PDL arrays in Perl (e.g. "$a->slice('0:10:2,:')" or other things
       such as transposes) it is nice to be able to do this transparently and
       to be able to do this 'in-place' - i.e, not to have to make a memory
       copy of the section. PP handles all the necessary element and offset
       arithmetic for you. There are also the notions of threading (repeated
       calling of the same routine for multiple slices, see PDL::Indexing) and
       dataflow (see PDL::Dataflow) which use of PP allows.

       So how do you use PP? Well for the most part you just write ordinary C
       code except for special PP constructs which take the form:

	  $something(something else)

       or:

	  PPfunction %{
	    <stuff>
	  %}

       The most important PP construct is the form "$array()". Consider the
       very simple PP function to sum the elements of a 1D vector (in fact
       this is very similar to the actual code used by 'sumover'):

	  pp_def('sumit',
		  Pars => 'a(n);  [o]b();',
		  Code => '
		       double tmp;
		       tmp = 0;
		       loop(n) %{
			 tmp += $a();
		       %}
		       $b() = tmp;
	  ');

       What's going on? The "Pars =>" line is very important for PP - it
       specifies all the arguments and their dimensionality. We call this the
       signature of the PP function (compare also the explanations in
       PDL::Indexing).	In this case the routine takes a 1-D function as input
       and returns a 0-D scalar as output.  The "$a()" PP construct is used to
       access elements of the array a(n) for you - PP fills in all the
       required C code.

       [Aside: since PP used "$var()" for its parsing you must single-quote
       all Code=> arguments since you don't want perl to interpolate "$var()"
       into another string - i.e. don't use "" unless you know what you are
       doing! Tjl: it's usually easiest to use single quotes and
       'something'.$interpolatable.'somethingelse']

       In the simple case here where all elements are accessed the PP
       construct "loop(n) %{ ... %}" is used to loop over all elements in
       dimension "n".  Note this feature of PP: ALL DIMENSIONS ARE SPECIFIED
       BY NAME.

       This is made clearer if we avoid the PP loop() construct and write the
       loop explicitly using conventional C:

	  pp_def('sumit',
		  Pars => 'a(n);  [o]b();',
		  Code => '
		       int i,n_size;
		       double tmp;
		       n_size = $SIZE(n);
		       tmp = 0;
		       for(i=0; i<n_size; i++) {
			 tmp += $a(n=>i);
		       }
		       $b() = tmp;
	  ');

       which does the same as before, except more long-windedly.  You can see
       to get element "i" of a() we say "$a(n=>i)" - we are specifying the
       dimension by name "n". In 2D we might say:

	  Pars=>'a(m,n);',
	     ...
	     tmp += $a(m=>i,n=>j);
	     ...

       The syntax 'm=>i' borrows from Perl hashes (which are in fact used in
       the implementation of PP). One could also say "$a(n=>j,m=>i)" as order
       is not important.

       You can also see in the above example the use of another PP construct -
       $SIZE(n) to get the length of the dimension "n".

       It should, however, be noted that you shouldn't write an explicit
       C-loop when you could have used the PP "loop" construct since PDL::PP
       checks automatically the loop limits for you, usage of "loop" makes the
       code more concise, etc. But there are certainly situations where you
       need explicit control of the loop and now you know how to do it ;).

       To revisit 'Why PP?' - the above code for sumit() will be generated for
       each data-type. It will operate on slices of arrays 'in-place'. It will
       thread automatically - e.g. if a 2D array is given it will be called
       repeatedly for each 1D row (again check PDL::Indexing for the details
       of threading).  And then b() will be a 1D array of sums of each row.
       We could call it with $a->xchg(0,1) to sum the colums instead.  And
       Dataflow tracing etc. will be available.

       You can see PP saves the programmer from writing a lot of needlessly
       repetitive C-code -- in our opinion this is one of the best features of
       PDL making writing new C subroutines for PDL an amazingly concise
       exercise. A second reason is the ability to make PP expand your concise
       code definitions into different C code based on the needs of the
       computer architecture in question. Imagine for example you are lucky to
       have a supercomputer at your hands; in that case you want PDL::PP
       certainly to generate code that takes advantage of the
       vectorising/parallel computing features of your machine (this a project
       for the future). In any case, the bottom line is that your unchanged
       code should still expand to working XS code even if the internals of
       PDL changed.

       Also, because you are generating the code in an actual Perl script,
       there are many fun things that you can do. Let's say that you need to
       write both sumit (as above) and multit. With a little bit of
       inventivity, we can do

	  for({Name => 'sumit', Init => '0', Op => '+='},
	      {Name => 'multit', Init => '1', Op => '*='}) {
		  pp_def($_->{Name},
			  Pars => 'a(n);  [o]b();',
			  Code => '
			       double tmp;
			       tmp = '.$_->{Init}.';
			       loop(n) %{
				 tmp '.$_->{Op}.' $a();
			       %}
			       $b() = tmp;
		  ');
	  }

       which defines both the functions easily. Now, if you later need to
       change the signature or dimensionality or whatever, you only need to
       change one place in your code.  Yeah, sure, your editor does have 'cut
       and paste' and 'search and replace' but it's still less bothersome and
       definitely more difficult to forget just one place and have strange
       bugs creep in.  Also, adding 'orit' (bitwise or) later is a one-liner.

       And remember, you really have perl's full abilities with you - you can
       very easily read any input file and make routines from the information
       in that file. For simple cases like the above, the author (Tjl)
       currently favors the hash syntax like the above - it's not too much
       more characters than the corresponding array syntax but much easier to
       understand and change.

       We should mention here also the ability to get the pointer to the
       beginning of the data in memory - a prerequisite for interfacing PDL to
       some libraries. This is handled with the "$P(var)" directive, see
       below.

       So, after this quick overview of the general flavour of programming PDL
       routines using PDL::PP let's summarise in which circumstances you
       should actually use this preprocessor/precompiler. You should use
       PDL::PP if you want to

       ·  interface PDL to some external library

       ·  write some algorithm that would be slow if coded in perl (this is
	  not as often as you think; take a look at threading and dataflow
	  first).

       ·  be a PDL developer (and even then it's not obligatory)

WARNING
       Because of its architecture, PDL::PP can be both flexible and easy to
       use (yet exuberantly complicated) at the same time. Currently, part of
       the problem is that error messages are not very informative and if
       something goes wrong, you'd better know what you are doing and be able
       to hack your way through the internals (or be able to figure out by
       trial and error what is wrong with your args to "pp_def").

       An alternative, of course, is to ask someone about it (e.g., through
       the mailing lists).

ABANDON ALL HOPE, YE WHO ENTER HERE (DESCRIPTION)
       Now that you have some idea how to use "pp_def" to define new PDL
       functions it is time to explain the general syntax of "pp_def".
       "pp_def" takes as arguments first the name of the function you are
       defining and then a hash list that can contain various keys.

       Based on these keys PP generates XS code and a .pm file. The function
       "pp_done" (see example in the SYNOPSIS) is used to tell PDL::PP that
       there are no more definitions in this file and it is time to generate
       the .xs and
	.pm file.

       As a consequence, there may be several pp_def() calls inside a file (by
       convention files with PP code have the extension .pd or .pp) but
       generally only one pp_done().

       There are two main different types of usage of pp_def(), the 'data
       operation' and 'slice operation' prototypes.

       The 'data operation' is used to take some data, mangle it and output
       some other data; this includes for example the '+' operation, matrix
       inverse, sumover etc and all the examples we have talked about in this
       document so far. Implicit and explicit threading and the creation of
       the result are taken care of automatically in those opeartions. You can
       even do dataflow with "sumit", "sumover", etc (don't be dismayed if you
       don't understand the concept of dataflow in PDL very well yet; it is
       still very much experimental).

       The 'slice operation' is a different kind of operation: in a slice
       operation, you are not changing any data, you are defining
       correspondences between different elements of two piddles (examples
       include the index manipulation/slicing function definitions in the file
       slices.pd that is part of the PDL distribution; but beware, this is not
       introductory level stuff).

       If PDL was compiled with support for bad values (ie "WITH_BADVAL =>
       1"), then additional keys are required for "pp_def", as explained
       below.

       If you are just interested in communicating with some external library
       (for example some linear algebra/matrix library), you'll usually want
       the 'data operation' so we are going to discuss that first.

Data operation
       A simple example

       In the data operation, you must know what dimensions of data you need.
       First, an example with scalars:

	       pp_def('add',
		       Pars => 'a(); b(); [o]c();',
		       Code => '$c() = $a() + $b();'
	       );

       That looks a little strange but let's dissect it. The first line is
       easy: we're defining a routine with the name 'add'.  The second line
       simply declares our parameters and the parentheses mean that they are
       scalars. We call the string that defines our parameters and their
       dimensionality the signature of that function. For its relevance with
       regard to threading and index manipulations check the PDL::Indexing
       manpage.

       The third line is the actual operation. You need to use the dollar
       signs and parentheses to refer to your parameters (this will probably
       change at some point in the future, once a good syntax is found).

       These lines are all that is necessary to actually define the function
       for PDL (well, actually it isn't; you aditionally need to write a
       Makefile.PL (see below) and build the module (something like 'perl
       Makefile.PL; make'); but let's ignore that for the moment). So now you
       can do

	       use MyModule;
	       $a = pdl 2,3,4;
	       $b = pdl 5;

	       $c = add($a,$b);
	       # or
	       add($a,$b,($c=null)); # Alternative form, useful if $c has been
				     # preset to something big, not useful here.

       and have threading work correctly (the result is $c == [7 8 9]).

       The Pars section: the signature of a PP function

       Seeing the above example code you will most probably ask: what is this
       strange "$c=null" syntax in the second call to our new "add" function?
       If you take another look at the definition of "add" you will notice
       that the third argument "c" is flagged with the qualifier "[o]" which
       tells PDL::PP that this is an output argument. So the above call to add
       means 'create a new $c from scratch with correct dimensions' - "null"
       is a special token for 'empty piddle' (you might ask why we haven't
       used the value "undef" to flag this instead of the PDL specific "null";
       we are currently thinking about it ;).

       [This should be explained in some other section of the manual as
       well!!]	The reason for having this syntax as an alternative is that if
       you have really huge piddles, you can do

	       $c = PDL->null;
	       for(some long loop) {
		       # munge a,b
		       add($a,$b,$c);
		       # munge c, put something back to a,b
	       }

       and avoid allocating and deallocating $c each time. It is allocated
       once at the first add() and thereafter the memory stays until $c is
       destroyed.

       If you just say

	 $c =  add($a,$b);

       the code generated by PP will automatically fill in "$c=null" and
       return the result. If you want to learn more about the reasons why
       PDL::PP supports this style where output arguments are given as last
       arguments check the PDL::Indexing manpage.

       "[o]" is not the only qualifier a pdl argument can have in the
       signature.  Another important qualifier is the "[t]" option which flags
       a pdl as temporary.  What does that mean? You tell PDL::PP that this
       pdl is only used for temporary results in the course of the calculation
       and you are not interested in its value after the computation has been
       completed. But why should PDL::PP want to know about this in the first
       place?  The reason is closely related to the concepts of pdl auto
       creation (you heard about that above) and implicit threading. If you
       use implicit threading the dimensionality of automatically created pdls
       is actually larger than that specified in the signature. With "[o]"
       flagged pdls will be created so that they have the additional
       dimensions as required by the number of implicit thread dimensions.
       When creating a temporary pdl, however, it will always only be made big
       enough so that it can hold the result for one iteration in a
       threadloop, i.e. as large as required by the signature.	So less memory
       is wasted when you flag a pdl as temporary. Secondly, you can use
       output auto creation with temporary pdls even when you are using
       explicit threading which is forbidden for normal output pdls flagged
       with "[o]" (see PDL::Indexing).

       Here is an example where we use the [t] qualifier. We define the
       function "callf" that calls a C routine "f" which needs a temporary
       array of the same size and type as the array "a" (sorry about the
       forward reference for $P; it's a pointer access, see below) :

	 pp_def('callf',
	       Pars => 'a(n); [t] tmp(n); [o] b()',
	       Code => 'int ns = $SIZE(n);
			f($P(a),$P(b),$P(tmp),ns);
		       '
	 );

       Argument dimensions and the signature

       Now we have just talked about dimensions of pdls and the signature. How
       are they related? Let's say that we want to add a scalar + the index
       number to a vector:

	       pp_def('add2',
		       Pars => 'a(n); b(); [o]c(n);',
		       Code => 'loop(n) %{
				       $c() = $a() + $b() + n;
				%}'
	       );

       There are several points to notice here: first, the "Pars" argument now
       contains the n arguments to show that we have a single dimensions in a
       and c. It is important to note that dimensions are actual entities that
       are accessed by name so this declares a and c to have the same first
       dimensions. In most PP definitions the size of named dimensions will be
       set from the respective dimensions of non-output pdls (those with no
       "[o]" flag) but sometimes you might want to set the size of a named
       dimension explicitly through an integer parameter. See below in the
       description of the "OtherPars" section how that works.

       Type conversions and the signature

       The signature also determines the type conversions that will be
       performed when a PP function is invoked. So what happens when we invoke
       one of our previously defined functions with pdls of different type,
       e.g.

	 add2($a,$b,($ret=null));

       where $a is of type "PDL_Float" and $b of type "PDL_Short"? With the
       signature as shown in the definition of "add2" above the datatype of
       the operation (as determined at runtime) is that of the pdl with the
       'highest' type (sequence is byte < short < ushort < long < float <
       double). In the add2 example the datatype of the operation is float ($a
       has that datatype). All pdl arguments are then type converted to that
       datatype (they are not converted inplace but a copy with the right type
       is created if a pdl argument doesn't have the type of the operation).
       Null pdls don't contribute a type in the determination of the type of
       the operation.  However, they will be created with the datatype of the
       operation; here, for example, $ret will be of type float. You should be
       aware of these rules when calling PP functions with pdls of different
       types to take the additional storage and runtime requirements into
       account.

       These type conversions are correct for most functions you normally
       define with "pp_def". However, there are certain cases where slightly
       modified type conversion behaviour is desired. For these cases
       additional qualifiers in the signature can be used to specify the
       desired properties with regard to type conversion. These qualifiers can
       be combined with those we have encountered already (the creation
       qualifiers "[o]" and "[t]"). Let's go through the list of qualifiers
       that change type conversion behaviour.

       The most important is the "int" qualifier which comes in handy when a
       pdl argument represents indices into another pdl. Let's take a look at
       an example from "PDL::Ufunc":

	  pp_def('maximum_ind',
		 Pars => 'a(n); int [o] b()',
		 Code => '$GENERIC() cur;
			  int curind;
			  loop(n) %{
			   if (!n || $a() > cur) {cur = $a(); curind = n;}
			  %}
			  $b() = curind;',
	  );

       The function "maximum_ind" finds the index of the largest element of a
       vector. If you look at the signature you notice that the output
       argument "b" has been declared with the additional "int" qualifier.
       This has the following consequences for type conversions: regardless of
       the type of the input pdl "a" the output pdl "b" will be of type
       "PDL_Long" which makes sense since "b" will represent an index into
       "a". Furthermore, if you call the function with an existing output pdl
       "b" its type will not influence the datatype of the operation (see
       above). Hence, even if "a" is of a smaller type than "b" it will not be
       converted to match the type of "b" but stays untouched, which saves
       memory and CPU cycles and is the right thing to do when "b" represents
       indices. Also note that you can use the 'int' qualifier together with
       other qualifiers (the "[o]" and "[t]" qualifiers). Order is significant
       -- type qualifiers precede creation qualifiers ("[o]" and "[t]").

       The above example also demonstrates typical usage of the "$GENERIC()"
       macro.  It expands to the current type in a so called generic loop.
       What is a generic loop? As you already heard a PP function has a
       runtime datatype as determined by the type of the pdl arguments it has
       been invoked with.  The PP generated XS code for this function
       therefore contains a switch like "switch (type) {case PDL_Byte: ...
       case PDL_Double: ...}" that selects a case based on the runtime
       datatype of the function (it's called a type ``loop'' because there is
       a loop in PP code that generates the cases).  In any case your code is
       inserted once for each PDL type into this switch statement. The
       "$GENERIC()" macro just expands to the respective type in each copy of
       your parsed code in this "switch" statement, e.g., in the "case
       PDL_Byte" section "cur" will expand to "PDL_Byte" and so on for the
       other case statements. I guess you realise that this is a useful macro
       to hold values of pdls in some code.

       There are a couple of other qualifiers with similar effects as "int".
       For your convenience there are the "float" and "double" qualifiers with
       analogous consequences on type conversions as "int". Let's assume you
       have a very large array for which you want to compute row and column
       sums with an equivalent of the "sumover" function.  However, with the
       normal definition of "sumover" you might run into problems when your
       data is, e.g. of type short. A call like

	 sumover($large_pdl,($sums = null));

       will result in $sums be of type short and is therefore prone to
       overflow errors if $large_pdl is a very large array. On the other hand
       calling

	 @dims = $large_pdl->dims; shift @dims;
	 sumover($large_pdl,($sums = zeroes(double,@dims)));

       is not a good alternative either. Now we don't have overflow problems
       with $sums but at the expense of a type conversion of $large_pdl to
       double, something bad if this is really a large pdl. That's where
       "double" comes in handy:

	 pp_def('sumoverd',
		Pars => 'a(n); double [o] b()',
		Code => 'double tmp=0;
			 loop(n) %{ tmp += a(); %}
			 $b() = tmp;',
	 );

       This gets us around the type conversion and overflow problems. Again,
       analogous to the "int" qualifier "double" results in "b" always being
       of type double regardless of the type of "a" without leading to a
       typeconversion of "a" as a side effect.

       Finally, there are the "type+" qualifiers where type is one of "int" or
       "float". What shall that mean. Let's illustrate the "int+" qualifier
       with the actual definition of sumover:

	 pp_def('sumover',
		Pars => 'a(n); int+ [o] b()',
		Code => '$GENERIC(b) tmp=0;
			 loop(n) %{ tmp += a(); %}
			 $b() = tmp;',
	 );

       As we had already seen for the "int", "float" and "double" qualifiers,
       a pdl marked with a "type+" qualifier does not influence the datatype
       of the pdl operation. Its meaning is "make this pdl at least of type
       "type" or higher, as required by the type of the operation". In the
       sumover example this means that when you call the function with an "a"
       of type PDL_Short the output pdl will be of type PDL_Long (just as
       would have been the case with the "int" qualifier). This again tries to
       avoid overflow problems when using small datatypes (e.g. byte images).
       However, when the datatype of the operation is higher than the type
       specified in the "type+" qualifier "b" will be created with the
       datatype of the operation, e.g. when "a" is of type double then "b"
       will be double as well. We hope you agree that this is sensible
       behaviour for "sumover". It should be obvious how the "float+"
       qualifier works by analogy.  It may become necessary to be able to
       specify a set of alternative types for the parameters. However, this
       will probably not be implemented until someone comes up with a
       reasonable use for it.

       Note that we now had to specify the $GENERIC macro with the name of the
       pdl to derive the type from that argument. Why is that? If you
       carefully followed our explanations you will have realised that in some
       cases "b" will have a different type than the type of the operation.
       Calling the '$GENERIC' macro with "b" as argument makes sure that the
       type will always the same as that of "b" in that part of the generic
       loop.

       This is about all there is to say about the "Pars" section in a
       "pp_def" call. You should remember that this section defines the
       signature of a PP defined function, you can use several options to
       qualify certain arguments as output and temporary args and all
       dimensions that you can later refer to in the "Code" section are
       defined by name.

       It is important that you understand the meaning of the signature since
       in the latest PDL versions you can use it to define threaded functions
       from within perl, i.e. what we call perl level threading. Please check
       PDL::Indexing for details.

       The Code section

       The "Code" section contains the actual XS code that will be in the
       innermost part of a threadloop (if you don't know what a thread loop is
       then you still haven't read PDL::Indexing; do it now ;) after any PP
       macros (like $GENERIC) and PP functions have been expanded (like the
       "loop" function we are going to explain next).

       Let's quickly reiterate the "sumover" example:

	 pp_def('sumover',
		Pars => 'a(n); int+ [o] b()',
		Code => '$GENERIC(b) tmp=0;
			 loop(n) %{ tmp += a(); %}
			 $b() = tmp;',
	 );

       The "loop" construct in the "Code" section also refers to the dimension
       name so you don't need to specify any limits: the loop is correctly
       sized and everything is done for you, again.

       Next, there is the surprising fact that "$a()" and "$b()" do not
       contain the index. This is not necessary because we're looping over n
       and both variables know which dimensions they have so they
       automatically know they're being looped over.

       This feature comes in very handy in many places and makes for much
       shorter code. Of course, there are times when you want to circumvent
       this; here is a function which symmetrizes a matrix and serves as an
       example of how to code explicit looping:

	       pp_def('symm',
		       Pars => 'a(n,n); [o]c(n,n);',
		       Code => 'loop(n) %{
				       int n2;
				       for(n2=n; n2<$SIZE(n); n2++) {
					       $c(n0 => n, n1 => n2) =
					       $c(n0 => n2, n1 => n) =
						$a(n0 => n, n1 => n2);
				       }
			       %}
		       '
	       );

       Let's dissect what is happening. Firstly, what is this function
       supposed to do? From its signature you see that it takes a 2D matrix
       with equal numbers of columns and rows and outputs a matrix of the same
       size. From a given input matrix $a it computes a symmetric output
       matrix $c (symmetric in the matrix sense that A^T = A where ^T means
       matrix transpose, or in PDL parlance $c == $c->xchg(0,1)). It does this
       by using only the values on and below the diagonal of $a. In the output
       matrix $c all values on and below the diagonal are the same as those in
       $a while those above the diagonal are a mirror image of those below the
       diagonal (above and below are here interpreted in the way that PDL
       prints 2D pdls). If this explanation still sounds a bit strange just go
       ahead, make a little file into which you write this definition, build
       the new PDL extension (see section on Makefiles for PP code) and try it
       out with a couple of examples.

       Having explained what the function is supposed to do there are a couple
       of points worth noting from the syntactical point of view. First, we
       get the size of the dimension named "n" again by using the $SIZE macro.
       Second, there are suddenly these funny "n0" and "n1" index names in the
       code though the signature defines only the dimension "n". Why this? The
       reason becomes clear when you note that both the first and second
       dimension of $a and $b are named "n" in the signature of "symm". This
       tells PDL::PP that the first and second dimension of these arguments
       should have the same size. Otherwise the generated function will raise
       a runtime error.	 However, now in an access to $a and $c PDL::PP cannot
       figure out which index "n" refers to any more just from the name of the
       index.  Therefore, the indices with equal dimension names get numbered
       from left to right starting at 0, e.g. in the above example "n0" refers
       to the first dimension of $a and $c, "n1" to the second and so on.

       In all examples so far, we have only used the "Pars" and "Code" members
       of the hash that was passed to "pp_def". There are certainly other keys
       that are recognised by PDL::PP and we will hear about some of them in
       the course of this document. Find a (non-exhaustive) list of keys in
       Appendix A.  A list of macros and PPfunctions (we have only encountered
       some of those in the examples above yet) that are expanded in values of
       the hash argument to "pp_def" is summarised in Appendix B.

       At this point, it might be appropriate to mention that PDL::PP is not a
       completely static, well designed set of routines (as Tuomas puts it:
       "stop thinking of PP as a set of routines carved in stone") but rather
       a collection of things that the PDL::PP author (Tuomas J. Lukka)
       considered he would have to write often into his PDL extension
       routines. PP tries to be expandable so that in the future, as new needs
       arise, new common code can be abstracted back into it. If you want to
       learn more on why you might want to change PDL::PP and how to do it
       check the section on PDL::PP internals.

       Handling bad values

       If you do not have bad-value support compiled into PDL you can ignore
       this section and the related keys: "BadCode", "HandleBad", ...  (try
       printing out the value of $PDL::Bad::Status - if it equals 0 then move
       straight on).

       There are several keys and macros used when writing code to handle bad
       values. The first one is the "HandleBad" key:

       HandleBad => 0
	   This flags a pp-routine as NOT handling bad values. If this routine
	   is sent piddles with their "badflag" set, then a warning message is
	   printed to STDOUT and the piddles are processed as if the value
	   used to represent bad values is a valid number. The "badflag" value
	   is not propogated to the output piddles.

	   An example of when this is used is for FFT routines, which
	   generally do not have a way of ignoring part of the data.

       HandleBad => 1
	   This causes PDL::PP to write extra code that ensures the BadCode
	   section is used, and that the "$ISBAD()" macro (and its brethren)
	   work.

       HandleBad is not given
	   If any of the input piddles have their "badflag" set, then the
	   output piddles will have their "badflag" set, but any supplied
	   BadCode is ignored.

       The value of "HandleBad" is used to define the contents of the "BadDoc"
       key, if it is not given.

       To handle bad values, code must be written somewhat differently; for
       instance,

	$c() = $a() + $b();

       becomes something like

	if ( $a() != BADVAL && $b() != BADVAL ) {
	   $c() = $a() + $b();
	} else {
	   $c() = BADVAL;
	}

       However, we only want the second version if bad values are present in
       the input piddles (and that bad-value support is wanted!) - otherwise
       we actually want the original code. This is where the "BadCode" key
       comes in; you use it to specify the code to execute if bad values may
       be present, and PP uses both it and the "Code" section to create
       something like:

	if ( bad_values_are_present ) {
	   fancy_threadloop_stuff {
	      BadCode
	   }
	} else {
	   fancy_threadloop_stuff {
	      Code
	   }
	}

       This approach means that there is virtually no overhead when bad values
       are not present (ie the badflag routine returns 0).

       The BadCode section can use the same macros and looping constructs as
       the Code section.  However, it wouldn't be much use without the
       following additional macros:

       $ISBAD(var)
	   To check whether a piddle's value is bad, use the $ISBAD macro:

	    if ( $ISBAD(a()) ) { printf("a() is bad\n"); }

	   You can also access given elements of a piddle:

	    if ( $ISBAD(a(n=>l)) ) { printf("element %d of a() is bad\n", l); }

       $ISGOOD(var)
	   This is the opposite of the $ISBAD macro.

       $SETBAD(var)
	   For when you want to set an element of a piddle bad.

       $ISBADVAR(c_var,pdl)
	   If you have cached the value of a piddle "$a()" into a c-variable
	   ("foo" say), then to check whether it is bad, use
	   "$ISBADVAR(foo,a)".

       $ISGOODVAR(c_var,pdl)
	   As above, but this time checking that the cached value isn't bad.

       $SETBADVAR(c_var,pdl)
	   To copy the bad value for a piddle into a c variable, use
	   "$SETBADVAR(foo,a)".

       TODO: mention "$PPISBAD()" etc macros.

       Using these macros, the above code could be specified as:

	Code => '$c() = $a() + $b();',
	BadCode => '
	   if ( $ISBAD(a()) || $ISBAD(b()) ) {
	      $SETBAD(c());
	   } else {
	      $c() = $a() + $b();
	   }',

       Since this is perl, TMTOWTDI, so you could also write:

	BadCode => '
	   if ( $ISGOOD(a()) && $ISGOOD(b()) ) {
	      $c() = $a() + $b();
	   } else {
	      $SETBAD(c());
	   }',

       If you want access to the value of the badflag for a given piddle, you
       can use the "$PDLSTATExxxx()" macros:

       $PDLSTATEISBAD(pdl)
       $PDLSTATEISGOOD(pdl)
       $PDLSTATESETBAD(pdl)
       $PDLSTATESETGOOD(pdl)

       TODO: mention the "FindBadStatusCode" and "CopyBadStatusCode" options
       to "pp_def", as well as the "BadDoc" key.

       Interfacing your own/library functions using PP

       Now, consider the following: you have your own C function (that may in
       fact be part of some library you want to interface to PDL) which takes
       as arguments two pointers to vectors of double:

	       void myfunc(int n,double *v1,double *v2);

       The correct way of defining the PDL function is

	       pp_def('myfunc',
		       Pars => 'a(n); [o]b(n);',
		       GenericTypes => [D],
		       Code => 'myfunc($SIZE(n),$P(a),$P(b));'
	       );

       The "$P("par")" syntax returns a pointer to the first element and the
       other elements are guaranteed to lie after that.

       Notice that here it is possible to make many mistakes. First, $SIZE(n)
       must be used instead of "n". Second, you shouldn't put any loops in
       this code. Third, here we encounter a new hash key recognised by
       PDL::PP : the "GenericTypes" declaration tells PDL::PP to ONLY GENERATE
       THE TYPELOOP FOP THE LIST OF TYPES SPECIFIED. In this case "double".
       This has two advantages. Firstly the size of the compiled code is
       reduced vastly, secondly if non-double arguments are passed to
       "myfunc()" PDL will automatically convert them to double before passing
       to the external C routine and convert them back afterwards.

       One can also use "Pars" to qualify the types of individual arguments.
       Thus one could also write this as:

	       pp_def('myfunc',
		       Pars => 'double a(n); double [o]b(n);',
		       Code => 'myfunc($SIZE(n),$P(a),$P(b));'
	       );

       The type specification in "Pars" exempts the argument from variation in
       the typeloop - rather it is automatically converted too and from the
       type specified. This is obviously useful in a more general example,
       e.g.:

	       void myfunc(int n,float *v1,long *v2);

	       pp_def('myfunc',
		       Pars => 'float a(n); long [o]b(n);',
		       GenericTypes => [F],
		       Code => 'myfunc($SIZE(n),$P(a),$P(b));'
	       );

       Note we still use "GenericTypes" to reduce the size of the type loop,
       obviously PP could in principle spot this and do it automatically
       though the code has yet to attain that level of sophistication!

       Finally note when types are converted automatically one MUST use the
       "[o]" qualifier for output variables or you hard one changes will get
       optimised away by PP!

       If you interface a large library you can automate the interfacing even
       further. Perl can help you again(!) in doing this. In many libraries
       you have certain calling conventions. This can be exploited. In short,
       you can write a little parser (which is really not difficult in perl)
       that then generates the calls to "pp_def" from parsed descriptions of
       the functions in that library. For an example, please check the Slatec
       interface in the "Lib" tree of the PDL distribution. If you want to
       check (during debugging) which calls to PP functions your perl code
       generated a little helper package comes in handy which replaces the PP
       functions by identically named ones that dump their arguments to
       stdout.

       Just say

	  perl -MPDL::PP::Dump myfile.pd

       to see the calls to "pp_def" and friends. Try it with ops.pd and
       slatec.pd. If you're interested (or want to enhance it), the source is
       in Basic/Gen/PP/Dump.pm

       Other macros and functions in the Code section

       Macros: So far we have encountered the $SIZE, $GENERIC and $P macros.
       Now we are going to quickly explain the other macros that are expanded
       in the "Code" section of PDL::PP along with examples of their usage.

       $T The $T macro is used for type switches. This is very useful when you
	  have to use different external (e.g. library) functions depending on
	  the input type of arguments. The general syntax is

		  $Ttypeletters(type_alternatives)

	  where "typeletters" is a permutation of a subset of the letters
	  "BSULFD" which stand for Byte, Short, Ushort, etc. and
	  "type_alternatives" are the expansions when the type of the PP
	  operation is equal to that indicated by the respective letter. Let's
	  illustrate this incomprehensible description by an example. Assuming
	  you have two C functions with prototypes

	    void float_func(float *in, float *out);
	    void double_func(double *in, double *out);

	  which do basically the same thing but one accepts float and the
	  other double pointers. You could interface them to PDL by defining a
	  generic function "foofunc" (which will call the correct function
	  depending on the type of the transformation):

	    pp_def('foofunc',
		  Pars => ' a(n); [o] b();',
		  Code => ' $TFD(float_func,double_func) ($P(a),$P(b));'
		  GenericTypes => [F,D],
	    );

	  Please note that you can't say

		 Code => ' $TFD(float,double)_func ($P(a),$P(b));'

	  since the $T macro expands with trailing spaces, analogously to C
	  preprocessor macros.	The slightly longer form illustrated above is
	  correct.  If you really want brevity, you can of course do

		  '$TBSULFD('.(join ',',map {"long_identifier_name_$_"}
			  qw/byt short unseigned lounge flotte dubble/).');'

       $PP
	  The $PP macro is used for a so called physical pointer access. The
	  physical refers to some internal optimisations of PDL (for those who
	  are familiar with the PDL core we are talking about the vaffine
	  optimisations). This macro is mainly for internal use and you
	  shouldn't need to use it in any of your normal code.

       $COMP (and the "OtherPars" section)
	  The $COMP macro is used to access non-pdl values in the code
	  section. Its name is derived from the implementation of
	  transformations in PDL. The variables you can refer to using $COMP
	  are members of the ``compiled'' structure that represents the PDL
	  transformation in question but does not yet contain any information
	  about dimensions (for further details check PDL::Internals).
	  However, you can treat $COMP just as a black box without knowing
	  anything about the implementation of transformations in PDL. So when
	  would you use this macro? Its main usage is to access values of
	  arguments that are declared in the "OtherPars" section of a "pp_def"
	  definition. But then you haven't heard about the "OtherPars" key
	  yet?!	 Let's have another example that illustrates typical usage of
	  both new features:

	    pp_def('pnmout',
		  Pars => 'a(m)',
		  OtherPars => "char* fd",
		  GenericTypes => [B,U,S,L],
		  Code => 'PerlIO *fp;
			   IO *io;

			 io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
			   if (!io || !(fp = IoIFP(io)))
				  croak("Can\'t figure out FP");

			   if (PerlIO_write(fp,$P(a),len) != len)
					  croak("Error writing pnm file");
	    ');

	  This function is used to write data from a pdl to a file. The file
	  descriptor is passed as a string into this function. This parameter
	  does not go into the "Pars" section since it cannot be usefully
	  treated like a pdl but rather into the aptly named "OtherPars"
	  section. Parameters in the "OtherPars" section follow those in the
	  "Pars" section when invoking the function, i.e.

	     open FILE,">out.dat" or die "couldn't open out.dat";
	     pnmout($pdl,'FILE');

	  When you want to access this parameter inside the code section you
	  have to tell PP by using the $COMP macro, i.e. you write "$COMP(fd)"
	  as in the example. Otherwise PP wouldn't know that the "fd" you are
	  referring to is the same as that specified in the "OtherPars"
	  section.

	  Another use for the "OtherPars" section is to set a named dimension
	  in the signature. Let's have an example how that is done:

	    pp_def('setdim',
		  Pars => '[o] a(n)',
		  OtherPars => 'int ns => n',
		  Code => 'loop(n) %{ $a() = n; %}',
	    );

	  This says that the named dimension "n" will be initialised from the
	  value of the other parameter "ns" which is of integer type (I guess
	  you have realised that we use the "CType From => named_dim" syntax).
	  Now you can call this function in the usual way:

	    setdim(($a=null),5);
	    print $a;
	      [ 0 1 2 3 4 ]

	  Admittedly this function is not very useful but it demonstrates how
	  it works. If you call the function with an existing pdl and you
	  don't need to explicitly specify the size of "n" since PDL::PP can
	  figure it out from the dimensions of the non-null pdl. In that case
	  you just give the dimension parameter as "-1":

	    $a = hist($b);
	    setdim($a,-1);

	  That should do it.

       The only PP function that we have used in the examples so far is
       "loop".	Additionally, there are currently two other functions which
       are recognised in the "Code" section:

       threadloop
	 As we heard above the signature of a PP defined function defines the
	 dimensions of all the pdl arguments involved in a primitive
	 operation.  However, you often call the functions that you defined
	 with PP with pdls that have more dimensions than those specified in
	 the signature. In this case the primitive operation is performed on
	 all subslices of appropriate dimensionality in what is called a
	 threadloop (see also overview above and PDL::Indexing). Assuming you
	 have some notion of this concept you will probably appreciate that
	 the operation specified in the code section should be optimised since
	 this is the tightest loop inside a threadloop.	 However, if you
	 revisit the example where we define the "pnmout" function, you will
	 quickly realise that looking up the "IO" file descriptor in the inner
	 threadloop is not very efficient when writing a pdl with many rows. A
	 better approach would be to look up the "IO" descriptor once outside
	 the threadloop and use its value then inside the tightest threadloop.
	 This is exactly where the "threadloop" function comes in handy. Here
	 is an improved definition of "pnmout" which uses this function:

	   pp_def('pnmout',
		 Pars => 'a(m)',
		 OtherPars => "char* fd",
		 GenericTypes => [B,U,S,L],
		 Code => 'PerlIO *fp;
			  IO *io;
			  int len;

			io = GvIO(gv_fetchpv($COMP(fd),FALSE,SVt_PVIO));
			  if (!io || !(fp = IoIFP(io)))
				 croak("Can\'t figure out FP");

			  len = $SIZE(m) * sizeof($GENERIC());

			  threadloop %{
			     if (PerlIO_write(fp,$P(a),len) != len)
					 croak("Error writing pnm file");
			  %}
	   ');

	 This works as follows. Normally the C code you write inside the
	 "Code" section is placed inside a threadloop (i.e., PP generates the
	 appropriate wrapping XS code around it). However, when you explicitly
	 use the "threadloop" function, PDL::PP recognises this and doesn't
	 wrap your code with an additional threadloop. This has the effect
	 that code you write outside the threadloop is only executed once per
	 transformation and just the code with in the surrounding "%{ ... %}"
	 pair is placed within the tightest threadloop. This also comes in
	 handy when you want to perform a decision (or any other code,
	 especially CPU intensive code) only once per thread, i.e.

	   pp_addhdr('
	     #define RAW 0
	     #define ASCII 1
	   ');
	   pp_def('do_raworascii',
		  Pars => 'a(); b(); [o]c()',
		  OtherPars => 'int mode',
		Code => ' switch ($COMP(mode)) {
			     case RAW:
				 threadloop %{
				     /* do raw stuff */
				 %}
				 break;
			     case ASCII:
				 threadloop %{
				     /* do ASCII stuff */
				 %}
				 break;
			     default:
				 croak("unknown mode");
			    }'
	    );

       types
	 The types function works similar to the $T macro. However, with the
	 "types" function the code in the following block (delimited by "%{"
	 and "%}" as usual) is executed for all those cases in which the
	 datatype of the operation is any of the types represented by the
	 letters in the argument to "type", e.g.

	      Code => '...

		      types(BSUL) %{
			  /* do integer type operation */
		      %}
		      types(FD) %{
			  /* do floating point operation */
		      %}
		      ...'

       Typemap handling in the "OtherPars" section

       The "OtherPars" section discussed above is very often absolutely
       crucial when you interface external libraries with PDL. However in many
       cases the external libraries either use derived types or pointers of
       various types.

       The standard way to handle this in Perl is to use a "typemap" file.
       This is discussed in some detail in perlxs in the standard Perl
       documentation. In PP the functionality is very similar, so you can
       create a "typemap" file in the directory where your PP file resides and
       when it is built it is automatically read in to figure out the
       appropriate translation between the C type and Perl's built-in type.

       That said, there are a couple of important differences from the general
       handling of types in XS. The first, and probably most important, is
       that at the moment pointers to types are not allowed in the "OtherPars"
       section. To get around this limitation you must use the "IV" type
       (thanks to Judd Taylor for pointing out that this is necessary for
       portability).

       It is probably best to illustrate this with a couple of code-snippets:

       For instance the "gsl_spline_init" function has the following C
       declaration:

	   int	gsl_spline_init(gsl_spline * spline,
		 const double xa[], const double ya[], size_t size);

       Clearly the "xa" and "ya" arrays are candidates for being passed in as
       piddles and the "size" argument is just the length of these piddles so
       that can be handled by the "$SIZE()" macro in PP. The problem is the
       pointer to the "gsl_spline" type. The natural solution would be to
       write an "OtherPars" declaration of the form

	   OtherPars => 'gsl_spline *spl'

       and write a short "typemap" file which handled this type. This does not
       work at present however! So what you have to do is to go around the
       problem slightly (and in some ways this is easier too!):

       The solution is to declare "spline" in the "OtherPars" section using an
       "Integer Value", "IV". This hides the nature of the variable from PP
       and you then need to (well to avoid compiler warnings at least!)
       perform a type cast when you use the variable in your code. Thus
       "OtherPars" should take the form:

	   OtherPars => 'IV spl'

       and when you use it in the code you will write

	   (gsl_spline *) $COMP(spl)

       so putting this together as Andres Jordan has done (with the
       modification using "IV" by Judd Taylor) in the "gsl_interp.pd" in the
       distribution source you get:

	    pp_def('init_meat',
		   Pars => 'double x(n); double y(n);',
		   OtherPars => 'IV spl',
		   Code =>'
		gsl_spline_init( (gsl_spline *) $COMP(spl), $P(x),$P(y),$SIZE(n));'
	   );

       where I have removed a macro wrapper call, but that would obscure the
       discussion.

       The other minor difference as compared to the standard typemap handling
       in Perl, is that the user cannot specify non-standard typemap locations
       or typemap filenames using the "TYPEMAPS" option in MakeMaker... Thus
       you can only use a file called "typemap" and/or the "IV" trick above.

       Other useful PP keys in data operation definitions

       You have already heard about the "OtherPars" key. Currently, there are
       not many other keys for a data operation that will be useful in normal
       (whatever that is) PP programming. In fact, it would be interesting to
       hear about a case where you think you need more than what is provided
       at the moment.  Please speak up on one of the PDL mailing lists. Most
       other keys recognised by "pp_def" are only really useful for what we
       call slice operations (see also above).

       One thing that is strongly being planned is variable number of
       arguments, which will be a little tricky.

       An incomplete list of the available keys:

       Inplace
	   Setting this key marks the routine as working inplace - ie the
	   input and output piddles are the same. An example is
	   "$a->inplace->sqrt()" (or "sqrt(inplace($a))").

	   Inplace => 1
	       Use when the routine is a unary function, such as "sqrt".

	   Inplace => ['a']
	       If there are more than one input piddles, specify the name of
	       the one that can be changed inplace using an array reference.

	   Inplace => ['a','b']
	       If there are more than one output piddle, specify the name of
	       the input piddle and output piddle in a 2-element array
	       reference. This probably isn't needed, but left in for
	       completeness.

	   If bad values are being used, care must be taken to ensure the
	   propogation of the badflag when inplace is being used; consider
	   this excerpt from Basic/Bad/bad.pd:

	     pp_def('replacebad',HandleBad => 1,
	       Pars => 'a(); [o]b();',
	       OtherPars => 'double newval',
	       Inplace => 1,
	       CopyBadStatusCode =>
	       '/* propogate badflag if inplace AND it has changed */
		if ( a == b && $ISPDLSTATEBAD(a) )
		  PDL->propogate_badflag( b, 0 );

		/* always make sure the output is "good" */
		$SETPDLSTATEGOOD(b);
	       ',
	       ...

	   Since this routine removes all bad values, then the output piddle
	   had its bad flag cleared. If run inplace (so "a == b"), then we
	   have to tell all the children of "a" that the bad flag has been
	   cleared (to save time we make sure that we call
	   "PDL->propogate_badgflag" only if the input piddle had its bad flag
	   set).

	   NOTE: one idea is that the documentation for the routine could be
	   automatically flagged to indicate that it can be executed inplace,
	   ie something similar to how "HandleBad" sets "BadDoc" if it's not
	   supplied (it's not an ideal solution).

       Other PDL::PP functions to support concise package definition

       So far, we have described the "pp_def" and "pp_done" functions. PDL::PP
       exports a few other functions to aid you in writing concise PDL
       extension package definitions.

       Often when you interface library functions as in the above example you
       have to include additional C include files. Since the XS file is
       generated by PP we need some means to make PP insert the appropriate
       include directives in the right place into the generated XS file.  To
       this end there is the "pp_addhdr" function. This is also the function
       to use when you want to define some C functions for internal use by
       some of the XS functions (which are mostly functions defined by
       "pp_def").  By including these functions here you make sure that
       PDL::PP inserts your code before the point where the actual XS module
       section begins and will therefore be left untouched by xsubpp (cf.
       perlxs and perlxstut manpages).

       A typical call would be

	 pp_addhdr('
	 #include <unistd.h>	   /* we need defs of XXXX */
	 #include "libprotos.h"	   /* prototypes of library functions */
	 #include "mylocaldecs.h"  /* Local decs */

	 static void do_the real_work(PDL_Byte * in, PDL_Byte * out, int n)
	 {
	       /* do some calculations with the data */
	 }
	 ');

       This ensures that all the constants and prototypes you need will be
       properly included and that you can use the internal functions defined
       here in the "pp_def"s, e.g.:

	 pp_def('barfoo',
		Pars => ' a(n); [o] b(n)',
		GenericTypes => '[B]',
		Code => ' int ns = $SIZE(n);
			  do_the_real_work($P(a),$P(b),ns);
			',
	 );

       In many cases the actual PP code (meaning the arguments to "pp_def"
       calls) is only part of the package you are currently implementing.
       Often there is additional perl code and XS code you would normally have
       written into the pm and XS files which are now automatically generated
       by PP. So how to get this stuff into those dynamically generated files?
       Fortunately, there are a couple of functions, generally called
       "pp_addXXX" that assist you in doing this.

       Let's assume you have additional perl code that should go into the
       generated pm-file. This is easily achieved with the "pp_addpm" command:

	  pp_addpm(<<'EOD');

	  =head1 NAME

	  PDL::Lib::Mylib -- a PDL interface to the Mylib library

	  =head1 DESCRIPTION

	  This package implements an interface to the Mylib package with full
	  threading and indexing support (see L<PDL::Indexing>).

	  =cut

	  use PGPLOT;

	  =head2 use_myfunc
	       this function applies the myfunc operation to all the
	       elements of the input pdl regardless of dimensions
	       and returns the sum of the result
	  =cut

	  sub use_myfunc {
	       my $pdl = shift;

	       myfunc($pdl->clump(-1),($res=null));

	       return $res->sum;
	  }

	  EOD

       You have probably got the idea. In some cases you also want to export
       your additional functions. To avoid getting into trouble with PP which
       also messes around with the @EXPORT array you just tell PP to add your
       functions to the list of exported functions:

	 pp_add_exported('', 'use_myfunc gethynx');

       Note the initial empty string argument (reason for it?).

       The "pp_add_isa" command works like the the "pp_add_exported" function.
       The arguments to "pp_add_isa" are added the @ISA list, e.g.

	 pp_add_isa(' Some::Other::Class ');

       Sometimes you want to add extra XS code of your own (that is generally
       not involved with any threading/indexing issues but supplies some other
       functionality you want to access from the perl side) to the generated
       XS file, for example

	 pp_addxs('','

	 # Determine endianness of machine

	 int
	 isbigendian()
	    CODE:
	      unsigned short i;
	      PDL_Byte *b;

	      i = 42; b = (PDL_Byte*) (void*) &i;

	      if (*b == 42)
		 RETVAL = 0;
	      else if (*(b+1) == 42)
		 RETVAL = 1;
	      else
		 croak("Impossible - machine is neither big nor little endian!!\n");
	      OUTPUT:
		RETVAL
	 ');

       Especially "pp_add_exported" and "pp_addxs" should be used with care.
       PP uses PDL::Exporter, hence letting PP export your function means that
       they get added to the standard list of function exported by default
       (the list defined by the export tag ``:Func''). If you use "pp_addxs"
       you shouldn't try to do anything that involves threading or indexing
       directly. PP is much better at generating the appropriate code from
       your definitions.

       Finally, you may want to add some code to the BOOT section of the XS
       file (if you don't know what that is check perlxs). This is easily done
       with the "pp_add_boot" command:

	 pp_add_boot(<<EOB);
	       descrip = mylib_initialize(KEEP_OPEN);

	       if (descrip == NULL)
		  croak("Can't initialize library");

	       GlobalStruc->descrip = descrip;
	       GlobalStruc->maxfiles = 200;
	 EOB

       By default, PP.pm puts all subs defined using the pp_def function into
       the output .pm file's EXPORT list. This can create problems if you are
       creating a subclassed object where you don't want any methods exported.
       (i.e. the methods will only be called using the $object->method
       syntax).

       For these cases you can call pp_export_nothing() to clear out the
       export list. Example (At the end of the .pd file):

	 pp_export_nothing();
	 pp_done();

       By default, PP.pm puts the 'use Core;' line into the output .pm file.
       This imports Core's exported names into the current namespace, which
       can create problems if you are over-riding one of Core's methods in the
       current file.  You end up getting messages like "Warning: sub sumover
       redefined in file subclass.pm" when running the program.

       For these cases the pp_core_importList can be used to change what is
       imported from Core.pm.  For example:

	 pp_core_importList('()')

       This would result in

	 use Core();

       being generated in the output .pm file. This would result in no names
       being imported from Core.pm. Similarly, calling

	 pp_core_importList(' qw/ barf /')

       would result in

	 use Core qw/ barf/;

       being generated in the output .pm file. This would result in just
       'barf' being imported from Core.pm.

Slice operation
       The slice operation section of this manual is provided using dataflow
       and lazy evaluation: when you need it, ask Tjl to write it.  a delivery
       in a week from when I receive the email is 95% probable and two week
       delivery is 99% probable.

       And anyway, the slice operations require a much more intimate knowledge
       of PDL internals than the data operations. Furthermore, the complexity
       of the issues involved is considerably higher than that in the average
       data operation. If you would like to convince yourself of this fact
       take a look at the Basic/Slices/slices.pd file in the PDL distribution
       :-). Nevertheless, functions generated using the slice operations are
       at the heart of the index manipulation and dataflow capabilities of
       PDL.

       Also, there are a lot of dirty issues with virtual piddles and vaffines
       which we shall entirely skip here.

       Slices and bad values

       Slice operations need to be able to handle bad values (if support is
       compiled into PDL). The easiest thing to do is look at
       Basic/Slices/slices.pd to see how this works.

       Along with "BadCode", there are also the "BadBackCode" and
       "BadRedoDimsCode" keys for "pp_def". However, any "EquivCPOffsCode"
       should not need changing, since any changes are absorbed into the
       definition of the "$EQUIVCPOFFS()" macro (ie it is handled
       automatically by PDL::PP>.

       A few notes on writing a slicing routine...

       The following few paragraphs describe writing of a new slicing routine
       ('range'); any errors are CED's. (--CED 26-Aug-2002)

USEFUL ROUTINES
       The PDL "Core" structure, defined in Basic/Core/pdlcore.h.PL, contains
       pointers to a number of routines that may be useful to you.  The
       majority of these routines deal with manipulating piddles, but some are
       more general:

       PDL->qsort_B( PDL_Byte *xx, int a, int b )
	   Sort the array "xx" between the indices "a" and "b".	 There are
	   also versions for the other PDL datatypes, with postfix "_S", "_U",
	   "_L", "_F", and "_D".  Any module using this must ensure that
	   "PDL::Ufunc" is loaded.

       PDL->qsort_ind_B( PDL_Byte *xx, int *ix, int a, int b )
	   As for "PDL->qsort_B", but this time sorting the indices rather
	   than the data.

       The routine "med2d" in Lib/Image2D/image2d.pd shows how such routines
       are used.

MAKEFILES FOR PP FILES
       If you are going to generate a package from your PP file (typical file
       extensions are ".pd" or ".pp" for the files containing PP code) it is
       easiest and safest to leave generation of the appropriate commands to
       the Makefile. In the following we will outline the typical format of a
       perl Makefile to automatically build and install your package from a
       description in a PP file. Most of the rules to build the xs, pm and
       other required files from the PP file are already predefined in the
       PDL::Core::Dev package. We just have to tell MakeMaker to use it.

       In most cases you can define your Makefile like

	 # Makefile.PL for a package defined by PP code.

	 use PDL::Core::Dev;		# Pick up development utilities
	 use ExtUtils::MakeMaker;

	 $package = ["mylib.pd",Mylib,PDL::Lib::Mylib];
	 %hash = pdlpp_stdargs($package);
	 $hash{OBJECT} .= ' additional_Ccode$(OBJ_EXT) ';
	 $hash{clean}->{FILES} .= ' todelete_Ccode$(OBJ_EXT) ';
	 $hash{'VERSION_FROM'} = 'mylib.pd';
	 WriteMakefile(%hash);

	 sub MY::postamble { pdlpp_postamble($package); }

       Here, the list in $package is: first: PP source file name, then the
       prefix for the produced files and finally the whole package name.  You
       can modify the hash in whatever way you like but it would be reasonable
       to stay within some limits so that your package will continue to work
       with later versions of PDL.

       If you don't want to use prepackaged arguments, here is a generic
       Makefile.PL that you can adapt for your own needs:

	 # Makefile.PL for a package defined by PP code.

	 use PDL::Core::Dev;		# Pick up development utilities
	 use ExtUtils::MakeMaker;

	 WriteMakefile(
	  'NAME'       => 'PDL::Lib::Mylib',
	  'VERSION_FROM'       => 'mylib.pd',
	  'TYPEMAPS'	 => [&PDL_TYPEMAP()],
	  'OBJECT'	 => 'mylib$(OBJ_EXT) additional_Ccode$(OBJ_EXT)',
	  'PM'	       => { 'Mylib.pm'		  => '$(INST_LIBDIR)/Mylib.pm'},
	  'INC'		 => &PDL_INCLUDE(), # add include dirs as required by your lib
	  'LIBS'	 => [''],   # add link directives as necessary
	  'clean'	 => {'FILES'  =>
				 'Mylib.pm Mylib.xs Mylib$(OBJ_EXT)
				 additional_Ccode$(OBJ_EXT)'},
	 );

	 # Add genpp rule; this will invoke PDL::PP on our PP file
	 # the argument is an array reference where the array has three string elements:
	 #   arg1: name of the source file that contains the PP code
	 #   arg2: basename of the xs and pm files to be generated
	 #   arg3: name of the package that is to be generated
	 sub MY::postamble { pdlpp_postamble(["mylib.pd",Mylib,PDL::Lib::Mylib]); }

       To make life even easier PDL::Core::Dev defines the function
       "pdlpp_stdargs" that returns a hash with default values that can be
       passed (either directly or after appropriate modification) to a call to
       WriteMakefile.  Currently, "pdlpp_stdargs" returns a hash where the
       keys are filled in as follows:

	       (
		'NAME'	       => $mod,
		'TYPEMAPS'     => [&PDL_TYPEMAP()],
		'OBJECT'       => "$pref\$(OBJ_EXT)",
		PM     => {"$pref.pm" => "\$(INST_LIBDIR)/$pref.pm"},
		MAN3PODS => {"$src" => "\$(INST_MAN3DIR)/$mod.\$(MAN3EXT)"},
		'INC'	       => &PDL_INCLUDE(),
		'LIBS'	       => [''],
		'clean'	       => {'FILES'  => "$pref.xs $pref.pm $pref\$(OBJ_EXT)"},
	       )

       Here, $src is the name of the source file with PP code, $pref the
       prefix for the generated .pm and .xs files and $mod the name of the
       exntension module to generate.

INTERNALS
       The internals of the current version consist of a large table which
       gives the rules according to which things are translated and the subs
       which implement these rules.

       Later on, it would be good to make the table modifiable by the user so
       that different things may be tried.

       [Meta comment: here will hopefully be more in the future; currently,
       your best bet will be to read the source code :-( or ask on the list
       (try the latter first) ]

Appendix A: Some keys recognised by PDL::PP
       Unless otherwise specified, the arguments are strings. Keys marked with
       (bad) are only used if bad-value support is compiled into PDL.

       Pars
	   define the signature of your function

       OtherPars
	   arguments which are not pdls. Default: nothing.

       Code
	   the actual code that implements the functionality; several PP
	   macros and PP functions are recognised in the string value

       HandleBad (bad)
	   If set to 1, the routine is assumed to support bad values and the
	   code in the BadCode key is used if bad values are present; it also
	   sets things up so that the "$ISBAD()" etc macros can be used.  If
	   set to 0, cause the routine to print a warning if any of the input
	   piddles have their bad flag set.

       BadCode (bad)
	   Give the code to be used if bad values may be present in the input
	   piddles.  Only used if "HandleBad => 1".

       GenericTypes
	   An array reference. The array may contain any subset of the strings
	   `B', `S', `U', `L', `F' and `D', which specify which types your
	   operation will accept.  This is very useful (and important!) when
	   interfacing an external library.  Default: [qw/B S U L F D/]

       Inplace
	   Mark a function as being able to work inplace.

	    Inplace => 1	  if  Pars => 'a(); [o]b();'
	    Inplace => ['a']	  if  Pars => 'a(); b(); [o]c();'
	    Inplace => ['a','b']  if  Pars => 'a(); b(); [o]c(); [o]d();'

	   If bad values are being used, care must be taken to ensure the
	   propogation of the badflag when inplace is being used; for instance
	   see the code for "replacebad" in Basic/Bad/bad.pd.

       Doc Used to specify a documentation string in Pod format. See PDL::Doc
	   for information on PDL documentation conventions. Note: in the
	   special case where the PP 'Doc' string is one line this is
	   implicitly used for the quick reference AND the documentation!

	   If the Doc field is omitted PP will generate default documentation
	   (after all it knows about the Signature).

	   If you really want the function NOT to be documented in any way at
	   this point (e.g. for an internal routine, or because youu are doing
	   it elsewhere in the code) explictly specify "Doc=>undef".

       BadDoc (bad)
	   Contains the text returned by the "badinfo" command (in "perldl")
	   or the "-b" switch to the "pdldoc" shell script. In many cases, you
	   will not need to specify this, since the information can be
	   automatically created by PDL::PP. However, as befits computer-
	   generated text, it's rather stilted; it may be much better to do it
	   yourself!

Appendix B: PP macros and functions
       Macros

       Macros labelled by (bad) are only used if bad-value support is compiled
       into PDL.

       $variablename_from_sig()
	      access a pdl (by its name) that was specified in the signature

       $COMP(x)
	      access a value in the private data structure of this
	      transformation (mainly used to use an argument that is specified
	      in the "OtherPar" section)

       $SIZE(n)
	      replaced at runtime by the actual size of a named dimension (as
	      specified in the signature)

       $GENERIC()
	      replaced by the C type that is equal to the runtime type of the
	      operation

       $P(a)  a pointer access to the PDL named "a" in the signature. Useful
	      for interfacing to C functions

       $PP(a) a physical pointer access to pdl "a"; mainly for internal use

       $TXXX(Alternative,Alternative)
	      expansion alternatives according to runtime type of operation,
	      where XXX is some string that is matched by "/[BSULFD+]/".

       $PDL(a)
	      return a pointer to the pdl data structure (pdl *) of piddle "a"

       $ISBAD(a()) (bad)
	      returns true if the value stored in "a()" equals the bad value
	      for this piddle.	Requires "HandleBad" being set to 1.

       $ISGOOD(a()) (bad)
	      returns true if the value stored in "a()" does not equal the bad
	      value for this piddle.  Requires "HandleBad" being set to 1.

       $SETBAD(a()) (bad)
	      Sets "a()" to equal the bad value for this piddle.  Requires
	      "HandleBad" being set to 1.

       functions

       "loop(DIMS) %{ ... %}"
	  loop over named dimensions; limits are generated automatically by PP

       "threadloop %{ ... %}"
	  enclose following code in a threadloop

       "types(TYPES) %{ ... %}"
	  execute following code if type of operation is any of "TYPES"

SEE ALSO
       PDL

       For the concepts of threading and slicing check PDL::Indexing.

       PDL::Internals

       PDL::BadValues for information on bad values

       perlxs, perlxstut

CURRENTLY UNDOCUMENTED
       RedoDimsCode, $RESIZE()

BUGS
       PDL::PP is still, even in its rewritten form, too complicated.  It
       needs to be rethought a little as well as deconvoluted and modularized
       some more (e.g. all the NS things).

       After the rewrite, this can happen a little by little, though.

       Undocumented functions

       The following functions have been added since this manual was written
       and are as yet undocumented

       pp_export_nothing
       pp_core_importList
       pp_beginwrap
       pp_setversion
       pp_addbegin

AUTHOR
       Copyright(C) 1997 Tuomas J. Lukka (lukka@fas.harvard.edu), Karl
       Glaazebrook (kgb@aaocbn1.aao.GOV.AU) and Christian Soeller
       (c.soeller@auckland.ac.nz). All rights reserved. Although destined for
       release as a man page with the standard PDL distribution, it is not
       public domain. Permission is granted to freely distribute verbatim
       copies of this document provided that no modifications outside of
       formatting be made, and that this notice remain intact.	You are
       permitted and encouraged to use its code and derivatives thereof in
       your own source code for fun or for profit as you see fit.

perl v5.10.0			  2005-05-07				 PP(1)
[top]

List of man pages available for Peanut

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome