Bio::Tools::dpAlign man page on Fedora

Man page or keyword search:  
man Server   31170 pages
apropos Keyword Search (all sections)
Output format
Fedora logo
[printable version]

Bio::Tools::dpAlign(3)User Contributed Perl DocumentatioBio::Tools::dpAlign(3)

NAME
       Bio::Tools::dpAlign - Perl extension to do pairwise dynamic programming
       sequence alignment

SYNOPSIS
	 use Bio::Tools::dpAlign;
	 use Bio::SeqIO;
	 use Bio::SimpleAlign;
	 use Bio::AlignIO;
	 use Bio::Matrix::IO;

	 $seq1 = Bio::SeqIO->new(-file => $ARGV[0], -format => 'fasta');
	 $seq2 = Bio::SeqIO->new(-file => $ARGV[1], -format => 'fasta');

	 # create a dpAlign object
	 # to do global alignment, specify DPALIGN_GLOBAL_MILLER_MYERS
	 # to do ends-free alignment, specify DPALIGN_ENDSFREE_MILLER_MYERS
	 $factory = new dpAlign(-match => 3,
			    -mismatch => -1,
			    -gap => 3,
			    -ext => 1,
			    -alg => Bio::Tools::dpAlign::DPALIGN_LOCAL_MILLER_MYERS);

	 # actually do the alignment
	 $out = $factory->pairwise_alignment($seq1->next_seq, $seq2->next_seq);
	 $alnout = Bio::AlignIO->new(-format => 'pfam', -fh => \*STDOUT);
	 $alnout->write_aln($out);

	 # To do protein alignment, set the sequence type to protein
	 # By default all protein alignments are using BLOSUM62 matrix
	 # the gap opening cost is 7 and gap extension is 1. These
	 # values are from ssearch. To use your own custom substitution
	 # matrix, you can create a Bio::Matrix::MatrixI object.

	 $parser = Bio::Matrix::IO->new(-format => 'scoring', -file => 'blosum50.mat');
	 $matrix = $parser->next_matrix;
	 $factory = Bio::Tools::dpAlign->new(-matrix => $matrix, -alg => Bio::Tools::dpAlign::DPALIGN_LOCAL_MILLERMYERS);
	 $seq1->alphabet('protein');
	 $seq2->alphabet('protein');
	 $out = $factory->pairwise_alignment($seq1->next_seq, $seq2->next_seq);
	 $alnout->write_aln($out);

	 # use the factory to make some output

	 $factory->align_and_show($seq1, $seq2, STDOUT);

	 # use Phil Green's algorithm to calculate the optimal local
	 # alignment score between two sequences quickly. It is very
	 # useful when you are searching a query sequence in a database
	 # of sequences. Since finding a alignment is more costly
	 # than just calculating scores, you can save time if you only
	 # align sequences that have a high alignment score.

	 # To use this feature, first you call the sequence_profile function
	 # to obtain the profile of the query sequence.
	 $profile = $factory->sequence_profile($query);

	 %scores = ();
	 # Then use a loop to run a database of sequences against the
	 # profile to obtain a table of alignment scores
	 $dbseq = Bio::SeqIO(-file => 'dbseq.fa', -format => 'fasta');
	 while (defined($seq = $dbseq->next_seq)) {
	     $scores{$seq->id} = $factory->pairwise_alignment_score($profile, $seq);
	 }

DESCRIPTION
       Dynamic Programming approach is considered to be the most sensitive way
       to align two biological sequences. There are currently three major
       types of dynamic programming algorithms: Global Alignment, Local
       Alignment and Ends-free Alignment.

       Global Alignment compares two sequences in their entirety.  By
       inserting gaps in the two sequences, it aligns two sequences to
       minimize the edit distance as defined by the gap cost function and the
       substitution matrix. Global Alignment is generally applied to two
       sequences that are very similar in length and content.

       Local Alignment instead attempts to find out the subsequences that has
       the minimal edit distance among all possible subsequences.  It is good
       for sequences that has a stretch of subsequences that are similar to
       each other.

       Ends-free Alignment is a special case of Global Alignment. There are no
       gap penalty imposed for the gaps that extended from the end points of
       two sequences. Therefore it will be a good application when you think
       one sequence is contained by the other or when you think two sequences
       overlap each other.

       Dynamic Programming was first introduced by Needleman-Wunsch (1970) to
       globally align two sequences. The idea of local alignment was later
       introduced by Smith-Waterman (1981). Gotoh (1982) improved both
       algorithms by introducing auxillary arrays that reduced the time
       complexity of the algorithms to O(m*n).	Miller-Myers (1988) exploits
       the divide-and-conquer idea introduced by Hirschberg (1975) to solve
       the affine gap cost dynamic programming using only linear space. At the
       time of this writing, it is accepted that Miller-Myers is the fastest
       single CPU implementation and using the least memory that is truly
       equivalent to original algorithm introduced by Needleman-Wunsch.
       According to Aaron Mackey, Phil Green's SWAT implemention introduced a
       heuristic that does not consider paths throught the matrix where the
       score would be less than the gap opening penalty, yielding a 1.5-2X
       speedup on most comparisons. to skip the calculation of some cells.
       However, his approach is only good for calculating the minimum edit
       distance and find out the corresponding subsequences (aka search
       phase). Bill Pearson's popular dynamic programming alignment program
       SSEARCH uses Phil Green's algorithm to find the subsequences and then
       Miller-Myers's algorithm to find the actual alignment. (aka alignment
       phase)

       The current implementation supports local alignment of either DNA
       sequences or protein sequences. It allows you to specify either the
       Miller-Myers Global Alignment (DPALIGN_GLOBAL_MILLER_MYERS) or Miller-
       Myers Local Alignment (DPALIGN_LOCAL_MILLER_MYERS). For DNA alignment,
       you can specify the scores for match, mismatch, gap opening cost and
       gap extension cost. For protein alignment, it is using BLOSUM62 by
       default. Currently the substitution matrix is not configurable.

       Note: If you supply LocatableSeq objects to pairwise_alignment,
       pairwise_alignment_score, align_and_show or sequence_profile and the
       sequence supplied contains gaps, these functions will treat these
       sequences as if they are without gaps.

DEPENDENCIES
       This package comes with the main bioperl distribution. You also need to
       install the lastest bioperl-ext package which contains the XS code that
       implements the algorithms. This package won't work if you haven't
       compiled the bioperl-ext package.

TO-DO
       1. Basic support for IUPAC code for DNA sequence is now implemented.  X
	  will mismatch any character. T will match U. For others, whenever
	  there is a possibility for match, it is considered a full match, for
	  example, W will match B.

       2. Allow custom substitution matrix for DNA. Note that for proteins,
	  you can now use your own subsitution matirx.

FEEDBACK
   Mailing Lists
       User feedback is an integral part of the evolution of this and other
       Bioperl modules.	 Send your comments and suggestions preferably to one
       of the Bioperl mailing lists.  Your participation is much appreciated.

	 bioperl-l@bioperl.org			- General discussion
	 http://bioperl.org/wiki/Mailing_lists	- About the mailing lists

   Support
       Please direct usage questions or support issues to the mailing list:

       bioperl-l@bioperl.org

       rather than to the module maintainer directly. Many experienced and
       reponsive experts will be able look at the problem and quickly address
       it. Please include a thorough description of the problem with code and
       data examples if at all possible.

   Reporting Bugs
       Report bugs to the Bioperl bug tracking system to help us keep track
       the bugs and their resolution. Bug reports can be submitted via the
       web:

	 http://bugzilla.open-bio.org/

AUTHOR
	       This implementation was written by Yee Man Chan (ymc@yahoo.com).
	       Copyright (c) 2003 Yee Man Chan. All rights reserved. This program
	       is free software; you can redistribute it and/or modify it under
	       the same terms as Perl itself. Special thanks to Aaron Mackey
	       and WIlliam Pearson for the helpful discussions. [The portion
	       of code inside pgreen subdirectory was borrowed from ssearch. It
	       should be distributed in the same terms as ssearch.]

   sequence_profile
	Title	: sequence_profile
	Usage	: $prof = $factory->sequence_profile($seq1)
	Function: Makes a dpAlign_SequenceProfile object from one sequence
	Returns : A dpAlign_SequenceProfile object
	Args	: The lone argument is a Bio::PrimarySeqI that we want to
		  build a profile for. Usually, this would be the Query sequence

   pairwise_alignment_score
	Title	: pairwise_alignment_score
	Usage	: $score = $factory->pairwise_alignment_score($prof,$seq2)
	Function: Makes a SimpleAlign object from two sequences
	Returns : An integer that is the score of the optimal alignment.
	Args	: The first argument is the sequence profile obtained from a
		  call to the sequence_profile function. The second argument
		  is a Bio::PrimarySeqI object to be aligned. The second argument
		  is usually a sequence in the database sequence. Note
		  that this function only uses Phil Green's algorithm and
		  therefore theoretically may not always give you the optimal
		  score.

   pairwise_alignment
	Title	: pairwise_alignment
	Usage	: $aln = $factory->pairwise_alignment($seq1,$seq2)
	Function: Makes a SimpleAlign object from two sequences
	Returns : A SimpleAlign object if there is an alignment with positive
		  score. Otherwise, return undef.
	Args	: The first and second arguments are both Bio::PrimarySeqI
		  objects that are to be aligned.

   align_and_show
	Title	: align_and_show
	Usage	: $factory->align_and_show($seq1,$seq2,STDOUT)

   match
	Title	  : match
	Usage	  : $match = $factory->match() #get
		  : $factory->match($value) #set
	Function  : the set get for the match score
	Example	  :
	Returns	  : match value
	Arguments : new value

   mismatch
	Title	  : mismatch
	Usage	  : $mismatch = $factory->mismatch() #get
		  : $factory->mismatch($value) #set
	Function  : the set get for the mismatch penalty
	Example	  :
	Returns	  : mismatch value
	Arguments : new value

   gap
	Title	  : gap
	Usage	  : $gap = $factory->gap() #get
		  : $factory->gap($value) #set
	Function  : the set get for the gap penalty
	Example	  :
	Returns	  : gap value
	Arguments : new value

   ext
	Title	  : ext
	Usage	  : $ext = $factory->ext() #get
		  : $factory->ext($value) #set
	Function  : the set get for the ext penalty
	Example	  :
	Returns	  : ext value
	Arguments : new value

   alg
	Title	  : alg
	Usage	  : $alg = $factory->alg() #get
		  : $factory->alg($value) #set
	Function  : the set get for the algorithm
	Example	  :
	Returns	  : alg value
	Arguments : new value

perl v5.14.1			  2011-07-22		Bio::Tools::dpAlign(3)
[top]

List of man pages available for Fedora

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net