Bio::DB::GFF::Adaptor::dbi::pg_fts man page on Fedora

Man page or keyword search:  
man Server   31170 pages
apropos Keyword Search (all sections)
Output format
Fedora logo
[printable version]

Bio::DB::GFF::Adaptor:User:Contributed PeBio::DB::GFF::Adaptor::dbi::pg_fts(3)

NAME
       Bio::DB::GFF::Adaptor::dbi::pg_fts -- Database adaptor for a specific
       postgres schema with a TSearch2 implementation

SYNOPSIS
	   #create new GFF database connection
	   my $db      = Bio::DB::GFF->new( -adaptor => 'dbi::pg_fts',
					    -dsn     => 'dbi:Pg:dbname=worm');

	   #add full text indexing 'stuff'
	   #assumes that TSearch2 is available to PostgreSQL
	   #this will take a VERY long time for a reasonably large database
	   $db->install_TSearch2();

	   ...some time later...
	   #we don't like full text searching...
	   $db->remove_TSearch2();

DESCRIPTION
       This adaptor is based on Bio::DB::GFF::Adaptor::dbi::pg but it
       implements the TSearch2 PostgreSQL contrib module for fast full text
       searching.  To use this module with your PostgreSQL GFF database, you
       need to make TSearch2 available in the database.

       To use this adaptor, follow these steps:

       Install TSearch2 contrib module for Pg
	   Can be as easy as `sudo yum install postgresql-contrib`, or you may
	   need to recompile PostgreSQL to include it.	See
	   http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html
	   <http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-
	   V2-intro.html> for more details

       Load the TSearch2 functions to you database
	     % cat tsearch2.sql | psql <your database>

       Load your data using the pg adaptor:
	    % bp_pg_bulk_load_gff.pl -c -d yeast saccharomyces_cerevisiae.gff

	   or

	    % bp_load_gff.pl -c -d yeast -a dbi::pg saccharomyces_cerevisiae.gff

       Add GFF/TSearch2 specific modifications
	   Execute a perl script like this one:

	     #!/usr/bin/perl -w
	     use strict;

	     use Bio::DB::GFF;

	     my $db = Bio::DB::GFF->new(
		 -adaptor   => 'dbi::pg_fts',
		 -dsn	    => 'dbi:Pg:dbname=yeast',
		 -user	    => 'scott',
	       );

	     print "Installing TSearch2 columns...\n";

	     $db->install_TSearch2();

	     print "Done\n";

       Note that this last step will take a long time.	For a S. cerevisiae
       database with 15K rows, it took over an hour on my laptop, and with a
       C. elegans database (~10 million rows) it took well over a day.

       If at some point you add more data you your database, you need to run a
       similar script to the one above, only executing the update_TSearch2()
       method.	Finally, if you want to remove the TSearch2 columns from your
       database and go back to using the pg adaptor, you can execute a script
       like the one above, only executing the remove_TSearch2() method.

NOTES ABOUT TSearch2 SEARCHING
       You should know a few things about how searching with TSearch2 works in
       the GBrowse enviroment:

       1.  TSearch2 does not do wild cards, so you should encourage your users
	   not to use them.  If wild cards are used, the adaptor will fall
	   back on an ILIKE search, which will be much slower.

       2.  However, TSearch2 does do 'word stemming'.  That is, if you search
	   for 'copy', it will find 'copy', 'copies', and 'copied'.

       3.  TSearch2 does not do phrase searching; all of the terms in the
	   search string are ANDed together.

ACKNOWLEDGEMENTS
       Special thanks to Russell Smithies and Paul Smale at AgResearch in New
       Zealand for giving me their recipe for doing full text indexing in a
       GFF database.

BUGS
       Please report bugs to the BioPerl and/or GBrowse mailing lists
       (mailto:bioperl-l@lists.open-bio.org <mailto:bioperl-l@lists.open-
       bio.org> and mailto:gmod-gbrowse@lists.sourceforge.net <mailto:gmod-
       gbrowse@lists.sourceforge.net> respectively).

SEE ALSO
       Please see Bio::DB::GFF::Adaptor::dbi::pg for more information about
       tuning your PostgreSQL server for GFF data, and for general information
       about GFF database access, see Bio::DB::GFF.

AUTHOR
       Scott Cain, cain@cshl.edu

APPENDIX
   search_notes
	Title	: search_notes
	Usage	: @search_results = $db->search_notes("full text string",$limit)
	Function: Search the notes for a text string, using PostgreSQL TSearch2
	Returns : array of results
	Args	: full text search string, and an optional row limit
	Status	: public

       This is based on the mysql-specific method that makes use of the
       TSearch2 functionality in PosgreSQL's contrib directory. Given a search
       string, it performs a full-text search of the notes table and returns
       an array of results.  Each row of the returned array is a arrayref
       containing the following fields:

	 column 1   A Bio::DB::GFF::Featname object, for passing to segment()
	 column 2   The text of the note
	 column 3   A relevance score.

   make_features_by_name_where_part
	Title	: make_features_by_name_where_part
	Function: constructs a TSearch2-compliant WHERE clause for a name search
	Status	: protected

   install_TSearch2
	Title	: install_TSearch2
	Function: installs schema modifications for use with TSearch2
	Usage	: $db->install_TSearch2
	Status	: public

   update_TSearch2
	Title	: update_TSearch2
	Function: Updates TSearch2 columns
	Usage	: $db->update_TSearch2
	Status	: public

   remove_TSearch2
	Title	: remove_TSearch2
	Function: Removes TSearch2 columns
	Usage	: $db->remove_TSearch2
	Status	: public

perl v5.14.1			  2011-07Bio::DB::GFF::Adaptor::dbi::pg_fts(3)
[top]

List of man pages available for Fedora

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net