qdbm man page on DragonFly

qdbm man page on DragonFly
Man page or keyword search:
man Server 44335 pages
apropos Keyword Search (all sections)
Output format
QDBM(3)			    Quick Database Manager		       QDBM(3)

NAME
       QDBM - quick database manager

OVERVIEW
       QDBM is a library of routines for managing a database.  The database is
       a simple data file containing records, each is a pair of a  key	and  a
       value.  Every key and value is serial bytes with variable length.  Both
       binary data and character string can be used as	a  key	and  a	value.
       There  is  neither  concept of data tables nor data types.  Records are
       organized in hash table or B+ tree.

       As for database of hash table, each key must be unique within  a	 data‐
       base, so it is impossible to store two or more records with a key over‐
       laps.  The following access methods are provided to the database: stor‐
       ing  a  record  with  a	key  and  a value, deleting a record by a key,
       retrieving a record by a key.  Moreover, traversal access to every  key
       are  provided,  although	 the order is arbitrary.  These access methods
       are similar to ones of DBM (or its followers: NDBM  and	GDBM)  library
       defined	in  the UNIX standard.	QDBM is an alternative for DBM because
       of its higher performance.

       As for database of B+ tree, records whose keys are  duplicated  can  be
       stored.	 Access	 methods of storing, deleting, and retrieving are pro‐
       vided as with the database of hash table.  Records are stored in	 order
       by  a  comparing function assigned by a user.  It is possible to access
       each record with the cursor in ascending or descending order.   Accord‐
       ing  to	this  mechanism, forward matching search for strings and range
       search for integers are realized.  Moreover, transaction	 is  available
       in database of B+ tree.

EFFECTIVE IMPLEMENTATION OF HASH DATABASE
       QDBM  is	 developed  referring to GDBM for the purpose of the following
       three points: higher processing speed, smaller size of a database file,
       and  simpler  API.   They  have been achieved.  Moreover, the following
       three restrictions of traditional DBM: a process can  handle  only  one
       database,  the size of a key and a value is bounded, a database file is
       sparse, are cleared.

       QDBM uses hash algorithm to retrieve records.  If a  bucket  array  has
       sufficient  number  of  elements,  the  time complexity of retrieval is
       `O(1)'.	That is, time required for retrieving a	 record	 is  constant,
       regardless of the scale of a database.  It is also the same about stor‐
       ing and deleting.  Collision of hash  values  is	 managed  by  separate
       chaining.  Data structure of the chains is binary search tree.  Even if
       a bucket array has unusually scarce elements, the  time	complexity  of
       retrieval is `O(log n)'.

       QDBM  attains improvement in retrieval by loading RAM with the whole of
       a bucket array.	If a bucket array is on RAM, it is possible to	access
       a  region  of  a target record by about one path of file operations.  A
       bucket array saved in a file is not read into RAM with the `read'  call
       but  directly  mapped to RAM with the `mmap' call.  Therefore, prepara‐
       tion time on connecting to a database is very short, and	 two  or  more
       processes can share the same memory map.

       If  the	number	of elements of a bucket array is about half of records
       stored within a database, although it depends on characteristic of  the
       input,  the  probability	 of  collision	of  hash values is about 56.7%
       (36.8% if the same, 21.3% if twice, 11.5% if four times, 6.0% if	 eight
       times).	 In  such  case, it is possible to retrieve a record by two or
       less paths of file operations.  If it is made into a performance index,
       in  order  to  handle  a	 database containing one million of records, a
       bucket array with half a million of elements is needed.	 The  size  of
       each  element  is 4 bytes.  That is, if 2M bytes of RAM is available, a
       database containing one million records can be handled.

       QDBM provides  two  modes  to  connect  to  a  database:	 `reader'  and
       `writer'.   A  reader  can  perform  retrieving but neither storing nor
       deleting.  A writer can perform all access methods.  Exclusion  control
       between	processes  is  performed when connecting to a database by file
       locking.	 While a writer is connected to a  database,  neither  readers
       nor  writers  can be connected.	While a reader is connected to a data‐
       base, other readers can be connect, but writers can not.	 According  to
       this  mechanism,	 data consistency is guaranteed with simultaneous con‐
       nections in multitasking environment.

       Traditional DBM provides two modes of the storing operations:  `insert'
       and  `replace'.	 In  the  case	a key overlaps an existing record, the
       insert mode keeps the existing value, while the replace mode transposes
       it to the specified value.  In addition to the two modes, QDBM provides
       `concatenate' mode.  In the mode, the specified value  is  concatenated
       at  the	end  of the existing value and stored.	This feature is useful
       when adding a element to a value as an array.  Moreover,	 although  DBM
       has  a  method to fetch out a value from a database only by reading the
       whole of a region of a record, QDBM has a method to fetch out a part of
       a region of a value.  When a value is treated as an array, this feature
       is also useful.

       Generally speaking, while  succession  of  updating,  fragmentation  of
       available  regions  occurs,  and	 the size of a database grows rapidly.
       QDBM deal with this problem by coalescence of dispensable  regions  and
       reuse of them, and featuring of optimization of a database.  When over‐
       writing a record with a value whose size is greater than	 the  existing
       one,  it	 is  necessary to remove the region to another position of the
       file.  Because the time complexity of the operation depends on the size
       of  the	region	of  a record, extending values successively is ineffi‐
       cient.  However, QDBM deal with this problem by alignment.   If	incre‐
       ment can be put in padding, it is not necessary to remove the region.

       As  for many file systems, it is impossible to handle a file whose size
       is more than 2GB.  To deal with this problem, QDBM provides a directory
       database	 containing  multiple database files.  Due to this feature, it
       is possible to handle a database whose total size is up to 1TB in  the‐
       ory.   Moreover,	 because  database  files  can be deployed on multiple
       disks, the speed of updating operations can be improved as with	RAID-0
       (striping).   It	 is  also possible for the database files to deploy on
       multiple file servers using NFS and so on.

USEFUL IMPLEMENTATION OF B+ TREE DATABASE
       Although B+ tree database is slower than	 hash  database,  it  features
       ordering	 access	 to  each record.  The order can be assigned by users.
       Records of B+ tree are sorted and arranged in  logical  pages.	Sparse
       index organized in B tree that is multiway balanced tree are maintained
       for each page.  Thus, the time complexity of retrieval  and  so	on  is
       `O(log  n)'.   Cursor  is provided to access each record in order.  The
       cursor can jump to a position specified by a key and can	 step  forward
       or  backward  from the current position.	 Because each page is arranged
       as double linked list,  the  time  complexity  of  stepping  cursor  is
       `O(1)'.

       B+ tree database is implemented, based on above hash database.  Because
       each page of B+ tree is stored as each record of hash database, B+ tree
       database	 inherits  efficiency  of storage management of hash database.
       Because the header of each record is smaller and alignment of each page
       is  calculated  statistically, in most cases, the size of database file
       is cut by half compared to one of hash database.	 Although operation of
       many  pages  are required to update B+ tree, QDBM expedites the process
       by caching pages and reducing file operations.  In most cases,  because
       whole  of  the  sparse  index  is  cached  on memory, it is possible to
       retrieve a record by one or less path of file operations.

       B+ tree database features transaction mechanism.	  It  is  possible  to
       commit  a series of operations between the beginning and the end of the
       transaction in a lump, or to abort the transaction and perform rollback
       to  the state before the transaction.  Even if the process of an appli‐
       cation is crushed while the transaction, the database file is not  bro‐
       ken.

       In case that QDBM is built with ZLIB, LZO, or BZIP2 enabled, a lossless
       data-compression library, the content of each page of B+ tree  is  com‐
       pressed	and stored in a file.  Because each record in a page has simi‐
       lar patterns, high efficiency of compression is	expected  due  to  the
       Lempel-Ziv  algorithm  and  the	like.  In case handling text data, the
       size of a database is reduced to about 25%.  If the scale of a database
       is  large  and  disk I/O is the bottleneck, featuring compression makes
       the processing speed improved to a large extent.

SIMPLE BUT VARIOUS INTERFACES
       QDBM provides very simple APIs.	You can perform database I/O as	 usual
       file  I/O  with	`FILE' pointer defined in ANSI C.  In the basic API of
       QDBM, entity of a database is recorded as one file.   In	 the  extended
       API,  entity  of	 a database is recorded as several files in one direc‐
       tory.  Because the two APIs are very similar with each  other,  porting
       an application from one to the other is easy.

       APIs  which  are	 compatible  with NDBM and GDBM are also provided.  As
       there are a lot of applications using NDBM or GDBM, it is easy to  port
       them  onto QDBM.	 In most cases, it is completed only by replacement of
       header including (#include) and re-compiling.  However,	QDBM  can  not
       handle database files made by the original NDBM or GDBM.

       In  order  to  handle records on memory easily, the utility API is pro‐
       vided.  It implements memory allocating functions,  sorting  functions,
       extensible datum, array list, hash map, and so on.  Using them, you can
       handle records in C language cheaply as in  such	 script	 languages  as
       Perl or Ruby.

       B+  tree	 database  is used with the advanced API.  The advanced API is
       implemented using the basic API	and  the  utility  API.	  Because  the
       advanced	 API is also similar to the basic API and the extended API, it
       is easy to learn how to use it.

       In order to handle an inverted index which is used by full-text	search
       systems,	 the  inverted	API  is	 provided.  If it is easy to handle an
       inverted index of documents, an application can focus on text  process‐
       ing  and natural language processing.  Because this API does not depend
       on character codes  nor	languages,  it	is  possible  to  implement  a
       full-text  search  system  which	 can  respond to various requests from
       users.

       Along with APIs for C, QDBM provides APIs  for  C++,  Java,  Perl,  and
       Ruby.   APIs  for  C  are  composed  of seven kinds: the basic API, the
       extended API, the NDBM-compatible API,  the  GDBM-compatible  API,  the
       utility	API,  the  advanced  API,  and the inverted API.  Command line
       interfaces corresponding to each API are also provided.	They are  use‐
       ful for prototyping, testing, debugging, and so on.  The C++ API encap‐
       sulates database handling functions of the basic API, the extended API,
       and  the	 advanced  API	with class mechanism of C++.  The Java API has
       native methods calling  the  basic  API,	 the  extended	API,  and  the
       advanced	 API  with  Java  Native  Interface.  The Perl API has methods
       calling the basic API, the extended API, and the advanced API  with  XS
       language.   The Ruby API has method calling the basic API, the extended
       API, and the advanced API as modules of Ruby.   Moreover,  CGI  scripts
       for administration of databases and full-text search are provided.

WIDE PORTABILITY
       QDBM  is	 implemented  being  based on syntax of ANSI C (C89) and using
       only APIs defined in ANSI C or POSIX.  Thus, QDBM works	on  most  UNIX
       and  its	 compatible  OSs.  As for C API, checking operations have been
       done at least on Linux 2.2, Linux 2.4, FreeBSD 4.8, FreeBSD 5.0,	 SunOS
       5.7,  SunOS  5.8, SunOS 5.9, HP-UX 11.00, Cygwin 1.3.10, Mac OS X 10.2,
       and RISC OS 5.03.  Although a database file created by QDBM depends  on
       byte  order  of the processor, to do with it, utilities to dump data in
       format which is independent to byte orders are provided.

BUILDING
       For building a program using QDBM, the program should be linked with  a
       library	file  `libqdbm.a' or `libqdbm.so'.  For example, the following
       command is executed to build `sample' from `sample.c'.

	      gcc -I/usr/local/include	-o  sample  sample.c  -L/usr/local/lib
	      -lqdbm


AUTHOR
       QDBM  is	 written  by Mikio Hirabayashi.	 You can contact the author by
       e-mail to <mikio@fallabs.com>.  Any suggestion or bug report is welcome
       to the author.

COPYRIGHT
       Copyright(c) 2000-2003 Mikio Hirabayashi

       QDBM  is	 free software; you can redistribute it and/or modify it under
       the terms of the GNU Lesser General Public License as published by  the
       Free Software Foundation; either version 2 of the License, or any later
       version.

       QDBM is distributed in the hope that it will be useful, but WITHOUT ANY
       WARRANTY;  without even the implied warranty of MERCHANTABILITY or FIT‐
       NESS FOR A PARTICULAR PURPOSE.	See  the  GNU  Lesser  General	Public
       License for more details.

       You  should  have  received  a  copy  of	 the GNU Lesser General Public
       License along with QDBM; if not, write to the Free Software Foundation,
       Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA.

SEE ALSO
       depot(3),  curia(3),  relic(3), hovel(3), cabin(3), villa(3), odeum(3),
       ndbm(3), gdbm(3)

Man Page			  2004-04-22			       QDBM(3)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome