glocatedb man page on MirBSD

Man page or keyword search:  
man Server   6113 pages
apropos Keyword Search (all sections)
Output format
MirBSD logo
[printable version]



LOCATEDB(5L)	    UNIX Programmer's Manual	     LOCATEDB(5L)

NAME
     locatedb - front-compressed file name database

DESCRIPTION
     This manual page documents the format of file name databases
     for the GNU version of locate.  The file name databases con-
     tain lists of files that were in particular directory trees
     when the databases were last updated.

     There can be multiple databases.  Users can select which
     databases locate searches using an environment variable or
     command line option; see locate(1L).  The system administra-
     tor can choose the file name of the default database, the
     frequency with which the databases are updated, and the
     directories for which they contain entries.  Normally, file
     name databases are updated by running the updatedb program
     periodically, typically nightly; see updatedb(1L).

     updatedb runs a program called frcode to compress the list
     of file names using front-compression, which reduces the
     database size by a factor of 4 to 5.  Front-compression
     (also known as incremental encoding) works as follows.

     The database entries are a sorted list (case-insensitively,
     for users' convenience).  Since the list is sorted, each
     entry is likely to share a prefix (initial string) with the
     previous entry.  Each database entry begins with an offset-
     differential count byte, which is the additional number of
     characters of prefix of the preceding entry to use beyond
     the number that the preceding entry is using of its prede-
     cessor.  (The counts can be negative.)  Following the count
     is a null-terminated ASCII remainder - the part of the name
     that follows the shared prefix.

     If the offset-differential count is larger than can be
     stored in a byte (+/-127), the byte has the value 0x80 and
     the count follows in a 2-byte word, with the high byte first
     (network byte order).

     Every database begins with a dummy entry for a file called
     `LOCATE02', which locate checks for to ensure that the data-
     base file has the correct format; it ignores the entry in
     doing the search.

     Databases can not be concatenated together, even if the
     first (dummy) entry is trimmed from all but the first data-
     base.  This is because the offset-differential count in the
     first entry of the second and following databases will be
     wrong.

     There is also an old database format, used by Unix locate
     and find programs and earlier releases of the GNU ones.

MirOS BSD #10-current  Printed 16.11.2010			1

LOCATEDB(5L)	    UNIX Programmer's Manual	     LOCATEDB(5L)

     updatedb runs programs called bigram and code to produce
     old-format databases.  The old format differs from the above
     description in the following ways.	 Instead of each entry
     starting with an offset-differential count byte and ending
     with a null, byte values from 0 through 28 indicate offset-
     differential counts from -14 through 14.  The byte value
     indicating that a long offset-differential count follows is
     0x1e (30), not 0x80.  The long counts are stored in host
     byte order, which is not necessarily network byte order, and
     host integer word size, which is usually 4 bytes.	They also
     represent a count 14 less than their value.  The database
     lines have no termination byte; the start of the next line
     is indicated by its first byte having a value <= 30.

     In addition, instead of starting with a dummy entry, the old
     database format starts with a 256 byte table containing the
     128 most common bigrams in the file list.	A bigram is a
     pair of adjacent bytes.  Bytes in the database that have the
     high bit set are indexes (with the high bit cleared) into
     the bigram table.	The bigram and offset-differential count
     coding makes these databases 20-25% smaller than the new
     format, but makes them not 8-bit clean.  Any byte in a file
     name that is in the ranges used for the special codes is
     replaced in the database by a question mark, which not coin-
     cidentally is the shell wildcard to match a single charac-
     ter.

EXAMPLE
     Input to frcode:
     /usr/src
     /usr/src/cmd/aardvark.c
     /usr/src/cmd/armadillo.c
     /usr/tmp/zoo

     Length of the longest prefix of the preceding entry to share:
     0 /usr/src
     8 /cmd/aardvark.c
     14 rmadillo.c
     5 tmp/zoo

     Output from frcode, with trailing nulls changed to newlines
     and count bytes made printable:
     0 LOCATE02
     0 /usr/src
     8 /cmd/aardvark.c
     6 rmadillo.c
     -9 tmp/zoo

     (6 = 14 - 8, and -9 = 5 - 14)

SEE ALSO
     find(1L), locate(1L), locatedb(5L), xargs(1L) Finding Files

MirOS BSD #10-current  Printed 16.11.2010			2

LOCATEDB(5L)	    UNIX Programmer's Manual	     LOCATEDB(5L)

     (on-line in Info, or printed)

MirOS BSD #10-current  Printed 16.11.2010			3

[top]

List of man pages available for MirBSD

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net