dictzip man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

DICTZIP(1)							    DICTZIP(1)

NAME
       dictzip, dictunzip - compress (or expand) files, allowing random access

SYNOPSIS
       dictzip [options] name
       dictunzip [options] name

DESCRIPTION
       dictzip compresses files using the gzip(1) algorithm (LZ77) in a manner
       which is completely compatible with the gzip file format.  An extension
       to the gzip file format (Extra Field, described in 2.3.1.1 of RFC 1952)
       allows extra data to be stored in the  header  of  a  compressed	 file.
       Programs	 like  gzip  and  zcat	will ignore this extra data.  However,
       dictd(8), the DICT protocol dictionary server will  make	 use  of  this
       data to perform pseudo-random access on the file.  Files in the dictzip
       format should end in ".dz" so that they may be distinguished from  com‐
       mon gzip files that do not contain the special header information.

       From RFC 1952, the extra field is specified as follows:

	      If the FLG.FEXTRA bit is set, an "extra field" is present in the
	      header, with total length XLEN bytes.  It consists of  a	series
	      of subfields, each of the form:

	      +---+---+---+---+==================================+
	      |SI1|SI2|	 LEN  |... LEN bytes of subfield data ...|
	      +---+---+---+---+==================================+

	      SI1  and	SI2 provide a subfield ID, typically two ASCII letters
	      with     some	mnemonic     value.	 Jean-Loup	Gailly
	      <gzip@prep.ai.mit.edu>  is  maintaining  a  registry of subfield
	      IDs; please send him any subfield ID you wish to use.   Subfield
	      IDs with SI2 = 0 are reserved for future use.

	      LEN  gives the length of the subfield data, excluding the 4 ini‐
	      tial bytes.

       The dictzip program uses 'R' for SI1, and 'A' for  SI2  (i.e.,  "Random
       Access").  After the LEN field, the data is arranged as follows:

       +---+---+---+---+---+---+===============================+
       |  VER  | CHLEN | CHCNT |  ... CHCNT words of data ...  |
       +---+---+---+---+---+---+===============================+

       As  per RFC 1952, all data is stored least-significant byte first.  For
       VER 1 of the data, all values are  16-bits  long	 (2  bytes),  and  are
       unsigned integers.

       XLEN  (which is specified earlier in the header) is a two byte integer,
       so the extra field can be 0xffff bytes long, 2 bytes of which are  used
       for  the	 subfield  ID (SI1 and SI1), and 2 bytes of which are used for
       the subfield length (LEN).  This leaves	0xfffb	bytes  (0x7ffd	2-byte
       entries	or  0x3ffe  4-byte entries).  Given that the zip output buffer
       must be 10% + 12 bytes larger than the input buffer, we can store 58969
       bytes  per  entry,  or  about 1.8GB if the 2-byte entries are used.  If
       this becomes a limiting factor, another format version can be  selected
       and defined for 4-byte entries.

       For  compression,  the  file  is divided up into "chunks" of data, each
       chunk is less than 64kB, and can be compressed into  an	area  that  is
       also  less  than	 64kB long (taking incompressible data into account --
       usually the data is compressed into a block that is much	 smaller  than
       the  original).	 The  CHLEN field specifies the length of a "chunk" of
       data.  The CHCNT field specifies how many chunks are  preset,  and  the
       CHCNT  words of data specifies how long each chunk is after compression
       (i.e., in the current compressed file).

       To perform random access on the data, the offset and length of the data
       are  provided  to library routines.  These routines determine the chunk
       in which the desired data begins, and decompresses that chunk.  Consec‐
       utive chunks are decompressed as necessary.

TRADEOFFS
       Speed  True  random file access is not realized, since any access, even
	      for a single byte, requires that a 64kB chunk be read and decom‐
	      pressed.	This is slower than accessing a flat text file, but is
	      much, much faster than performing serial access on a fully  com‐
	      pressed file.

       Space  For  the	textual	 dictionary databases we are working with, the
	      use of 64kB chunks and maximal LZ77 compression realizes a  file
	      which  is only about 4% larger than the same file compressed all
	      at once.

OPTIONS
       -d or --decompress
	      Decompress.  This is the default if  the	executable  is	called
	      dictunzip.

       -c or --stdout
	      Write  output on standard output; keep original files unchanged.
	      This is only available when decompressing (because parts of  the
	      header must be updated after a write when compressing).

       -f or --force
	      Force  compression  or  decompression  even  if  the output file
	      already exists.

       -h or --help
	      Display help.

       -k or --keep
	      Do not delete the original file.

       -l or --list
	      For each compressed file, list the following fields:

		  type: dzip, gzip, or text (includes files  in	 unknown  for‐
	      mats)
		  crc: CRC checksum
		  date and time: from header
		  chunks: number of chunks in file
		  size: size of each uncompressed chunk
		  compr.: compressed size
		  uncompr.: uncompressed size
		  ratio: compression ratio (0.0% if unknown)
		  name: name of uncompressed file

	      Unlike gzip, the compression method is not detected.

       -L or --license
	      Display the dictzip license and quit.

       -t or --test
	      Check  the compressed file integrity.  This option is not imple‐
	      mented.  Instead, it will list the header information.

       -v or --verbose
	      Verbose. Display extra information during compression.

       -V or --version
	      Version. Display the version number and compilation options then
	      quit.

       -s start or --start start
	      Specify the offer to start decompression, using decimal numbers.
	      The default is at the beginning of the file.

       -e size or --size size
	      Specify the size of the portion of the file to decompress, using
	      decimal numbers.	The default is the whole file.

       -S start or --Start start
	      Specify  the offer to start decompression, using base64 numbers.
	      The default is at the beginning of the file.

       -E size or --Size start
	      Specify the size of the portion of the file to decompress, using
	      base64 numbers.  The default is the whole file.

       -p prefilter or --pre prefilter
	      Specify  a  shell command to execute as a filter before compres‐
	      sion or decompression of a chunk.	 The pre- and post-compression
	      filters  can be used to provide additional compression or output
	      formatting.  The filters may not increase the buffer  size  sig‐
	      nificantly.  The pre- and post-compression filters were designed
	      to provide the most general interface possible.

       -P postfilter or --post postfilter
	      Specify a shell command to execute as a filter after compression
	      or decompression.

CREDITS
       dictzip	was written by Rik Faith (faith@cs.unc.edu) and is distributed
       under the terms of the GNU General Public License.  If you need to dis‐
       tribute under other terms, write to the author.

       The main libraries used by this programs (zlib, regex, libmaa) are dis‐
       tributed under different terms, so you may be able to use the libraries
       for  applications which are incompatible with the GPL -- please see the
       copyright notices and license information that come with the  libraries
       for  more  information, and consult with your attorney to resolve these
       issues.

SEE ALSO
       dict(1), dictd(8), gzip(1), gunzip(1), zcat(1)

				  22 Jun 1997			    DICTZIP(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net