OVDB(5) InterNetNews Documentation OVDB(5)NAMEovdb - Overview storage method for INN
DESCRIPTION
Ovdb is a storage method that uses the BerkeleyDB library to store
overview data. It requires version 2.6.x or later of the BerkeleyDB
library, but has mostly been tested with version 3 and 4.
Ovdb makes use of the full transaction/logging/locking functionality of
the BerkeleyDB environment. BerkeleyDB may be downloaded from
<http://www.sleepycat.com> and is needed to build the ovdb backend.
UPGRADING
This is version 2 of ovdb. If you have a database created with a pre‐
vious version of ovdb (such as the one shipped with INN 2.3.0) your
database will need to be upgraded using ovdb_init(8). See the man page
ovdb_init(8) for upgrade instructions.
INSTALLATION
To build ovdb support into INN, specify the option "--with-berkeleydb"
when running the configure script. By default, configure will search
for a BerkeleyDB tree in several likely locations, and choose the high‐
est version (based on the name of the directory, e.g., BerkeleyDB.3.0)
that it finds. There will be a message in the configure output indi‐
cating the chosen pathname.
You can override this pathname by adding a path to the option, e.g.,
"--with-berkeleydb=/usr/BerkeleyDB.3.1". This directory is expected to
have subdirectories include and lib, containing db.h, and the library
itself, respectively.
The ovdb database will take up more disk space for a given spool than
the other overview methods. Plan on needing at least 1.1 KB for every
article in your spool (not counting crossposts). So, if you have 5
million articles, you'll need at least 5.5 GB of disk space for ovdb.
With BerkeleyDB 2.x, the db files are 'grow only'; the library will not
shrink them, even if data is removed. So, reserving extra space above
the estimate is a good idea. Plus, you'll need additional space for
transaction logs: at least 100 MB. By default the transaction logs go
in the same directory as the database. To improve performance, they
can be placed on a different disk -- see the DB_CONFIG section.
CONFIGURATION
To enable ovdb, set the ovmethod parameter in inn.conf to "ovdb". The
ovdb database is stored in the directory specified by the pathoverview
paramter in inn.conf. This is the "DB_HOME" directory. To start out,
this directory should be empty (other than an optional DB_CONFIG file;
see DB_CONFIG for details) and innd (or makehistory) will create the
files as necessary in that directory. Make sure the directory is owned
by the news user.
Other parameters for configuring ovdb are in the ovdb.conf(5) configu‐
ration file. See also the sample ovdb.conf.
cachesize
Size of the memory pool cache, in kilobytes. The cache will have a
backing store file in the DB directory which will be at least as
big. In general, the bigger the cache, the better. Use "ovdb_stat
-m" to see cache hit percentages. To make a change of this parame‐
ter take effect, shut down and restart INN (be sure to kill all of
the nnrpds when shutting down). Default is 8000, which is adequate
for small to medium sized servers. Large servers will probably
need at least 20000.
numdbfiles
Overview data is split between this many files. Currently, innd
will keep all of the files open, so don't set this too high or innd
may run out of file descriptors. nnrpd only opens one at a time,
regardless. May be set to one, or just a few, but only do that if
your OS supports large (>2G) files. Changing this parameter has no
effect on an already-established database. Default is 32.
txn_nosync
If txn_nosync is set to false, BerkeleyDB flushes the log after
every transaction. This minimizes the number of transactions that
may be lost in the event of a crash, but results in significantly
degraded performance. Default is true.
useshm
If useshm is set to true, BerkeleyDB will use shared memory instead
of mmap for its environment regions (cache, lock, etc). With some
platforms, this may improve performance. Default is false. This
parameter is ignored if you have BerkeleyDB 2.x
shmkey
Sets the shared memory key used by BerkeleyDB when 'useshm' is
true. BerkeleyDB will create several (usually 5) shared memory
segments, using sequentially numbered keys starting with 'shmkey'.
Choose a key that does not conflict with any existing shared memory
segments on your system. Default is 6400. This parameter is only
used with BerkeleyDB 3.1 or newer.
pagesize
Sets the page size for the DB files (in bytes). Must be a power of
2. Best choices are 4096 or 8192. The default is 8192. Changing
this parameter has no effect on an already-established database.
minkey
Sets the minimum number of keys per page. See the BerkeleyDB docu‐
mentation for more info. Default is based on page size:
default_minkey = MAX(2, pagesize / 2048 - 1)
The lowest allowed minkey is 2. Setting minkey higher than the
default is not recommended, as it will cause the databases to have
a lot of overflow pages. Changing this parameter has no effect on
an already-established database.
maxlocks
Sets the BerkeleyDB "lk_max" parameter, which is the maxmium number
of locks that can exist in the database at the same time. Default
is 4000.
nocompact
The nocompact parameter affects expireover's behavior. The expire‐
over function in ovdb can do its job in one of two ways: by simply
deleting expired records from the database, or by re-writing the
overview records into a different location leaving out the expired
records. The first method is faster, but it leaves 'holes' that
result in space that can not immediately be reused. The second
method 'compacts' the records by rewriting them.
If this parameter is set to 0, expireover will compact all news‐
groups; if set to 1, expireover will not compact any newsgroups;
and if set to a value greater than one, expireover will only com‐
pact groups that have less than that number of articles.
Experience has shown that compacting has minimal effect (other than
making expireover take longer) so the default is now 1. This
parameter will probably be removed in the future.
readserver
Normally, each nnrpd process directly accesses the BerkeleyDB envi‐
ronment. The process of attaching to the database (and detaching
when finished) is fairly expensive, and can result in high loads in
situations when there are lots of reader connections of relatively
short duration.
When the readserver parameter is "true", the nnrpds will access
overview via a helper server (ovdb_server -- which is started by
ovdb_init). This can also result in cleaner shutdowns for the
database, improving stability and avoiding deadlocks and corrupted
databases. If you are experiencing any instability in ovdb, try
setting this parameter to true. Default is false.
numrsprocs
This parameter is only used when readserver is true. It sets the
number of ovdb_server processes. As each ovdb_server can process
only one transaction at a time, running more servers can improve
reader response times. Default is 5.
maxrsconn
This parameter is only used when readserver is true. It sets a
maximum number of readers that a given ovdb_server process will
serve at one time. This means the maximum number of readers for
all of the ovdb_server processes is (numrsprocs * maxrsconn).
Default is 0, which means an umlimited number of connections is
allowed.
DB_CONFIG
A file called DB_CONFIG may be placed in the database directory to cus‐
tomize where the various database files and transaction logs are writ‐
ten. By default, all of the files are written in the "DB_HOME" direc‐
tory. One way to improve performance is to put the transaction logs on
a different disk. To do this, put:
DB_LOG_DIR /path/to/logs
in the DB_CONFIG file. If the pathname you give starts with a /, it is
treated as an absolute path; otherwise, it is relative to the "DB_HOME"
directory. Make sure that any directories you specify exist and have
proper ownership/mode before starting INN, because they won't be cre‐
ated automatically. Also, don't change the DB_CONFIG file while any‐
thing that uses ovdb is running.
Another thing that you can do with this file is to split the overview
database across multiple disks. In the DB_CONFIG file, you can list
directories that BerkeleyDB will search when it goes to open a data‐
base.
For example, let's say that you have pathoverview set to /mnt/overview
and you have four additional file systems created on /mnt/ov?. You
would create a file "/mnt/overview/DB_CONFIG" containing the following
lines:
set_data_dir /mnt/overview
set_data_dir /mnt/ov1
set_data_dir /mnt/ov2
set_data_dir /mnt/ov3
set_data_dir /mnt/ov4
(For BerkeleyDB 2.x, replace "set_data_dir" with "DB_DATA_DIR".)
Distribute your ovNNNNN files into the four filesystems. (say, 8
each). When called upon to open a database file, the db library will
look for it in each of the specified directories (in order). If said
file is not found, one will be created in the first of those directo‐
ries.
Whenever you change DB_CONFIG or move database files around, make sure
all news processes that use the database are shut down first (including
nnrpds).
The DB_CONFIG functionality is part of BerkeleyDB itself, rather than
something provided by ovdb. See the BerkeleyDB documentation for com‐
plete details for the version of BerkeleyDB that you're running.
RUNNING
When starting the news system, rc.news will invoke ovdb_init.
ovdb_init must be run before using the database. It performs the fol‐
lowing tasks:
· Creates the database environment, if necessary.
· If the database is idle, it performs a normal recovery. The recov‐
ery will remove stale locks, recreate the memory pool cache, and
repair any damage caused by a system crash or improper shutdown.
· Starts the DB housekeeping processes (ovdb_monitor) if they're not
already running.
And when stopping INN, rc.news kills the ovdb_monitor processes after
the other INN processes have been shut down.
DIAGNOSTICS
Problems relating to ovdb are logged to news.err with "OVDB" in the
error message.
INN programs that use overview will fail to start up if the ovdb_moni‐
tor processes aren't running. Be sure to run ovdb_init before running
anything that accesses overview.
Also, INN programs that use overview will fail to start up if the user
running them is not the "news" user.
If a program accessing the database crashes, or otherwise exits
uncleanly, it might leave a stale lock in the database. This lock
could cause other processes to deadlock on that stale lock. To fix
this, shut down all news processes (using "kill -9" if necessary) and
then restart. ovdb_init should perform a recovery operation which will
remove the locks and repair damage caused by killing the deadlocked
processes.
FILES
inn.conf
The ovmethod and pathoverview parameters are relevant to ovdb.
ovdb.conf
Optional configuration file for tuning. See CONFIGURATION above.
pathoverview
Directory where the database goes. BerkeleyDB calls it the
'DB_HOME' directory.
pathoverview/DB_CONFIG
Optional file to configure the layout of the database files.
pathrun/ovdb.sem
A file that gets locked by every process that is accessing the
database. This is used by ovdb_init to determine whether the data‐
base is active or quiescent.
pathrun/ovdb_monitor.pid
Contains the process ID of ovdb_monitor.
TO DO
Implement a way to limit how many databases can be open at once (to
reduce file descriptor usage); maybe using something similar to the
cache code in ov3.c
HISTORY
Written by Heath Kehoe <hakehoe@avalon.net> for InterNetNews
SEE ALSOinn.conf(5), innd(8), nnrpd(8), ovdb_init(8), ovdb_monitor(8),
ovdb_stat(8)
BerkeleyDB documentation: in the docs directory of the BerkeleyDB
source distribution, or on the Sleepycat web page: <http://www.sleepy‐
cat.com/>.
INN 2.4.2 2004-06-08 OVDB(5)