presto(7)presto(7)NAMEpresto - The Prestoserve pseudodevice driver
SYNOPSIS
#include <sys/presto.h>
#include <sys/prestoioctl.h>
pseudo-device prestoDESCRIPTION
The Prestoserve pseudodevice driver presto caches synchronous writes in
nonvolatile memory. Prestoserve causes synchronous writes to be per‐
formed at memory speeds, rather than at disk speeds. Synchronous writes
that result in Prestoserve cache hits do not perform the earlier physi‐
cal disk writes, because only the last write is actually performed by
Prestoserve. Therefore, 50% to 65% of all the physical disk write
operations are avoided, because every sequential NFS write to a file
also causes the inode and the indirect block to be synchronously writ‐
ten.
The presto driver is layered on other disk drivers, and it intercepts
the other drivers' I/O requests by replacing the entry points of the
original driver in the bdevsw and cdevsw tables. The presto driver
caches the intercepted synchronous write requests in the Prestoserve
cache's nonvolatile memory. When Prestoserve needs to perform actual
I/O, it calls the original driver's entry points to perform the I/O. A
modified form of Least Recently Used (LRU) replacement determines when
the Prestoserve cache data needs to be written to the intended disks.
An accelerated disk device (one that has the presto pseudodevice driver
layered on top of its driver), uses the same major and minor devices
that it used before it was accelerated.
The Prestoserve nonvolatile memory must be found at boot time in order
for Prestoserve to perform its write-caching function. In addition,
Prestoserve must pass diagnostic tests, and there must be sufficient
backup battery power to guarantee a reasonable amount of cache data
stability (measured in days or weeks) in the event of a power or hard‐
ware failure.
If the Prestoserve nonvolatile memory is not found or if there is not
enough backup battery power, then the disks are not accelerated; how‐
ever they can be opened and used as usual. In this case, the presto
driver simply passes all I/O requests directly through to the appropri‐
ate device.
Operation
When Prestoserve is in the PRUP state, it caches all synchronous write
requests for enabled file systems to the presto driver in nonvolatile
memory and writes the Prestoserve cache data asynchronously to the
intended disks.
When Prestoserve is in the PRDOWN state, there is no data in the
Prestoserve cache, no data is put into the Prestoserve cache, and all
disk requests are passed directly to the real disk driver.
When Prestoserve is in the PRERROR state, the data in the Prestoserve
cache can not be written to the intended disks because of a disk, sys‐
tem, or hardware error.
When the system is shut down normally by using the reboot system call
from the shutdown, halt, or reboot command, the Prestoserve cache data
is written to the intended disks, and Prestoserve enters the PRDOWN
state. This allows you to fix any system or disk error or to upgrade or
change your system without losing the data in the Prestoserve cache or
corrupting your disks.
If your system was shut down without following normal shutdown proce‐
dures, and you reboot the system, any data in the Prestoserve cache is
written to the intended disks, if possible. If the data is success‐
fully written to the intended disks (and if the nonvolatile memory and
backup battery passed the diagnostic tests), Prestoserve enters the
PRDOWN state. If an error occurs, Prestoserve enters the PRERROR
state.
Note
Data can exist in the Prestoserve cache after you reboot the system
only in the event of a previous power failure, disk device error, or
kernel crash that resulted from a software or hardware problem.
If an error from a disk device occurs or if the backup battery power is
insufficient, Prestoserve writes the cache data to the intended disks,
if possible, and enters the PRERROR state. When Prestoserve is in the
PRERROR state, new data that is written to a block not found in the
Prestoserve cache is passed directly to the real disk driver. If new
data is written to a block that is found in the cache, Prestoserve
replaces the existing block and attempts to write the block to the real
disk driver to determine if the error condition on that block still
exists. If the write is successful and if all the Prestoserve cache
data can be written to the intended disks, Prestoserve leaves the PRE‐
RROR state.
The Prestoserve cache never discards data without being explicitly told
to do so by using a PRRESET ioctl command. This can be done by using
the presto-R command. This command should only be used when there is
a fatal disk error and when the data is not important.
Prestoserve ioctl Commands
The presto pseudodevice driver does not intercept ioctl commands; they
go directly to the actual disk driver. The following ioctl commands
can be performed on the Prestoserve control device /dev/pr0. Some ioctl
commands affect all of Prestoserve operation, while others only affect
a particular accelerated file system. The argument to ioctl is a
pointer to a presto_status structure, which contains battery status
information, Prestoserve state, current and maximum nonvolatile memory
sizes, and various Prestoserve statistics. The argument to ioctl is a
pointer to a presto_status structure, which contains extended charge‐
able battery status information, Prestoserve state, current and maximum
nonvolatile memory sizes, and various Prestoserve statistics. The
argument to ioctl is a pointer to an int, which can be either PRUP to
enable Prestoserve or PRDOWN to disable Prestoserve. When a system
reboots, Prestoserve is in the PRDOWN state and must be explicitly
enabled by an ioctl. You enable Prestoserve by using the presto-u
command. You can also automatically enable Prestoserve by specifying
the appropriate run-time variables in the /etc/rc.config file and spec‐
ifying file systems in the /etc/prestotab file. The prestosetup com‐
mand provides you with an interactive facility to set up Prestoserve.
When Prestoserve goes from the PRDOWN state to the PRUP state, the
Prestoserve I/O statistics are reset. When Prestoserve goes from the
PRUP state to the PRDOWN state, all the Prestoserve buffers are written
to the intended disks, and the buffers are invalidated. The argument
to ioctl is a pointer to an int, which is the size in bytes of the
Prestoserve nonvolatile memory to be used. This size cannot be larger
than the maximum size reported in the presto_status structure. The
argument to ioctl is ignored. Like the PRSETSTATE ioctl, PRRESET sets
the Prestoserve state to PRDOWN, but it also reinitializes all of non‐
volatile Prestoserve memory. If Prestoserve was in the PRERROR state
and some Prestoserve buffers could not be written to the intended disks
because of disk I/O errors, the data in the buffers is lost. This is
the only method you can use to force Prestoserve to discard data that
cannot be written to disk, and it can be accomplished by using the
presto-R command. The argument to ioctl is ignored. All the data in
the Prestoserve buffers is written to the intended disks, but the buf‐
fers are not invalidated. This command can be used by a daemon that
flushes the cache periodically to minimize the risk to data in the
event of a catastrophic failure. The cache data can be flushed to the
intended disks by using the presto-F command. The argument to ioctl
is a pointer to a struct uprtab. On input, the upt_bmajordev and
upt_unit fields specify the block device major number and unit number
of the device whose struct uprtab should be returned. The upt_bma‐
jordev and upt_unit fields are set to NODEV, which is defined in the
header file <sys/param.h>, if the requested device does not exist or if
it is not accelerated. The struct uprtab contains a upt_enabled field
that is a bit vector indexed by a partition number and that indicates
whether the partition has Prestoserve caching enabled. The argument to
ioctl is a pointer to a struct uprtab. This ioctl returns the struct
uprtab for the accelerated device with the smallest (block device major
number, unit number) pair that is greater than the upt_bmajordev and
upt_unit fields of the struct uprtab argument. This allows each accel‐
erated device's struct uprtab to be retrieved sequentially by specify‐
ing the previous device's (block device major number, unit number)
pair. To get the first accelerated device's struct uprtab set the
upt_bmajordev and upt_unit fields to NODEV. Use the same struct uprtab
that was returned on the previous call for the next call. When the
upt_bmajordev and upt_unit fields of the struct uprtab argument are
greater than or equal to the last accelerated device's major block
device number, the struct uprtab that is returned has the upt_bmajordev
and upt_unit fields set to NODEV. The argument to ioctl is a pointer
to a dev_t. This enables Prestoserve caching on the specified file sys‐
tem. The argument to ioctl is a pointer to a dev_t. If all cached data
for the specified file system can be successfully written to disk,
Prestoserve caching is disabled for this file system.
Diagnostic and Error Messages
This message is displayed if you attempt to use Prestoserve on a system
that has not had its license registered. It is necessary to register a
valid license in order to use Prestoserve. During a system reboot,
dirty buffers were found for a block device with the major number N.
The dirty buffers could not be written to the device because the device
was not registered with Prestoserve. Prestoserve will remain in the
ERROR state until the device with major number N registers itself with
Prestoserve, at which time the dirty buffers will be flushed back to
the device. This message is displayed at boot time and indicates that
Prestoserve recognized its control information portion of the cache. It
is a normal Prestoserve startup message. This message is displayed at
boot time and indicates that Prestoserve did not recognize the cache as
being in either a clean (containing no data) or a dirty (containing
data) state. The message is usually displayed when the cache is used
for the first time, after the cache has been cleared by using a diag‐
nostic command, or after backup battery failure. This message is dis‐
played at boot time, and it indicates that the cache tested as either
read/write ok or readonly ok. The message is a normal Prestoserve
startup message. The status for the primary battery or a secondary
battery, if applicable, is reported as either OK, LOW, or DISABLED.
This message is displayed at boot time and when there is a change in
the state of the backup battery power level. This message is displayed
at boot time if Prestoserve was not shut down by using the normal sys‐
tem shutdown procedures. This message indicates that dirty buffers
were found after the system rebooted. The data is written to the
intended disks as soon as possible, usually when the first I/O request
occurs for any accelerated device. This message indicates that
Prestoserve has begun to write the data in the dirty buffers to the
intended disks. This message indicates that the data in the dirty buf‐
fers has been successfully written to the intended disks. This message
is displayed at boot time and indicates that the kernel is now being
run with a version of the Prestoserve software that is different from
the version used previously. Usually, this message is displayed when
you first boot the system after performing a software upgrade. This
message indicates that the block size and fragment size in the
Prestoserve control information portion of the cache are different from
the information that was expected. This message should only be dis‐
played when you first boot the system after performing a software
upgrade. This message indicates that only a portion of the Prestoserve
cache was being used when the system was shut down, but now Prestoserve
is using the entire cache. The presto-s command, which changes the
size of the Prestoserve cache, is described in presto(8). This message
indicates that a hardware or software problem exists because the size
of the Prestoserve cache at reboot is less than the size of the cache
when the system shut down. This message is displayed at boot time and
indicates that Prestoserve was not shut down normally and that the
cache contents were previously in a different system (for example,
either the cache was moved or the system ID, which is usually the on-
board Ethernet hardware address, has changed). Prestoserve allows you
to do one of the following interactively: discard the data, write the
data to disk, or halt the machine. When these messages are displayed,
Prestoserve allows you to do one of the following interactively: con‐
tinue with the boot or halt the machine. This message indicates that
dirty Prestoserve buffers were found after the system rebooted, and the
data in the dirty buffers could not be written to the specified device
because the device failed to open. You should verify that the device
is online and that the kernel successfully found the device at boot
time. Refer to errno(2) for a complete description of the error. This
message indicates that the specified device failed to open. You should
check your disk configuration and make sure that the drive is on line.
Refer to errno(2) for a complete description of the error. This mes‐
sage indicates that an ioctl failed for the specified device. Refer to
errno(2) for a complete description of the error. These messages indi‐
cate that the disk controller for the specified device could not
directly address the cache when Prestoserve was enabled on the file
system. This message is displayed when Prestoserve is in the PRERROR
state and receives a request to write the data in a dirty buffer to the
intended disks. Presto-ized in this kernel! This message indicates
that Prestoserve was not shut down cleanly, and the system was previ‐
ously running a kernel with an accelerated device that the current ker‐
nel does not accelerate. You should boot a kernel that accelerates all
the devices that were previously accelerated. This message is dis‐
played at system startup if a Prestoserve cache read/write error
occurred, and it indicates that the cache could not be accessed. It
indicates a hardware or software error. This message indicates that
the Prestoserve cache failed the read/write test at the specified
address. This message indicates that an I/O error occurred on the
specified disk during a Prestoserve write-back operation. This message
indicates that there is inadequate backup battery power. Prestoserve
attempts to write all Prestoserve cache data to the intended disks and
then enters the PRERROR state. This message indicates that Prestoserve
disabled itself because of inadequate backup battery power or because a
disk error occurred during a write to disk. This message indicates
that a disk error or low backup battery power condition has been cor‐
rected and that Prestoserve is enabled again.
ERRORS
The Prestoserve ioctl commands set errno to the specified values for
the following conditions: Prestoserve is not registered for use on this
system. A caller whose uid is not root tried to use the PRSETSTATE,
PRSETMEMSZ, PRRESET, PRENABLE or PRDISABLE command. An attempt was
made to use the PRSETSTATE, PRSETMEMSZ, or PRDISABLE command, but a
fatal disk error or a battery problem exists. The memory size speci‐
fied in the PRSETSTATE command exceeds the maximum size of the cache
reported in the presto_status structure. Prestoserve was not success‐
fully started at boot time, or the PRENABLE or PRDISABLE command was
used and the device specified by dev_t is not a device initialized for
use with Prestoserve, or an error occurred in trying to open/ioctl the
device. An invalid argument was specified with the PRSETMEMSZ or
PRSETSTATE command or that an invalid command was used.
FILES
Generic Prestoserve control device
SEE ALSOerrno(2), ioctl(2), presto(8), dxpresto(8X), prestoctl_svc(8),
prestosetup(8), prestotab(4)
Guide to Prestoserve
presto(7)