cpr man page on IRIX

Man page or keyword search:  
man Server   31559 pages
apropos Keyword Search (all sections)
Output format
IRIX logo
[printable version]



cpr(1)									cpr(1)

NAME
     cpr - checkpoint and restart processes; info query; delete statefiles
     cview - graphical user interface for checkpoint and restart (CPR)

SYNOPSIS
     cpr -c pathname -p id[:type],[id[:type]...] [ -fgkuw ]
     cpr -i pathname ...
     cpr [ -jmw ] -r pathname ...
     cpr -D pathname ...

     cview [ -display XwindowDisplay ]

DESCRIPTION
     IRIX Checkpoint and Restart (CPR) offers a set of user-transparent
     software management tools, allowing system administrators, operators, and
     users with suitable privileges to suspend a job or a set of jobs in mid-
     execution, and restart them later on.  The jobs may be running on a
     single machine or on an array of networking connected machines.  CPR may
     be used to enhance system availability, provide load and resource control
     or balancing, and to facilitate simulation or modeling.

     The cview command provides an X Windows interface to CPR, and is composed
     of two decks:  the Checkpoint Control Panel and the Restart Control
     Panel.  As of the IRIX 6.5.16 release, new features are no longer being
     added to the cview command.  The cview command will be removed in the
     next major release of IRIX.

OPTIONS
     Use the -c, -i, -r, and -D options to:  create, query, restart, and
     delete checkpoints, respectively.

   Create Checkpoint
     -c	  Checkpoint a process or set of processes and create a statefile
	  directory in pathname, based on the process id specified after -p.

     -f	  Force overwrite of an existing pathname, so existing statefiles are
	  replaced with new ones according to the new checkpoint.

     -g	  Have checkpoint target processes continue running (go) after this
	  checkpoint is finished.  This overrides the default WILL policy, and
	  the WILL policy specified in a user's CPR attribute file.

     -k	  Kill checkpoint target processes after this checkpoint is finished.
	  This is the default WILL policy, but overrides a CONT setting in the
	  user's CPR attribute file (see below).

     -u	  Use this option only when issuing a checkpoint immediately before an
	  operating system upgrade.  This forces a save of all executable
	  files and DSO libraries used by the current processes, so that
	  target processes can be restarted in an upgraded environment.	 This
	  flag must be used again if restarted processes must be recursively
	  checkpointed in the new environment.

									Page 1

cpr(1)									cpr(1)

     -w	  Specify that CPR use the attribute file located in the current
	  working directory (versus $HOME/.cpr).

     -p	  Specifies the process or set of processes to checkpoint.  Processes
	  may have any type in the following list:

	  PID	  for Unix process and POSIX pthread ID (the default type)
	  GID	  for Unix process group ID
	  SID	  for Unix process session ID; see termio(7)
	  ASH	  for IRIX Array Session ID; see array_services(5)
	  HID	  for process hierarchy (tree) rooted at that PID
	  SGP	  for IRIX sproc shared group; see sproc(2)
	  JID	  for IRIX job ID; see job_limits(5)

     If type is not given in a checkpoint request, id is interpreted to use
     its default type PID.  Here are some examples:

      cpr -c ckpt01 -p 1111
      cpr -c ckpt02 -p 2222:GID

     The first example checkpoints a process with PID 1111 to the statefile
     directory ./ckpt01.  The second example checkpoints all processes with
     process group ID 2222 to the statefile directory ./ckpt02.

     Users may checkpoint a random set of processes into one statefile by
     specifying more comma-separated ids (with optional type) after the -p
     flag, as in this example:

      cpr -c ckpt03 -p 111:GID,222,333:SID

     This saves all processes with process group ID 111, process ID 222, and
     process session ID 333 into the statefile directory ./ckpt03.

     Only the super user and the owner of a process or set of processes (the
     checkpoint owner) can checkpoint the targeted processes.

   Checkpoint Info
     -i statefile ...
	  Provides information about existing CPR statefile(s):	 the statefile
	  revision number, process name(s), credential information of the
	  process, current working directory, open file information, and the
	  time when the checkpoint was performed.

   Restart Checkpoint
     -r statefile ...
	  Restarts a process or set of processes from the statefile.  If a
	  restart involves more than one processes, the restart on all
	  processes has to succeed before any process starts running;
	  otherwise, all restarts are aborted.

									Page 2

cpr(1)									cpr(1)

     -j	  Make processes interactive and job controllable.  If a checkpoint is
	  issued against an interactive process or a group of processes rooted
	  at an interactive process, it can be restarted interactively with
	  the -j option.  It runs in the foreground, even the original process
	  ran in the background.  Users may issue job control signals to
	  background the process if desired.  An interactive job is defined as
	  a process with a controlling terminal; see termio(7).	 Only one
	  controlling terminal is restored even if the original process had
	  multiple controlling terminals.

     -m	  Migrate process memory at restart time.  This option migrates
	  process memory so it is restored to the location in the system
	  topology where the restart is executing (within a specific cpuset,
	  within the global cpuset, etc.).  Without this option, the default
	  restart behavior on NUMA systems is to restore process memory back
	  to where it was at the time of the checkpoint.  This option has no
	  effect on non-NUMA systems.  See the migration(3) man page for
	  scenarios which may prevent pages from migrating properly.

     -w	  Specify that CPR use the attribute file located in the current
	  working directory (versus $HOME/.cpr).

     Note that statefile remain unchanged after a restart unless users use the
     -D option to delete the statefile.

     A restart may fail due to a number of reasons including:

     Resource Limitation:  This happens when the original PID is not available
     and the application may not use another PID; or when certain
     application-related files, binaries, or libraries are no longer available
     on the system if the REPLACE or SUBSTITUTE option was not set at
     checkpoint time for missing files; or when other system resources such as
     memory or disk run out due to restart.

     File contents change: If the CONTENTS action was used for FILE policies
     in the user's CPR attribute file, the restart could fail if file contents
     have changed between checkpoint and the restart.  See the FILE section
     below for more information.

     Security and Data Integrity:  Restart fails if the restarting user lacks
     the proper permission to restart the statefile, or if the restart
     destroys or replaces data without proper permission.  The basic rule is
     that only the superuser and checkpoint owner can restart the processes.
     This implies that if the superuser checkpoints a process owned by a
     regular user, only the superuser has permission to restart it.

     Other Fatal Failures:  If important parts of the original processes
     cannot be restored due to any other reasons.

   Delete Checkpoint

									Page 3

cpr(1)									cpr(1)

     -D statefile ...
	  Delete one or more statefiles.  After a successful restart,
	  statefiles might no longer be needed, and may be removed.  The
	  delete option removes all files associated with the statefile,
	  including saved open files, mapped files, pipe data, etc.  Only the
	  superuser and checkpoint owner may delete a statefile directory.

   Cview Window
     How to Checkpoint:	 Under the STEP I button, select a process or set of
     processes from the list.  To checkpoint a process group, a session group,
     an IRIX array session, a process hierarchy, or an sproc shared group,
     select a category from the Individual Process drop-down menu.  In the
     filename field below, enter the name of a directory for storing the
     statefile.	 Click the STEP II button if you want to change checkpoint
     options, such as whether to exit or continue the process, or control open
     file and mapped file dispositions.	 Click the STEP III OK button to
     initiate the checkpoint, or the Cancel Checkpoint button to discontinue.

     How to Restart:  Click the Restart Control Panel tab at the bottom of the
     cview window.  From the scrolling list of files and directories, select a
     statefile to restart.  Note that all files and directories are shown, not
     just statefile directories.  If a statefile is located somewhere besides
     your home directory, change directories using the icon finder at the top.
     Select any options you want, such as whether to retain the original
     process ID, whether to restore the original working directory, or whether
     to restore the original root directory.  Click the OK Go Restart button
     to initiate restart.

     Querying a Statefile:  From the scrolling list of files and directories,
     select a statefile to query.  At the bottom of the cview window, click
     the Tell Me More About This Statefile button.

     Deleting a Statefile:  From the scrolling list of files and directories,
     select a statefile to delete.  At the bottom of the cview window, click
     the Remove This Statefile button.

SIGNALS AND EVENT HANDLING
     Two signals, SIGCKPT and SIGRESTART, are designed to give application
     programs adequate warning to take special action upon checkpoint or at
     restart time.  The default action is to ignore both signals unless
     applications catch the signals; see signal(2).  By catching the signals,
     an application gets an opportunity to set up its signal handler and be
     prepared for checkpoint or restart.  An application can clean up files,
     flush buffers, close or reconnect socket connections, etc.

     Meanwhile, the main CPR process waits as long as necessary for the
     application to finish the signal handling, before cpr proceeds with
     further checkpoint activities after SIGCKPT.  At restart the first thing
     an application runs is the SIGRESTART signal handler, if the application
     is catching the signal.

									Page 4

cpr(1)									cpr(1)

     However, these two signals (SIGCKPT and SIGRESTART) are not recommended
     for direct use by applications wishing to be checkpointed.	 Instead,
     applications call atcheckpoint(3C) and/or atrestart(3C) to register event
     handlers for checkpoint and restart, and activate signal handling.	 This
     is especially important for applications that need to register multiple
     callback handlers for checkpoint or restart events.  Use of
     atcheckpoint(3C) and atrestart(3C) also ensures that registered signal
     handlers are invoked only when a checkpoint or restart of the application
     is in progress (as opposed to the user sending the signals directly via a
     function such as sigsend(2)).

     Warning: if applications catch the two CPR signals directly, it may undo
     all of the CPR signal handler registration provided by atcheckpoint(3C)
     and atrestart(3C), including handlers that some libraries reserve without
     the application programmer's knowledge.

FILES
     statefile	      Directory containing images of checkpointed processes
     $HOME/.cpr	      User-configurable options for checkpoint and restart
     /etc/cpr_proto   Attribute file prototype for creating $HOME/.cpr

     /usr/lib/X11/app-defaults/Cview  Application defaults file
     /usr/lib/images/Cview.icon	      Image for minimized window

     The $HOME/.cpr files control CPR behavior, and consist of one or more
     CKPT attribute definitions, each in the following form:

	  CKPT IDtype IDvalue {
	      policy:  instance:  action
	      ...
	  }

     The IDtype is the same as for the -c option; see above.  The IDvalue is
     the process or process set ID.  Both can be given as a star (*) to
     represent any IDtype or IDvalue.

     Here are the policy keywords and what they control:

      FILE	policies of handling open files
      WILL	actions on the original process after checkpoint
      CDIR	policy on the original working directory; see chdir(2)
      RDIR	policy on the original root directory; see chroot(2)
      FORK	policy on the process ID to be restored when recreating a
		process; policy on the job ID to be restored if recreating
		an IRIX job structure; this policy does not affect the
		amount of data restarted, it simply allows the user to
		avoid ID collision failures if collisions should occur
      PLACEMENT policy on process and memory placement

     FILE takes an instance, which is the filename.

									Page 5

cpr(1)									cpr(1)

     FORK can take instances PID or JID.  If no instance is specified, the
     specified action is applied to all instances.

     FILE offers the following action keywords:

      MERGE	 upon restart, reopen the file and seek to the previous offset
		 (default action)
      IGNORE	 upon restart, reopen the file as originally opened
      APPEND	 upon restart, reopen the file for appending
      REPLACE	 save file at checkpoint; replace the original file at
		 restart.  See NOTES section for special considerations
		 regarding file REPLACE actions with setuid/setgid programs.
      SUBSTITUTE save file at checkpoint; at restart, open the saved file as
		 an anonymous substitute, not touching the original file
      CONTENTS	 calculate checksum (currently MD5) on the file at checkpoint,
		 upon restart detect if the file has been modified between
		 begin-of-file and file-size-at-checkpoint; if the file has
		 been modified in this area, the process is refused to do
		 restart, otherwise seek to the previous offset and continue.

     WILL offers the following action keywords:

      EXIT	 the original process exits after checkpoint (default action)
      CONT	 the original process continues to run after checkpoint

     CDIR and RDIR offer the following action keywords:

      REPLACE	 restore original current working directory or root directory
					  (default action)
      IGNORE	 ignore original current working directory or root directory;
		 restart according to new process environment

     FORK offers the following action keywords:

      ORIGINAL	 attempt to recover the original process ID when recreating a
		 process; attempt to recover the original job ID if recreating
		 an IRIX job structure (default action)
      ANY	 it is acceptable to recreate the process with any process ID;
		 if recreating an IRIX job structure, it is acceptable to
		 recreate the job with any job ID

     PLACEMENT offers the following action keywords:

      FLEXIBLE	 upon restart, if process memory placement fails when adhering
		 to the checkpointed placement policies, attempt to place
		 process memory according to a basic memory placement
		 algorithm (TOPOLOGY_FREE) and print a message stating that
		 this action was taken; if a process cannot be restricted
		 to the CPU it was restricted to at the time of checkpoint,
		 allow the process to run on any CPU (see MP_RUNANYWHERE in
		 sysmp(2)) and print a message stating that this action
		 was taken (default action)

									Page 6

cpr(1)									cpr(1)

      STRICT	 upon restart, restore the memory of a process according to
		 the placement policies saved at the time of checkpoint (the
		 restart may fail if memory can no longer be placed according
		 to the checkpointed placement policies due to machine
		 configuration changes, lack of available memory, etc.); upon
		 restart, restore a process which was restricted to a specific
		 CPU at the time of checkpoint to the same CPU (the restart
		 may fail if the process can no longer be restricted to that
		 CPU due to machine configuration changes, etc.)

NOTES
     Due to the nature of UNIX checkpoint and restart, it is impossible to
     claim that everything a process owns or connects with can be restored.
     The bullet items below attempt to list what is supported, and what is
     known to be not supported.	 For system objects not covered below, safety
     decisions must be made by application programmers and users.

     The following system objects are checkpoint-safe:

     o	  UNIX processes, process groups, terminal control sessions, IRIX
	  array sessions, process hierarchies, sproc(2) groups, POSIX pthreads
	  (pthread_create(3P)), random process sets, and IRIX jobs

     o	  all user memory area, including user stack and data regions

     o	  system states, including process and user information, signal
	  disposition and signal mask, scheduling information, owner
	  credentials, accounting data, resource limits, current directory,
	  root directory, locked memory, and user semaphores

     o	  system calls, if applications handle return values and error numbers
	  correctly, although slow system calls may return partial results

     o	  undelivered and queued signals are saved at checkpoint and delivered
	  at restart

     o	  open files (including NFS-mounted files), mapped files, file locks,
	  and inherited file descriptors

     o	  special files /dev/tty, /dev/console, /dev/zero, /dev/null,
	  ccsync(7M)

     o	  open pipes, pipeline data and streams pipe read and write message
	  modes

     o	  System V shared memory

     o	  POSIX semaphores (psema(D3X))

     o	  semaphore and lock arenas (usinit(3P))

									Page 7

cpr(1)									cpr(1)

     o	  jobs started with CHALLENGEarray services, provided they have a
	  unique ASH number; see array_services(5)

     o	  applications using node-lock licenses; see IRIX Checkpoint and
	  Restart Operation Guide on what to do for applications using
	  floating licenses

     o	  applications using the prctl() PR_ATTACHADDR option; see prctl(2)

     o	  applications using blockproc and unblockproc; see blockproc(2)

     o	  R10000 counters; see libperfex(3C) and perfex(1)

     o	  capabilities, Mandatory Access Control (MAC) labels, and Access
	  Control Lists (ACLs); see capabilities(4), DOMINANCE(5) and acl(4),
	  respectively

     The following system objects are not checkpoint-safe:

     o	  network socket connections; see socket(2)

     o	  X terminals and X11 client sessions

     o	  special devices such as tape drivers and CDROM

     o	  files opened with setuid credential that cannot be reestablished

     o	  System V semaphores and messages; see semop(2) and msgop(2)

     o	  memory mapped files using the /dev/mmem file; see mmap(2)

     o	  open directories

     The scope of process relationships saved at checkpoint time is directly
     related to the id and type options specified.  Likewise, at restart, only
     these process relationships can be restored to the state they were in at
     the time of the checkpoint.  Any other process relationships not
     encapsulated by the id and type options specified at checkpoint time will
     be inherited from the process performing the restart.  For example, if a
     user checkpoints a single process via type PID, and that process is not a
     session leader, at restart time the process will be restored within the
     same session as the process doing the restart.

     Some checkpoint-safe objects are installed as optional features.
     Examples include IRIX job limit data and Comprehensive System Accounting
     (CSA) data.  If a feature is enabled at the time of the checkpoint, and
     if the checkpoint id encapsulates feature data, CPR will store the
     pertinent feature data in the statefile and attempt to restore that data
     upon restart.  CPR attempts to restart the job so its state is exactly as
     it was at the time the job was checkpointed.  If an optional feature is
     no longer enabled at the time of the restart, the restart may fail as CPR
     cannot accurately recreate the state of the job prior to the checkpoint.

									Page 8

cpr(1)									cpr(1)

     For system security, when restarting a setuid or setgid process REPLACE
     actions will be changed to SUBSTITUTE actions for files which have been
     modified or deleted, or a MERGE action for all others.  If a SUBSTITUTE
     action is performed a notice specifying the location of the substituted
     file will be displayed.  It is the user's responsibility to pick up any
     output file thus substituted.  Applications which reopen a substituted
     file by its original name may not operate as expected.

SEE ALSO
     atcheckpoint(3C), atrestart(3C), ckpt_create(3), ckpt_remove(3),
     ckpt_restart(3), ckpt_stat(3)
     IRIX Checkpoint and Restart Operation Guide

COPYRIGHT
     Portions of the IRIX Checkpoint and Restart code are derived from the RSA
     Data Security, Inc. MD5 Message-Digest Algorithm.

									Page 9

[top]

List of man pages available for IRIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net