rgmanager man page on Scientific

Man page or keyword search:  
man Server   26626 pages
apropos Keyword Search (all sections)
Output format
Scientific logo
[printable version]

rgmanager(8)		     Red Hat Cluster Suite		  rgmanager(8)

       rgmanager - Resource Group (Cluster Service) Manager Daemon

       rgmanager  handles  management  of  user-defined cluster services (also
       known as resource groups).  This includes  handling  of	user  requests
       including service start, service disable, service relocate, and service
       restart.	 The service manager daemon also handles restarting and	 relo‐
       cating services in the event of failures.

       The  service manager is spawned by an init script after the cluster in‐
       frastructure has been started and only functions when  the  cluster  is
       quorate and locks are working.

       During  initialization,	the  service manager runs scripts which ensure
       that all services are clear to be started.  After that,	it  determines
       which services need to be started and starts them.

       When  an	 event	is  received,  members which are no longer online have
       their services taken away from them.  The event should  only  occur  in
       the case that the member has been fenced whenever fencing is available.

       When  a	cluster	 member determines that it is no longer in the cluster
       quorum, the service manager stops all services and waits for a new quo‐
       rum to form.

       Rgmanager  is  configured via cluster.conf.  With the exception of log‐
       ging, all of rgmanager's configuration resides with the <rm> tag.   The
       general parameters for rgmanager are as follows:

       central_processing  - Enable central processing mode (requires cluster-
       wide shut down and restart of rgmanager).   This	 alternative  mode  of
       handling	 failures  externalizes	 most  of  rgmanager's features into a
       user-editable script.  This mode is disabled by default.

       status_poll_interval - This defines the amount  of  time,  in  seconds,
       rgmanager   waits  between  resource  tree  scans  for  status  checks.
       Decreasing this value may improve rgmanager's ability to	 detect	 fail‐
       ures  in services, but at a cost of decreased performance and increased
       system utilization.  The default is 10 seconds.

       status_child_max - Maximum number of status check  threads  (default  =
       5).  It is not recommended that this ever be changed.  This simply con‐
       trols how many instances of clustat queries may	be  outstanding	 on  a
       single node at any given time.

       transition_throttling - This is the amount of time the event processing
       thread stays alive after	 the  last  event  has	been  processed.   The
       default is 5 seconds.  It is not recommended that this ever be changed.

       log_level  -  DEPRECATED;  DO NOT USE.  Controls log level filtering to
       syslog.	Default	 is  5;	 valid	values	range  from  0-7.   See	 clus‐
       ter.conf(5) for the current method to configure logging.

       log_facility  -	DEPRECATED;  DO	 NOT USE.  Controls log level facility
       when sending messages to	 syslog.   Default  is	"daemon".   See	 clus‐
       ter.conf(5) for the current method to configure logging.

       Resource	 agents	 define resource classes rgmanager can manage.	Rgman‐
       ager follows the Open Cluster Framework Resource Agent API v1.0 (draft)
       standard, with the following two notable exceptions:

	* Rgmanager does not call monitor; it only calls status
	* Rgmanager looks for resource agets in /usr/share/cluster

       Rgmanager  uses	the  metadata  from  resource agents to determine what
       parameters to look for in cluster.conf for a each resource type.	 View‐
       ing  the	 resource agent metadata is the best way to understand all the
       various resource agent parameters.

       A service or resource group is a collection  of	resources  defined  in
       cluster.conf  for  rgmanager's  use.   Resource	groups are also called
       resource trees.

       A resource group is the atomic unit of failover in rgmanager.  That is,
       even though rgmanager calls out to various resource agents individually
       in order to start or stop various resources, everything in the resource
       group  is  always moved around together in the event of a relocation or

       Rgmanager supports only two startup policies,

       autostart - if set to 1 (the default), the service is  started  when  a
       quorum forms.  If set to 0, the service is not automatically started.

       Startup Policy Configuration: Recovery Configuration:
	  <service name="service1" autostart="[0|1]" .../>

       Rgmanager  supports  three recovery policies for services; this is con‐
       figured by the recovery parameter in the service definition.

       restart - means to attempt to restart the resource group	 in  place  in
       the  event  of  one or more failures of individual resources.  This can
       further be augmented by the max_restarts and restart_expire_time param‐
       eters, which define a tolerance for the amount of service restarts over
       the given amount of time.

       relocate - means to move the resource group  to	another	 host  in  the
       cluster instead of restarting on the same host.

       disable	-  means  to  not try to recover the resource group.  Instead,
       just place it in to the disabled state.

       Recovery Configuration:
	  <service name="service1" recovery="[restart|relocate|disable]" .../>

       A failover domain is an ordered subset of members to  which  a  service
       may  be	bound.	The  following	is  a  list of semantics governing the
       options as to how the different configuration options affect the behav‐
       ior of a failover domain:

       preferred  node or preferred member : The preferred node was the member
       designated to run a given service if the member is online. We can  emu‐
       late  this  behavior  by specifying an unordered, unrestricted failover
       domain of exactly one member.

       restricted domain : Services bound to the domain may only run on	 clus‐
       ter  members  which are also members of the failover domain. If no mem‐
       bers of the failover domain are available, the service is placed in the
       stopped state.

       unrestricted  domain  :	Services  bound	 to this domain may run on all
       cluster members, but will run on a member of the domain whenever one is
       available.  This	 means	that  if  a  service is running outside of the
       domain and a member of  the  domain  comes  online,  the	 service  will
       migrate to that member.

       ordered	domain : The order specified in the configuration dictates the
       order of preference of members within the domain.  The  highest-ranking
       member  of the domain will run the service whenever it is online.  This
       means that if member A has a higher rank than  member  B,  the  service
       will  migrate to A if it was running on B if A transitions from offline
       to online.

       unordered domain : Members of the domain have no order  of  preference;
       any member may run the service. Services will always migrate to members
       of their failover domain whenever possible, however,  in	 an  unordered

       nofailback  :  Enabling this option for an ordered failover domain will
       prevent automated fail-back after a  more-preferred  node  rejoins  the
       cluster.	 Consequently,	nofailback requires an ordered domain in order
       to be meaningful.  When nofailback is used, the following two behaviors
       should be noted:
	* If a subset of cluster nodes forms a quorum, the node with the high‐
	est priority in the failover domain is selected to run a service bound
	to  the domain. After this point, a higher priority member joining the
	cluster will not trigger a relocation.
	* When a service is  running  outside  of  its	unrestricted  failover
	domain	and  a	cluster	 member boots which is a part of the service's
	failover domain, the service will relocate to that  member.  That  is,
	nofailback  does  not  prevent	transitions from outside of a failover
	domain to inside a failover domain. After this point, a higher	prior‐
	ity member joining the cluster will not trigger a relocation.

       Ordering,  restriction, and nofailback are flags and may be combined in
       almost any way (ie, ordered+restricted, unordered+unrestricted,	etc.).
       These  combinations affect both where services start after initial quo‐
       rum formation and which cluster members will take over services in  the
       event that the service has failed.

       Failover Domain Configuration:
	    <failoverdomain   name="NAME"  ordered="[0|1]"  restricted="[0|1]"
	    nofailback="[0|1" >
	      <failoverdomainnode name="node1" priority="[1..100]" />

       These are how the basic user-initiated service operations  (via	clusv‐
       cadm ) work.

       enable  -  start	 the  service,	optionally  on	a preferred target and
       optionally according to failover domain rules. In  absence  of  either,
       the  local  host	 where clusvcadm is run will start the service. If the
       original start fails, the service behaves as though a  relocate	opera‐
       tion  was requested (see below). If the operation succeeds, the service
       is placed in the started state.

       disable - stop the service and place into the disabled state.  This  is
       the only permissible operation when a service is in the failed state.

       relocate	 -  move the service to another node. Optionally, the adminis‐
       trator may specify a preferred node to receive  the  service,  but  the
       inability  for  the  service  to	 run on that host (e.g. if the service
       fails to start or the host is offline) does not prevent relocation, and
       another	node  is  chosen.   Rgmanager attempts to start the service on
       every permissible node in the cluster. If no permissible target node in
       the  cluster  successfully starts the service, the relocation fails and
       the service is attempted to be restarted on the original owner.	If the
       original	 owner	can  not restart the service, the service is placed in
       the stopped state.

       stop - stop the service and place into the stopped state.

       migrate - migrate the virtual machine to another node. The  administra‐
       tor  must specify a target node. Depending on the failure, a failure to
       migrate may result with the virtual machine in the failed state	or  in
       the started state on the original owner.

       freeze  -  freeze  the  service or virtual machine in place and prevent
       status checks from occurring.  Administrators may do this in  order  to
       perform	maintenance  on	 one  or more parts of a given service without
       having rgmanager interfere.  It is very important that the  administra‐
       tor  unfreezes  the  service  once maintenance is complete, as a frozen
       service will not fail over.  Freezing a	service	 does  NOT  affect  is
       operational  state.   For example, it does not 'pause' virtual machines
       or suspend them to disk.

       unfreeze - unfreeze (thaw) the service or virtual machine.   This  com‐
       mand makes rgmanager perform status checks on the service again.

       These are the most common service states.

       disabled	 -  The service will remain in the disabled state until either
       an administrator re-enables the service or  the	cluster	 loses	quorum
       (when  the  cluster  regains  quorum, the autostart parameter is evalu‐
       ated). An administrator may enable the service from this state.

       failed - The service is presumed dead.  A service is placed in to  this
       state  whenever	a resource's stop operation fails.  After a service is
       placed in to this state, the administrator must verify that  there  are
       no  allocated resources (mounted file systems, etc.) prior to issuing a
       disable request. The only operation which can take place when a service
       has entered this state is a disable.

       stopped	- When in the stopped state, the service will be evaluated for
       starting after the next service or node transition.  This is considered
       a  temporary  state. An administrator may disable or enable the service
       from this state.

       recovering - The cluster is trying to recover the service. An  adminis‐
       trator may disable the service to prevent recovery if desired.

       started	- If a service status check fails, recover it according to the
       service recovery policy. If the host running the service fails, recover
       it  following failover domain & exclusive service rules. An administra‐
       tor may relocate, stop, disable, and (with  virtual  machines)  migrate
       the service from this state.

       Apart from what is noted in the VM resource agent, rgmanager provides a
       few convenience features when dealing with virtual machines.
	* it will use live migration when transferring a virtual machine to  a
	more-preferred host in the cluster as a consequence of failover domain
	* it will search the other instances of rgmanager in  the  cluster  in
	the  case that a user accidentally moves a virtual machine using other
	management tools
	* unlike services, adding a virtual machine to rgmanager's  configura‐
	tion will not cause the virtual machine to be restarted
	* removing a virtual machine from rgmanager's configuration will leave
	the virtual machine running.

       -f     Run in the foreground (do not fork).

       -d     Enable debug-level logging.

       -q     Disable DBus signals  which  are	normally  sent	when  services
	      change state.

       -w     Disable internal process monitoring (for debugging).

       -N     Do  not perform stop-before-start.  Combined with the -Z flag to
	      clusvcadm, this can be used to allow rgmanager  to  be  upgraded
	      without stopping a given user service or set of services.

       -C     Explicitly  disable or enable CPG-based locking.	The default is
	      to enable this when RRP is turned on (which requires  a  cluster
	      outage).	This option MUST be the same on all hosts in the clus‐
	      ter and must only be enabled or disabled with all	 instances  of
	      rgmanager turned off.


       clusvcadm(8), cluster.conf(5), cpglockd(8)

				   Jul 2010			  rgmanager(8)

List of man pages available for Scientific

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
Vote for polarhome
Free Shell Accounts :: the biggest list on the net