NAME
pmcd - performance metrics collector daemon
SYNOPSIS
pmcd [-f] [-i ipaddress] [-l logfile] [-L bytes] [-n pmnsfile] [-p
port[,port ...] [-q timeout] [-T traceflag] [-t timeout] [-x file]
DESCRIPTION
pmcd is the collector used by the Performance Co-Pilot (see PCPIntro(1))
to gather performance metrics on a system. As a rule, there must be an
instance of pmcd running on a system for any performance metrics to be
available to the PCP.
pmcd accepts connections from client applications running either on the
same machine or remotely and provides them with metrics and other related
information from the machine that pmcd is executing on. pmcd delegates
most of this request servicing to a collection of Performance Metrics
Domain Agents (or just agents), where each agent is responsible for a
particular group of metrics, known as the domain of the agent. For
example the environ agent is responsible for reporting information
relating to the environment of a Challenge system, such as the cabinet
temperature and voltage levels of the power supply.
The agents may be processes started by pmcd, independent processes or
Dynamic Shared Objects (DSOs, see dso(5)) attached to pmcd's address
space. The configuration section below describes how connections to
agents are specified.
The options to pmcd are as follows.
-f By default pmcd is started as a daemon. The -f option indicates
that it should run in the foreground. This is most useful when
trying to diagnose problems with misbehaving agents.
-i ipaddress
This option is usually only used on hosts with more than one network
interface. If no -i options are specified pmcd accepts connections
made to any of its host's IP (Internet Protocol) addresses. The -i
option is used to specify explicitly an IP address that connections
should be accepted on. ipaddress should be in the standard dotted
form (e.g. 100.23.45.6). The -i option may be used multiple times
to define a list of IP addresses. Connections made to any other IP
addresses the host has will be refused. This can be used to limit
connections to one network interface if the host is a network
gateway. It is also useful if the host takes over the IP address of
another host that has failed. In such a situation only the standard
IP addresses of the host should be given (not the ones inherited
from the failed host). This allows PCP applications to determine
that a host has failed, rather than connecting to the host that has
assumed the identity of the failed host.
-l logfile
By default a log file named pmcd.log is written in the directory
$PCP_LOG_DIR/pmcd. The -l option causes the log file to be written
to logfile instead of the default. If the log file cannot be
created or is not writable, output is written to the standard error
instead.
Page 1
1
PMCD(1)PMCD(1)-L bytes
PDUs received by pmcd from monitoring clients are restricted to a
maximum size of 65536 bytes by default to defend against Denial of
Service attacks. The -L option may be used to change the maximum
incoming PDU size.
-n pmnsfile
Normally pmcd loads the default Performance Metrics Name Space
(PMNS) from $PCP_VAR_DIR/pmns/root, however if the -n option is
specified an alternative namespace is loaded from the file pmnsfile.
-q timeout
The pmcd to agent version exchange protocol (new in PCP 2.0 -
introduced to provide backward compatibility) uses this timeout to
specify how long pmcd should wait before assuming that no version
response is coming from an agent. If this timeout is reached, the
agent is assumed to be an agent which does not understand the PCP
2.0 protocol. The default timeout interval is five seconds, but the
-q option allows an alternative timeout interval (which must be
greater than zero) to be specified. The unit of time is seconds.
-t timeout
To prevent misbehaving agents from hanging the entire Performance
Metrics Collection System (PMCS), pmcd uses timeouts on PDU
exchanges with agents running as processes. By default the timeout
interval is five seconds. The -t option allows an alternative
timeout interval in seconds to be specified. If timeout is zero,
timeouts are turned off. It is almost impossible to use the
debugger interactively on an agent unless timeouts have been turned
off for its "parent" pmcd.
Once pmcd is running, the timeout may be dynamically modified by
storing an integer value (the timeout in seconds) into the metric
pmcd.control.timeout via pmstore(1).
-T traceflag
To assist with error diagnosis for agents and/or clients of pmcd
that are not behaving correctly, an internal event tracing mechanism
is supported within pmcd. The value of traceflag is interpreted as
a bit field with the following control functions:
1 enable client connection tracing
2 enable PDU tracing
256 unbuffered event tracing
By default, event tracing is buffered using a circular buffer that
is over-written as new events are recorded. The default buffer size
holds the last 20 events, although this number may be over-ridden by
using pmstore(1) to modify the metric pmcd.control.tracebufs.
Page 2
PMCD(1)PMCD(1)
Similarly once pmcd is running, the event tracing control may be
dynamically modified by storing 1 (enable) or 0 (disable) into the
metrics pmcd.control.traceconn, pmcd.control.tracepdu and
pmcd.control.tracenobuf. These metrics map to the bit fields
associated with the traceflag argument for the -T option.
When operating in buffered mode, the event trace buffer will be
dumped whenever an agent connection is terminated by pmcd, or when
any value is stored into the metric pmcd.control.dumptrace via
pmstore(1).
In unbuffered mode, every event will be reported when it occurs.
-x file
Before the pmcd logfile can be opened, pmcd may encounter a fatal
error which prevents it from starting. By default, the output
describing this error is sent to /dev/tty but it may redirected to
file.
If a PDU exchange with an agent times out, the agent has violated the
requirement that it delivers metrics with little or no delay. This is
deemed a protocol failure and the agent is disconnected from pmcd. Any
subsequent requests for information from the agent will fail with a
status indicating that there is no agent to provide it.
It is possible to specify host-level access control to pmcd. This allows
one to prevent users from certain hosts from accessing the metrics
provided by pmcd and is described in more detail in the Section on ACCESS
CONTROL below.
CONFIGURATION
On startup pmcd looks for a configuration file named $PCP_PMCDCONF_PATH.
This file specifies which agents cover which performance metrics domains
and how pmcd should make contact with the agents. An optional section
specifying host-based access controls may follow the agent configuration
data.
Warning: pmcd is usually started as part of the boot sequence and runs
as root. The configuration file may contain shell commands to create
agents, which will be executed by root. To prevent security breaches the
configuration file should be writable only by root. The use of absolute
path names is also recommended.
The case of the reserved words in the configuration file is unimportant,
but elsewhere, the case is preserved.
Blank lines and comments are permitted (even encouraged) in the
configuration file. A comment begins with a ``#'' character and finishes
at the end of the line. A line may be continued by ensuring that the
last character on the line is a ``\'' (backslash). A comment on a
continued line ends at the end of the continued line. Spaces may be
included in lexical elements by enclosing the entire element in double
Page 3
PMCD(1)PMCD(1)
quotes (there must be whitespace before the opening and after the closing
quote). A double quote preceded by a backslash is always a literal
double quote. A ``#'' in double quotes or preceded by a backslash is
treated literally rather than as a comment delimiter. Lexical elements
and separators are described further in the following sections.
AGENT CONFIGURATION
Each line of the agent configuration section of the configuration file
contains details of how to connect pmcd to one of its agents and
specifies which metrics domain the agent deals with. An agent may be
attached as a DSO, or via a socket, or a pair of pipes.
Each line of the agent configuration section of the configuration file
must be either an agent specification, a comment, or a blank line.
Lexical elements are separated by whitespace characters, however a single
agent specification may not be broken across lines unless a \ (backslash)
is used to continue the line.
Each agent specification must start with a textual label (string)
followed by an integer in the range 1 to 254. The label is a tag used to
refer to the agent and the integer specifies the domain for which the
agent supplies data. This domain identifier corresponds to the domain
portion of the PMIDs handled by the agent. Each agent must have a unique
label and domain identifier.
For DSO agents a line of the form:
label domain-no dso entry-point path
should appear. Where,
label is a string identifying the agent
domain-no is an unsigned integer specifying the agent's domain in the
range 1 to 254
entry-point is the name of an initialization function which will be
called when the DSO is loaded
path designates the location of the DSO. This field is treated
differently on Irix and on Linux. Later expects it to be an
absolute pathname, while former uses some heuristics to
find an agent. If path begins with a / it is taken as an
absolute path specifying the DSO. If path is relative, pmcd
will expect to find the agent in a file with the name
mips_simabi.path, where simabi is either o32, n32 or 64.
pmcd is only able to load DSO agents that have the same
simabi (Subprogram Interface Model ABI, or calling
conventions) as it does (i.e. only one of the simabi
versions will be applicable). The simabi version of a
running pmcd may be determined by fetching pmcd.simabi.
Alternatively, the file(1) command may be used to determine
the simabi version from the pmcd executable.
Page 4
PMCD(1)PMCD(1)
For a relative path the environment variable PMCD_PATH
defines a colon (:) separated list of directories to search
when trying to locate the agent DSO. The default search
path is $PCP_SHARE_DIR/lib:/usr/pcp/lib.
For agents providing socket connections, a line of the form
label domain-no socket addr-family address [ command ]
should appear. Where,
label is a string identifying the agent
domain-no is an unsigned integer specifying the agent's domain in the
range 1 to 254
addr-family designates whether the socket is in the AF_INET or AF_UNIX
domain, and the corresponding values for this parameter are
inet and unix respectively.
address specifies the address of the socket within the previously
specified addr-family. For unix sockets, the address should
be the name of an agent's socket on the local host (a valid
address for the UNIX domain). For inet sockets, the
address may be either a port number or a port name which
may be used to connect to an agent on the local host.
There is no syntax for specifying an agent on a remote host
as a pmcd deals only with agents on the same machine.
command is an optional parameter used to specify a command line to
start the agent when pmcd initializes. If command is not
present, pmcd assumes that the specified agent has already
been created. The command is considered to start from the
first non-white character after the socket address and
finish at the next newline that isn't preceded by a
backslash. After a fork(2) the command is passed
unmodified to execve(2) to instantiate the agent.
For agents interacting with the pmcd via stdin/stdout, a line of the
form:
label domain-no pipe protocol command
should appear. Where,
label is a string identifying the agent
domain-no is a unsigned integer specifying the agent's domain
protocol specifies whether a text-based (ASCII) or a binary protocol
should be used over the pipes. The two valid values for
this parameter are text and binary.
Note: To the best of our knowledge, nothing but the
demonstration PMDA news agent and the America's Cup San
Diego water temperature agent has ever used the ASCII PDU
interface to pmcd. The current PCP libraries (in
particular libpcp_pmda and libpcp_trace) make building a
Page 5
PMCD(1)PMCD(1)
real PMDA less effort than fighting with the ASCII PDUs in
a sh(1) script. Consequently, support for ASCII PDUs and
hence the keyword text in the pmcd configuration file is
discouraged.
command specifies a command line to start the agent when pmcd
initializes. Note that command is mandatory for pipe-based
agents. The command is considered to start from the first
non-white character after the protocol parameter and finish
at the next newline that isn't preceded by a backslash.
After a fork(2) the command is passed unmodified to
execve(2) to instantiate the agent.
ACCESS CONTROL CONFIGURATION
The access control section of the configuration file is optional, but if
present it must follow the agent configuration data. The case of
reserved words is ignored, but elsewhere case is preserved. Lexical
elements in the access control section are separated by whitespace or the
special delimiter characters: square brackets (``['' and ``]''), braces
(``{'' and ``}''), colon (``:''), semicolon (``;'') and comma (``,'').
The special characters are not treated as special in the agent
configuration section.
The access control section of the file must start with a line of the
form:
[access]
Leading and trailing whitespace may appear around and within the brackets
and the case of the access keyword is ignored. No other text may appear
on the line except a trailing comment.
Following this line, the remainder of the configuration file should
contain lines that allow or disallow operations from particular hosts or
groups of hosts.
There are two kinds of operations that occur via pmcd:
fetch allows retrieval of information from pmcd. This may be
information about a metric (e.g. it's description,
instance domain or help text) or a value for a metric.
store allows pmcd to be used to store metric values in agents
that permit store operations.
Access to pmcd is granted at the host level, i.e. all users on a host are
granted the same level of access. Permission to perform the store
operation should not be given indiscriminately; it has the potential to
be abused by malicious users.
Page 6
PMCD(1)PMCD(1)
Hosts may be identified by name, IP address or a wildcarded IP address
with the single wildcard character ``*'' as the last-given component of
the IP address. Host names may not be wildcarded. The following are all
valid host identifiers:
boing
localhost
giggle.melbourne.sgi.com
129.127.112.2
129.127.114.*
129.*
*
The following are not valid host identifiers:
*.melbourne
129.127.*.*
129.*.114.9
129.127*
The first example is not allowed because only (numeric) IP addresses may
contain a wildcard. The second example is not valid because there is
more than one wildcard character. The third contains an embedded
wildcard, the fourth has a wildcard character that is not the last
component of the IP address (the last component is 127*).
The name localhost is given special treatment to make the behavior of
host wildcarding consistent. Rather than being 127.0.0.1, it is mapped
to the primary IP address associated with the name of the host on which
pmcd is running. Beware of this when running pmcd on multi-homed hosts.
Access for hosts are allowed or disallowed by specifying statements of
the form:
allow hostlist : operations ;
disallow hostlist : operations ;
hostlist is a comma separated list of host identifiers.
operations is a comma separated list of the operation types described
above, all (which allows/disallows all operations), or all
except operations (which allows/disallows all operations
except those listed).
Where no specific allow or disallow statement applies to an operation for
some host, the default is to allow the operation from that host. In the
trivial case when there is no access control section in the configuration
file, all operations from all hosts are permitted.
If a new connection to pmcd is attempted from a host that is not
permitted to perform any operations, the connection will be closed
immediately after an error response PM_ERR_PERMISSION has been sent to
Page 7
PMCD(1)PMCD(1)
the client attempting the connection.
Statements with the same level of wildcarding specifying identical hosts
may not contradict each other. For example if a host named clank had an
IP address of 129.127.112.2, specifying the following two rules would be
erroneous:
allow clank : fetch, store;
disallow 129.127.112.2 : all except fetch;
because they both refer to the same host, but disagree as to whether the
fetch operation is permitted from that host.
Statements containing more specific host specifications override less
specific ones according to the level of wildcarding. For example a rule
of the form
allow clank : all;
overrides
disallow 129.127.112.* : all except fetch;
because the former contains a specific host name (equivalent to a fully
specified IP address), whereas the latter has a wildcard. In turn, the
latter would override
disallow * : all;
It is possible to limit the number of connections from a host to pmcd.
This may be done by adding a clause of the form
maximum n connections
to the operations list of an allow statement. Such a clause may not be
used in a disallow statement. Here, n is the maximum number of
connections that will be accepted from hosts matching the host
identifier(s) used in the statement.
An access control statement with a list of host identifiers is equivalent
to a group of access control statements, with each specifying one of the
host identifiers in the list and all with the same access controls (both
permissions and connection limits). A wildcard should be used if you
want hosts to contribute to a shared connection limit.
When a new client requests a connection, and pmcd has determined that the
client has permission to connect, it searches the matching list of access
control statements for the most specific match containing a connection
limit. For brevity, this will be called the limiting statement. If
there is no limiting statement, the client is granted a connection. If
there is a limiting statement and the number of pmcd clients with IP
addresses that match the host identifier in the limiting statement is
Page 8
PMCD(1)PMCD(1)
less than the connection limit in the statement, the connection is
allowed. Otherwise the connection limit has been reached and the client
is refused a connection.
The wildcarding in host identifiers means that once pmcd actually accepts
a connection from a client, the connection may contribute to the current
connection count of more than one access control statement (the client's
host may match more than one access control statement). This may be
significant for subsequent connection requests.
Note that because most specific match semantics are used when checking
the connection limit, priority is given to clients with more specific
host identifiers. It is also possible to exceed connection limits in
some situations. Consider the following:
allow clank : all, maximum 5 connections;
allow * : all except store, maximum 2 connections;
This says that only 2 client connections at a time are permitted for all
hosts other than "clank", which is permitted 5. If a client from host
"boing" is the first to connect to pmcd, it's connection is checked
against the second statement (that is the most specific match with a
connection limit). As there are no other clients, the connection is
accepted and contributes towards the limit for only the second statement
above. If the next client connects from "clank", its connection is
checked against the limit for the first statement. There are no other
connections from "clank", so the connection is accepted. Once this
connection is accepted, it counts towards both statements' limits because
"clank" matches the host identifier in both statements. Remember that
the decision to accept a new connection is made using only the most
specific matching access control statement with a connection limit. Now,
the connection limit for the second statement has been reached. Any
connections from hosts other than "clank" will be refused.
If instead, pmcd with no clients saw three successive connections arrived
from "boing", the first two would be accepted and the third refused.
After that, if a connection was requested from "clank" it would be
accepted. It matches the first statement, which is more specific than
the second, so the connection limit in the first is used to determine
that the client has the right to connect. Now there are 3 connections
contributing to the second statement's connection limit. Even though the
connection limit for the second statement has been exceeded, the earlier
connections from "boing" are maintained. The connection limit is only
checked at the time a client attempts a connection rather than being re-
evaluated every time a new client connects to pmcd.
This gentle scheme is designed to allow reasonable limits to be imposed
on a first come first served basis, with specific exceptions.
As illustrated by the example above, a client's connection is honored
once it has been accepted. However, pmcd reconfiguration (see the next
section) re-evaluates all the connection counts and will cause client
Page 9
PMCD(1)PMCD(1)
connections to be dropped where connection limits have been exceeded.
RECONFIGURING PMCD
If the configuration file has been changed or if an agent is not
responding because it has terminated or the PMNS has been changed, pmcd
may be reconfigured by sending it a SIGHUP, as in
# killall -HUP pmcd
When pmcd receives a SIGHUP, it checks the configuration file for
changes. If the file has been modified, it is reparsed and the contents
become the new configuration. If there are errors in the configuration
file, the existing configuration is retained and the contents of the file
are ignored. Errors are reported in the pmcd log file.
It also checks the PMNS file for changes. If the PMNS file has been
modified, then it is reloaded. Use of tail(1) on the log file is
recommended while reconfiguring pmcd.
If the configuration for an agent has changed (any parameter except the
agent's label is different), the agent is restarted. Agents whose
configurations do not change are not restarted. Any existing agents not
present in the new configuration are terminated. Any deceased agents are
that are still listed are restarted.
Sometimes it is necessary to restart an agent that is still running, but
malfunctioning. Simply kill the agent, then send pmcd a SIGHUP, which
will cause the agent to be restarted.
STARTING AND STOPPING PMCD
Normally, pmcd is started automatically at boot time and stopped when the
system is being brought down (see rc2(1M) and rc0(1M)). Under certain
circumstances it is necessary to start or stop pmcd manually. To do this
one must become superuser and type
# $PCP_RC_DIR/pcp start
to start pmcd, or
# $PCP_RC_DIR/pcp stop
to stop pmcd. Starting pmcd when it is already running is the same as
stopping it and then starting it again.
Sometimes it may be necessary to restart pmcd during another phase of the
boot process. Time-consuming parts of the boot process are often put
into the background to allow the system to become available sooner (e.g.
mounting huge databases). If an agent run by pmcd requires such a task
to complete before it can run properly, it is necessary to restart or
reconfigure pmcd after the task completes. Consider, for example, the
case of mounting a database in the background while booting. If the PMDA
which provides the metrics about the database cannot function until the
Page 10
PMCD(1)PMCD(1)
database is mounted and available but pmcd is started before the database
is ready, the PMDA will fail (however pmcd will still service requests
for metrics from other domains). If the database is initialized by
running a shell script, adding a line to the end of the script to
reconfigure pmcd (by sending it a SIGHUP) will restart the PMDA (if it
exited because it couldn't connect to the database). If the PMDA didn't
exit in such a situation it would be necessary to restart pmcd because if
the PMDA was still running pmcd would not restart it.
Normally pmcd listens for client connections on one or more well-known
TCP/IP port numbers (historically 4321 and more recently the officially
registered port 44321; in the current release, pmcd listens on both these
ports as a transitional arrangement). Either the environment variable
PMCD_PORT or the -p command line option may be used to specify
alternative port number(s) when pmcd is started; in each case, the
specficiation is a comma-separated list of one or more numerical port
numbers. Should both methods be used or multiple -p options appear on
the command line, pmcd will listen on the union of the set of ports
specified via all -p options and the PMCD_PORT environment variable. If
non-default ports are used with pmcd care should be taken to ensure that
PMCD_PORT is also set in the environment of any client application that
will connect to pmcd.
LICENSES
In previous PCP releases, pmcd would terminate immediately if there was
no valid Collector license on the localhost. This has now changed so
that on Irix pmcd will run on hosts without a Collector license, however
an unlicensed pmcd will only accept connections from authorized clients.
On Linux pmcd will run on any host without a license and will accept
connections from any client. Not all PCP tools are authorized clients.
See the PCP release notes for more details about licenses for PCP.
FILES
$PCP_PMCDCONF_PATH
default configuration file
$PCP_PMCDOPTIONS_PATH
command line options to pmcd when launched from $PCP_RC_DIR/pcp
All the command line option lines should start with a hyphen as
the first character. This file can also contain environment
variable settings of the form "VARIABLE=value".
./pmcd.log
(or $PCP_LOG_DIR/pmcd/pmcd.log when started automatically)
All messages and diagnostics are directed here
ENVIRONMENT
In addition to the PCP environment variables described in the PCP
ENVIRONMENT section below, the PMCD_PORT variable is also recognised as
the TCP/IP port for incoming connections (default 4321).
Page 11
PMCD(1)PMCD(1)PCP ENVIRONMENT
Environment variables with the prefix PCP_ are used to parameterize the
file and directory names used by PCP. On each installation, the file
/etc/pcp.conf contains the local values for these variables. The
$PCP_CONF variable may be used to specify an alternative configuration
file, as described in pcp.conf(4).
SEE ALSOPCPIntro(1), pmdbg(1), pmerr(1), pmgenmap(1), pminfo(1), pmkstat(1),
pmstore(1), pmval(1), pcp.conf(4), pcp.env(4) and dso(5).
DIAGNOSTICS
If pmcd is already running the message "Error: OpenRequestSocket bind:
Address already in use" will appear. This may also appear if pmcd was
shutdown with an outstanding request from a client. In this case, a
request socket has been left in the TIME_WAIT state and until the system
closes it down (after some timeout period) it will not be possible to run
pmcd.
In addition to the standard PCP debugging flags, see pmdbg(1), pmcd
currently uses DBG_TRACE_APPL0 for tracing I/O and termination of agents,
DBG_TRACE_APPL1 for tracing host access control (see below) and
DBG_TRACE_APPL2 for tracing the configuration file scanner and parser.
CAVEATS
pmcd does not kill its child agents, it only closes their pipes. If an
agent never checks for a closed pipe it may not terminate.
The configuration file parser will only read lines of less than 1200
characters. This is intended to prevent accidents with binary files.
The timeouts controlled by the -t option apply to IPC between pmcd and
the PMDAs it spawns. This is independent of settings of the environment
variables PMCD_CONNECT_TIMEOUT and PMCD_REQUEST_TIMEOUT (see PCPIntro(1))
which may be used respectively to control timeouts for client
applications trying to connect to pmcd and trying to receive information
from pmcd.
Page 12