cpc_bind_curlwp man page on SmartOS

Man page or keyword search:  
man Server   16655 pages
apropos Keyword Search (all sections)
Output format
SmartOS logo
[printable version]


       cpc_bind_curlwp,	     cpc_bind_pctx,	 cpc_bind_cpu,	   cpc_unbind,
       cpc_request_preset, cpc_set_restart - bind  request  sets  to  hardware

       cc [ flag... ] file... -lcpc [ library... ]
       #include <libcpc.h>

       int cpc_bind_curlwp(cpc_t *cpc, cpc_set_t *set, uint_t flags);

       int cpc_bind_pctx(cpc_t *cpc, pctx_t *pctx, id_t id, cpc_set_t *set,
	    uint_t flags);

       int cpc_bind_cpu(cpc_t *cpc, processorid_t id, cpc_set_t *set,
	    uint_t flags);

       int cpc_unbind(cpc_t *cpc, cpc_set_t *set);

       int cpc_request_preset(cpc_t *cpc, int index, uint64_t preset);

       int cpc_set_restart(cpc_t *cpc, cpc_set_t *set);

       These  functions program the processor's hardware counters according to
       the requests contained in the set argument. If these functions are suc‐
       cessful, then upon return the physical counters will have been assigned
       to count events on behalf of each request in the set, and each  counter
       will be enabled as configured.

       The  cpc_bind_curlwp()  function	 binds	the set to the calling LWP. If
       successful, a performance counter context is associated	with  the  LWP
       that allows the system to virtualize the hardware counters to that spe‐
       cific LWP.

       By default, the system binds the set to the current LWP	only.  If  the
       CPC_BIND_LWP_INHERIT  flag  is  present in the flags argument, however,
       any subsequent LWPs created by the current LWP will inherit a  copy  of
       the request set. The newly created LWP will have its virtualized 64-bit
       counters initialized to the preset values specified  in	set,  and  the
       counters will be enabled and begin counting events on behalf of the new
       LWP. This automatic inheritance behavior can  be	 useful	 when  dealing
       with  multithreaded  programs to determine aggregate statistics for the
       program as a whole.

       If the CPC_BIND_LWP_INHERIT flag is specified and any of	 the  requests
       in the set have the CPC_OVF_NOTIFY_EMT flag set, the process will imme‐
       diately dispatch a SIGEMT signal to the freshly created LWP so that  it
       can  preset its counters appropriately on the new LWP. This initializa‐
       tion condition can be detected using cpc_set_sample(3CPC)  and  looking
       at  the counter value for any requests with CPC_OVF_NOTIFY_EMT set. The
       value of any such counters will be UINT64_MAX.

       The cpc_bind_pctx() function binds the set to the LWP specified by  the
       pctx-id	pair,  where pctx refers to a handle returned from libpctx and
       id is the ID of the desired LWP in the target process. If successful, a
       performance  counter  context  is associated with the specified LWP and
       the system virtualizes the hardware counters to that specific LWP.  The
       flags argument is reserved for future use and must always be 0.

       The cpc_bind_cpu() function binds the set to the specified CPU and mea‐
       sures events occurring on that CPU regardless of which LWP is  running.
       Only  one such binding can be active on the specified CPU at a time. As
       long as any application has bound a set to a CPU, per-LWP counters  are
       unavailable   and  any  attempt	to  use	 either	 cpc_bind_curlwp()  or
       cpc_bind_pctx() returns EAGAIN. The first invocation of	cpc_bind_cpu()
       invalidates  all	 currently bound per-LWP counter sets, and any attempt
       to sample an invalidated set returns EAGAIN. To	bind  to  a  CPU,  the
       library	binds  the  calling  LWP  to  the  measured  CPU  with proces‐
       sor_bind(2). The application must  not  change  its  processor  binding
       until  after  it has unbound the set with cpc_unbind(). The flags argu‐
       ment is reserved for future use and must always be 0.

       The cpc_request_preset() function updates the preset and current	 value
       stored  in  the indexed request within the currently bound set, thereby
       changing the starting value for the specified request for  the  calling
       LWP only, which takes effect at the next call to cpc_set_restart().

       When  a	performance  counter  counting on behalf of a request with the
       CPC_OVF_NOTIFY_EMT flag set overflows,  the  performance	 counters  are
       frozen  and the LWP to which the set is bound receives a SIGEMT signal.
       The cpc_set_restart() function can be called from a SIGEMT signal  han‐
       dler function to quickly restart the hardware counters. Counting begins
       from each request's original preset (see cpc_set_add_request(3CPC)), or
       from  the  preset  specified  in	 a prior call to cpc_request_preset().
       Applications performing performance counter overflow  profiling	should
       use  the	 cpc_set_restart()  function to quickly restart counting after
       receiving a SIGEMT overflow signal and recording any  relevant  program

       The cpc_unbind() function unbinds the set from the resource to which it
       is bound. All hardware resources associated  with  the  bound  set  are
       freed  and  if  the  set was bound to a CPU, the calling LWP is unbound
       from the corresponding CPU. See processor_bind(2).

       Upon successful completion these functions return 0. Otherwise,	-1  is
       returned and errno is set to indicate the error.

       Applications  wanting  to  get detailed error values should register an
       error handler with cpc_seterrhndlr(3CPC). Otherwise, the	 library  will
       output a specific error description to stderr.

       These functions will fail if:

		  For  cpc_bind_curlwp(),  the system has Pentium 4 processors
		  with HyperThreading and at least one physical processor  has
		  more than one hardware thread online. See NOTES.

		  For  cpc_bind_cpu(),	the  process does not have the cpc_cpu
		  privilege to access the CPU's counters.

		  For cpc_bind_curlwp(), cpc_bind_cpc(), and  cpc_bind_pctx(),
		  access to the requested hypervisor event was denied.

		  For  cpc_bind_curlwp()  and cpc_bind_pctx(), the performance
		  counters are not available for use by the application.

		  For cpc_bind_cpu(), another process  has  already  bound  to
		  this	CPU. Only one process is allowed to bind to a CPU at a
		  time and only one set can be bound to a CPU at a time.

		  The	set    does    not    contain	 any	requests    or
		  cpc_set_add_request() was not called.

		  The  value  given  for  an  attribute of a request is out of

		  The system could not	assign	a  physical  counter  to  each
		  request in the system.  See NOTES.

		  One  or  more	 requests in the set conflict and might not be
		  programmed simultaneously.

		  The set was not created with the same cpc handle.

		  For cpc_bind_cpu(), the specified processor does not exist.

		  For cpc_unbind(), the set is not bound.

		  For cpc_request_preset() and cpc_set_restart(), the  calling
		  LWP does not have a bound set.

		  For cpc_bind_cpu(), the specified processor is not online.

		  The	cpc_bind_curlwp()   function   was   called  with  the
		  CPC_OVF_NOTIFY_EMT flag, but the underlying processor is not
		  capable of detecting counter overflow.

		  For cpc_bind_pctx(), the specified LWP in the target process
		  does not exist.

       Example 1 Use hardware performance counters  to	measure	 events	 in  a

       The  following example demonstrates how a standalone application can be
       instrumented with the libcpc(3LIB) functions to	use  hardware  perfor‐
       mance counters to measure events in a process. The application performs
       20 iterations of a computation, measuring the counter values  for  each
       iteration. By default, the example makes use of two counters to measure
       external cache references and external cache hits.  These  options  are
       only  appropriate  for UltraSPARC processors. By setting the EVENT0 and
       EVENT1 environment variables to other strings (a list of which  can  be
       obtained	 from  the  -h option of the cpustat(1M) or cputrack(1) utili‐
       ties), other events can be counted.  The error() routine is assumed  to
       be  a  user-provided routine analogous to the familiar printf(3C) func‐
       tion from the C library that also performs an  exit(2)  after  printing
       the message.

	 #include <inttypes.h>
	 #include <stdlib.h>
	 #include <stdio.h>
	 #include <unistd.h>
	 #include <libcpc.h>
	 #include <errno.h>

	 main(int argc, char *argv[])
	 int iter;
	 char *event0 = NULL, *event1 = NULL;
	 cpc_t *cpc;
	 cpc_set_t *set;
	 cpc_buf_t *diff, *after, *before;
	 int ind0, ind1;
	 uint64_t val0, val1;

	 if ((cpc = cpc_open(CPC_VER_CURRENT)) == NULL)
		 error("perf counters unavailable: %s", strerror(errno));

	 if ((event0 = getenv("EVENT0")) == NULL)
	      event0 = "EC_ref";
	 if ((event1 = getenv("EVENT1")) == NULL)
	      event1 = "EC_hit";

	 if ((set = cpc_set_create(cpc)) == NULL)
		 error("could not create set: %s", strerror(errno));

	 if ((ind0 = cpc_set_add_request(cpc, set, event0, 0, CPC_COUNT_USER, 0,
		 NULL)) == -1)
		 error("could not add first request: %s", strerror(errno));

	 if ((ind1 = cpc_set_add_request(cpc, set, event1, 0, CPC_COUNT_USER, 0,
		 NULL)) == -1)
		 error("could not add first request: %s", strerror(errno));

	 if ((diff = cpc_buf_create(cpc, set)) == NULL)
		 error("could not create buffer: %s", strerror(errno));
	 if ((after = cpc_buf_create(cpc, set)) == NULL)
		 error("could not create buffer: %s", strerror(errno));
	 if ((before = cpc_buf_create(cpc, set)) == NULL)
		 error("could not create buffer: %s", strerror(errno));

	 if (cpc_bind_curlwp(cpc, set, 0) == -1)
		  error("cannot bind lwp%d: %s", _lwp_self(), strerror(errno));

	 for (iter = 1; iter <= 20; iter++) {

		 if (cpc_set_sample(cpc, set, before) == -1)

		  /* ==> Computation to be measured goes here <== */

		 if (cpc_set_sample(cpc, set, after) == -1)

		 cpc_buf_sub(cpc, diff, after, before);
		 cpc_buf_get(cpc, diff, ind0, &val0);
		 cpc_buf_get(cpc, diff, ind1, &val1);

		  (void) printf("%3d: %" PRId64 " %" PRId64 "\n", iter,
			 val0, val1);

	  if (iter != 21)
		 error("cannot sample set: %s",	 strerror(errno));


	 return (0);

       Example 2 Write a signal handler to catch overflow signals.

       The following example builds on Example 1 and demonstrates how to write
       the signal handler to catch overflow signals. A counter	is  preset  so
       that it is 1000 counts short of overflowing. After 1000 counts the sig‐
       nal handler is invoked.

       The signal handler:

	 cpc_t	   *cpc;
	 cpc_set_t *set;
	 cpc_buf_t *buf;
	 int	   index;

	 emt_handler(int sig, siginfo_t *sip, void *arg)
	      ucontext_t *uap = arg;
	      uint64_t val;

	      if (sig != SIGEMT || sip->si_code != EMT_CPCOVF) {
		  psignal(sig, "example");
		  psiginfo(sip, "example");

	      (void) printf("lwp%d - si_addr %p ucontext: %%pc %p %%sp %p\n",
		  _lwp_self(), (void *)sip->si_addr,
		  (void *)uap->uc_mcontext.gregs[PC],
		  (void *)uap->uc_mcontext.gregs[SP]);

	      if (cpc_set_sample(cpc, set, buf) != 0)
		  error("cannot sample: %s", strerror(errno));

	      cpc_buf_get(cpc, buf, index, &val);

	      (void) printf("0x%" PRIx64"\n", val);
	      (void) fflush(stdout);

	      * Update a request's preset and restart the counters. Counters which
	      * have not been preset with cpc_request_preset() will resume counting
	      * from their current value.
	      (cpc_request_preset(cpc, ind1, val1) != 0)
		 error("cannot set preset for request %d: %s", ind1,
		 if (cpc_set_restart(cpc, set) != 0)
		      error("cannot restart lwp%d: %s", _lwp_self(), strerror(errno));

       The setup code, which can be positioned after the code that  opens  the
       CPC library and creates a set:

	 #define PRESET (UINT64_MAX - 999ull)

	      struct sigaction act;
	      act.sa_sigaction = emt_handler;
	      bzero(&act.sa_mask, sizeof (act.sa_mask));
	      act.sa_flags = SA_RESTART|SA_SIGINFO;
	      if (sigaction(SIGEMT, &act, NULL) == -1)
		  error("sigaction: %s", strerror(errno));

	      if ((index = cpc_set_add_request(cpc, set, event, PRESET,
		 error("cannot add request to set: %s", strerror(errno));

	      if ((buf = cpc_buf_create(cpc, set)) == NULL)
		 error("cannot create buffer: %s", strerror(errno));

	      if (cpc_bind_curlwp(cpc, set, 0) == -1)
		  error("cannot bind lwp%d: %s", _lwp_self(), strerror(errno));

	      for (iter = 1; iter <= 20; iter++) {
		  /* ==> Computation to be measured goes here <== */

	      cpc_unbind(cpc, set);	 /* done */

       See attributes(5) for descriptions of the following attributes:

       │Interface Stability │ Evolving	      │
       │MT-Level	    │ Safe	      │

       cpustat(1M),  cputrack(1),  psrinfo(1M),	 processor_bind(2), cpc_seter‐
       rhndlr(3CPC), cpc_set_sample(3CPC), libcpc(3LIB), attributes(5)

       When a set is bound, the system assigns a physical hardware counter  to
       count  on  behalf  of each request in the set. If such an assignment is
       not possible for all requests in the set, the bind function returns  -1
       and  sets  errno	 to  EINVAL.  The  assignment  of requests to counters
       depends on the capabilities of the available counters. Some  processors
       (such  as  Pentium 4) have a complicated counter control mechanism that
       requires the reservation	 of  limited  hardware	resources  beyond  the
       actual  counters. It could occur that two requests for different events
       might be impossible to count at the same	 time  due  to	these  limited
       hardware	  resources.   See  the	 processor  manual  as	referenced  by
       cpc_cpuref(3CPC) for details about the underlying processor's capabili‐
       ties and limitations.

       Some processors can be configured to dispatch an interrupt when a phys‐
       ical counter overflows. The most obvious use for this  facility	is  to
       ensure  that  the  full	64-bit	counter	 values are maintained without
       repeated sampling. Certain hardware, such as the UltraSPARC  processor,
       does  not  record  which counter overflowed. A more subtle use for this
       facility is to preset the counter to a value  slightly  less  than  the
       maximum	value,	then  use the resulting interrupt to catch the counter
       overflow associated with that event. The overflow can then be  used  as
       an indication of the frequency of the occurrence of that event.

       The interrupt generated by the processor might not be particularly pre‐
       cise.  That is, the particular  instruction  that  caused  the  counter
       overflow	 might	be earlier in the instruction stream than is indicated
       by the program counter value in the ucontext.

       When a request is added to a set with the CPC_OVF_NOTIFY_EMT flag  set,
       then  as	 before, the control registers and counter are preset from the
       64-bit preset value given. When the flag is set,	 however,  the	kernel
       arranges	 to send the calling process a SIGEMT signal when the overflow
       occurs. The si_code member of the corresponding	siginfo	 structure  is
       set  to	EMT_CPCOVF  and	 the  si_addr member takes the program counter
       value at the time the overflow interrupt	 was  delivered.  Counting  is
       disabled until the set is bound again.

       If  the	CPC_CAP_OVERFLOW_PRECISE  bit  is set in the value returned by
       cpc_caps(3CPC), the processor is	 able  to  determine  precisely	 which
       counter	has overflowed after receiving the overflow interrupt. On such
       processors, the SIGEMT signal is sent only if a counter	overflows  and
       the  request  that  the	counter is counting has the CPC_OVF_NOTIFY_EMT
       flag set.  If the capability is not present on the processor, the  sys‐
       tem  sends  a  SIGEMT signal to the process if any of its requests have
       the CPC_OVF_NOTIFY_EMT flag set and any counter in its set overflows.

       Different processors have different counter  ranges  available,	though
       all processors supported by Solaris allow at least 31 bits to be speci‐
       fied as a counter preset value. Portable preset values lie in the range
       UINT64_MAX to UINT64_MAX-INT32_MAX.

       The  appropriate	 preset value will often need to be determined experi‐
       mentally.  Typically, this value will depend on the  event  being  mea‐
       sured  as  well as the desire to minimize the impact of the act of mea‐
       surement on the event being measured. Less frequent interrupts and sam‐
       ples lead to less perturbation of the system.

       If  the	processor  cannot  detect counter overflow, bind will fail and
       return ENOTSUP. Only user events can be measured using this  technique.
       See Example 2.

   Pentium 4
       Most  Pentium  4	 events require the specification of an event mask for
       counting.  The event mask is specified with the emask attribute.

       Pentium 4 processors with HyperThreading Technology have only  one  set
       of  hardware  counters per physical processor. To use cpc_bind_curlwp()
       or cpc_bind_pctx() to measure per-LWP events on a system with Pentium 4
       HT processors, a system administrator must first take processors in the
       system offline until each physical  processor  has  only	 one  hardware
       thread  online (See the -p option to psrinfo(1M)). If a second hardware
       thread is brought online, all per-LWP bound contexts  will  be  invali‐
       dated and any attempt to sample or bind a CPC set will return EAGAIN.

       Only  one  CPC  set at a time can be bound to a physical processor with
       cpc_bind_cpu(). Any call to cpc_bind_cpu() that attempts to bind a  set
       to  a  processor that shares a physical processor with a processor that
       already has a CPU-bound set returns an error.

       To measure the shared state on a Pentium 4 processor with  HyperThread‐
       ing,  the  count_sibling_usr  and count_sibling_sys attributes are pro‐
       vided for use with cpc_bind_cpu(). These attributes behave  exactly  as
       the CPC_COUNT_USER and CPC_COUNT_SYSTEM request flags, except that they
       act on the sibling hardware thread sharing the physical processor  with
       the CPU measured by cpc_bind_cpu(). Some CPC sets will fail to bind due
       to resource constraints. The most common type of resource constraint is
       an  ESCR	 conflict  among one or more requests in the set. For example,
       the branch_retired event cannot be  measured  on	 counters  12  and  13
       simultaneously because both counters require the CRU_ESCR2 ESCR to mea‐
       sure this event. To measure  branch_retired  events  simultaneously  on
       more  than  one	counter,  use  counters	 such  that  one  counter uses
       CRU_ESCR2 and the other counter uses CRU_ESCR3. See the processor docu‐
       mentation for details.

				 Mar 05, 2007		 CPC_BIND_CURLWP(3CPC)

List of man pages available for SmartOS

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
Vote for polarhome
Free Shell Accounts :: the biggest list on the net