sched_stat man page on DigitalUNIX

Man page or keyword search:  
man Server   12896 pages
apropos Keyword Search (all sections)
Output format
DigitalUNIX logo
[printable version]

sched_stat(8)							 sched_stat(8)

       sched_stat  -  Displays CPU usage and process-scheduling statistics for
       SMP and NUMA platforms

       /usr/sbin/sched_stat [-l] [-s] [-f] [-u] [-R] [command [cmd_arg]...]

       Prints the count of calls that are not multiprocessor safe  and	there‐
       fore funneled to the master CPU. For example:

	       Funnelling counts

	       unix master calls 11174	 resulting blocks 2876

	      The impact of funneled calls on the master CPU needs to be taken
	      into account when evaluating  statistics	for  the  master  CPU.
	      Prints scheduler load-balancing statistics. For example:
					 Scheduler Load Balancing

							    |	      5-second
			steal		   idle	       desired	 |     current
	      interrupt	    RT
	       cpu	 trys	 steals	  steals    load    |	load	     %
		 0  |	    288		3      20609	    0.000	 0.000
	      0.454	 0.156
		 1   |	      615	   6	 21359	     0.000	 0.000
	      0.002	 0.203
		 2  |	    996		4      20135	    0.000	 0.001
	      0.000	 0.237
		 3   |	     1302	   4	 16195	     0.000	 0.001
	      0.000	 0.330
		 6  |	      5		0	3029	    0.000	 0.000
	      0.000	 0.034

	      .	 .  .

	      In the displayed table, each row contains per-CPU information as
	      follows: The number  identifier  of  the	CPU.   The  number  of
	      attempts	made  to  steal processes/threads from other CPUs when
	      the CPU was not idle.  The number of processes/threads  actually
	      stolen from other CPUs when the CPU was not idle.	 The number of
	      processes/threads stolen from other CPUs when the CPU was	 idle.
	      The  number  of  time slices that should be used on this CPU for
	      running timeshare threads.  This information  is	calculated  by
	      comparing	 the  current  load,  interrupt %, and RT % statistics
	      obtained for this CPU with those obtained for other CPUs in  the
	      same PAG.

	      When  current load is less than desired load, the scheduler will
	      attempt to migrate timeshare threads to this  CPU	 in  order  to
	      better  balance  the  timeshare  workload among CPUs in the same

	      See DESCRIPTION for information about PAGs.  Over the last  five
	      seconds, the average number of time slices used to run timeshare
	      threads on this CPU.  Over the last five	seconds,  the  average
	      percentage  of time slices that this CPU spent in interrupt con‐
	      text.  Over the last five seconds,  the  average	precentage  of
	      time  slices that this CPU used to run threads according to FIFO
	      or round-robin policy.  Prints information about CPU locality in
	      two  tables:  Shows  the order-of-preference (in terms of memory
	      affinity) that exists between a CPU and different RADs.	Order-
	      of-preference  indicates,	 for  a given home RAD, the ranking of
	      other RADs in terms of increasing physical  distance  from  that
	      home  RAD.  If a process or thread needs more memory or needs to
	      be scheduled on a RAD other than its home RAD, the kernel	 auto‐
	      matically	 searches  RADs for additional memory or CPU cycles in
	      the order of preference shown in this table.  Shows the distance
	      (number  of  hops)  between  different RADs and, by association,
	      between CPUs. The information in this table  is  coarser-grained
	      than  in	the  preceding	Radtab table and more relevant to NUMA
	      programming choices. For example, the expression	RAD_DIST_LOCAL
	      +	 2  indicates  RADs  that  are	no  more  than two hops from a
	      thread's home RAD.

	      For example (a small, switchless mesh NUMA system):

	      Radtab (rads in order of preference)
				       CPU # Preference	   0	1    2	  3
			-------------------- 0		   0	1    2	  3  1
	      1	    0	 3    2 2	      2	   3	0    1 3	     3
	      2	   1	0

	      Hoptab (hops indexed by rad)
				       CPU # To rad #	   0	1    2	  3
			-------------------- 0		   0	1    1	  2  1
	      1	    0	 2    1 2	      1	   2	0    1 3	     2
	      1	   1	0

	      In these tables, the CPU identifiers are listed across  the  top
	      from  left  to  right  and the RAD identifiers are listed on the
	      left from top to bottom.	For example if a  process  running  on
	      CPU  2 needs additional memory, Radtab indicates that the kernel
	      will search for that memory first in RAD 2, then in RAD 3,  then
	      in  RAD  0,  and	last in RAD 1.	Hoptab shows the basis of this
	      preference in that RAD 2 is CPU 2's local RAD, RADs 0 and 3  are
	      one hop away, and RAD 1 is two hops away.

	      The  -R  option is useful only on NUMA platforms, such as GS1280
	      and ES80 AlphServer  systems,  in	 which	memory	latency	 times
	      varies  from one RAD to another. The information in these tables
	      is less useful for GS80, GS160  and  GS320  AlphaServer  systems
	      because  both  coarse  and  finer-grained memory affinity is the
	      same from any CPU in one RAD to any CPU in another RAD; however,
	      the displays can tell you which CPUs are in which RAD.

	      Make  sure that you both maximize the size of your terminal emu‐
	      lator window and minimize the font  size	before	using  the  -R
	      option;  otherwise,  line-wrapping  will	render the tables very
	      difficult to read on systems that have many CPUs.	 Prints sched‐
	      uling-dispatch  (processor-usage)	 statistics  for each CPU. For

				       Scheduler Dispatch Statistics

	      cpu  0	    local	 global		idle	    remote   |
	      total						       percent
	      hot	    60827	  12868	     19158991		  0  |
	      19232686	   91.6	 warm		 78	      21       1542019
	      0	  |	  1542118	7.3   cold	      315	 27289
	      184784		  7855	     |		  220243	   1.0
	      total	   61220	 40178	    20885794	      7855   |
	      20995047 percent	     0.3	 0.2	    99.5	 0.0

	      cpu   1	      local	    global	  idle	     remote  |
	      total						       percent
	      hot	   33760	 11788	    16412544		 0   |
	      16458092	    89.5  warm		  66	       24      1707014
	      0	 |	 1707104       9.3   cold	     201	 26191
	      203513		     0	     |		  229905	   1.2

	      .	 .  .

	      These statistics show the count and percentage of thread context
	      switches (times that the kernel switches to a  new  thread)  for
	      the following categories: Threads scheduled from the CPU's Local
	      Run Queue Threads scheduled from the Global Run Queue of the PAG
	      to  which	 the  CPU  belongs Threads scheduled from the Idle CPU
	      Queue of the PAG to which the CPU belongs	 Threads  stolen  from
	      Global or Local Run Queues in another PAG

	      Note  that  these	 statistics  do not count CPU time slices that
	      were used to re-run the same thread.

	      Each SMP unit (or RAD on a NUMA system) has a Processor Affinity
	      Group (PAG). Each PAG contains the following queues:

	      A Global Run Queue from which processes or threads are scheduled
	      on the first available CPU One or more  Local  Run  Queues  from
	      which  processes	or  threads  are scheduled on a specific CPU A
	      queue that contains idle CPUs

	      A thread that is handed to an idle CPU goes directly to that CPU
	      without first being placed on the other queues.

	      If  there	 is insufficient work queued locally to keep the PAG's
	      CPUs busy, threads are stolen first from the Global and then the
	      Local Run Queues in a remote PAG.

	      For  each	 of these categories, statistics are grouped into hot,
	      warm, and cold subcategories.  The hot statistics	 show  context
	      switches	to  threads that last ran on the CPU only a very short
	      time before.  The	 warm  statistics  show	 context  switches  to
	      threads  that last ran on the CPU a somewhat longer time before.
	      The cold statistics indicate context switches  to	 threads  that
	      never  ran  on the CPU before. These statistics are a measure of
	      how well cache affinity is being maintained; that is, how likely
	      the  data	 used  by  threads  when they last ran is still in the
	      cache when the threads are rescheduled. You cannot evaluate this
	      information  without knowledge of the type of work being done on
	      the system; maintenance of cache affinity can be very  important
	      on  systems  (or	processor  sets) that are dedicated to running
	      certain applications (such as those doing high performance tech‐
	      nical  computing)	 but  is  less	critical for systems serving a
	      variety of applications and users.  Prints processor-usage  sta‐
	      tistics for each CPU. For example:

					      Processor Usage

	       cpu     user   nice  system  idle  widle |    scalls	  intr
	      csw							 tbsyc
		 0 |   0.0   0.0   0.7	 99.2	 0.1  |	   3327337    50861486
	      41885424	 317108
		 1  |	 0.0	0.0    0.4   99.5   0.1 |   3514438	     0
	      36710149	 268667
		 2 |   0.0   0.0   0.4	 99.5	 0.1  |	   3182064	     0
	      37384120	 257749
		 3  |	 0.0	0.0    0.4   99.5   0.1 |   3528519	     0
	      36468319	 249492
		 6 |   0.0   0.0   0.1	 99.9	 0.0  |	    668892	 11664
	      11793053	 352294
		 7  |	 0.0	0.0    0.1   99.9   0.0 |    772821	     0
	      9341527	352319
		 8 |   0.0   0.0   0.0	100.0	 0.0  |	    529050	 11724
	      5717059	347267
		 9  |	 0.0	0.0    0.0  100.0   0.0 |    492386	     0
	      6603681	351509

	      .	 .  .

	      In this table: The number identifier of the CPU.	The percentage
	      of  time slices spent running threads in user context.  The per‐
	      centage of time slices  in  which	 lower-priority	 threads  were
	      scheduled.  These	 are  user-context  threads whose priority was
	      explicitly lowered by using an interface such as the  nice  com‐
	      mand  or	the class-scheduling software.	The percentage of time
	      slices spent  running  threads  in  system  context.  This  work
	      includes	servicing of interrupts and system calls that are made
	      on behalf of user processes.  An unusually  high	percentage  in
	      the  system category might indicate a system bottleneck. Running
	      kprofile and lockinfo provides more specific  information	 about
	      where  system  time  is  being  spent. See uprofile(1) and lock‐
	      info(8), respectively, for information  about  these  utilities.
	      The  percentage  of  time slices in which no threads were sched‐
	      uled.  The percentage of time slices in which available  threads
	      were  blocked by pending I/O and the CPU was idle. If this count
	      is unusually high, it suggests that a bottleneck in an I/O chan‐
	      nel  might be causing suboptimal performance.  The count of sys‐
	      tem calls that were serviced.  The count of interrupts that were
	      serviced.	 The count of thread context switches (thread schedul‐
	      ing changes) that completed.   The  number  of  times  that  the
	      translation buffer was synchronized.

       The command to be executed by sched_stat.  Any arguments to the preced‐
       ing command.

       The command and cmd_arg operands are used to limit the length  of  time
       in  which  sched_stat gathers statistics. Typically, sleep is specified
       for command and some number of seconds is specified for cmd_arg.

       If you do not specify a command to specify an time interval for statis‐
       tics gathering, the statistics will reflect what has occurred since the
       system was last booted.

       The sched_stat utility helps you determine how well the system load  is
       distributed among CPUs, what kinds of jobs are getting (or not getting)
       sufficient cycles on each CPU, and how well  cache  affinity  is	 being
       maintained for these jobs.

       Answers	to  the	 following  questions  influence how a process and its
       threads are scheduled: Is the request to	 be  serviced  multiprocessor-

	      If  not,	the  kernel funnels the request to the master CPU. The
	      master CPU must reside in the default processor set (which  con‐
	      tains all system CPUs if none were assigned to user-defined pro‐
	      cessor sets) and is typically CPU	 0;  however,  some  platforms
	      permit CPUs other than CPU 0 to be the master CPU.  Few requests
	      generated by software distributed with the operating system need
	      to  be  funneled to the master CPU and most of these are associ‐
	      ated with certain device drivers. However, if  the  system  runs
	      many  third-party	 drivers,  the number of requests that must be
	      funneled to the master CPU might be higher.   What  is  the  job

	      Job  priority  influences	 how frequently a thread is scheduled.
	      Realtime requests and interrupts have higher priority than time-
	      share jobs, which include the majority of user-mode threads. So,
	      if a significant number of CPU cycles are spent servicing	 real‐
	      time  requests  and interrupts, there are fewer cycles available
	      for time-share jobs.

	      Default priority for time-share jobs  can	 also  be  changed  by
	      using  the nice command, the runclass command, or through class-
	      scheduling software. On a busy system, cache  affinity  is  less
	      likely to be maintained for a thread from a time-share job whose
	      priority was lowered because  more  time	is  likely  to	elapse
	      between  rescheduling  operations	 for  each thread. Conversely,
	      cache affinity is more likely to be maintained for threads of  a
	      higher-priority time-share job because less time elapses between
	      rescheduling operations. Note that the scheduler always  priori‐
	      tizes  the  need for low response latency (as demanded by inter‐
	      rupts and real-time requests) higher than maintenance  of	 cache
	      affinity,	 regardless  of	 the priority assigned to a time-share
	      job.  Are there user-defined restrictions	 that  limit  where  a
	      process may run?

	      If  so,  the kernel must schedule all threads of that process on
	      CPUs in the restricted set. In some cases, user-defined restric‐
	      tions  are  explicit  RAD or CPU bindings specified either in an
	      application or by a command (such as runon)  that	 was  used  to
	      launch the program or reassign one of its threads.

	      The  set	of CPUs where the kernel can schedule a thread is also
	      influenced by the presence of user-defined  processor  sets.  If
	      the  process  was	 not  explicitly started in or reassigned to a
	      user-defined processor set, the kernel must run it  and  all  of
	      its  threads only on CPUs in the default processor set.  Are any
	      CPUs idle?

	      The scheduler is very aggressive in its attempts to  steal  jobs
	      from  other  CPUs	 to  run  on  an idle CPU. This means that the
	      scheduler will migrate processes or threads  across  RAD	bound‐
	      aries to give an idle CPU work to do unless one of the preceding
	      restrictions is in place	to  prevent  that.  For	 example,  the
	      scheduler	 does not cross processor set boundaries when stealing
	      work from another CPU, even when a  CPU  is  idle.  In  general,
	      keeping CPUs busy with work has higher priority than maintaining
	      memory or cache affinity during load-balancing operations.

       Explicit memory-allocation advice provided in application  code	influ‐
       ences  scheduling  only to the extent that the preceding factors do not
       override that advice. However, explicit memory-allocation  advice  does
       make  a	difference  (and thereby can improve performance) when CPUs in
       the processor set where the program is running are kept	busy  but  are
       not overloaded.

       To gather statistics with sched_stat, you typically follow these steps:
       Start up a system workload and wait for it to get to  a	steady	state.
       Start sched_stat with sleep as the specified command and some number of
       seconds as the specified cmd_arg. This causes sched_stat to gather sta‐
       tistics for the length of time it takes the sleep command to execute.

       For example, the following command causes sched_stat to collect statis‐
       tics for 60 seconds and then print  a  report:  #  /usr/sbin/sched_stat
       sleep 60

       If  you	include	 options  on the command line, only statistics for the
       specified options are reported.

       If you specify the command without any options, all options except  for
       -R are assumed. (See the descriptions of the -f, -l, -s, and -u options
       in the OPTIONS section.)

       Running the sched_stat command has minimal  impact  on  system  perfor‐

       The  sched_stat	utility	 is subject to change, without advance notice,
       from one release to another. The utility is intended mainly for use  by
       other  software	applications included in the operating system product,
       kernel developers, and  software	 support  representatives.  Therefore,
       sched_stat  should  be used only interactively; any customer scripts or
       programs written to depend on its output data or display	 format	 might
       be  broken  by  changes in future versions of the utility or by patches
       that might be applied to it.

       Success.	 An error occurred.

       The pseudo driver that is opened by the	sched_stat  utility  for  RAD-
       related statistics gathering.

       Commands:   iostat(1),  netstat(1),  nice(1),  renice(1),  runclass(1),
       runon(1),  uprofile(1),	vmstat(1),  advfsstat(8),  collect(8),	 lock‐
       info(8), nfsstat(8), sys_check(8)

       Others: numa_intro(3), class_scheduling(4), processor_sets(4)


List of man pages available for DigitalUNIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
Vote for polarhome
Free Shell Accounts :: the biggest list on the net