numa_intro man page on DigitalUNIX

Man page or keyword search:  
man Server   12896 pages
apropos Keyword Search (all sections)
Output format
DigitalUNIX logo
[printable version]

numa_intro(3)							 numa_intro(3)

NAME
       numa_intro - Introduction to NUMA support

DESCRIPTION
       NUMA,  or Non-Uniform Memory Access, refers to a hardware architectural
       feature in modern multiprocessor platforms that attempts to address the
       increasing disparity between requirements for processor speed and band‐
       width and the bandwidth capabilities of memory systems,	including  the
       interconnect  between  processors and memory. NUMA systems address this
       problem by grouping resources--processors, I/O buses, and  memory--into
       building	 blocks	 that  balance an appropriate number of processors and
       I/O buses with a local memory system that delivers the necessary	 band‐
       width.  The  local building blocks are combined into a larger system by
       means of a system-level interconnect with a platform-specific topology.

       The local processor and I/O components on a particular  building	 block
       can  access  their  own “local” memory with the lowest possible latency
       for a particular system design. The local building block	 can  in  turn
       access  the  resources (processors, I/O, and memory) of remote building
       blocks at the cost of increased access  latency	and  decreased	global
       access  bandwidth.  The	term “Non-Uniform Memory Access” refers to the
       difference in latency between “local” and “remote” memory accesses that
       can occur on a NUMA platform.

       Overall	system	throughput  and	 individual application performance is
       optimized on a NUMA platform by maximizing the ratio of local  resource
       accesses	 to  remote accesses. This is achieved by recognizing and pre‐
       serving the “affinity” that processes have for the various resources on
       the  system  building blocks.  For this reason, the building blocks are
       called “Resource Affinity Domains” or RADs.

       RADs are supported only on a class of platforms known as Cache Coherent
       NUMA,  or  CC  NUMA,  where all memory is accessible and cache coherent
       with respect to all processors and I/O buses. The Tru64 UNIX  operating
       system includes enhancements to optimize system throughput and applica‐
       tion performance on CC NUMA platforms for legacy applications  as  well
       as  those that use NUMA-aware APIs. System enhancements to support NUMA
       are discussed in the following subsections.  Along with system  perfor‐
       mance  monitoring  and  tuning facilities, these enhancements allow the
       operating system to make a “best effort” to optimize the performance of
       any given collection of applications or application components on a CC-
       NUMA platform.

   NUMA Enhancements to Basic UNIX Algorithms and Default Behaviors
       For NUMA, modifications to basic UNIX  algorithms  (scheduling,	memory
       allocation,  and	 so  forth)  and  to  default behaviors maximize local
       accesses transparently  to  applications.  These	 modifications,	 which
       include	the  following,	 directly  benefit  legacy  and non-NUMA-aware
       applications that were designed for  uniprocessors  or  Uniform	Memory
       Access  Symmetric  Multiprocessors but run on CC NUMA platforms: Topol‐
       ogy-aware placement of data

	      The operating system attempts to allocate memory for application
	      (and  kernel)  data on the RAD closest to where the data will be
	      accessed; or, for data that is globally accessed, the  operating
	      system may allocate memory across the available RADs. When there
	      is insufficient free memory on optimal RADs, the memory  alloca‐
	      tions  for data may “overflow” onto nearby RADs.	Replication of
	      read-only code and data

	      The operating system will attempt to make a local copy of	 read-
	      only  text, such as shared library and program code. Kernel code
	      and kernel read-only data are replicated on  all	RADs  at  boot
	      time.  If insufficient free local memory is available, the oper‐
	      ating system may choose to utilize a  remote  copy  rather  than
	      wait for free local memory.  Memory affinity-aware scheduling

	      The  operating  system  scheduler	 takes	“cache	affinity” into
	      account when choosing a processor to run	a  process  thread  on
	      multiprocessor  platforms. Cache affinity assumes that a process
	      thread builds a “memory footprint” in a  particular  processor's
	      cache.  On  CC  NUMA  platforms,	the  scheduler also takes into
	      account the fact that processes will have	 memory	 allocated  on
	      particular  RADs,	 and will attempt to keep processes running on
	      processors that are in the same RAD as their memory  footprints.
	      Load balancing

	      To  minimize the requirement for remote memory allocation (over‐
	      flow), the scheduler will take into account memory  availability
	      on  a  RAD  as  well  as the processor load average for the RAD.
	      Although these two  factors  may	at  times  conflict  with  one
	      another,	the scheduler will attempt to balance the load so that
	      processes run where there are memory pages as well as  processor
	      cycles  available.  This	balancing  involves  both  the initial
	      selection of a RAD at process creation  and  migration  of  pro‐
	      cesses or individual pages in response to changing loads as pro‐
	      cesses come and go or their resource requirements or access pat‐
	      terns change.

   NUMA Enhancements to Application Programming Interfaces
       Application  programmers	 can  use  new or modified library routines to
       further increase local accesses on CC NUMA platforms. Using these APIs,
       programmers  can	 write	new applications or modify old ones to provide
       additional information to the operating system or to take explicit con‐
       trol over process, thread, memory object placement, or some combination
       of these.

       Following are tables that list the NUMA library routines that deal with
       RADs  and  RAD sets, processes and threads, memory management, CPUs and
       CPU sets, and NUMA Scheduling Groups. Routines  are  listed  alphabeti‐
       cally  in each table, and some routines are listed in more than one ta‐
       ble.

       For information about NUMA types, structures, and symbolic values,  see
       numa_types(4).	For  information  about	 NUMA  Scheduling  Groups, see
       numa_scheduling_groups(4).

       RADs and RAD Sets

       ───────────────────────────────────────────────────────────────────────
       Function		   Purpose		Library	  Reference Page
       ───────────────────────────────────────────────────────────────────────
       nloc()		   Returns  the	  RAD	libnuma	   nloc(3)
			   set	 that	is  a
			   specified distance
			   from a resource.

       rad_attach_pid()	   Attaches a process	libnuma	   rad_attach_pid(3)
			   to a RAD  (assigns
			   a   home  RAD  but
			   allows   execution
			   on other RADs).

       rad_bind_pid()	   Binds a process to	libnuma	   rad_attach_pid(3)
			   a RAD  (assigns  a
			   home	   RAD	  and
			   restricts   execu‐
			   tion	 to  the home
			   RAD).

       rad_foreach()	   Scans  a  RAD  set	libnuma	   rad_foreach(3)
			   for	 members  and
			   returns the	first
			   member found.

       rad_get_cur‐	   Returns the	call‐	libnuma	  rad_get_cur‐
       rent_home()	   er's home RAD.		  rent_home(3)

       rad_get_cpus()	   Returns the set of	libnuma	   rad_get_num(3)
			   CPUs that are in a
			   RAD.

       rad_get_freemem()   Returns a snapshot	libnuma	   rad_get_num(3)
			   of the free memory
			   pages that are  in
			   a RAD.

       rad_get_info()	   Returns   informa‐	libnuma	   rad_get_num(3)
			   tion about a	 RAD,
			   including	  its
			   state  (online  or
			   offline)  and  the
			   number of CPUs and
			   memory   pages  it
			   contains.

       rad_get_max()	   Returns the number	libnuma	   rad_get_num(3)
			   of	RADs  in  the
			   system.  **

       rad_get_num()	   Returns the number	libnuma	   rad_get_num(3)
			   of  RAD's  in  the
			   caller's    parti‐
			   tion. **
       rad_get_physmem()   Returns the number	libnuma	   rad_get_num(3)
			   of  memory	pages
			   assigned to a RAD.

       rad_get_state()	   Reserved	  for	libnuma	   rad_get_num(3)
			   future use.	(Cur‐
			   rently,  RAD state
			   is always  set  to
			   RAD_ONLINE.)

       radaddset()	   Adds	 a  RAD	 to a	libnuma	   radsetops(3)
			   RAD set.

       radandset()	   Performs a logical	libnuma	   radsetops(3)
			   AND	operation  on
			   two	 RAD	sets,
			   storing the result
			   in a RAD set.

       radcopyset()	   Copies  the	 con‐	libnuma	   radsetops(3)
			   tents  of  one RAD
			   set to another RAD
			   set.

       radcountset()	   Returns  the	 mem‐	libnuma	   radsetops(3)
			   bers of a RAD set.

       raddelset()	   Removes a RAD from	libnuma	   radsetops(3)
			   a RAD set.

       raddiffset()	   Finds  the logical	libnuma	   radsetops(3)
			   difference between
			   two	  RAD	sets,
			   storing the result
			   in	another	  RAD
			   set.

       rademptyset()	   Initializes a  RAD	libnuma	   radsetops(3)
			   set	such  that no
			   RADs are included.

       radfillset()	   Initializes a  RAD	libnuma	   radsetops(3)
			   set	such  that it
			   includes all RADs.

       radisemptyset()	   Tests  whether   a	libnuma	   radsetops(3)
			   RAD set is empty.

       radismember()	   Tests   whether  a	libnuma	   radsetops(3)
			   RAD belongs	to  a
			   given RAD set.

       radorset()	   Performs a logical	libnuma	   radsetops(3)
			   OR  operation   on
			   two	  RAD	sets,
			   storing the result
			   in	another	  RAD
			   set.

       radsetcreate()	   Allocates  a	  RAD	libnuma	   radsetops(3)
			   set and sets it to
			   empty.

       radsetdestroy()	   Releases the	 mem‐	libnuma	   radsetops(3)
			   ory	allocated for
			   a RAD set.

       radxorset()	   Performs a logical	libnuma	   radsetops(3)
			   XOR	operation  on
			   two	 RAD	sets,
			   storing the result
			   in	another	  RAD
			   set.

       ───────────────────────────────────────────────────────────────────────

       **  On  a  partitioned system, the system and the partition are equiva‐
       lent.  In this case, the operating system returns information only  for
       the partition in which it is installed.

       Processes and Threads

       ──────────────────────────────────────────────────────────────────────────────────
       Function		      Purpose		      Library	   Reference Page
       ──────────────────────────────────────────────────────────────────────────────────

       nfork()		      Creates	 a    child   libnuma	    nfork(3)
			      process  that  is	 an
			      exact   copy  of	its
			      parent  process.	See
			      also  the table entry
			      for rad_fork().

       nmadvise()	      Tells the system what   libnuma	    nmadvise(3)
			      behavior	 to  expect
			      from a  process  with
			      respect  to referenc‐
			      ing mapped files	and
			      shared	     memory
			      regions.

       nsg_attach_pid()	      Attaches a process to   libnuma	   nsg_attach_pid(3)
			      a	  NUMA	 scheduling
			      group.

       nsg_detach_pid()	      Detaches	 a  process   libnuma	   nsg_attach_pid(3)
			      from a NUMA  schedul‐
			      ing group.

       pthread_nsg_attach()   Attaches a thread	 to   libpthread   pthread_nsg_attach(3)
			      a	  NUMA	 scheduling
			      group.

       pthread_nsg_detach()   Detaches	 a   thread   libpthread   pthread_nsg_detach(3)
			      from a NUMA  schedul‐
			      ing group.

       pthread_rad_attach()   Attaches	a thread to   libpthread   pthread_rad_attach(3)
			      a RAD set.

       pthread_rad_bind()     Attaches a thread	 to   libpthread   pthread_rad_attach(3)
			      a	   RAD	  set	and
			      restricts its  execu‐
			      tion to the home RAD.

       pthread_rad_detach()   Detaches	 a   thread   libpthread   pthread_rad_detach(3)
			      from a RAD set.

       rad_attach_pid()	      Attaches a process to   libnuma	   rad_attach_pid(3)
			      a RAD (assigns a home
			      RAD but allows execu‐
			      tion on other RADs).

       rad_bind_pid()	      Binds  a process to a   libnuma	   rad_attach_pid(3)
			      RAD (assigns  a  home
			      RAD   and	  restricts
			      execution to the home
			      RAD).

       rad_fork()	      Creates	 a    child   libnuma	    rad_fork(3)
			      process on a RAD that
			      optionally  does	not
			      inherit	 the	RAD
			      assignment   of	its
			      parent. See also	the
			      table    entry	for
			      nfork().

       ──────────────────────────────────────────────────────────────────────────────────

       Memory Management

       ──────────────────────────────────────────────────────────────────────
       Function		   Purpose		   Library   Reference Page
       ──────────────────────────────────────────────────────────────────────
       memalloc_attr()	   Returns  the	  memory   libnuma   memal‐
			   allocation policy for	     loc_attr(3)
			   a RAD  set  specified
			   by	  its	 virtual
			   address.

       nacreate()	   Sets up an arena  for   libc	      amalloc(3)
			   memory allocation for
			   use	with  the  amal‐
			   loc()  function..  An
			   arena is used in mul‐
			   tithreaded	programs
			   when	 there is a need
			   for	 thread-specific
			   heap	 memory	 alloca‐
			   tion.

       nmadvise()	   Tells the system what   libnuma    nmadvise(3)
			   behavior   to  expect
			   from a  process  with
			   respect  to referenc‐
			   ing mapped files  and
			   shared	  memory
			   regions.

       nmmap()		   Maps an open file (or   libnuma    nmmap(3)
			   anonymous	 memory)
			   onto	  the	 address
			   space  for  a process
			   by using a  specified
			   memory     allocation
			   policy.

       nshmget()	   Returns  or	 creates   libnuma    nshmget(3)
			   the	ID  for a shared
			   memory region.

       ──────────────────────────────────────────────────────────────────────

       CPUs and CPU Sets

       ───────────────────────────────────────────────────────────────────────
       Function		   Purpose		    Library   Reference Page
       ───────────────────────────────────────────────────────────────────────
       cpu_foreach()	   Enumerates the members   libc      cpu_foreach(3)
			   of a CPU set.

       cpu_get_current()   Returns the identifier   libc      cpu_get_cur‐
			   of the current CPU  on	      rent(3)
			   which    the	  calling
			   process is running.

       cpu_get_info()	   Returns  CPU	 informa‐   libc      cpu_get_info(3)
			   tion for  the  system.
			   **

       cpu_get_max()	   Returns the number  of   libc      cpu_get_info(3)
			   CPU slots available in
			   the	caller's   parti‐
			   tion. **

       cpu_get_num()	   Returns  the number of   libc      cpu_get_info(3)
			   available CPUs.

       cpu_get_rad()	   Returns the RAD  iden‐   libnuma   cpu_get_rad(3)
			   tifier for a CPU.

       cpuaddset()	   Adds	 a  CPU	 to a CPU   libc       cpusetops(3)
			   set.

       cpuandset()	   Performs a logical AND   libc       cpusetops(3)
			   operation  on the con‐
			   tents of two CPU sets,
			   storing  the result in
			   a third CPU set.

       cpucopyset()	   Copies the contents of   libc       cpusetops(3)
			   one CPU set to another
			   CPU set.

       cpucountset()	   Returns the number  of   libc       cpusetops(3)
			   CPUs in a CPU set.

       cpudelset()	   Deletes  a  CPU from a   libnuma    cpusetops(3)
			   CPU set.

       cpudiffset()	   Finds the logical dif‐   libnuma    cpusetops(3)
			   ference   between  two
			   CPU sets, storing  the
			   result  in a third CPU
			   set.

       cpuemptyset()	   Initializes a CPU  set   libnuma    cpusetops(3)
			   such	 that it includes
			   no CPUs.

       cpufillset()	   Initializes	a CPU set   libnuma    cpusetops(3)
			   such	 that it includes
			   all CPUs.

       cpuisemptyset()	   Tests  whether  a  CPU   libnuma    cpusetops(3)
			   set is empty.

       cpuismember()	   Tests whether a CPU is   libnuma    cpusetops(3)
			   a member of a particu‐
			   lar CPU set.

       cpuorset()	   Performs  a logical OR   libnuma    cpusetops(3)
			   operation on the  con‐
			   tents of two CPU sets,
			   storing the result  in
			   a third CPU set.

       cpusetcreate()	   Allocates  a	 CPU  set   libnuma    cpusetops(3)
			   and sets it to empty.

       cpusetdestroy()	   Releases  the   memory   libnuma    cpusetops(3)
			   allocated   to  a  CPU
			   set.

       cpuxorset()	   Performs a logical XOR   libnuma    cpusetops(3)
			   operation  on the con‐
			   tents of two CPU sets,
			   storing  the result in
			   a third CPU set.

       ───────────────────────────────────────────────────────────────────────

       ** On a partitioned system, the system and the  partition  are  equiva‐
       lent.   In this case, the operating system returns information only for
       the partition in which it is installed.

       NUMA Scheduling Groups

       ─────────────────────────────────────────────────────────────────────────────────
       Function		      Purpose		     Library	  Reference Page
       ─────────────────────────────────────────────────────────────────────────────────
       nsg_attach_pid()	      Attaches	a  process   libnuma	  nsg_attach_pid(3)
			      to a NUMA scheduling
			      group.

       nsg_destroy()	      Removes	 a    NUMA   libnuma	   nsg_destroy(3)
			      scheduling group and
			      deallocates      its
			      structures.

       nsg_detach_pid()	      Detaches	a  process   libnuma	  nsg_attach_pid(3)
			      from a NUMA schedul‐
			      ing group.

       pthread_nsg_attach()   Attaches a thread to   libpthread	  pthread_nsg_attach(3)
			      a	 NUMA	scheduling
			      group.

       pthread_nsg_detach()   Detaches	a   thread   libpthread	  pthread_nsg_detach(3)
			      from a NUMA schedul‐
			      ing group.

       nsg_get()	      Returns  the  status   libnuma	   nsg_get(3)
			      of a NUMA scheduling
			      group.

       nsg_get_nsgs()	      Returns  a  list	of   libnuma	   nsg_get_nsgs(3)
			      NUMA	scheduling
			      groups   that    are
			      active.

       nsg_get_pids()	      Returns  a  list	of   libnuma	   nsg_get_pids(3)
			      processes	  attached
			      to a NUMA scheduling
			      group.

       nsg_init()	      Looks up (and possi‐   libnuma	   nsg_init(3)
			      bly  creates) a NUMA
			      scheduling group.

       nsg_set()	      Sets group ID,  user   libnuma	   nsg_set(3)
			      ID,  and permissions
			      for a NUMA  schedul‐
			      ing group.

       pthread_nsg_get()      Returns  a  list	of   libpthread	  pthread_nsg_get(3)
			      threads attached	to
			      a	  NUMA	scheduling
			      group.

       ─────────────────────────────────────────────────────────────────────────────────

   NUMA Enhancements to System Utilities and Deamons
       A number of system commands display RAD-specific information or perform
       RAD-specific  operations. The following list briefly describes the NUMA
       options supported by system utilities and daemons: The runon -r command
       executes	 an application on a specific RAD.  The vmstat -r command dis‐
       plays virtual memory statistics for a specific  RAD.   The  netstat  -R
       command	displays  network  routing tables for each RAD.	 The ps -o RAD
       command includes RAD binding in the information	displayed  about  pro‐
       cesses  running	on  the system.	 The hwmgr -view hier command displays
       the RAD location of CPUs and devices. In this case, in place of	a  RAD
       identifier, the command identifies the contruct in hardware that corre‐
       sponds to a RAD.	 When run on a GS80, GS160, or GS320 AlphaServer plat‐
       form,  the command shows the hierarchy of CPUs and devices within QBBs.
       When run on an ES80 or GS1280 AlphaServer platform, the	command	 shows
       the  hierarchy  of  CPUs and devices within PIDs (processing unit IDs).
       The sched_stat -R command also displays	the  RAD  location  of	system
       CPUs.  In addition, this command shows the relative distance (number of
       hops) between CPUs.  The -t and -u options on the  nfsd	command	 allow
       customization  of  the  number  of  TCP and UCP server threads, respec‐
       tively, that are spawned per RAD. This feature allows the NFS server to
       automatically  scale the number of TCP and UCP server threads according
       to the size of the system.  The -r option on the inetd  command	allows
       customization  of  the  RAD locations on which to start Internet server
       child daemons. By default, one child deamon is  started	on  each  RAD.
       The route -R command of the kdbx kernel debugger displays network route
       tables for all RADs.

SEE ALSO
       NUMA Overview

       The NUMA Overview is a web-only document that includes a complete  NUMA
       programming  example.  Starting	with Tru64 UNIX Version 5.1, this web-
       only document can be accessed through the  version-specific  web	 pages
       for Tru64 UNIX documentation. Links to documentation sets for different
       product versions are available at the following URL:

       http://www.Tru64UNIX.compaq.com/docs/pub_page/doc_list.html

								 numa_intro(3)
[top]

List of man pages available for DigitalUNIX

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net