/*****************************************************************************/ /* Instance.c The INSTANCE module contains functions used to setup, maintain and coordinate action between, multiple servers running on a single system, alone or in combination with multiple servers running across a cluster. An "instance" in this context refers to the (almost) completely autonomous server process. A large portion of the required functionality concerns itself with synchronization and communication between these instances (servers) using the Distributed Lock Manager (DLM) and mutexes. Multiple processes (instances) can share incoming requests by each assigning a channel to the same BG: pseudo-device created with the appropriate listening socket characteristics. Each will receive a share of the incoming requests in a round-robin distribution. See NET.C for more information on this particular aspect and it's implementation. VMS CLUSTERING COMPARISON ------------------------- The approach WASD has used in providing multiple instance serving may be compared to VMS clustering. A cluster is often described as a loosely-coupled, distributed operating environment where autonomous processors can join, process and leave (even fail) independently, participating in a single management domain and communicating with one another for the purposes of resource sharing and high availability. Similarly WASD instances run in autononmous, detached processes (across one or more systems in a cluster) using a common configuration and management interface, aware of the presence and activity of other instances (via the DLM and shared memory), sharing processing load and providing rolling restart and automatic failover as required. LOAD SHARING ------------ On a multi-CPU system there are performance advantages to having processing available for scheduling on each. WASD employs AST (I/O) based processing and was not originally designed to support VMS kernel threading. Benchmarking has shown this be quite fast and efficient even when compared to a kernel-threaded server (OSU) across 2 CPUs. The advantage of multiple CPUs for a single multi-threaded server also diminishes where a site frequently activates scripts for processing. These of course (potentially) require a CPU each for processing. Where a system has many CPUs (and to a lesser extent with only two and few script activations) WASD's single-process, AST-driven design would scale more poorly. Running multiple WASD instances addresses this. Of course load sharing is not the only advantage to multiple instances ... RESTART ------- When multiple WASD instances are executing on a node and a restart is directed only one process shuts down at a time. The rest remain available for requests until the one restarting is fully ready to again process them itself. FAIL-THROUGH ------------ When multiple instances are executing on a node and one of these exits for some reason (bugcheck, resource exhaustion, etc.) the other(s) will continue to process requests. Of course requests in-progress by the particular instance at the time of instance failure are disconnected. If the former process has actually exited (in contrast to just the image) a new server process will automatically be created after a few seconds. ACTIVE/PASSIVE -------------- Implemented in NetActive() and NetPassive(), and under the control of CLI and Server Admin directives, instances can operate in either of two modes. ACTIVE mode; (classic/historical WASD instance processing, with all instances sharing the request processing load. PASSIVE mode; where only the supervisor instance is processing requests, other instances are instantiated but quiescent. One of the issues with multiple instances is use of the WATCH facility. WATCH necessarily can deal with only one instance at a time (tied as it is via a network connection and the associated per-process socket). It becomes a very hit-and-miss activity to try and capture particular events on multi-instance sites. The only solution, without (before) passive instances was to reduce the site to a single instance (requires a restart) and WATCH only that. Making instance processing passive is a (relatively) transparent action that confines request processing to the one (supervisor) instance only. This allows WATCH to be used much more effectively. When the activity is complete just move the instances back to active mode. Although described here in the INSTANCE.C module all the functionality is implemented in the NET.C module. To move into passive mode the mechanism is simply to dequeue ($CANCEL) all the connection acceptance QIOs on all instances but the supervisor. Very simple all the other instances no longer respond to connection requests. To move to active mode new accepts are queued restoring the instance(s) to processing. Elegant and functional! Instance failover is still maintained by having a previously passive, non-supervisor instance receiving the supervisor lock AST check and enable active mode on it's sockets as required. LOCK RESOURCE NAMES ------------------- With IPv6 support with WASD v8.5 lock resource names needed to be changed from the previously all ASCII to a binary representation. To continue using locks to coordinate socket usage the previously hexadecimal representation for the 32 bit IPv4 address and 16 bit port number needed to be expanded to accomodate the 128 bit IPv6 address and 16 bit port. For this to fit into a 31 character resource name the address/port data needed to be represented in binary and other information (e.g. naming version) needed to be compressed (also into a binary representation). per-cluster specific WASD|v|g|f WASD.. per-node specific WASD|v|g|node|f WASD.KLAATU. per-node socket WASD|v|g|node|ap WASD.KLAATU................. admin socket WASD|v|g|node::WASD:port WASD.KLAATU::WASD:80 where v is the 4 bit WASD instance lock version g is the 4 bit "environment" number (0..15) f is the 8 bit lock function (0..31) ap is the 32 bit or 128 bit address plus the 16 bit port These locks can be located in the System Dump Analyzer using SDa> SHOW LOCK /name=WASD The "per-cluster specific" are used to synchronize and communicate through locking across all nodes in a cluster. The "per-node specific" are used to synchronize and communicate through locking access to various resources shared amongst servers executing on the one node. The "per-node socket" is used to distribute information about the BG: device names of sockets already created by instances on the node. The device names store in the lock value blocks then allow subsequent instances to just assign channels to share the listen-for requests. The "admin socket" distributes per-instance (process) administration socket port across the node/cluster. This administration socket is required to allow a site administrator consistent access to a single instance (normally of course the socket sharing between instances means that requests are distributed between processes in a round-robin fashion). As this contains only the port number (in decimal) it assumes that there is a host name entry for each of the instance node names. MUTEX USAGE ----------- Using of the DLM for short-duration locking of shared memory access is probably an overly-expensive approach. So for these activities a mutex in shared memory is used. Multiple such mutexes are supported to provide maximum granularity when distributed across various activities. See InstanceMutexLock() for further detail. VERSION HISTORY --------------- 08-OCT-2009 MGD if HttpdServerStartup delay additional instance startup 16-AUG-2009 MGD bugfix; InstanceSupervisor() InstanceProcessName() prcnam 11-JUL-2009 MGD InstanceSupervisor() and InstanceProcessName() move process naming from "HTTPd:" to "WASD:" with backward-compatibility via WASD_PROCESS_NAME 05-NOV-2006 MGD it would appear that at least IA64 returns a lock value block length of 64 regardless of whether it is empty or not! 15-JUL-2006 MGD instance active and passive modes InstanceNodeSupervisorAst() calls NetActive(TRUE) refinements to controlled restart 04-JUL-2006 MGD use PercentOf() for more accurate percentages 25-MAY-2005 MGD allow for VMS V8.2 64 byte lksb$b_valblk 17-NOV-2004 MGD InstanceLockReportData() rework blocked-by/blocking into general indication of non-GR queue (underline) 10-APR-2004 MGD significant modifications to support IPv6, lock names now contain non-ASCII, binary components, remove never-used InstanceLockReportCli() and /LOCK 27-JUL-2003 MGD bugfix; use _BBCCI() to clear the mutex in InstanceExit()!! 19-JUN-2003 MGD bugfix; use _BBCCI() to clear the mutex 31-MAR-2003 MGD bugfix; add &puser= to lock 'show process' report 30-MAY-2002 MGD restart when 'quiet' 20-MAY-2002 MGD move more 'locking' functions over to using a 'mutex' 10-APR-2002 MGD some refinement to single-instance locking 31-MAR-2002 MGD use a more light-weight 'mutex' instead of DLM lock around general global section access 28-DEC-2001 MGD refine 'instance' creation/destruction 22-SEP-2001 MGD initial */ /*****************************************************************************/ #ifdef WASD_VMS_V7 #undef _VMS__V6__SOURCE #define _VMS__V6__SOURCE #undef __VMS_VER #define __VMS_VER 70000000 #undef __CRTL_VER #define __CRTL_VER 70000000 #endif /* standard C header files */ #include #include #include /* VMS related header files */ #include #include #include #include #include #include #include #include #include /* application-related header files */ #include "wasd.h" #define WASD_MODULE "INSTANCE" /******************/ /* global storage */ /******************/ BOOL InstanceNodeSupervisor, InstanceWasdName = true; int InstanceEnvNumber = 1, /* default environment number */ InstanceLockNameMagic, InstanceNodeCurrent, InstanceNodeConfig, InstanceNodeJoiningCount, InstanceLockReportNameWidth, InstanceSocketCount, InstanceSupervisorPoll; char *InstanceGroupChars [] = { "","","2","3","4","5","6","7","8","9", "a","b","c","d","e","f" }, *InstanceHttpChars [] = { "d","d","e","f","g","h","i","j","k" }, *InstanceWasdChars [] = { "","1","2","3","4","5","6","7","8" }; #if sizeof(WasdChars)/sizeof(char*) < INSTANCE_MAX #error "InstanceProcessName() WasdChars[] needs adjustment" #endif INSTANCE_LOCK InstanceLockAdmin; INSTANCE_LOCK InstanceLockTable [INSTANCE_LOCK_COUNT+1]; INSTANCE_SOCKET_LOCK InstanceSocketTable [INSTANCE_LOCK_SOCKET_MAX]; BOOL InstanceMutexHeld [INSTANCE_MUTEX_COUNT+1]; unsigned long InstanceMutexCount [INSTANCE_MUTEX_COUNT+1], InstanceMutexWaitCount [INSTANCE_MUTEX_COUNT+1]; char *InstanceMutexDescr [INSTANCE_MUTEX_COUNT+1] = INSTANCE_MUTEX_DESCR; /********************/ /* external storage */ /********************/ extern BOOL CliInstanceNoCrePrc, ControlRestartQuiet, ControlRestartRequested, HttpdNetworkMode, HttpdServerStartup, HttpdTicking; extern int CliInstanceMax, EfnWait, EfnNoWait, ExitStatus, HttpdTickSecond, NetConnectProcessing, ServerPort; extern int ToLowerCase[], ToUpperCase[]; extern unsigned long CrePrcMask[], SysLckMask[], WorldMask[]; extern char ErrorSanityCheck[], ErrorXvalNotValid[]; extern ACCOUNTING_STRUCT *AccountingPtr; extern CONFIG_STRUCT Config; extern HTTPD_GBLSEC *HttpdGblSecPtr; extern HTTPD_PROCESS HttpdProcess; extern MSG_STRUCT Msgs; extern SYS_INFO SysInfo; extern WATCH_STRUCT Watch; /*****************************************************************************/ /* Initialize the per-node and per-cluster lock resource names, queuing a NL lock against each resource. This NL lock will then be converted to other modes as required. */ InstanceLockInit () { static int LockCode [] = { INSTANCE_LOCK_CODES }; int cnt, status, NameLength; char *cptr, *sptr, *zptr; INSTANCE_LOCK *ilptr; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceLockInit()"); if ((HTTPD_LOCK_VERSION & 0xf) > 15) ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); if (InstanceEnvNumber > INSTANCE_ENV_NUMBER_MAX) { FaoToStdout ("%HTTPD-E-INSTANCE, environment range 1 to !UL\n", DEMO_INSTANCE_GROUP_NUMBER); exit (SS$_BADPARAM); } /* a byte comprising two 4 bit fields, version and environment number */ InstanceLockNameMagic = ((HTTPD_LOCK_VERSION & 0xf) << 4) | (InstanceEnvNumber & 0xf); InstanceSupervisorPoll = INSTANCE_SUPERVISOR_POLL; sys$setprv (1, &SysLckMask, 0, 0); for (cnt = 1; cnt <= INSTANCE_LOCK_COUNT; cnt++) { ilptr = &InstanceLockTable[cnt]; /* build the (binary) resource name for each non-socket lock */ zptr = (sptr = ilptr->Name) + sizeof(ilptr->Name)-1; for (cptr = HTTPD_NAME; *cptr && sptr < zptr; *sptr++ = *cptr++); if (sptr < zptr) *sptr++ = (char)InstanceLockNameMagic; if (cnt > INSTANCE_CLUSTER_LOCK_COUNT) { cptr = SysInfo.NodeName; while (*cptr && sptr < zptr) *sptr++ = *cptr++; } if (sptr < zptr) *sptr++ = (char)LockCode[cnt]; *sptr = '\0'; /* not at all necessary */ NameLength = sptr - ilptr->Name; ilptr->NameLength = NameLength; ilptr->NameDsc.dsc$w_length = NameLength; ilptr->NameDsc.dsc$a_pointer = &ilptr->Name; if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchDataDump (ilptr->Name, ilptr->NameLength); /* this is the basic place-holding, resource instantiating lock */ status = sys$enqw (EfnWait, LCK$K_NLMODE, &ilptr->Lksb, LCK$M_EXPEDITE | LCK$M_SYSTEM, &ilptr->NameDsc, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = ilptr->Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); } sys$setprv (0, &SysLckMask, 0, 0); } /*****************************************************************************/ /* Queue a conversion to a blocking EX mode lock on the node supervisor resource. Whichever process holds this lock for the image lifetime and has the dubious honour of performing tasks related to the per-node instances (e.g. creating processes to provide the configured instances of the server). */ InstanceServerInit () { int status; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceServerInit()"); sys$setprv (1, &SysLckMask, 0, 0); /* convert to a CR lock on the cluster membership resource */ InstanceLockTable[INSTANCE_CLUSTER].InUse = true; status = sys$enqw (EfnWait, LCK$K_CRMODE, &InstanceLockTable[INSTANCE_CLUSTER].Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[INSTANCE_CLUSTER].Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); /* convert to a CR lock on the node membership resource */ InstanceLockTable[INSTANCE_NODE].InUse = true; status = sys$enqw (EfnWait, LCK$K_CRMODE, &InstanceLockTable[INSTANCE_NODE].Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[INSTANCE_NODE].Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); /* ask to be notified whenever a node-instance joins */ InstanceLockNotifySet (INSTANCE_NODE_JOINING, &InstanceNodeJoiningAst); /* notify others (as well as ourself) that we're joining */ InstanceLockNotifyNow (INSTANCE_NODE_JOINING, NULL); /* queue up for our turn to be the instance node supervisor */ InstanceNodeSupervisor = false; InstanceLockTable[INSTANCE_NODE_SUPERVISOR].InUse = true; /* note: this is NOT a sys$enqw(), it's asynchronous */ status = sys$enq (EfnNoWait, LCK$K_EXMODE, &InstanceLockTable[INSTANCE_NODE_SUPERVISOR].Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, &InstanceNodeSupervisorAst, 0, 0, 0, 2, 0); if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enq()", FI_LI); sys$setprv (0, &SysLckMask, 0, 0); } /*****************************************************************************/ /* Check if there is already a configured single instance executing on this node. If this instance is configured to be a single instance then check how many other instances (potentially) are executing. If more than one exit with an error message - you can't have one instance wandering around thinking it's the only one. If this instance is configured to be one of multiple and there is already one instance thinking it's the only one around then exit for the complementary reason. */ InstanceSingleInit () { int status; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceSingleInit()"); InstanceMutexLock (INSTANCE_MUTEX_HTTPD); sys$setprv (1, &SysLckMask, 0, 0); /* convert to an EX lock on the single instance resource */ InstanceLockTable[INSTANCE_NODE_SINGLE].InUse = true; status = sys$enqw (EfnWait, LCK$K_EXMODE, &InstanceLockTable[INSTANCE_NODE_SINGLE].Lksb, LCK$M_NOQUEUE | LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); if (status == SS$_NOTQUEUED) { FaoToStdout ( "%HTTPD-E-INSTANCE, single instance already executing - exiting\n"); /* cancel any startup messages provided for the monitor */ HttpdGblSecPtr->StatusMessage[0] = '\0'; if (HttpdProcess.Mode == JPI$K_INTERACTIVE) exit (SS$_ABORT | STS$M_INHIB_MSG); else { InstanceExit (); sys$delprc (0, 0); } } else if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); if (InstanceNodeConfig > 1) { /* multiple instances; successfully queued, convert back to NL mode */ status = sys$enqw (EfnWait, LCK$K_NLMODE, &InstanceLockTable[INSTANCE_NODE_SINGLE].Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[INSTANCE_NODE_SINGLE].Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); InstanceLockTable[INSTANCE_NODE_SINGLE].InUse = false; } /* else single instance; else leave it at EX mode */ sys$setprv (0, &SysLckMask, 0, 0); InstanceMutexUnLock (INSTANCE_MUTEX_HTTPD); } /*****************************************************************************/ /* Establish how many per-node instances are allowed on this system. Must be called after the server configuration is loaded. If the number of instances specified is negative this sets the number of instances to be the system CPU count minus that number. At least one instance will always be set. */ InstanceFinalInit () { int status, NodeCount, StartupMax; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceFinalInit()"); InstanceMutexLock (INSTANCE_MUTEX_HTTPD); StartupMax = HttpdGblSecPtr->InstanceStartupMax; InstanceMutexUnLock (INSTANCE_MUTEX_HTTPD); if (StartupMax == INSTANCE_PER_CPU) FaoToStdout ("%HTTPD-W-INSTANCE, startup set to CPU\n"); else if (StartupMax) FaoToStdout ("%HTTPD-W-INSTANCE, startup set to !SL\n", StartupMax); if (CliInstanceMax < 0) InstanceNodeConfig = SysInfo.AvailCpuCnt + CliInstanceMax; else if (CliInstanceMax > 0) InstanceNodeConfig = CliInstanceMax; else if (StartupMax == INSTANCE_PER_CPU) InstanceNodeConfig = SysInfo.AvailCpuCnt; else if (StartupMax < 0) InstanceNodeConfig = SysInfo.AvailCpuCnt + StartupMax; else if (StartupMax > 0) InstanceNodeConfig = StartupMax; else if (CliInstanceMax == INSTANCE_PER_CPU) InstanceNodeConfig = SysInfo.AvailCpuCnt; else if (Config.cfServer.InstanceMax == INSTANCE_PER_CPU) InstanceNodeConfig = SysInfo.AvailCpuCnt; else if (Config.cfServer.InstanceMax < 0) InstanceNodeConfig = SysInfo.AvailCpuCnt + Config.cfServer.InstanceMax; else if (Config.cfServer.InstanceMax > 0) InstanceNodeConfig = Config.cfServer.InstanceMax; else InstanceNodeConfig = 1; /* minimum one, maximum eight (somewhat arbitrary but let's be sensible) */ if (InstanceNodeConfig < 1) InstanceNodeConfig = 1; else if (InstanceNodeConfig > INSTANCE_MAX) InstanceNodeConfig = INSTANCE_MAX; /* lets check that's it OK to go ahead with this configuration */ InstanceSingleInit (); NodeCount = InstanceLockList (INSTANCE_NODE, NULL, NULL); if (NodeCount > 1 && InstanceLockTable[INSTANCE_NODE_SINGLE].InUse) { FaoToStdout ( "%HTTPD-W-INSTANCE, multiple instances already executing - exiting\n"); /* cancel any startup messages provided for the monitor */ HttpdGblSecPtr->StatusMessage[0] = '\0'; InstanceExit (); sys$delprc (0, 0); } if (!CliInstanceNoCrePrc && NodeCount > InstanceNodeConfig) { FaoToStdout ("%HTTPD-W-INSTANCE, sufficient processes - exiting\n"); /* cancel any startup messages provided for the monitor */ HttpdGblSecPtr->StatusMessage[0] = '\0'; InstanceExit (); sys$delprc (0, 0); } if (NodeCount == 1) { /* first-in sets the pace */ InstanceMutexLock (INSTANCE_MUTEX_HTTPD); HttpdGblSecPtr->InstancePassive = Config.cfServer.InstancePassive; InstanceMutexUnLock (INSTANCE_MUTEX_HTTPD); } if (InstanceNodeConfig == 1) FaoToStdout ("%HTTPD-I-INSTANCE, 1 process\n"); else FaoToStdout ("%HTTPD-I-INSTANCE, !UL processes\n", InstanceNodeConfig); } /*****************************************************************************/ /* A node has just notified that it's in the process of joining the group of node instance(s). Determine the new number of instances on this node so that this information may be used when locking, displaying administration reports, etc. */ InstanceNodeJoiningAst () { /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceNodeJoiningAst()"); /* note that the instance composition may have changed */ InstanceNodeJoiningCount++; InstanceNodeCurrent = InstanceLockList (INSTANCE_NODE, NULL, NULL); /* kick off ticking to initiate any supervisory activities */ if (!HttpdTicking) HttpdTick (0); } /*****************************************************************************/ /* We've just become the node supervisor!! Either this is the first instance on the node or some other server process (or image) has exited and this process was the next in the conversion queue. */ InstanceNodeSupervisorAst (int AstParam) { /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceNodeSupervisorAst()"); InstanceNodeSupervisor = true; FaoToStdout ("%HTTPD-I-INSTANCE, supervisor\n"); /* the node supervisor gets to provide it's PID to HTTPDMON, etc. */ HttpdGblSecPtr->HttpdProcessId = HttpdProcess.Pid; /* ensure supervisor is accepting connections! */ NetActive (true); /* kick off ticking to initiate any supervisory activities */ if (!HttpdTicking) HttpdTick (0); } /*****************************************************************************/ /* When the server is processing requests this function is called by HttpdTick() every second. Only one process per node is allowed to perform the activities in this function. At least one node must perform these activities. Returns true to keep the server supervisor ticking, false to say no longer necessary. If a control restart has been requested then only the supervisor node is allowed to restart at any one time (of course all get a turn after one has exited because of the queued supervisor lock being delivered). This is what enables the rolling restart. Using InstanceLockList() get the current number of locks queued against the node lock and from that knows how many instances (processes) are currently executing. If less than the required number of instances then create a new server process. */ BOOL InstanceSupervisor () { static BOOL NeedsInstance; static int PollHttpdTickSecond, RestartHttpdTickSecond, RestartQuietCount, ShutdownCount = 30; static char PrcNam [16]; static $DESCRIPTOR (PrcNamDsc, PrcNam); static unsigned long JpiPid; static VMS_ITEM_LIST3 JpiItems [] = { { sizeof(JpiPid), JPI$_PID, &JpiPid, 0 }, { 0,0,0,0 } }; int idx, status, LockCount, InstanceNodeReady, StartupMax; unsigned short Length; IO_SB IOsb; /*********/ /* begin */ /*********/ if (!InstanceNodeSupervisor) return (false); if (WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceSupervisor()"); InstanceNodeCurrent = InstanceLockList (INSTANCE_NODE, NULL, NULL); if (ControlRestartRequested) { if (InstanceNodeCurrent > 1) { /* there is more than one instance executing */ InstanceNodeReady = InstanceLockList (INSTANCE_NODE_READY, NULL, NULL); if (InstanceNodeReady == InstanceNodeCurrent) { /* all of those are processing so restart immediately */ if (ShutdownCount > 0) ShutdownCount = 0; } else { /* There are less ready to process that are executing so wait a maximum of thirty seconds for more to start processing. */ ShutdownCount = 30; } } else if (InstanceNodeConfig == 1) { /* if started with only one instance then restart immediately */ if (ShutdownCount > 0) ShutdownCount = 0; } else if (ShutdownCount > 5) { /* A single instance (this one) is currently executing. Wait five seconds to see if more become current and if none does restart. */ ShutdownCount = 5; } if (ShutdownCount <= 0) { if (!NetConnectProcessing) { /* no outstanding requests */ FaoToStdout ("%HTTPD-I-CONTROL, server restart\n"); exit (SS$_NORMAL); } if (ShutdownCount == 0) { /* stop receiving incoming connections */ NetShutdownServerSocket (); } if (ShutdownCount < -300) { /* five minutes is a *long* wait for a request to finish! */ FaoToStdout ("%HTTPD-W-CONTROL, server restart timeout\n"); exit (SS$_NORMAL); } } ShutdownCount--; /* don't want to do any of the normal supervisor duties if restarting! */ return (true); } if (ControlRestartQuiet) { if (NetConnectProcessing) RestartQuietCount = 0; else if (RestartQuietCount++ > 1) { FaoToStdout ("%HTTPD-I-CONTROL, server restart when quiet\n"); exit (SS$_NORMAL); } return (true); } /* only every so-many seconds do we do a supervisor poll */ if (HttpdTickSecond < PollHttpdTickSecond) { if (!NeedsInstance) return (false); /* return true to keep it ticking only when a new instance is needed */ return (true); } PollHttpdTickSecond = HttpdTickSecond + InstanceSupervisorPoll; if (!CliInstanceNoCrePrc && InstanceNodeCurrent < InstanceNodeConfig) { if (!NeedsInstance || HttpdServerStartup) { /* keep it ticking until the next supervisor poll */ NeedsInstance = true; return (true); } for (idx = InstanceNodeConfig > 1 ? 1 : 0; idx < INSTANCE_MAX; idx++) { status = FaoToBuffer (PrcNam, sizeof(PrcNam), &Length, "!AZ!AZ!AZ:!UL", InstanceGroupChars[InstanceEnvNumber], InstanceWasdName ? "WASD" : "HTTP", InstanceWasdName ? InstanceWasdChars[idx] : InstanceHttpChars[idx], ServerPort); if (VMSnok (status) || status == SS$_BUFFEROVF) ErrorExitVmsStatus (status, NULL, FI_LI); if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchDataFormatted ("!&Z\n", PrcNam); PrcNamDsc.dsc$w_length = Length; status = sys$getjpiw (EfnWait, 0, &PrcNamDsc, &JpiItems, &IOsb, 0, 0); if (VMSok (status)) status = IOsb.Status; if (status == SS$_NONEXPR) { /* found a process name that should exist and doesn't */ FaoToStdout ("%HTTPD-I-INSTANCE, !20%D, creating \"!AZ\"\n", 0, PrcNam); if (HttpdNetworkMode) if (VMSnok (status = sys$setprv (1, &CrePrcMask, 0, 0))) ErrorExitVmsStatus (status, "sys$setprv()", FI_LI); HttpdDetachServerProcess (); if (HttpdNetworkMode) if (VMSnok (status = sys$setprv (0, &CrePrcMask, 0, 0))) ErrorExitVmsStatus (status, "sys$setprv()", FI_LI); return (true); } if (VMSnok (status)) ErrorNoticed (NULL, status, NULL, FI_LI); } /* instances fully populated, at least according to process names */ } NeedsInstance = false; return (false); } /*****************************************************************************/ /* Proactively dequeue these locks. I would have thought image exit would have done this "quickly enough", but it appears as if there are still sufficient locks when a move of supervisor role occurs to defeat the logic in the restart and create process code! Perhaps this only occurs with the '$DELPRC(0,0)' and it takes a while for the DLM to catch up? */ InstanceExit () { int idx, status; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceExit()"); /* unlock any instance-locked mutexes */ for (idx = 1; idx <= INSTANCE_MUTEX_COUNT; idx++) { if (!InstanceMutexHeld[idx]) continue; _BBCCI (0, &HttpdGblSecPtr->Mutex[idx]); } status = sys$deq (InstanceLockTable[INSTANCE_NODE].Lksb.lksb$l_lkid, 0, 0, 0); if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$deq()", FI_LI); status = sys$deq (InstanceLockTable[INSTANCE_NODE_READY].Lksb.lksb$l_lkid, 0, 0, 0); if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$deq()", FI_LI); } /*****************************************************************************/ /* Set the server process name. If multiple instances have been configured for step through the process names available breaking at the first successful. This becomes the "instance" name of this particular process on the node. */ InstanceProcessName () { static $DESCRIPTOR (LogNameDsc, "WASD_PROCESS_NAME"); static $DESCRIPTOR (LnmFileDevDsc, "LNM$FILE_DEV"); static char NameBuffer [16]; static VMS_ITEM_LIST3 NameLnmItem [] = { { sizeof(NameBuffer), LNM$_STRING, NameBuffer, 0 }, { 0,0,0,0 } }; int idx, status; unsigned short Length; $DESCRIPTOR (PrcNamDsc, HttpdProcess.PrcNam); /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceProcessName() !UL", InstanceNodeConfig); status = sys$trnlnm (0, &LnmFileDevDsc, &LogNameDsc, 0, &NameLnmItem); if (VMSok(status)) if (NameBuffer[0] == '0' || TOUP(NameBuffer[0]) == 'F') InstanceWasdName = false; for (idx = InstanceNodeConfig > 1 ? 1 : 0; idx < INSTANCE_MAX; idx++) { status = FaoToBuffer (HttpdProcess.PrcNam, sizeof(HttpdProcess.PrcNam), &Length, "!AZ!AZ!AZ:!UL", InstanceGroupChars[InstanceEnvNumber], InstanceWasdName ? "WASD" : "HTTP", InstanceWasdName ? InstanceWasdChars[idx] : InstanceHttpChars[idx], ServerPort); if (VMSnok (status) || status == SS$_BUFFEROVF) ErrorExitVmsStatus (status, NULL, FI_LI); if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchDataFormatted ("!&Z\n", HttpdProcess.PrcNam); PrcNamDsc.dsc$w_length = HttpdProcess.PrcNamLength = Length; if (VMSok (status = sys$setprn (&PrcNamDsc))) break; } if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$setprn()", FI_LI); FaoToStdout ("%HTTPD-I-INSTANCE, process name !AZ\n", HttpdProcess.PrcNam); } /*****************************************************************************/ /* Ready to process requests, just do a lock conversion to CR to indicate this. */ InstanceReady () { int status; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceReady()"); sys$setprv (1, &SysLckMask, 0, 0); InstanceLockTable[INSTANCE_NODE_READY].InUse = true; status = sys$enqw (EfnWait, LCK$K_CRMODE, &InstanceLockTable[INSTANCE_NODE_READY].Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[INSTANCE_NODE_READY].Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); sys$setprv (0, &SysLckMask, 0, 0); } /*****************************************************************************/ /* The "administration socket" is used to to connect exclusively to a single instance (normally connects are distributed between instances). This function distributes the IP port (in decimal) across the cluster via InstanceSocketForAdmin(). Creates a lock resource with a name based on the process name and stores in it's lock value block the number (in ASCII as always) of it's "internal", per-instance (process) admininstration port. */ int InstanceSocketAdmin (short IpPort) { int enqfl, status, NameLength; char *cptr, *sptr, *zptr; IO_SB IOsb; INSTANCE_LOCK *ilptr; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceSocketAdmin() !UL", IpPort); ilptr = &InstanceLockAdmin; /* build the (binary) resource name the admin lock */ zptr = (sptr = ilptr->Name) + sizeof(ilptr->Name)-1; for (cptr = HTTPD_NAME; *cptr && sptr < zptr; *sptr++ = *cptr++); if (sptr < zptr) *sptr++ = (char)InstanceLockNameMagic; for (cptr = SysInfo.NodeName; *cptr && sptr < zptr; *sptr++ = *cptr++); if (sptr < zptr) *sptr++ = ':'; if (sptr < zptr) *sptr++ = ':'; for (cptr = HttpdProcess.PrcNam; *cptr && sptr < zptr; *sptr++ = *cptr++); NameLength = sptr - ilptr->Name; ilptr->NameDsc.dsc$w_length = NameLength; ilptr->NameDsc.dsc$a_pointer = ilptr->Name; FaoToBuffer (&ilptr->Lksb.lksb$b_valblk, SysInfo.LockValueBlockSize, NULL, "!UL", (unsigned short)IpPort); /* queue at EX then convert to NL causing lock value block to be written */ sys$setprv (1, &SysLckMask, 0, 0); if (ilptr->InUse) ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); ilptr->InUse = true; status = sys$enqw (EfnWait, LCK$K_EXMODE, &ilptr->Lksb, LCK$M_NOQUEUE | LCK$M_SYSTEM, &ilptr->NameDsc, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = ilptr->Lksb.lksb$w_status; /* this just shouldn't happen */ if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); enqfl = LCK$M_VALBLK | LCK$M_CONVERT | LCK$M_SYSTEM; if (SysInfo.LockValueBlockSize == LOCK_VALUE_BLOCK_64) enqfl |= LCK$M_XVALBLK; status = sys$enqw (EfnWait, LCK$K_NLMODE, &ilptr->Lksb, enqfl, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = ilptr->Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); sys$setprv (0, &SysLckMask, 0, 0); if (status == SS$_XVALNOTVALID) { /* hmmm, change in cluster composition? whatever! go back to 16 bytes */ SysInfo.LockValueBlockSize = LOCK_VALUE_BLOCK_16; ErrorNoticed (NULL, SS$_XVALNOTVALID, ErrorXvalNotValid, FI_LI); } return (status); } /*****************************************************************************/ /* Given a process name in the format 'node::WASD:port' (e.g. "DELTA::WASD:80") generate the same lock name as InstanceSocketAdmin() and queue a NL lock, then get the lock value block using sys$getlki(). Retrieve the decimal port into the supplied pointed-to storage. Return a VMS status code. */ int InstanceSocketForAdmin ( char *ProcessName, short *IpPortPtr ) { static unsigned long Lki_XVALNOTVALID; static char LockName [31+1]; static struct lksb LockSb; static VMS_ITEM_LIST3 LkiItems [] = { /* careful, values are dynamically assigned in code below! */ { 0, 0, 0, 0 }, /* reserved for LKI$_[X]VALBLK item */ { 0, 0, 0, 0 }, /* reserved for LKI$_XVALNOTVALID item */ {0,0,0,0} }; static $DESCRIPTOR (LockNameDsc, LockName); int retval, status; char *cptr, *sptr, *zptr; IO_SB IOsb; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceSocketForAdmin() !&Z", ProcessName); if (SysInfo.LockValueBlockSize == LOCK_VALUE_BLOCK_64) { LkiItems[0].buf_len = LOCK_VALUE_BLOCK_64; LkiItems[0].buf_addr = &LockSb.lksb$b_valblk; LkiItems[0].item = LKI$_XVALBLK; LkiItems[1].buf_len = sizeof(Lki_XVALNOTVALID); LkiItems[1].buf_addr = &Lki_XVALNOTVALID; LkiItems[1].item = LKI$_XVALNOTVALID; } else { LkiItems[0].buf_len = LOCK_VALUE_BLOCK_16; LkiItems[0].buf_addr = &LockSb.lksb$b_valblk; LkiItems[0].item = LKI$_VALBLK; /* in this case this terminates the item list */ LkiItems[1].buf_len = 0; LkiItems[1].buf_addr = 0; LkiItems[1].item = 0; Lki_XVALNOTVALID = 0; } /* build the (binary) resource name the admin lock */ zptr = (sptr = LockName) + sizeof(LockName)-1; for (cptr = HTTPD_NAME; *cptr && sptr < zptr; *sptr++ = *cptr++); if (sptr < zptr) *sptr++ = (char)InstanceLockNameMagic; for (cptr = ProcessName; *cptr && sptr < zptr; *sptr++ = *cptr++); LockNameDsc.dsc$w_length = sptr - LockName; sys$setprv (1, &SysLckMask, 0, 0); status = sys$enqw (EfnWait, LCK$K_NLMODE, &LockSb, LCK$M_SYSTEM, &LockNameDsc, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = LockSb.lksb$w_status; if (VMSok (status)) { status = sys$getlkiw (EfnWait, &LockSb.lksb$l_lkid, &LkiItems, &IOsb, 0, 0, 0); if (VMSok (status)) status = IOsb.Status; } sys$deq (LockSb.lksb$l_lkid, 0, 0, 0); sys$setprv (0, &SysLckMask, 0, 0); if (VMSnok (status)) return (status); if (Lki_XVALNOTVALID) { /* hmmm, change in cluster composition? whatever! go back to 16 bytes */ SysInfo.LockValueBlockSize = LOCK_VALUE_BLOCK_16; ErrorNoticed (NULL, SS$_XVALNOTVALID, ErrorXvalNotValid, FI_LI); } if (IpPortPtr) *IpPortPtr = atoi(&LockSb.lksb$b_valblk); return (SS$_NORMAL); } /*****************************************************************************/ /* This function controls the creation of bound sockets, and distribution of the BG: device names, amongst per-node instances of the server. This function is called one or two times to do it's job, which is to create a per-node lock resource name containing a BINARY representation of a service IP address and port. Binary is necessary to be able to contain the 16 byte address + 2 byte port of IPv6. The resource name becomes the 5 character resource name prefix (see above), the (up to) 6 character node name the finally the 18 byte socket address, a total of 29 characters (out of a possible 31). The first call checks if this instance already has a channel to the requested socket (address/port combination). If it does (stored in a local table) it returns the BG device name with a leading underscore. If not it checks The first (and possibly second) call has 'BgDevName' as NULL and creates the resource name, enqueues a CR lock then converts it to NL which causes the lock value block to be returned. This can be checked for a string with the BG: device name (e.g. "_BG206:") of any previously created listening socket for the address and port. If such a string is found then a pointer to it is returned and it can be used to assign another channel to it. If the lock value block is empty a NULL is returned, the calling routine then creates and binds a socket, then calls this function again. This time with the 'BgDeviceName' is non-NULL and points to a string containing the device name (e.g. "_BG206:"). This is copied to the lock value block, an EX mode lock enqueued then converted back to NL to write the lock value, making it available for use by other processes. This function assumes some other overall lock prevents other processes from using this function while it is called two times (i.e. the service creation process is locked). */ char* InstanceSocket ( IPADDRESS *ipaptr, short IpPort, char *BgDevName ) { static char DeviceName [LOCK_VALUE_BLOCK_64]; int cnt, enqfl, status, SocketNameLength; char *cptr, *sptr, *zptr; char SocketName [31+1]; INSTANCE_SOCKET_LOCK *islptr; $DESCRIPTOR (NameDsc, ""); /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceSocket()"); if (BgDevName) { /*************************************/ /* socket created, store device name */ /*************************************/ islptr = &InstanceSocketTable[InstanceSocketCount]; /* store the BG device name in the lock value block */ sptr = islptr->Lksb.lksb$b_valblk; zptr = sptr + sizeof(islptr->Lksb.lksb$b_valblk)-1; for (cptr = BgDevName; *cptr && sptr < zptr; *sptr++ = *cptr++); *sptr = '\0'; enqfl = LCK$M_VALBLK | LCK$M_CONVERT | LCK$M_SYSTEM; if (SysInfo.LockValueBlockSize == LOCK_VALUE_BLOCK_64) enqfl |= LCK$M_XVALBLK; sys$setprv (1, &SysLckMask, 0, 0); /* convert NL to EX then back to NL, lock value block is written */ status = sys$enqw (EfnWait, LCK$K_EXMODE, &islptr->Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = islptr->Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); status = sys$enqw (EfnWait, LCK$K_NLMODE, &islptr->Lksb, enqfl, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = islptr->Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); sys$setprv (0, &SysLckMask, 0, 0); if (status == SS$_XVALNOTVALID) { /* hmmm, change in cluster composition? whatever! back to 16 bytes */ SysInfo.LockValueBlockSize = LOCK_VALUE_BLOCK_16; ErrorNoticed (NULL, SS$_XVALNOTVALID, ErrorXvalNotValid, FI_LI); } InstanceSocketCount++; return (NULL); } /***************************************/ /* build the socket lock resource name */ /***************************************/ zptr = (sptr = SocketName) + sizeof(SocketName)-1; for (cptr = HTTPD_NAME; *cptr && sptr < zptr; *sptr++ = *cptr++); if (sptr < zptr) *sptr++ = (char)InstanceLockNameMagic; cptr = SysInfo.NodeName; while (*cptr && sptr < zptr) *sptr++ = *cptr++; if (sptr < zptr) { if (IPADDRESS_IS_V4(ipaptr)) *sptr++ = (char)INSTANCE_NODE_SOCKIP4; else *sptr++ = (char)INSTANCE_NODE_SOCKIP6; } cnt = IPADDRESS_SIZE(ipaptr); cptr = IPADDRESS_ADR46(ipaptr); while (cnt-- && sptr < zptr) *sptr++ = *cptr++; cnt = sizeof(short); cptr = (char*)&IpPort; while (cnt-- && sptr < zptr) *sptr++ = *cptr++; if (sptr >= zptr) ErrorExitVmsStatus (0, ErrorSanityCheck, FI_LI); *sptr = '\0'; /* not really necessary */ SocketNameLength = sptr - SocketName; if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchDataDump (SocketName, SocketNameLength); /**************************************************************/ /* check if this instance already has a channel to the socket */ /**************************************************************/ for (cnt = 0; cnt < InstanceSocketCount; cnt++) { islptr = &InstanceSocketTable[cnt]; if (MATCH0 (islptr->Name, SocketName, SocketNameLength)) break; } if (cnt >= InstanceSocketCount) islptr = NULL; if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "!&?YES\rNO\r", islptr); if (islptr) { /* yes it has! */ zptr = (sptr = DeviceName) + sizeof(DeviceName)-1; cptr = &islptr->Lksb.lksb$b_valblk; if (*cptr == '_') cptr++; *sptr++ = '_'; while (*cptr && sptr < zptr) *sptr++ = *cptr++; *sptr = '\0'; /* return with a leading underscore */ return (DeviceName); } /**************************************************/ /* check if another instance has bound the socket */ /**************************************************/ islptr = &InstanceSocketTable[InstanceSocketCount]; memcpy (islptr->Name, SocketName, SocketNameLength+1); NameDsc.dsc$w_length = SocketNameLength; NameDsc.dsc$a_pointer = islptr->Name; sys$setprv (1, &SysLckMask, 0, 0); /* this is the basic place-holding, resource instantiating lock */ status = sys$enqw (EfnWait, LCK$K_NLMODE, &islptr->Lksb, LCK$M_EXPEDITE | LCK$M_SYSTEM, &NameDsc, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = islptr->Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); enqfl = LCK$M_VALBLK | LCK$M_CONVERT | LCK$M_SYSTEM; if (SysInfo.LockValueBlockSize == LOCK_VALUE_BLOCK_64) enqfl |= LCK$M_XVALBLK; /* convert NL to CR then back to NL, the lock value block is returned */ status = sys$enqw (EfnWait, LCK$K_CRMODE, &islptr->Lksb, enqfl, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = islptr->Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); if (status == SS$_XVALNOTVALID) { /* hmmm, change in cluster composition? whatever! go back to 16 bytes */ SysInfo.LockValueBlockSize = LOCK_VALUE_BLOCK_16; ErrorNoticed (NULL, SS$_XVALNOTVALID, ErrorXvalNotValid, FI_LI); } /* back to NL mode */ status = sys$enqw (EfnWait, LCK$K_NLMODE, &islptr->Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = islptr->Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); sys$setprv (0, &SysLckMask, 0, 0); if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "!&Z", islptr->Lksb.lksb$b_valblk); if (islptr->Lksb.lksb$b_valblk[0]) { /* yes it has! lock value block contains a BG: device name string */ InstanceSocketCount++; /* return without a leading underscore */ return (islptr->Lksb.lksb$b_valblk); } /* no BG: device name string, socket will need to be created */ return (NULL); } /*****************************************************************************/ /* Lock the server control functionality (e.g. /DO=) against any concurrent usage. Write the PID of the initiating process into the value block of the CONTROL lock. This can be used for log and audit purposes on other nodes, etc. */ int InstanceLockControl () { static int LockIndex = INSTANCE_CLUSTER_CONTROL; static unsigned long JpiPid; static VMS_ITEM_LIST3 JpiItems [] = { { sizeof(JpiPid), JPI$_PID, &JpiPid, 0 }, { 0,0,0,0 } }; int enqfl, status; IO_SB IOsb; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceLockControl() !UL !&Z", LockIndex, InstanceLockTable[LockIndex].Name); status = sys$getjpiw (EfnWait, 0, 0, &JpiItems, &IOsb, 0, 0); if (VMSok (status)) status = IOsb.Status; if (VMSnok (status)) return (status); /* store the PID in the lock status block */ status = FaoToBuffer (&InstanceLockTable[LockIndex].Lksb.lksb$b_valblk, SysInfo.LockValueBlockSize, NULL, "!8XL", JpiPid); if (VMSnok (status)) return (status); enqfl = LCK$M_VALBLK | LCK$M_CONVERT | LCK$M_SYSTEM; if (SysInfo.LockValueBlockSize == LOCK_VALUE_BLOCK_64) enqfl |= LCK$M_XVALBLK; sys$setprv (1, &SysLckMask, 0, 0); /* convert to EX then to PW causing lock value block to be written */ if (InstanceLockTable[LockIndex].InUse) return (SS$_BUGCHECK); status = sys$enqw (EfnWait, LCK$K_EXMODE, &InstanceLockTable[LockIndex].Lksb, LCK$M_NOQUEUE | LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[LockIndex].Lksb.lksb$w_status; if (VMSok (status)) status = sys$enqw (EfnWait, LCK$K_PWMODE, &InstanceLockTable[LockIndex].Lksb, enqfl, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[LockIndex].Lksb.lksb$w_status; if (VMSok (status)) InstanceLockTable[LockIndex].InUse = true; sys$setprv (0, &SysLckMask, 0, 0); if (status == SS$_XVALNOTVALID) { /* hmmm, change in cluster composition? whatever! go back to 16 bytes */ SysInfo.LockValueBlockSize = LOCK_VALUE_BLOCK_16; ErrorNoticed (NULL, SS$_XVALNOTVALID, ErrorXvalNotValid, FI_LI); } return (status); } /*****************************************************************************/ /* Unlock the server control functionality. */ InstanceUnLockControl () { int status; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceUnLockControl()"); if (VMSnok (status = InstanceUnLock (INSTANCE_CLUSTER_CONTROL))) ErrorExitVmsStatus (status, "InstanceUnLock()", FI_LI); } /*****************************************************************************/ /* Take out a EX lock on the specified resource. Wait until it is granted. InstanceLock(), InstanceLockNoWait() and InstanceUnLock() attempt to improve performance by avoiding the use of the DLM where possible. The DLM does not need to be used when its a node-only lock (not for a cluster-wide resource) and when there is only the one instance executing on a node. When this is the case the serialization is performed by AST deliver level and the '.InUse' flags. */ int InstanceLock (int LockIndex) { int status; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceLock() !&B !UL !31&H", LockIndex <= INSTANCE_CLUSTER_LOCK_COUNT || InstanceNodeConfig > 1, LockIndex, InstanceLockTable[LockIndex].Name); if (InstanceLockTable[LockIndex].InUse) return (SS$_BUGCHECK); if (LockIndex > INSTANCE_CLUSTER_LOCK_COUNT && InstanceNodeConfig <= 1) { /* a node-only lock is being requested and not multiple instances */ InstanceLockTable[LockIndex].InUse = true; return (SS$_NORMAL); } sys$setprv (1, &SysLckMask, 0, 0); status = sys$enqw (EfnWait, LCK$K_EXMODE, &InstanceLockTable[LockIndex].Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); sys$setprv (0, &SysLckMask, 0, 0); if (VMSok (status)) status = InstanceLockTable[LockIndex].Lksb.lksb$w_status; if (VMSok (status)) InstanceLockTable[LockIndex].InUse = true; return (status); } /*****************************************************************************/ /* Take out an EX lock on the specified resource. If it cannot be immediately granted then do not queue, immediately return with an indicative status. */ int InstanceLockNoWait (int LockIndex) { int status; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceLockNoWait() !&B !UL !31&H", LockIndex <= INSTANCE_CLUSTER_LOCK_COUNT || InstanceNodeConfig > 1, LockIndex, InstanceLockTable[LockIndex].Name); if (InstanceLockTable[LockIndex].InUse) return (SS$_NOTQUEUED); if (LockIndex > INSTANCE_CLUSTER_LOCK_COUNT && InstanceNodeConfig <= 1) { /* a node-only lock is being requested and not multiple instances */ InstanceLockTable[LockIndex].InUse = true; return (SS$_NORMAL); } sys$setprv (1, &SysLckMask, 0, 0); status = sys$enqw (EfnWait, LCK$K_EXMODE, &InstanceLockTable[LockIndex].Lksb, LCK$M_NOQUEUE | LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); sys$setprv (0, &SysLckMask, 0, 0); if (VMSok (status)) status = InstanceLockTable[LockIndex].Lksb.lksb$w_status; if (VMSok (status)) InstanceLockTable[LockIndex].InUse = true; return (status); } /*****************************************************************************/ /* Return the specified lock to NL mode. */ int InstanceUnLock (int LockIndex) { int status; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceUnLock() !&B !UL !31&H", LockIndex <= INSTANCE_CLUSTER_LOCK_COUNT || InstanceNodeConfig > 1, LockIndex, InstanceLockTable[LockIndex].Name); if (!InstanceLockTable[LockIndex].InUse) return (SS$_BUGCHECK); if (LockIndex > INSTANCE_CLUSTER_LOCK_COUNT && InstanceNodeConfig <= 1) { /* a node-only lock is being requested and not multiple instances */ InstanceLockTable[LockIndex].InUse = false; return (SS$_NORMAL); } sys$setprv (1, &SysLckMask, 0, 0); status = sys$enqw (EfnWait, LCK$K_NLMODE, &InstanceLockTable[LockIndex].Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, 0, 0, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[LockIndex].Lksb.lksb$w_status; if (VMSok (status)) InstanceLockTable[LockIndex].InUse = false; sys$setprv (0, &SysLckMask, 0, 0); return (status); } /*****************************************************************************/ /* Increment the longword in the shared global section pointed to by the supplied parameter. Lock global section structure if multiple per-node instances possible. This function avoids the overhead of InstanceMutexLock()/UnLock() with it's required set privilege calls, etc., for the very common action of accounting structure longword increment. See InstanceMutexLock() for a description of mutex operation. */ InstanceGblSecIncrLong (long *longptr) { int TickSecond, WaitCount, WaitHttpdTickSecond; unsigned long BinTime [2]; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceGblSecIncrLong()"); /* if multiple per-node instances not possible */ if (InstanceNodeConfig <= 1) { *longptr = *longptr + 1; return; } if (InstanceMutexHeld[INSTANCE_MUTEX_HTTPD]) ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); WaitCount = 0; InstanceMutexCount[INSTANCE_MUTEX_HTTPD]++; for (;;) { InstanceMutexHeld[INSTANCE_MUTEX_HTTPD] = !_BBSSI (0, &HttpdGblSecPtr->Mutex[INSTANCE_MUTEX_HTTPD]); if (InstanceMutexHeld[INSTANCE_MUTEX_HTTPD]) { *longptr = *longptr + 1; _BBCCI (0, &HttpdGblSecPtr->Mutex[INSTANCE_MUTEX_HTTPD]); InstanceMutexHeld[INSTANCE_MUTEX_HTTPD] = 0; return; } if (!WaitCount++) { InstanceMutexWaitCount[INSTANCE_MUTEX_HTTPD]++; WaitHttpdTickSecond = HttpdTickSecond + INSTANCE_MUTEX_WAIT; } if (SysInfo.AvailCpuCnt == 1) sys$resched (); sys$gettim (&BinTime); TickSecond = decc$fix_time (&BinTime); if (TickSecond > WaitHttpdTickSecond) break; } /* something's drastically amiss, clear the mutex peremtorily */ _BBCCI (0, &HttpdGblSecPtr->Mutex[INSTANCE_MUTEX_HTTPD]); ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); } /*****************************************************************************/ /* Same as InstanceGblSecIncrLong() except if decrements the longword if non-zero. See InstanceMutexLock() for a description of mutex operation. */ InstanceGblSecDecrLong (long *longptr) { int TickSecond, WaitCount, WaitHttpdTickSecond; unsigned long BinTime [2]; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceGblSecDecrLong()"); /* if multiple per-node instances not possible */ if (InstanceNodeConfig <= 1) { if (*longptr) *longptr = *longptr - 1; return; } if (InstanceMutexHeld[INSTANCE_MUTEX_HTTPD]) ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); WaitCount = 0; InstanceMutexCount[INSTANCE_MUTEX_HTTPD]++; for (;;) { InstanceMutexHeld[INSTANCE_MUTEX_HTTPD] = !_BBSSI (0, &HttpdGblSecPtr->Mutex[INSTANCE_MUTEX_HTTPD]); if (InstanceMutexHeld[INSTANCE_MUTEX_HTTPD]) { if (*longptr) *longptr = *longptr - 1; _BBCCI (0, &HttpdGblSecPtr->Mutex[INSTANCE_MUTEX_HTTPD]); InstanceMutexHeld[INSTANCE_MUTEX_HTTPD] = 0; return; } if (!WaitCount++) { InstanceMutexWaitCount[INSTANCE_MUTEX_HTTPD]++; WaitHttpdTickSecond = HttpdTickSecond + INSTANCE_MUTEX_WAIT; } if (SysInfo.AvailCpuCnt == 1) sys$resched (); sys$gettim (&BinTime); TickSecond = decc$fix_time (&BinTime); if (TickSecond > WaitHttpdTickSecond) break; } /* something's drastically amiss, clear the mutex peremtorily */ _BBCCI (0, &HttpdGblSecPtr->Mutex[INSTANCE_MUTEX_HTTPD]); ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); } /*****************************************************************************/ /* Take out a mutex (lock) on the global section. There is a small chance that an instance will crash or be stopped while the mutex is held. InstanceExit() should reset the mutex if currently held. A sanity checks causes the instance to exit if the mutex is held for more than the defined period. Of course as with all indeterminate shared access there are small critical code sections and chances of race conditions here. Worst-case is a mutex being taken out and not released because of process STOPing, though there should not be infinite loops or waits, the sanity check should cause an exit. There is also a small chance that the instance may have released the mutex but still have the flag set that it holds it. This might result in the mutex being "released" (zeroed) while some other instance legitimately holds it. All-in-all such uncoordinated access to the global section might result in minor data corruption (accounting accumulators), but nothing disasterous. On a multi-CPU system this algorithm might cause the waiting instance to "spin" a little (i.e. uselessly consume CPU cycles). It assumes the blocking instance will be scheduled and processing on some other CPU :^) If this "hangs" at AST delivery level then 'HttpdTickSecond' will have stopped ticking. Generate our own ticks here. I'm assuming that this mutex approach is more light-weight than using the DLM. */ InstanceMutexLock (int MutexNumber) { int TickSecond, WaitCount, WaitHttpdTickSecond; unsigned long BinTime [2]; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceMutexLock() !UL", MutexNumber); if (InstanceNodeConfig <= 1) return; if (MutexNumber <= 0 || MutexNumber > INSTANCE_MUTEX_COUNT || InstanceMutexHeld[MutexNumber]) { char String [256]; sprintf (String, "%s (mutex %d)", ErrorSanityCheck, MutexNumber); ErrorExitVmsStatus (SS$_BUGCHECK, String, FI_LI); } WaitCount = 0; InstanceMutexCount[MutexNumber]++; for (;;) { InstanceMutexHeld[MutexNumber] = !_BBSSI (0, &HttpdGblSecPtr->Mutex[MutexNumber]); if (InstanceMutexHeld[MutexNumber]) return; if (!WaitCount++) { InstanceMutexWaitCount[MutexNumber]++; WaitHttpdTickSecond = HttpdTickSecond + INSTANCE_MUTEX_WAIT; } if (SysInfo.AvailCpuCnt == 1) sys$resched (); sys$gettim (&BinTime); TickSecond = decc$fix_time (&BinTime); if (TickSecond > WaitHttpdTickSecond) break; } /* something's drastically amiss, clear the mutex peremtorily */ _BBCCI (0, &HttpdGblSecPtr->Mutex[MutexNumber]); ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); } /*****************************************************************************/ /* Reset the mutex taken out on the global section. See InstanceMutexLock() for a description of mutex operation. */ InstanceMutexUnLock (int MutexNumber) { int status; char String [256]; /*********/ /* begin */ /*********/ if (WATCH_MOD && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceMutexUnLock() !UL", MutexNumber); if (InstanceNodeConfig <= 1) return; if (MutexNumber >= 1 && MutexNumber <= INSTANCE_MUTEX_COUNT && InstanceMutexHeld[MutexNumber]) { _BBCCI (0, &HttpdGblSecPtr->Mutex[MutexNumber]); InstanceMutexHeld[MutexNumber] = 0; return; } /* something's drastically amiss, clear the mutex peremtorily */ if (InstanceMutexHeld[MutexNumber]) _BBCCI (0, &HttpdGblSecPtr->Mutex[MutexNumber]); sprintf (String, "%s (mutex %d)", ErrorSanityCheck, MutexNumber); ErrorExitVmsStatus (SS$_BUGCHECK, String, FI_LI); } /*****************************************************************************/ /* This function establishes a DLM based mechanism for registering interest in receiving notifications of "events" across all node and/or cluster instances (depending on the resource name involved) of servers. When called it "registers interest" in the associated resource name and when InstanceLockNotifyNow() is used the callback AST is activated and the lock status value block used to transfer data to that AST. Enqueues a CR (concurrent read) lock on the specified resource. This allows a "blocking" AST to be delivered (back to this function, the two states are differentiated by setting the most significant bit of 'LockIndex' for the AST call), indicating another instance somewhere (using InstanceLockNotifyNow()) is wishing to initiate a distributed action, by enqueing an EX (exclusive) lock for the same resource. Release the CR lock then immediately enqueue another CR so that the lock value block subsequently written to by the initiating EX mode lock is read via the specified AST function. (Note the 'AstFunction' parameter is only accessed during non-AST processing and so is not a *real* issue - except for purists ;^) The '..' lock status block and AST function are used here because it is being set up to last the life of the server. */ int InstanceLockNotifySet ( int LockIndex, CALL_BACK AstFunction ) { int enqfl, status; char *cptr; /*********/ /* begin */ /*********/ if (WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceLockNotifySet() !&F !&X !&Z !&A", &InstanceLockNotifySet, LockIndex, InstanceLockTable[LockIndex&0x7fffffff].Name, LockIndex&0x80000000 ? 0 : AstFunction); sys$setprv (1, &SysLckMask, 0, 0); if (LockIndex & 0x80000000) { enqfl = LCK$M_VALBLK | LCK$M_QUECVT | LCK$M_CONVERT | LCK$M_SYSTEM; if (SysInfo.LockValueBlockSize == LOCK_VALUE_BLOCK_64) enqfl |= LCK$M_XVALBLK; /* mask out the bit that indicates it's an AST */ LockIndex &= 0x7fffffff; if (!InstanceLockTable[LockIndex].InUse) ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); /* convert (wait) current CR mode lock to NL unblocking the queue */ status = sys$enqw (EfnWait, LCK$K_NLMODE, &InstanceLockTable[LockIndex].Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, LockIndex|0x80000000, 0, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[LockIndex].Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); /* convert (nowait) back to CR to block queue against next EX mode */ status = sys$enq (EfnNoWait, LCK$K_CRMODE, &InstanceLockTable[LockIndex].Lksb, enqfl, 0, 0, &InstanceLockNotifySetAst, LockIndex|0x80000000, &InstanceLockNotifySet, 0, 2, 0); if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enq()", FI_LI); if (status == SS$_XVALNOTVALID) { /* hmmm, change in cluster composition? whatever! back to 16 bytes */ SysInfo.LockValueBlockSize = LOCK_VALUE_BLOCK_16; ErrorNoticed (NULL, SS$_XVALNOTVALID, ErrorXvalNotValid, FI_LI); } } else if (AstFunction) { /* initial call */ if (InstanceLockTable[LockIndex].InUse) ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); InstanceLockTable[LockIndex].InUse = true; InstanceLockTable[LockIndex].AstFunction = AstFunction; /* convert (wait) to CR to block the queue against EX mode */ status = sys$enqw (EfnWait, LCK$K_CRMODE, &InstanceLockTable[LockIndex].Lksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, 0, LockIndex|0x80000000, &InstanceLockNotifySet, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[LockIndex].Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); } else ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); sys$setprv (0, &SysLckMask, 0, 0); return (status); } /*****************************************************************************/ /* This function abstracts away the actual lock status block containing the data being delivered by calling the AST with a pointer to the one in use. */ InstanceLockNotifySetAst (int LockIndex) { /*********/ /* begin */ /*********/ if (WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceLockNotifySetAst() !&F !&X !&Z", &InstanceLockNotifySetAst, LockIndex, InstanceLockTable[LockIndex&0x7fffffff].Name); /* mask out the bit that indicates it's an AST */ LockIndex &= 0x7fffffff; /* invoke the AST function, address of the lock status block parameter */ (*InstanceLockTable[LockIndex].AstFunction) (&InstanceLockTable[LockIndex].Lksb); } /*****************************************************************************/ /* After InstanceLockNotifySet() has "registered interest" in a particular resource name this function may be used to notify and deliver either, 15 (16) bytes for pre-V8.2 VMS, or 63 (64) bytes for non-VAX V8.2 and later VMS, of data in the lock status block to the callback AST function specified when InstanceLockNotifySet() was originally called. The 15/63 bytes of data can be anything including a null-terminated string, only the first 15/63 bytes are used of any parameter supplied. As it's generally assumed to be a string the 16/64th byte is always set to a null character (for when the string has been truncated). This function is explicitly called to initiate the notify, queuing an EXMODE lock containing a lock value block, and is also called by itself as an AST to dequeue the EXMODE lock causing the lock value block to be written to all participating in the resource. The two states are differentiated by setting the most significant bit of 'LockIndex' for the AST call. There is a third behaviour performed. If 'LockIndex' is zero the value of the lock ID is returned as a boolean. If zero then the enqueuing has concluded. If non-zero (i.e. a lock ID) then the enqueuing is not complete. Used by polling from ControlCommand(). This function uses it's own internal, static lock status block, and is used infrequently enough that the full enqueue/dequeue does not pose any performance issue. The "queue" then "convert" is required due to the conversion queue (and the locks being converted in InstanceLockNotifySet()) having priority over the waiting queue (VMS Progamming Concepts diagram). */ int InstanceLockNotifyNow ( int LockIndex, char *ValuePtr ) { static char ValueBlock [LOCK_VALUE_BLOCK_64]; static struct lksb NotifyLksb; int deqfl, status; char *cptr; /*********/ /* begin */ /*********/ if (WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceLockNotifyNow() !&F !&X !&Z !&Z", &InstanceLockNotifyNow, LockIndex, InstanceLockTable[LockIndex&0x7fffffff].Name, LockIndex&0x80000000 ? ValueBlock : ValuePtr); /* just polling the process of the lock enqueue */ if (!LockIndex) return (NotifyLksb.lksb$l_lkid); sys$setprv (1, &SysLckMask, 0, 0); if (LockIndex & 0x80000000) { if (SysInfo.LockValueBlockSize == LOCK_VALUE_BLOCK_64) deqfl = LCK$M_XVALBLK; else deqfl = 0; /* dequeue the EX mode lock writing the value block */ status = sys$deq (NotifyLksb.lksb$l_lkid, ValueBlock, 0, deqfl); if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$deq()", FI_LI); memset (&NotifyLksb, 0, sizeof(struct lksb)); memset (ValueBlock, 0, SysInfo.LockValueBlockSize); } else { if (NotifyLksb.lksb$l_lkid) ErrorExitVmsStatus (SS$_BUGCHECK, ErrorSanityCheck, FI_LI); /* if supplied store the value in the status block */ if (ValuePtr) { strncpy (ValueBlock, ValuePtr, SysInfo.LockValueBlockSize); ValueBlock[SysInfo.LockValueBlockSize-1] = '\0'; } /* queue (wait) an CR mode lock */ status = sys$enqw (EfnWait, LCK$K_CRMODE, &NotifyLksb, LCK$M_SYSTEM, &InstanceLockTable[LockIndex].NameDsc, 0, 0, LockIndex|0x80000000, 0, 0, 2, 0); if (VMSok (status)) status = InstanceLockTable[LockIndex].Lksb.lksb$w_status; if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enqw()", FI_LI); /* convert (nowait) to EX mode */ status = sys$enq (EfnNoWait, LCK$K_EXMODE, &NotifyLksb, LCK$M_CONVERT | LCK$M_SYSTEM, 0, 0, &InstanceLockNotifyNow, LockIndex|0x80000000, 0, 0, 2, 0); if (VMSnok (status)) ErrorExitVmsStatus (status, "sys$enq()", FI_LI); } sys$setprv (0, &SysLckMask, 0, 0); return (status); } /*****************************************************************************/ /* NL locks indicate any other utility, etc., (e.g. HTTPDMON) that may have an interest in the server locks. Non-NL locks indicate active server interest in the resource. This function gets all locks associated with the specified lock resource and then goes through them noting each non-NL lock. From these it can set the count of the number of servers (number of CR locks) and/or create a list of the processes with these non-NL locks. It returns a pointer to a dynamically allocated string containing list of processes. THIS MUST BE FREED. */ int InstanceLockList ( int LockIndex, char *Separator, char **ListPtrPtr ) { static unsigned short JpiNodeNameLen, JpiPrcNamLen; static char JpiNodeName [7], JpiPrcNam [16]; static struct { unsigned short tot_len, /* bits 0..15 */ lck_len; /* bits 16..30 */ } LkiLocksLength; static VMS_ITEM_LIST3 JpiItems [] = { { sizeof(JpiNodeName)-1, JPI$_NODENAME, &JpiNodeName, &JpiNodeNameLen }, { sizeof(JpiPrcNam)-1, JPI$_PRCNAM, &JpiPrcNam, &JpiPrcNamLen }, { 0,0,0,0 } }; static VMS_ITEM_LIST3 LkiItems [] = { /* careful, values are dynamically assigned in code below! */ { 0, LKI$_LOCKS, 0, &LkiLocksLength }, {0,0,0,0} }; int cnt, status, ListBytes, LockCount, NonNlLockCount; char *cptr, *sptr; IO_SB IOsb; LKIDEF *lkiptr; LKIDEF LkiLocks [INSTANCE_REPORT_LOCK_MAX]; /*********/ /* begin */ /*********/ if (WATCH_MODULE(WATCH_MOD_INSTANCE)) { WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceLockList()"); WatchDataDump (InstanceLockTable[LockIndex].Name, InstanceLockTable[LockIndex].NameLength); } NonNlLockCount = 0; if (ListPtrPtr) *ListPtrPtr = NULL; LkiItems[0].buf_addr = LkiLocks; LkiItems[0].buf_len = sizeof(LkiLocks); sys$setprv (1, &SysLckMask, 0, 0); status = sys$getlkiw (EfnWait, &InstanceLockTable[LockIndex].Lksb.lksb$l_lkid, &LkiItems, &IOsb, 0, 0, 0); sys$setprv (0, &SysLckMask, 0, 0); if (VMSok (status)) status = IOsb.Status; if (VMSnok (status)) { ErrorNoticed (NULL, status, NULL, FI_LI); return (-1); } if (LkiLocksLength.tot_len) { if (LkiLocksLength.tot_len & 0x8000) { ErrorNoticed (NULL, SS$_BADPARAM, NULL, FI_LI); return (NULL); } LockCount = LkiLocksLength.tot_len / LkiLocksLength.lck_len; } else LockCount = 0; cnt = LockCount; for (lkiptr = &LkiLocks; cnt--; lkiptr++) { /* only interested in non-NL locks */ if (lkiptr->lki$b_grmode == LCK$K_NLMODE) continue; NonNlLockCount++; } /* if not interested in generating a list */ if (!(Separator && ListPtrPtr && NonNlLockCount)) { if (WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "!UL", NonNlLockCount); return (NonNlLockCount); } ListBytes = (sizeof(JpiNodeName) + sizeof(JpiPrcNam) + strlen(Separator) + 2) * LockCount; cptr = sptr = VmGet (ListBytes); sptr[0] = '\0'; /* use WORLD to allow access to other processes */ sys$setprv (1, &WorldMask, 0, 0); cnt = LockCount; for (lkiptr = &LkiLocks; cnt--; lkiptr++) { /* only interested in CR locks */ if (lkiptr->lki$b_grmode == LCK$K_NLMODE) continue; status = sys$getjpiw (EfnWait, &lkiptr->lki$l_pid, 0, &JpiItems, &IOsb, 0, 0); if (VMSok (status)) status = IOsb.Status; if (VMSnok (status)) { ErrorNoticed (NULL, status, NULL, FI_LI); continue; } JpiNodeName[JpiNodeNameLen] = '\0'; JpiPrcNam[JpiPrcNamLen] = '\0'; if (cptr[0]) strcpy (sptr, Separator); while (*sptr) sptr++; strcpy (sptr, JpiNodeName); while (*sptr) sptr++; strcpy (sptr, "::"); while (*sptr) sptr++; strcpy (sptr, JpiPrcNam); while (*sptr) sptr++; } if (WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchDataFormatted ("!UL !&Z\n", NonNlLockCount, cptr); sys$setprv (0, &WorldMask, 0, 0); *ListPtrPtr = cptr; return (NonNlLockCount); } /*****************************************************************************/ /* Using the lock IDs in the general and socket lock tables produce a report that lists all of the related locks, showing process PIDs, cluster nodes, etc. This is mainly intended as a debugging, development and trouble-shooting tool. */ InstanceLockReport ( REQUEST_STRUCT *rqptr, REQUEST_AST NextTaskFunction ) { static char BeginPage [] = "

\n\ \n\
\n\ \n\ \n\
\n\
!#*   MSTLKID  MSTCSID  RQ GR QU LKID     \
CSID     PRCNAM          PID      VALBLK(!UL)\n";

   static char  MutexFao [] = "\n!18AZ  !11&L / !&L (!UL%)";

   static char  EndPage [] =
"
\n\
\n\ \n\ \n"; int idx, status, ResNameLength; unsigned long *vecptr; unsigned long FaoVector [32]; /*********/ /* begin */ /*********/ if (WATCHING(rqptr) && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (rqptr, FI_LI, WATCH_MOD_INSTANCE, "InstanceLockReport()"); InstanceLockReportNameWidth = 0; /* the socket locks index from zero!! */ for (idx = 0; idx < InstanceSocketCount; idx++) { ResNameLength = strlen(InstanceParseLockName(InstanceSocketTable[idx].Name)); if (ResNameLength > InstanceLockReportNameWidth) InstanceLockReportNameWidth = ResNameLength; } AdminPageTitle (rqptr, "Lock Report", BeginPage, InstanceLockReportNameWidth, SysInfo.LockValueBlockSize); /* use WORLD to allow access to other process' PID process names */ sys$setprv (1, &WorldMask, 0, 0); sys$setprv (1, &SysLckMask, 0, 0); /* the general locks index from one!! */ for (idx = 1; idx <= INSTANCE_LOCK_COUNT; idx++) InstanceLockReportData (rqptr, &InstanceLockTable[idx].Lksb.lksb$l_lkid); /* the socket locks index from zero!! */ for (idx = 0; idx < InstanceSocketCount; idx++) InstanceLockReportData (rqptr, &InstanceSocketTable[idx].Lksb.lksb$l_lkid); if (InstanceLockAdmin.Lksb.lksb$l_lkid) InstanceLockReportData (rqptr, &InstanceLockAdmin. Lksb.lksb$l_lkid); sys$setprv (0, &SysLckMask, 0, 0); sys$setprv (0, &WorldMask, 0, 0); InstanceMutexLock (INSTANCE_MUTEX_HTTPD); for (idx = 1; idx <= INSTANCE_MUTEX_COUNT; idx++) { vecptr = FaoVector; *vecptr++ = InstanceMutexDescr[idx]; *vecptr++ = HttpdGblSecPtr->MutexCount[idx]; *vecptr++ = HttpdGblSecPtr->MutexWaitCount[idx]; *vecptr++ = PercentOf (HttpdGblSecPtr->MutexWaitCount[idx], HttpdGblSecPtr->MutexCount[idx]); status = FaolToNet (rqptr, MutexFao, &FaoVector); if (VMSnok (status)) ErrorNoticed (rqptr, status, NULL, FI_LI); } InstanceMutexUnLock (INSTANCE_MUTEX_HTTPD); status = FaolToNet (rqptr, EndPage, &FaoVector); if (VMSnok (status)) ErrorNoticed (rqptr, status, NULL, FI_LI); rqptr->rqResponse.PreExpired = PRE_EXPIRE_ADMIN; ResponseHeader200 (rqptr, "text/html", &rqptr->NetWriteBufferDsc); SysDclAst (NextTaskFunction, rqptr); } /*****************************************************************************/ /* Report on a single lock. */ InstanceLockReportData ( REQUEST_STRUCT *rqptr, unsigned long *LockIdPtr ) { static char LockDataFao [] = "!#AZ !8XL !8AZ !AZ !AZ !&@ !8XL !8AZ \ !15AZ !8XL !AZ\n"; static char *LockMode [] = { "NL","CR","CW","PR","PW","EX" }, /* lock state seems to range from -8 (RSPRESEND) to +1 (GR) */ *LockState [] = { "??","??","-8","-7","-6","-5","-4", "-3","-2","WT","CV","GR","??","??" }; static unsigned long JpiPrcNamLen, LkiResNamLen, LkiValBlkLen, Lki_XVALNOTVALID; static char NodeName [16], JpiPrcNam [16], JpiUserName [13], LkiResNam [31+1], LkiValBlk [LOCK_VALUE_BLOCK_64+1]; static struct { unsigned short tot_len, /* bits 0..15 */ lck_len; /* bits 16..30 */ } *lksptr, LkiLocksLen; static VMS_ITEM_LIST3 LkiItems [] = { /* careful, values are dynamically assigned in code below! */ { 0, 0, 0, 0 }, /* reserved for LKI$_LOCKS item */ { 0, 0, 0, 0 }, /* reserved for LKI$_RESNAM item */ { 0, 0, 0, 0 }, /* reserved for LKI$_[X]VALBLK item */ { 0, 0, 0, 0 }, /* reserved for LKI$_XVALNOTVALID item */ {0,0,0,0} }; static VMS_ITEM_LIST3 JpiItems [] = { { sizeof(JpiPrcNam)-1, JPI$_PRCNAM, &JpiPrcNam, &JpiPrcNamLen }, { sizeof(JpiUserName), JPI$_USERNAME, &JpiUserName, 0 }, { 0,0,0,0 } }; static VMS_ITEM_LIST3 SyiItems [] = { { sizeof(NodeName)-1, SYI$_NODENAME, &NodeName, 0 }, {0,0,0,0} }; int cnt, idx, status, LockCount, LockTotal; unsigned long *vecptr; unsigned long FaoVector [32]; char *cptr; char CsidNodeName [16], MstCsidNodeName [16], PidPrcNam [16], String [256]; IO_SB IOsb; LKIDEF *lkiptr; LKIDEF LkiLocks [INSTANCE_REPORT_LOCK_MAX]; /*********/ /* begin */ /*********/ if (WATCHING(rqptr) && WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (rqptr, FI_LI, WATCH_MOD_INSTANCE, "InstanceLockReportData() !8XL", *LockIdPtr); memset (LkiValBlk, 0, sizeof(LkiValBlk)); LkiItems[0].buf_len = sizeof(LkiLocks); LkiItems[0].buf_addr = &LkiLocks; LkiItems[0].item = LKI$_LOCKS; LkiItems[0].ret_len = &LkiLocksLen; LkiItems[1].buf_len = sizeof(LkiResNam); LkiItems[1].buf_addr = &LkiResNam; LkiItems[1].item = LKI$_RESNAM; LkiItems[1].ret_len = &LkiResNamLen; if (SysInfo.LockValueBlockSize == LOCK_VALUE_BLOCK_64) { LkiItems[2].buf_len = LOCK_VALUE_BLOCK_64; LkiItems[2].buf_addr = &LkiValBlk; LkiItems[2].item = LKI$_XVALBLK; LkiItems[2].ret_len = &LkiValBlkLen; LkiItems[3].buf_len = sizeof(Lki_XVALNOTVALID); LkiItems[3].buf_addr = &Lki_XVALNOTVALID; LkiItems[3].item = LKI$_XVALNOTVALID; } else { LkiItems[2].buf_len = LOCK_VALUE_BLOCK_16; LkiItems[2].buf_addr = &LkiValBlk; LkiItems[2].item = LKI$_VALBLK; LkiItems[2].ret_len = &LkiValBlkLen; /* in this case this terminates the item list */ LkiItems[3].buf_len = 0; LkiItems[3].buf_addr = 0; LkiItems[3].item = 0; Lki_XVALNOTVALID = 0; } status = sys$getlkiw (EfnWait, LockIdPtr, &LkiItems, &IOsb, 0, 0, 0); if (VMSok (status)) status = IOsb.Status; if (VMSnok (status)) { ErrorNoticed (rqptr, status, NULL, FI_LI); return (status); } if (Lki_XVALNOTVALID) { /* hmmm, change in cluster composition? whatever! go back to 16 bytes */ SysInfo.LockValueBlockSize = LOCK_VALUE_BLOCK_16; ErrorNoticed (NULL, SS$_XVALNOTVALID, ErrorXvalNotValid, FI_LI); return (SS$_XVALNOTVALID); } LkiResNam[LkiResNamLen] = '\0'; LkiValBlk[LkiValBlkLen] = '\0'; lkiptr = LkiItems[0].buf_addr; lksptr = LkiItems[0].ret_len; if (lksptr->tot_len & 0x8000) { ErrorNoticed (rqptr, SS$_BADPARAM, NULL, FI_LI); return (SS$_BADPARAM); } if (lksptr->lck_len) LockCount = lksptr->tot_len / lksptr->lck_len; else LockCount = 0; if (!LockCount) { ErrorNoticed (NULL, status, ErrorSanityCheck, FI_LI); return (status); } for (cnt = 0; cnt < LockCount; cnt++, lkiptr++) { memset (NodeName, 0, sizeof(NodeName)); status = sys$getsyiw (EfnWait, &lkiptr->lki$l_mstcsid, 0, &SyiItems, &IOsb, 0, 0); if (VMSok (status)) status = IOsb.Status; if (VMSnok (status)) { ErrorNoticed (rqptr, status, NULL, FI_LI); continue; } strcpy (MstCsidNodeName, NodeName); memset (NodeName, 0, sizeof(NodeName)); status = sys$getsyiw (EfnWait, &lkiptr->lki$l_csid, 0, &SyiItems, &IOsb, 0, 0); if (VMSok (status)) status = IOsb.Status; if (VMSnok (status)) { ErrorNoticed (rqptr, status, NULL, FI_LI); continue; } strcpy (CsidNodeName, NodeName); memset (JpiPrcNam, 0, sizeof(JpiPrcNam)); status = sys$getjpiw (EfnWait, &lkiptr->lki$l_pid, 0, &JpiItems, &IOsb, 0, 0); if (VMSok (status)) status = IOsb.Status; if (VMSnok (status)) { ErrorNoticed (rqptr, status, NULL, FI_LI); continue; } JpiPrcNam[15] = JpiUserName[12] = '\0'; for (cptr = JpiUserName; *cptr && *cptr != ' '; cptr++); *cptr = '\0'; vecptr = FaoVector; *vecptr++ = InstanceLockReportNameWidth; if (cnt) *vecptr++ = ""; else *vecptr++ = InstanceParseLockName(LkiResNam); *vecptr++ = lkiptr->lki$l_mstlkid; *vecptr++ = MstCsidNodeName; *vecptr++ = LockMode[lkiptr->lki$b_rqmode]; *vecptr++ = LockMode[lkiptr->lki$b_grmode]; if (lkiptr->lki$b_queue == 1) *vecptr++ = "!AZ"; else *vecptr++ = "!AZ"; *vecptr++ = LockState[lkiptr->lki$b_queue+10]; *vecptr++ = lkiptr->lki$l_lkid; *vecptr++ = CsidNodeName; *vecptr++ = JpiPrcNam; *vecptr++ = ADMIN_REPORT_SHOW_PROCESS; *vecptr++ = lkiptr->lki$l_pid; *vecptr++ = JpiUserName; *vecptr++ = lkiptr->lki$l_pid; *vecptr++ = LkiValBlk; status = FaolToNet (rqptr, LockDataFao, &FaoVector); if (VMSnok (status) || status == SS$_BUFFEROVF) ErrorNoticed (rqptr, status, NULL, FI_LI); } return (status); } /*****************************************************************************/ /* Parse the binary lock resource name into readable format suitable for display. Return a pointer to a static buffer containing that description. */ char* InstanceParseLockName (char *LockName) { static char String [128]; static char *LockUses [] = { INSTANCE_LOCK_USES }; static char ErrorOverflow [] = "***OVERFLOW***"; int cnt; unsigned int Ip4Address; unsigned short IpPort; unsigned char Ip6Address [16]; char *cptr, *sptr, *zptr; /*********/ /* begin */ /*********/ if (WATCH_MODULE(WATCH_MOD_INSTANCE)) WatchThis (NULL, FI_LI, WATCH_MOD_INSTANCE, "InstanceParseLockName()"); zptr = (sptr = String) + sizeof(String)-16; cptr = LockName; cnt = sizeof(HTTPD_NAME)-1; while (cnt-- && sptr < zptr) *sptr++ = *cptr++; /* version and environment number */ sptr += sprintf (sptr, "|%d|%d", (*cptr & 0xf0) >> 4, *cptr & 0x0f); if (sptr >= zptr) return (ErrorOverflow); cptr++; if (*cptr > INSTANCE_LOCK_PRINTABLE) { /* node name */ if (sptr < zptr) *sptr++ = '|'; while (*cptr > INSTANCE_LOCK_PRINTABLE && sptr < zptr) *sptr++ = *cptr++; } if (*cptr <= INSTANCE_LOCK_COUNT) { sptr += sprintf (sptr, "|%s", LockUses[*cptr]); if (sptr >= zptr) return (ErrorOverflow); *sptr = '\0'; return (String); } if (*cptr == INSTANCE_NODE_SOCKIP4) { cptr++; Ip4Address = *(UINTPTR)cptr; cptr += sizeof(unsigned int); IpPort = *(USHORTPTR)cptr; sptr += sprintf (sptr, "|%s,%d", TcpIpAddressToString(Ip4Address,4), IpPort); } else if (*cptr == INSTANCE_NODE_SOCKIP6) { cptr++; memcpy (&Ip6Address, cptr, sizeof(Ip6Address)); cptr += sizeof(Ip6Address); IpPort = *(USHORTPTR)cptr; sptr += sprintf (sptr, "|%s,%d", TcpIpAddressToString(&Ip6Address,6), IpPort); } else *sptr++ = '?'; if (sptr >= zptr) return (ErrorOverflow); *sptr = '\0'; return (String); } /*****************************************************************************/