(--------------------------------------------------------------------) (Configuration Considerations\cr_config)

WASD has a global configuration, which applies characteristics to the entire running server, as well as per-service (virtual server) and conditional configuration, which applies characteristics or behaviours to specific requests. All configuration is provided via files located by logical names.

((Configuration Files\BOLD)) (3\20\10) (Name\Scope\Description) (WASD_CONFIG_AUTH\loadable\request authorization control) (WASD_CONFIG_GLOBAL\global\global server configuration) (WASD_CONFIG_MAP\loadable\request processing control) (WASD_CONFIG_MSG\global\provides server messages) (WASD_CONFIG_SERVICE\global\specifies services (virtual servers))

Simple editing of these files change the configuration. Comment lines may be included by prefixing them with the hash ((#)) character. Comment lines prefixed with a quote and then a hash ((!#)) are displayed in Server Admin reports and are WATCHable during rule proceessing. Configuration file directives are not case-sensitive. Any changes to global configuration file can only be enabled by restarting the HTTPd process using the following command on the server system. $ HTTPD /DO=RESTART

Changes to request mapping or authorization configuration files also can be dynamically reloaded into the running server using the administration command-line interface. $ HTTPD /DO=MAP=LOAD $ HTTPD /DO=AUTH=LOAD

Changes to configuration files can be validated at the command-line before reload or restart. This detects and reports any syntactical and fatal configuration errors but of course cannot check the (intent) of the rules. $ HTTPD /DO=AUTH=CHECK $ HTTPD /DO=CONFIG=CHECK $ HTTPD /DO=GLOBAL=CHECK $ HTTPD /DO=MAP=CHECK $ HTTPD /DO=MSG=CHECK $ HTTPD /DO=SERVICE=CHECK

The (config) check sequentially processes each of the (authorization), (global), (mapping), (message) and (service) configuration files.

If additional server startup qualifiers are required to enable specific configuration features then these must also be provided when checking. For example: $ HTTPD /DO=AUTH=CHECK /SYSUAF /PROFILE

A server's currently loaded configuration can be interrogated from the Server Administration menu (see (HREF=[-.features]features.sdml!../features/#cr_server_admin)(doc_features.sdml)). (Include File Directive\hd_config_directives_include)

WASD uses multiple configuration files for a server and its site, each one providing for a different functional aspect configuration, virtual services, path mapping, authorization, etc. Generally these configuration files are (flat), with all required directives included in a single file. This provides a simple and straight-forward approach suitable for most sites and allows for the provision of Server Administration page online configuration of several aspects.

It is also possible to build site configurations by including the contents of referenced files. This may provide a structure and flexibility not possible using the flat-file approach. All WASD configuration files allow the use of an [IncludeFile] directive. This takes a VMS file specification parameter. The file's contents are then loaded and processed as if part of the parent configuration file. These included files are allowed to be nested to a depth of two (i.e. the configuration file can include a file which may then include another file).

The following is an example used to build up the mapping rules for four virtual services supported on the one server. # WASD_CONFIG_MAP [[alpha.site.com]] [IncludeFile] WASD_ROOT:[LOCAL]MAP_ALPHA_80.CONF [[alpha.site.com:443]] [IncludeFile] WASD_ROOT:[LOCAL]MAP_ALPHA_443.CONF [[beta.site.com]] [IncludeFile] WASD_ROOT:[LOCAL]MAP_BETA_80.CONF [[beta.site.com:443]] [IncludeFile] WASD_ROOT:[LOCAL]MAP_BETA_443.CONF [[*]] [IncludeFile] WASD_ROOT:[LOCAL]MAP_COMMON.CONF Such configurations cannot be managed using Server Administration facility (see (HREF=[-.features]features.sdml!../features/#cr_server_admin)(doc_features.sdml)). Files containing [IncludeFile] directives are noted during server startup and if an Server Administration page configuration interface is accessed where this would be a problem an explanatory message and warning is provided. A configuration (can still be saved) but the resulting configuration will be a flat-file representation of the server configuration, not the original hierarchical one. (....................................................................) (Site Organisation\hd_site_organisation)

(It is recommended that the server distribution tree and any document and other web-specific data areas be kept separate and distinct. \BOLD)

The former in WASD_ROOT:[000000], the latter perhaps in something like WEB:[000000]. This logical device could be provided with the following DCL introduced into the site or server startup procedures: $ DEFINE /SYSTEM /TRANSLATION=CONCEALED WEB DKA0:[WEB.]

See (hd_map_vms) for further information on the use of logical names in locating and defining the content and structure of a site.

Note that logical device names like this need not appear in in the structure of the Web site. The root of the Web-accessible path can be concealed using a final mapping rule similar to the following pass /* /web/* which simply defaults (anything else) to that physical area. Of course if that (anything else) needs to exist then it must be located in that physical area.

Mapping rules are the tools used to build a logical structure to a site from the physical area, perhaps multiple areas, used to house the associated files. The logical organisation of served data is largely hierarchical, organised under the Web-server path root, and is achieved via two mechanisms. (NUMBERED) The natural tree structure provided by a hierarchical file system. The logical hierarchy possible using rules within the mapping file to place disparate physical areas into a single logical structure.

Physically distinct areas are used for good physical reasons (e.g. the area can best be hosted on a task-local disk), for historical reasons (e.g. the area existed before any Web environment existed) or for reasons of convenience (e.g. lets put this where access controls already allow the maintainers to manage it).

(There are no good reasons for having site-specific documents integrated into the package directory structure!\BOLD)

All site-served files should be located in an autonomous, dedicated area or areas. The only reason to place script files into WASD_ROOT:[CGI-BIN] or WASD_ROOT:[(architecture)_BIN] is that the script script is traditionally accessible via a /cgi-bin/ path or that the site is a small and/or low usage environment where this directory is conveniently available for the few extra scripts being made available.

For any significant site (size that as best suits your perception), or for when a specific software system or systems is being built or exists and it is being (Web-ified), design that software system as you would be any other. That is place the documentation in one directory are, executables and support procedures in their own, management files in another, data in yet another area, etc. Then make those portions that are required to be accessible via the Web interface accessible via the logical associations afforded through the use of the server's mapping rules ((cr_mapping_rule)). Of course existing areas that are to be now made available via the Web can be mapped in the same way. This includes the active components - executable scripts. There is no reason (apart from historical) why the /cgi-bin/ path should be used to activate scripts associated with a dedicated software system. Use a specific and unique path for scripts associated with each such system.

When making a directory structure available via the Web care must be taken that only the portions required to be accessed can be. Other areas should or must not be accessible. The server process can only access files that are world-accessible, it is specifically granted access via VMS protection mechanisms (e.g. ACLs), or that the individual SYSUAF-authorized accessor can access and which have specifically been made available via server authorization rules. Use the recommendations in (hd_securing_package) as guidlines when designing your own site's protections and permissions. (Document Root\hd_data_area_root)

A particular area of the file system may be specified as the (root) of a particular (virtual) sites documents. This is done using the WASD_CONFIG_MAP SET (map=root=()) mapping rule. After this rule is applied all subsequent rules have the specified string prefixed to mapped strings before file-system resolution.

For example, the following WASD_CONFIG_MAP rule set [[the.virtual.site:*]] pass /*/-/* /wasd_root/runtime/*/* /wasd_root/* /wasd_root/* set * map=root=/dka0/the_site exec /cgi-bin/* /cgi-bin/* pass /* /* fail * when applied to the following request URLs results in the described mappings being applied. http://the.virtual.site/doc/example.txt access to the document represented by file DKA0:[THE_SITE.DOC]EXAMPLE.TXT

With the request for a directory icon using http://the.virtual.site/-/httpd/file.gif access to the image represented by file WASD_ROOT:[RUNTIME.HTTPD]FILE.GIF

And a request for a script using http://the.virtual.site/cgi-bin/example.php activation of the script represented by the file DKA0:[THE_SITE.CGI-BIN]EXAMPLE.PHP

Care must be taken in getting the sequence of mapping rules correct for access to non-site resources before actually setting the document root which then ties every other resource to that root. (....................................................................) (Virtual Services\hd_virtual_services)

A single WASD server process is capable of concurrently supporting the same host name on different port numbers and a number of different host names (DNS aliased or multi-homed) using the same port number. This capability is generally known as a (virtual server). There is no design limitation on the number of these services that WASD will concurrently support. Virtual services offer versatile and powerful multi-site capabilities using the one system and server. Service determination is based on the contents of the request's (Host:) header field. If none is present it defaults to base service for the interface's IP address and port. (WASD_CONFIG_SERVICE\hd_virtual_httpd_service)

If the logical name WASD_CONFIG_SERVICE is defined the deprecated WASD_CONFIG_GLOBAL [Service] directive is not used (see below).

See (cr_service_directives) for further detail. (WASD_CONFIG_GLOBAL [Service] (Deprecated)\hd_virtual_service)

Using the [Service] WASD_CONFIG_GLOBAL configuration parameter or the /SERVICE qualifier the server creates an HTTP service for each specified. If the host name is omitted it defaults to the local host name. If the port is omitted it defaults to 80. The first port specified in the service list becomes the (administration) port of the server, using the local host name, appearing in administration reports, menus, etc. This port is also that specified when sending control commands via the /DO= qualifier.

This rather contrived example shows a server configured to provide four services over two host names. [Service] alpha.example.com alpha.example.com:8080 beta.example.com beta.example.com:8000

Note that both the WASD_CONFIG_SERVICE configuration file (see (cr_service_directives)) and the /SERVICE= command-line qualifier override this directive. ([[virtual-server]]\hd_virtual_rules)

The essential profile of a site is established by its mapped resources and any authorization controls, the WASD_CONFIG_MAP and WASD_CONFIG_AUTH configuration files respectively, and these two files support directives that allow configuration rules to be applied to all virtual services (i.e. a default), to a host name (all ports), or to a single specified service (host name and specific port).

To restrict rules to a specified server (virtual or real) add a line containing the server host name, and optionally a port number, between double-square brackets. All following rules will be applied only to that service. If a port number is not present it applies to all ports for that service name, otherwise only to the service using that port. To resume applying rules to all services use a single asterisk instead of a host name. In this way default (all service) and server-specific rules may be interleaved to build a composite environment, server-specific yet with defaults. Note that service-specific and service-common rules may be mixed in any order allowing common rules to be shared. This descriptive example shows a file with one rule per line. # just an example (this rule applies to all services so does this and this one) [[alpha.example.com]] (this one however applies only to ALPHA, but to all ports as indeed does this) [[beta.example.com:8000]] (now we switch to the BETA service, but only port 8000 another one only applying to BETA and a third) [[*]] (now we have a couple default rules that again apply to all servers)

Both the mapping and authorization modules report if rules are provided for services that are not configured for the particular server process (i.e. not in the server's [Service] or /SERVICE parameter list). This provides feedback to the site administrator about any configuration problems that exist, but may also appear if a set of rules are shared between multiple processes on a system or cluster where processes deliver differing services. In this latter case the reports can be considered informational, but should be checked initially and then occasionally for misconfiguration. There is a difference when specifying virtual services during service creation and when using them to apply mapping, etc. When creating a service the scheme (or protocol, e.g. (http:), (https:)) needs to be specified so the server can apply the correct protocol to connections accepted at that service. Once a service is created however, it becomes defined by the host-name and port supplied when created. Only one scheme (protocol) can be supported on any one host-name/port instance and so it becomes unnecessary to provide it with mapping rules, etc. The server will complain in instances where it is redundant. (Unknown Virtual Server\hd_virtual_unknown)

If a service is not configured for the particular host address and port of a request one of two actions will be taken. (NUMBERED) If the configuration directive [ServiceNotFoundURL] is set the request will be redirected to the specified URL. This should contain a specific host name, as well as message page. For the default page use: [ServiceNotFoundURL] //server.host.name/httpd/-/servicenotfound.html If the above directive is not set the request is mapped using the default rules (e.g. [[*]]). It is possible to specify a rule set containing a default rule for each virtual server. The unmatched request is then handled by a fallback rule, as illustrated in the following. pass /*/-/admin/* pass /*/-/* /wasd_root/runtime/*/* exec /cgi-bin/* /cgi-bin/* [[virtual1.host.name]] /* /web/virtual1/* / /web/virtual1/ [[virtual2.host.name]] /* /web/virtual2/* / /web/virtual2/ [[virtual3.host.name]] /* /web/virtual3/* / /web/virtual3/ [[*]] /* /web/servicenotfound.html

This applies to dotted-decimal addresses as well as alpha-numeric. Therefore if there is a requirement to connect via a numeric IP address such a service must have been configured.

Note also that the converse is possible. That is, it's possible to configure a service that the server cannot ever possibly respond to because it does not have an interface using the IP address represented by the service host. (....................................................................) (GZIP Encoding\hd_gzip_encoding)

WASD can apply GZIP compression (gzip, deflate) to any suitable response body and can accept similarly compressed request bodies. It dynamically maps required functions from a ZLIB shareable image. Originally developed against the ZLIB v1.2.(n) port by Jean-François Piéronne, the VMS-PORTS (GNV) LIBZ package is also supported.

WASD dynamically maps the associated shareable image by successively accessing the (optionally defined) WASD_LIBZ_SHR32 logical name, then GNV$LIBZSHR32, then LIBZ_SHR32, before reporting GZIP unavailable.

The shareable image must be INSTALLed (without any particular privileges) before it can be activated by the privileged WASD HTTPd image (the WASD startup will automatically do this if necessary). The server process log and the Server Administration page, Statistics Report panel named Environment contains the version activated or a VMS status message if an error was encountered. (Response Encoding\hd_gzip_response_encoding)

The WASD_CONFIG_GLOBAL directive [GzipResponse] controls whether this feature is enabled for the gzip content-encoding of suitable response bodies. This directive requires at least one parameter, the compression level in the range 1..9. Smaller values provide faster but poorer compression ratios while larger values better compression at the cost of more CPU cycles and latency. This corresponds to the GZIP utility's -1..-9 CLI switches. Two optional parameters could allow ZLIB's 'memLevel' and 'windowBits' to be adjusted by ZLIB afficiendos (level[,memory,window]). A small amount of experimentation by this author indicates minor changes in memory usage and compression ratio by fiddling with these.

Be aware that GZIP encoding is (memory intensive\BOLD). From 132kB to 265kB has been observed per compressing request (WATCH provides this in a summary line). These values apply across a wide range of transfer sizes (from kilobytes to tens of megabytes). It also is (CPU intensive\BOLD) and adds response latency, though that might be well be offset by significant reductions in transfer time on the Internet or other slower, non-intranet infrastructures. Text content compression has been observed from 30% to 10% of the original file size (even down to 1% in the case of the extremely redundant content of [EXAMPLE]64K.TXT). VMS executables (for want of another binary test case) at around 40%. In other words, GZIP encoding may not be suitable or efficient for every site or every request!

Once enabled WASD will GZIP the responses for all suitable contents provided the client accepts the encoding and the response is not one of the following: (UNNUMBERED) less than 1400 bytes (no point in the overhead) already content-encoded script output a compressed image (e.g. GIF, JPEG, PNG, etc) a video stream (presumably already compressed, e.g. MPEG) a compressed audio stream a PDF file a Shockwave Flash file an obviously compressed application stream (e.g. GZIP, ZIP, JAR)

Additional control may be exercised with the following path SETings: (UNNUMBERED) (response=GZIP=all), matching paths will always have GZIP encoding performed (the above constraints still apply) (response=GZIP=none), matching paths will never have GZIP encoding (response=GZIP=()), responses with content-lengths greater than the specified number of kilobytes will be GZIP content-encoded (if the content-length cannot be determined it will NOT not encoded and the above constraints still apply)

Using path settings GZIP compression may be disabled for specified file types (apart from those already suppressed as described above). set **.myzip response=gzip=none

A script using the (Script-Control: X-content-encoding-gzip=0) CGI response header can similarly suppress GZIP compression of its output if required. See (Scripting Overview) for further detail. (Flush Period\hd_gzip_request_flush)

By default GZIP encoding flushes the internal buffer only when full. Most commonly this is not an issue because of high rates of output. However with slow output sources, such as from some classes of script, this can result in considerable latency before a client sees an initial response, and then between transmission of further output. By default output is initially flushed after 5 seconds and thereafter at a maximum interval of 15 seconds. The WASD_CONFIG_GLOBAL directive [GzipFlushSeconds] allows this period to be adjusted. (Request Encoding\hd_gzip_request_encoding)

Decoding of GZIP content-encoded request bodies is enabled using the WASD_CONFIG_GLOBAL directive [GzipAccept]. Enabling this using a value 15 (or 1) results in the server advertising its acceptance of GZIPed requests using the "Accept-Encoding: gzip, deflate" response header. Requests containing bodies GZIP compressed will have these decoded as they are read from the client and before further processing, such as the upload of files into server accessible file-system space. This decoding is optional and not the default with DCL and DECnet script processing. That is, a request body will be passed to the script still encoded unless specific mapping directs otherwise. Decoding by the server into the original data prior to transfering to the script can be enabled for all or selected scripts using the following path settings: (UNNUMBERED) (script=body=decode), script gets the decoded stream (script=body=NOdecode), script gets the raw, encoded stream (default)

Note that scripts need to be specially aware of both GZIP encoded bodies and those already decoded by the server. In the first case the stream must be read to the specified content-length and then decoded. In the second case, a content-length cannot be provided by the server (without unencoding the entire stream ahead of time it cannot predict the final size). Where the server is to decode the request body before transfering it to the script it changes the CGI variable CONTENT_LENGTH to a single question-mark ("?"). Scripts may use this to detect the server's intention and then must ignore any transfer-encoding and/or content-encoding header information and read the request body until end-of-file is received.

GZIP decoding (decompression) is understandably much less memory and CPU intensive. Experimentation indicates it does not contribute significantly to latency either. (....................................................................) (Request Throttling\hd_request_throttling)

Request (throttling) is a term adopted to describe controlling the number of requests that can be processing against any specified path at any one time. Requests in excess of this value are First-In-First-Out (FIFO) queued, up to an optional limit, waiting for a currently processing request to conclude allowing the next queued request to resume processing. This is primarily intended to limit concurrent resource-intensive script execution but could be applied to any resource path. Here's one dictionary description.

( (throttle n 1:\BOLD) a valve that regulates the supply of fuel to the engine [syn: accelerator, throttle valve] (2:\BOLD) a pedal that controls the throttle valve; (he stepped on the gas) [syn: accelerator, accelerator pedal, gas pedal, gas, gun] (v 1:\BOLD) place limits on; (restrict the use of this parking lot) [syn: restrict, restrain, trammel, limit, bound, confine] (2:\BOLD) squeeze the throat of; (he tried to strangle his opponent) [syn: strangle, strangulate] (3:\BOLD) reduce the air supply; of carburetors [syn: choke] )

This is applied to a path (or paths) using the WASD_CONFIG_MAP mapping SET THROTTLE= rule ((hd_map_rule_set)). The general format is set (path) throttle=(n1)[(/u1)][(,n2,n3,n4,t/o1,t/o2)] set (path) throttle=(from)[(/per-user)][(,to,resume,busy,t/o-queue,t/o-busy)] where (UNNUMBERED) (n1) sets the number of concurrent requests before queuing begins (the number of processing requests becomes static and the number of queued requests increases) (u1) is separated from the (n1) value by a forward-slash and limits the concurrent request any one authenticated user can process. Even though the (n1) value may allow processing if (u1) would be exceeded the request is queued. (n2) is the concurrent requests before FIFO queuing begins, meaning each new request is put onto the queue but at the same the first-in request is taken off the queue for processing (the number of queued requests becomes static and the number of processing requests increases) (n3) puts a limit on FIFO queuing (the number of queued requests again increases and the number of processing requests becomes static) (n4) is an absolute limit for concurrent requests against the path (a 503 (server too busy) status is immediately generated) (t/o1) is the maximum period for queued requests before they are processed (if not constrained by (n3)) (t/o2) is the maximum period for queued requests before a 503 (server too busy) response is returned, it begins immediately or following the expiry of any (t/o1)

One way to read a throttle rule is (begin to (throttle) (queue) requests (from) the n1 value up (to) the n2 value, after which the queue is FIFOed up to the n3 value when it (resume)s queuing-only, up until the (busy) n4 value).

Each integer represents the number of concurrent requests against the throttle rule path. Parameters not required may be specified as zero or omitted in a comma-separated list. The schema of the rule requires that each successive parameter be larger than that preceding it. This basic consistency check is performed when the rule is loaded.

For any rule the possible maximum number of requests that can be processed at any one time may be simply calculated through the addition of the (n1) value to the difference of the (n3) and (n2) values (i.e. max = n1 + (n3 - n2)). The maximum concurrently queued as the difference of the (n4) and the maximum concurrently processed.

A comprehensive throttle statistics report is available from the Server Administration facility. (Per-User Throttle\hd_request_throttle_per_user)

If the concurrent processing value ((n1)) has a second, slash-delimited integer, this serves to limit the number of authenticated user-associated requests that can be concurrently processing.

When a request is available for processing the associated remote user name is checked for activity against the queue. The (u1) (or per-user throttle value) is a limit on that user name's concurrent processing. If it would exceed the specified value the request is queued until the number of requests processing drops below the (u1) value. All other values in the throttle rule are applied as for non-per-user throttling. The user name used for comparison purposes is the authenticated remote user (same as the CGI variable value REMOTE_USER). This can be for any realm. Of course the same string can be used to represent different users within different authentication realms and so care should be exercised that per-user throttling does not span realms otherwise unexpected (and incorrect) throttling may occur for distinct users.

If an unauthenticated request is matched against the throttle rule (i.e. there is no authorization rule matching the request path) the client has a 500 (server error) response returned. Obviously per-user throttling must have a remote user name to throttle against and this is a configuration issue. (Examples\hd_request_throttle_examples) (NUMBERED) (throttle=10\BOLD)

Requests up to 10 are concurrently processed. When 10 is reached futher requests are queued to server capacity. (throttle=10,20\BOLD)

Concurrent requests to 10 are processed immediately. From 11 to 20 requests are queued. After 20 all requests are queued but also result in a request FIFOing off the queue to be processed (queue length is static, number being processed increases to server capacity). (throttle=15,30,40\BOLD)

Concurrent requests up to 15 are immediately processed. Requests 16 through to 30 are queued, while 31 to 40 requests result in the new requests being queued and waiting requests being FIFOed into processing. Concurrent requests from 41 onwards are again queued, in this scenario to server capacity. (throttle=10,20,30,40\BOLD)

Concurrent requests up to 10 are immediately processed. Requests 11 through to 20 will be queued. Concurrent requests from 21 to 30 are queued too, but at the same time waiting requests are FIFOed from the queue (resulting in 10 (n1) + 10 (n3-n2) = 20 being processed). From 31 onwards requests are just queued. Up to 40 concurrent requests may be against the path before all new requests are immediately returned with a 503 "busy" status. With this scenario no more than 20 can be concurrently processed with 20 concurrently queued. (throttle=10,,,30\BOLD)

Concurrent requests up to 10 are processed. When 10 is reached requests are queued up to request 30. When request 31 arrives it is immediately given a 503 "busy" status. (throttle=10,20,30,40,00:02:00\BOLD)

This is basically the same as scenario 4) but with a resume-on-timeout of two minutes. If there are currently 15 (or 22 or 28) requests (n1 exceeded, n3 still within limit) the queued requests will begin processing on timeout. Should there be 32 processing (n3 has reached limit) the request will continue to sit in the queue. The timeout would not be reset. (throttle=15,30,40,,,00:03:00\BOLD)

This is basically the same as scenario 3) but with a busy-on-timeout of three minutes. When the timeout expires the request is immediately dequeued with a 503 "busy" status. (throttle=10/1\BOLD)

Concurrent requests up to 10 are processed. The requests must be of authenticated users. Each authenticated user is allowed to execute at most one concurrent request against this path. When 10 is reached, or if less than 10 users are currently executing requests, then further requests are queued to server capacity. (throttle=10/1,,,,,00:03:00\BOLD)

This is basically the same as scenario 8) but with a busy-on-timeout of three minutes. When the timeout expires any requests still queued against the user name is immediately dequeued with a 503 "busy" status. (Mapping Reload\hd_request_throttle_reload)

Throttling is applied using mapping rules. The set of these rules may be changed within an executing server using map reload functionality. This means the number of, and/or contents of, throttle rules may change during server execution. The throttle functionality needs to be independent of the the mapping functionality (requests are processed independently of mapping rules once the rules have been applied). After a mapping reload the contents of the throttle data structures may be at variance with the constraints currently executing requests began processing under.

This should have little deleterious effect. The worst case is mis-applied constraints on the execution limits of changed request paths, and slightly confusing data in the Throttle Report. This quickly passes as requests being processed under the previous throttle constraints conclude and an entirely new collection of requests created using the constraints of the currently loaded rules are processed. (....................................................................) (Client Concurrency\hd_client_concurrency)

The (client_connect_gt:) mapping conditional ((cr_conditional)) attempts to allow some measurement of the number of requests a particular client currently has being processed. Using this decision criterion appropriate request mapping for controlling the additional requests can be undertaken. It is not intended to provide fine-grained control over activities, rather just to prevent a single client using an unreasonable proportion of the resources.

For example. If the number of requests from one particulat client looks like it has got out of control (at the client end) then it becomes possible to queue (throttle) or reject further requests. In WASD_CONFIG_MAP if (client_connect_gt:15) set * throttle=15 if (client_connect_gt:15) pass * "503 Exceeding your concurrency limit!"

While not completely foolproof it does offer some measure of control over gross client concurrency abuse or error. (....................................................................) (Content-Type Configuration\hd_config_content_type)

HTTP uses an implementation of the MIME (Multi-purpose Internet Mail Extensions) specification for identifying the type of data returned in a response. A MIME content-type consists of a plain text string describing the data as a (type) and slash-separated (subtype), as illustrated in the following examples: text/html text/plain image/gif image/jpeg application/octet-stream The content-type is returned to the client as part of the HTTP response, the client then using this information to correctly process and present the data contained in that response. (Adding Content-Types\hd_config_content_addtype)

In common with most HTTP servers WASD uses a file's suffix (extension, type, e.g. (.HTML, (.TXT), (.GIF)) to identify the data type within the file. The [AddType] directive is used during configuration to bind a file type to a MIME content-type. To make the server recognise and return specific content-types these directives map file types to content-types.

With the VMS file system there is no effective file characteristic or algorithm for identifying a file's content without an exhaustive examination of the data contained there-in a very expensive process (and probably still inconclusive in many cases), hence the reliance on the file type. When adding a totally new content-type to the configuration be sure also to bind an icon to that type using the [AddIcon] directive (see below). If this is not done the default icon specified by [AddDefaultIcon] is displayed. If that is not defined then a directory listing shows (HTML="[?]") (HTML/OFF) ([?]) (HTML/ON) in place of an icon.

Mappings using [AddType] look like these. [AddType] .html text/html Web Markup Language .txt text/plain plain text .gif image/gif image (GIF) .hlb text/x-script /Conan VMS Help library .decw$book text/x-script /HyperReader Bookreader book * internal/x-unknown application/octet-stream (MIME.TYPES\hd_config_content_mimetypes)

To allow the server to share content-type definitions with other MIME-aware applications, and for WASD scripts to be able to perform their own mapping on a shared understanding of MIME content it is possible to move the file suffix to content-type mapping from a collection of [AddType]s in WASD_CONFIG_GLOBAL to an external file. This file is usually named MIME.TYPES and is specified in WASD_CONFIG_GLOBAL using the [AddMimeTypesFile] directive.

Mappings using MIME.TYPES look like these. # MIME type Extension application/msword doc application/octet-stream bin dms lha lzh exe class application/oda oda application/pdf pdf application/postscript ai eps ps application/rtf rtf

A leading content-type is mapped to single or multiple file suffixes. A general MIME.TYPES file commonly has content-types listed with no corresponding file suffix. These are ignored by WASD. Where a file suffix is repeated during configuration the latter version completely supercedes the former (with the Server Administration page showing an italicised and struck-through content-type to help identify duplicates).

To allow the configuration information used by the server to generate directory listings with additional detail, WASD-specific extensions to the standard MIME.TYPES format are provided. These are (hidden) in comment structures so as not to interfere with non-WASD application use. All begin with a hash then an exclamation character ((#!)) then another reserved character indicating the purpose of the extension. Existing comments are unaffected provided the second character is anything but an exclamation mark! (UNNUMBERED) (#! file description\BOLD) A space reserved character indicates following free-form text, used as the file type description displayed on the far right of directory listings. (#!/cgi-bin/script\BOLD) A forward-slash introduces an auto-script specification. An auto-script is automatically activated by the server to process and display a corresponding file's contents. These are sometimes refered to as (presentation) scripts. (#![(alt)] /path/to/icon.gif\BOLD) A left-square-bracket is used for icon specifications. These are actually mapped against the following content-type, not file suffix, and so only need to be specified once for each content-type in the file. This behaves in a similar fashion to [AddIcon], only the components are reversed. (#!!\BOLD) The two exclamation marks can be used to indicate a MIME type intended for WASD only. The can be ignored by non-WASD applications. (#!+\BOLD) An exclamation mark then a plus symbol indicates an FTP transfer mode directive. One of three characters may follow the plus. An (A) indicates that this file type should be FTP transfered in ASCII mode. An (I) or a (B) indicates that this file type should be FTP transfered in Image (binary) mode. (#!%\BOLD) A percentage is ignored by WASD. This is reserved for local (non-WASD) viewers.

These directives are placed (following\BOLD) the MIME-type entry they apply to. An example of the contents of a MIME.TYPES file with various WASD extensions. # MIME type Extension application/msword doc #! MS Word document #![DOC] /httpd/-/doc.gif application/octet-stream bin dms lha lzh exe class #! binary content #![BIN] /httpd/-/binary.gif application/oda oda application/pdf pdf application/postscript ai eps ps #! Adobe PostScript #![PS.] /httpd/-/postscript.gif #!+A application/rtf rtf #! Rich Text Format #![RTF] /httpd/-/rtf.gif application/x-script bks decw$bookshelf #! DEC Bookshelf #!/cgi-bin/hypershelf application/x-script bkb decw$book #![BKR] /httpd/-/script.gif #! DEC Book #!/cgi-bin/hyperreader

Other reserved characters have been specified for development purposes but are not (perhaps currently) employed by the HTTP server. (UNNUMBERED) (#!(<) html marked-up text\BOLD) A less-than symbol indicates HTML marked-up text. (#!# blah blah blah\BOLD) (##! rhubarb rhubarb\BOLD) Two combinations of hash and exclamation characters provide for WASD-specific comments. (Unknown Content-Types\hd_config_content_unknown)

If a file type is not recognised (i.e. no [AddType] or [AddMimeTypesFile] mapping corresponding to the file type) then by default WASD identifies its data as (application/octet-stream) (i.e. essentially binary data). Most browsers respond to this content-type with a download dialog, allowing the data to be saved as a file. Most commonly these unknown types manifest themselves when authors use (interesting) file names to indicate their purpose. Here are some examples the author has encountered: README.VMS README.1ST READ-ME.FIRST BUILD.INSTRUCTIONS MANUAL.PT1 (.PT2, )

If the site administrator would prefer another default content-type, perhaps (text/plain) so that any unidentified files default to plain text, then this may be configured by specifying that content-type as the (description) of the catch-all file type entry. Examples (use one of): [AddType] * internal/x-unknown * internal/x-unknown application/octet-stream * internal/x-unknown text/plain * internal/x-unknown something/else-entirely It is the author's opinion that unidentified file types should remain as binary downloads, not (text) documents, which they are probably more often not, but it's there if wanted. (Explicitly Specifying Content-Type\hd_explicit_type)

When accessing files it is possible to explicitly specify the identifying content-type to be returned to the browser in the HTTP response header. Of course this does not change the actual content of the file, just the header content-type! This is primarily provided to allow access to plain-text documents that have obscure, non-(standard) or non-configured file extensions.

It could also be used for other purposes, (forcing) the browser to accept a particular file as a particular content-type. This can be useful if the extension is not configured (as mentioned above) or in the case where the file contains data of a known content-type but with an extension conflicting with an already configured extension specifying data of a different content-type.

Enter the file path into the browser's URL specification field ("Location:", "Address:"). Then, for plain-text, append the following query string: ?httpd=content&type=text/plain

For another content-type substitute it appropriately. For example, to retrieve a text file in binary (why I can't imagine :-) use ?httpd=content&type=application/octet-stream

This is an example: (....................................................................) (HTML=

  file.unknown

  file.unknown?httpd=content&type=text/plain
) (onlinedemo.sdml) (....................................................................)

It is posssible to "force" the content-type for all files in a particular directory. Enter the path to the directory and then add ?httpd=index&type=text/plain

(or what-ever type is desired). Links to files in the listing will contain the appropriate (?httpd=content&type=...) appended as a query string.

This is an example: (....................................................................) (HTML=

  *.*

  *.*?httpd=index&type=text/plain
) (onlinedemo.sdml) (....................................................................) (Language Variants\hd_language_variants)

Language-specific variants of a document may be configured to be served automatically and transparently. This is organized as a basic file and name with language-specific variant indicated by an additional (tag), one of ISO language abbreviations used by the (Accept-Language:) request header field, e.g. (en) for English, (fr) for French, (de) for German, (ru) for Russian, etc.

Two variants of the basic file specification are possible; file name (the default) and file type. Hence if the basic file name is EXAMPLE.HTML then specifically German, English, French and Russian language versions in the directory would be either EXAMPLE.HTML EXAMPLE_DE.HTML EXAMPLE_EN.HTML EXAMPLE_FR.HTML EXAMPLE_RU.HTML or EXAMPLE.HTML EXAMPLE.HTML_DE EXAMPLE.HTML_EN EXAMPLE.HTML_FR EXAMPLE.HTML_RU

A path must be explicitly SET using the (accept=lang) mapping rule as containing language variants. As searching for variants is a relatively expensive operation the rule(s) applying this functionality should be carefully crafted. The (accept=lang) rule accepts an optional default language representing the contents of the basic, untagged files. This provides an opportunity to more efficiently handle requests with a language first preference matching that of the default. In this case no variant search is undertaken, the basic file is simply served. The following example sets a path to contain files with a default language of French and possibly containing other language variants. set /web/doc/* accept=lang=(default=fr)

In this case the behaviour would be as follows. With the default language set to (fr) a request's (Accept-Language:) field is initially processed to check if the first preference is for (fr). If it is then there is no need for further accept language processing and the basic file is returned as the response. If not then the directory is searched for other files matching the EXAMPLE_*.HTML specification. All files matching this wildcard have the (*) portion (e.g. (EN), (FR), (DE), (RU)) added to a list of variants. When the search is complete this list is compared to the request's (Accept-Language:) list. The first one to be matched has the contents of the corresponding file returned. If none are matched the default version would be returned.

This example of the behaviour is based on the contents of the directory described above. A request that specifies Accept-Language: fr,de,en will have EXAMPLE.HTML returned (without having searched for any other variants). For a request specifying Accept-Language: ru,en then the EXAMPLE_RU.HTML file is returned, and if no (Accept-Language:) is supplied with the request EXAMPLE.HTML would be returned. One or other file is always returned, with the default, non-language file always the fallback source of data. If it does not exist and no other language variant is selected the request returns a 404 file-not-found error. (Content-Type\hd_language_type)

When using the (accept=lang=(variant=type)) form of the rule (i.e. the variant is placed on the file type rather than the default file name) each possible file extension must also must have its content-type made known to the server. Using the example above the variants would need to be configured in a similar way to the following. [AddType] .HTML "text/html; charset=ISO-8859-1" Web Markup Language .HTML_DE "text/html; charset=ISO-8859-1" HTML (German) .HTML_EN "text/html; charset=ISO-8859-1" HTML (English) .HTML_FR "text/html; charset=ISO-8859-1" HTML (French) .HTML_RU "text/html; charset=koi8-r" HTML (Russian) (Non-Text Content\hd_language_non_text)

Normally only files with a content-type of (text/..) are subject to variant searching. If the rule path includes a file type then those files matching the rule are also variant-searched. In this way images, audio files, etc., may also have language-specific versions supplied transparently. The following illustrates this usage set /web/doc/*.jpg accept=lang=(default=fr) set /web/doc/*.wav accept=lang=(default=fr) (....................................................................) (Character Set Conversion\hd_charset_conv)

The default character set sent in the response header for text documents (plain and HTML) is set using the [CharsetDefault] directive and/or the SET charset mapping rule. English language sites should specify ISO-8859-1, other Latin alphabet sites, ISO-8859-2, 3, etc. Cyrillic sites might wish to specify ISO-8859-5 or KOI8-R, and so on.

Document and CGI script output may be dynamically converted from one character set to another using the standard VMS NCS conversion library. The [CharsetConvert] directive provides the server with character set aliases (those that are for all requirements the same) and which NCS conversion function may be used to convert one character set into another. document-charset accept-charset[,accept-charset..] [NCS-function-name[=factor]]

When this directive is configured the server compares each text response's character set (if any) to each of the directive's (document charset) string. If it matches it then compares each of the (accepted charset) (if multiple) to the request (Accept-Charset:) list of accepted characters sets.

At least one (doc-charset) and one (accept-charset) must be present. If only these two are present (i.e. no (NCS-conversion-function)) it indicates that the two character sets are aliases (i.e. the same set of characters, different name) and no conversion is necessary.

If an (NCS-conversion-function) is supplied it indicates that the document (doc-charset) can be converted to the request (Accept-Charset:) preference of the (accept-charset) using the NCS conversion function name specified.

A (factor) parameter can be appended to the conversion function. Some conversion functions require more than one output byte to represent one input byte for some characters. The 'factor' is an integer between 1 and 4 indicating how much more buffer space may be required for the converted string. It works by allocating that many times more output buffer space than is occupied by the input buffer. If not specified it defaults to 1, or an output buffer the same size as the input buffer.

Multiple comma-separated (accept-charset)s may be included as the second component for either of the above behaviours, with each being matched individually. Wildcard (*) and (%) may be used in the (doc-charset) and (accept-charset) strings. [CharsetConvert] windows-1251 windows-1251,cp-1251 windows-1251 koi8-r windows1251_to_koi8r koi8-r koi8-r,koi8 koi8-r windows-1251,cp-1251 koi8r_to_windows1251 koi8-r utf-8 koi8r_to_utf8=2 (....................................................................) (Error Reporting\hd_error_reporting)

By default the server provides its own internal error reporting facility. These reports may be configured as (basic) or (detailed) on a per-path basis, as well as determining the basic (look-and-feel). For more demanding requirements the [ErrorReportPath] configuration directive allows a redirection path to be specified for error reporting, permitting the site administrator to tailor both the nature and format of the information provided. A Server Side Include document, CGI script or even standard HTML file(s) may be specified. Generally an SSI document would be recommended for the simplicity yet versatility. (Basic and Detailed\hd_error_reporting_basic)

Internally generated error reports are the most efficient. These can be delivered with two levels of error information. The default is more detailed. (HTML=

ERROR 404  -  The requested resource could not be found.
Document not found  ...  /wasd_root/index.html
(document, bookmark, or reference requires revision)
Additional information:  1xx2xx3xx4xx5xxHelp

WASD/10.0.0 Server at www.example.com Port 80
) (HTML/OFF) (ERROR 404\BOLD) - The requested resource could not be found. Document not found ... /wasd_root/index.html ((document, bookmark, or reference requires revision)) Additional information: 1(xx), 2(xx), 3(xx), 4(xx), 5(xx), Help ----------------------------------------------------------------- (WASD/10.0.0 Server at www.example.com Port 80) (HTML/ON)

There is also the more basic. (HTML=

ERROR 404  -  The requested resource could not be found.
Additional information:  1xx2xx3xx4xx5xxHelp

WASD/10.0.0 Server at www.example.com Port 80
) (HTML/OFF) (ERROR 404\BOLD) - The requested resource could not be found. Additional information: 1(xx), 2(xx), 3(xx), 4(xx), 5(xx), Help ----------------------------------------------------------------- (WASD/10.0.0 Server at www.example.com Port 80) (HTML/ON)

These can be set per-server using the [ReportBasicOnly] configuration directive, or on a per-path basis in the WASD_CONFIG_MAP configuration file. The basic report is intended for environments where traditionally a minimum of information might be provided to the user community, both to reduce site configuration information leakage but also where a general user population may only need or want the information that a document was either found or not found. The detailed report often provides far more specific information as to the nature of the event and so may be more appropriate to a more technical group of users. Either way it is relatively simple to provide one as the default and the other for specific audiences. Note that the detailed report also includes in page () information the code module and line references for reported errors.

To default to a basic report for all but selected resource paths introduce the following to the top of the WASD_CONFIG_MAP configuration file. # default is basic reports set /* report=basic set /internal-documents/* report=detailed set /other/path/* report=detailed

To provide the converse, default to a detailed report for all but selected paths use the following. # default is detailed reports set /web/* report=basic (Other Customization\hd_error_reporting_other)

The additional reference information included in the report may be disabled using the appropriate WASD_CONFIG_MSG [status] message item. Emptying this message results in an error report similar to the following. (HTML=

ERROR 404  -  The requested resource could not be found.

WASD/10.0.0 Server at www.example.com Port 80
) (HTML/OFF) (ERROR 404\BOLD) - The requested resource could not be found. ----------------------------------------------------------------- (WASD/10.0.0 Server at www.example.com Port 80) (HTML/ON)

The server signature may be disabled using the WASD_CONFIG_GLOBAL [ServerSignature] configuration directive. This results in a minimal error report. (HTML=

ERROR 404  -  The requested resource could not be found.
)

A simple approach to providing a site-specific (look-and-feel) to server reports is to customize the [ServerReportBodyTag] WASD_CONFIG_GLOBAL configuration directive. Using this directive report page background colour, background image, text and link colours, etc., may be specified for all reports. It is also possible to more significantly change the report format and contents (within some constraints), without resorting to the site-specific mechansims refered to below, by changing the contents of the appropriate WASD_CONFIG_MSG [status] item. This should be undertaken with care. (HTML/OFF) (ERROR 404\BOLD) - The requested resource could not be found. (HTML/ON) (Site Specific\hd_error_reporting_site)

Customized error reports can be generated for all or selected HTTP status status associated with errors reported by the server using the WASD_CONFIG_GLOBAL [ErrorReportPath] and WASD_CONFIG_SERVER [ServiceErrorReportPath] configuration directives. To explicitly handle all error reports specify the path to the error reporting mechanism (see description below) as in the following example. [ErrorReportPath] /httpd/-/reporterror.shtml

To handle only selected error reports add the HTTP status codes following the report path. In this example only 403 and 404 errors are explicitly handled, the rest remain server-generated. This is particularly useful for static error documents. [ErrorReportPath] /httpd/-/reporterror.shtml 403 404

To exclude selected error reports (and handle all others by default) add the HTTP status codes preceded by a hyphen following the report path. In this example 401 and 500 errors are server-generated. [ErrorReportPath] /httpd/-/reporterror.shtml -401 -500

Site-specific error reporting works by internal redirection. When an error is reported the original request is concluded and the request reconstructed using the error report path before internally being reprocessed. For SSI and CGI script handlers error information becomes available via a specially-built query string, and from that as CGI variables in the error report context. One implication is the original request path and query string are no longer available. All error information must be obtained from the error information in the new query string.

It is suggested with any use of this facility the reporting document(s) be located somewhere local, probably WASD_ROOT:[RUNTIME.HTTPD], and then enabled by placing the appropriate path into the [ErrorReportPath] configuration directive. [ErrorReportPath] /httpd/-/reporterror.shtml

Note that virtual services can subsequently have this path mapped to other documents (or even scripts) so that some or all services may have custom error reports. For instance the following arrangement provides each host (service) with an customized error report. # WASD_CONFIG_GLOBAL [ErrorReportPath] /errorreport.shtml # WASD_CONFIG_MAP [[alpha.example.com]] pass /errorreport.shtml /httpd/-/alphareport.shtml [[beta.example.com]] pass /errorreport.shtml /httpd/-/betareport.shtml [[gamma.example.com]] pass /errorreport.shtml /httpd/-/gammareport.shtml (Using Static HTML Documents\hd_error_reporting_html)

Static HTML documents are a good choice for site-specific error messages. They are very low overhead and are easily customizable. One per possible response error status code is required. When providing an error report path including a (!UL) introduces the response status code into the file path, providing a report path that includes a three digit number representing the HTTP status code. A file for each possible or configured code must then be provided, in this example for 403 (authorization failure), 404 (resource not found) and 502 (bad gateway/script). [ErrorReportPath] /httpd/-/reporterror!UL.html 403 404 502

This mapping will generate paths such as the following, and require the three specified to respond to those errors. /httpd/-/reporterror403.html /httpd/-/reporterror404.html /httpd/-/reporterror502.html (Using an SSI Document\hd_error_reporting_ssi)

SSI documents provide the versatility of dynamic report generation for but they do take time and CPU for processing, and this may be a significant consideration on busy sites.

Three example SSI error report documents are provided. (HTML=

  1. WASD_ROOT:[EXAMPLE]REPORTERROR1.SHTML
    Provides a report identical with those internally generated in versions prior to v7.0.

  2. WASD_ROOT:[EXAMPLE]REPORTERROR2.SHTML
    This is a minor variation, showing how the format may be easily customized.

  3. WASD_ROOT:[EXAMPLE]REPORTERROR3.SHTML
    This version has a radically different format and content, with much less specific error information (which some administrator's may consider advantageous). When generated these reports look something like this.

  4. WASD_ROOT:[EXAMPLE]REPORTERROR4.SHTML
    This example uses the report format provided with WASD v7.0 and later, and look something like this.

  5. WASD_ROOT:[EXAMPLE]REPORTERROR5.SHTML
    This is another variation, showing how the format may be easily customized. When generated this report looks something like this.
) (HTML/OFF) See WASD_ROOT:[EXAMPLE]REPORTERROR*.SHTML The first providing a report identical with those internally generated, the second a small variation on this, and the third considerably different and with much less specific error information (which some administrator's may consider advantageous). (HTML/ON)

The following SSI variables are available specifically for generating error reports. The () statement near the top of the file may be uncommented to view all SSI and CGI variables available.

((Error Variables\BOLD))

(2\20) (Variable\Description) (ERROR_LINE\ The HTTPd source code line from where the error was generated.) (ERROR_MODULE\ The HTTPd source code module corresponding to the line described above.) (ERROR_REPORT\ A single HTML string providing a detailed error message.) (ERROR_REPORT2\ A single HTML comment providing more detailed VMS error information if available) (ERROR_REPORT3\ A server-generated HTML string providing a brief explanation of the error if available) (ERROR_STATUS_CLASS\ Essentially the single hundreds digit from the status code (e.g. 4).) (ERROR_STATUS_CODE\ The HTTP response status code representing the error (e.g. 404).) (ERROR_STATUS_EXPLANATION\ The HTTP response status code descriptive meaning (e.g. (The requested resource could not be found.))) (ERROR_STATUS_TEXT\ The HTTP response status code abbreviated meaning (e.g. (Not Found)).) (ERROR_STATUS_TYPE\ (basic) or (detailed).) (FORM_ERROR_\ A series of CGI variables providing the sources for the above SSI variables, as well as other general environment information.) (Using a Script\hd_error_reporting_script)

It is also possible to report using a script. The same error information is available via corresponding CGI variables. The source code (HTML= WASD_ROOT:[SRC.MISC]REPORTERROR.C ) (HTML/OFF) WASD_ROOT:[SRC.MISC]REPORTERROR.C (HTML/ON) provides such an implementation example. (....................................................................) (OPCOM Logging\hd_opcom)

Significant server events may be optionally displayed via a selected operator's console and recorded in the operator log. Various categories of these events may be selectively enabled via WASD_CONFIG_GLOBAL directives ((cr_config_directives)). (UNNUMBERED) Server Administration page directives authentication/authorization (e.g. failures) CLI HTTPd control directives HTTPd events (e.g. startup, exit, SSL private key password requests) proxy file cache maintenance

Some significant server events are always logged to OPCOM if any one of the above categories is enabled. (....................................................................) (Access Logging\hd_access_logging)

WASD provides a versatile access log, allowing data to be collected in Web-standard (common) and (combined) formats, as well as allowing customization of the log record format. It is also possible to specify a log period. If this is done log files are automatically changed according to the period specified.

Where multiple access log files are generated with per-instance, per-period and/or per-service logging (see below) these can be merged into single files for administrative or archival purposes using the CALOGS utility.

The Quick-and-Dirty LOG STATisticS utility can be used to provide elementary ad hoc log analysis from the command-line or CGI interface.

Exclude requests from specified hosts using the [LogExcludeHosts] configuration parameter. (Log Format\hd_log_format)

The configuration parameter [LogFormat] and the server qualifier /FORMAT specifies one of three pre-defined formats, or a user-definable format. Most log analysis tools can process the three pre-defined formats. There is a small performance impost when using the user-defined format, as the log entry must be specially formatted for each request. (UNNUMBERED) (COMMON -\BOLD) This is the most common, base logging format for Web servers. COMMON is the default log format. (COMMON_SERVER -\BOLD) This is an optional format used, for one, by the NCSA server. It is basically the common format, with the server host name appended to the line (used for multi-homed servers, see (hd_virtual_services)). (COMBINED -\BOLD) This is an optional format used, for one again, by the NCSA server. It too is basically the common format, with the HTTP referer and user agent appended. (User-Defined\hd_log_user_defined)

The user-defined format allows customised log formats to be specified using a selection of commonly required data. The specification must begin with a character that is used as a substitute when a particular field is empty (use "\0" for no substitute, as in the "windows log format" example below).

Two different "escape" characters introduce the following parameters:

((A (!) followed by\BOLD))

(2\10) (Characters\Description) (AR\authentication realm (if any)) (AU\authenticated user name (if any)) (BB\bytes in body (excludes response header)) (BQ\quadword bytes in response (includes header)) (BY\bytes in response (includes header)) (CA\client address) (CN\client host name (or address if DNS lookup disabled)) (CP\client port) (EM\request elapsed time in milliseconds) (ES\request elapsed time in fractional seconds) (ID\session track ID) (ME\request method) (PA\request path (not to be confused with (RQ))) (PR\request URL (includes protocol scheme)) (QS\request query string (if any)) (RF\referer (if any)) (RQ\complete request string (see below)) (RS\response status code) (SN\server host name) (SC\script name (if any)) (SM\request scheme (http: or https:)) (SP\server port) (TC\request time (common log format)) (TG\request time (GMT)) (TV\request time (VMS format)) (UA\user agent) ((A () followed by\BOLD))
(2\10) (Character\Description) (0\a null character (used to define the empty field character)) (!\insert an (!)) (\insert a ()) (n\insert a newline) (q\insert a quote (so that in DCL the quotes won't need escaping!)) (t\insert a TAB)

Any other character is directly inserted into the log entry. ((PA) and (RQ)) The (PA) and (RQ) have distinct roles. In general the (RQ) (request) directive will always be used as this is the full request string; script component (if any), path string and query string component (if any). The (PA) directive is merely the path string after any script and query string components have been removed. (Examples\log_format_examples) (NUMBERED) The equivalent of the common log format is: -!CN - !AU [!TC] \q!RQ\q !RS !BY The combined log format could be specified as: -!CN - !AU [!TC] \q!RQ\q !RS !BY \q!RF\q \q!UA\q The (O'Reilly WebSite) (windows log format) would be created by: \0!TC\t!CA\t!SN\t!AR\t!AU\t!ME\t!PA\t!RQ\t!EM\t!UA\t!RS\t!BB\t The common log format with appended request duration in seconds could be provided using: -!CN - !AU [!TC] \q!RQ\q !RS !BY !ES (....................................................................) (Log Per-Period\hd_log_period)

The access log file may have a period specified against it, producing an automatic generation of log file based on that period. This allows logs to be systematically named, ordered and kept to a managable size. This is also known as log rotation. The period specified can be one of (UNNUMBERED) HOURLY DAILY weekly as MONDAY TUESDAY WEDNESDAY THURSDAY FRIDAY SATURDAY SUNDAY MONTHLY

The log file changes on the first request after the entering of the new period.

When using a periodic log file, the file name specified by WASD_CONFIG_LOG or the configuration parameter [LogFile] is partially ignored, only partially because the directory component of it is used to located the generated file name. The periodic log file name generated comprises (UNNUMBERED) server host name server port year (YYYY) month (MM) day (DD) hour (HH, only present when HOURLY period is configured) as in the following example WASD_LOGS:WASD_80_19971013_ACCESS.LOG

For the daily period the date represents the request date. For the weekly period it is the date of the previous (or current) day specified. That is, if the request occurs on the Wednesday for a weekly period specified by Monday the log date show the last Monday's. For the monthly period it uses the first. (....................................................................) (Log Per-Service\hd_log_per_service)

By default a single access log file is created for each HTTP server process. Using the [LogPerService] configuration directive a log file for each service provided by the HTTPd is generated ((hd_virtual_services)). The [LogNaming] format can be any of "NAME" (default) which names the log file using the first period-delimited component of the IP host name, "HOST" (which uses as much of the IP host name as can be accomodated within the maximum 39 character filename limitation under ODS-2), or "ADDRESS" which uses the full IP host address in the name. Both HOST and ADDRESS have hyphens substituted for periods in the string. If these are specified then by default the service port follows the host name component. This may be suppressed using the [LogPerServiceHostOnly] directive, allowing a minimum extra 3 characters in the name, and combining entries for all ports associated with the host name (for example, a standard HTTP service on port 80 and an SSL service on port 443 would have entries in the one file). (....................................................................) (Log Per-Instance\hd_log_per_instance)

To reduce physical disk activity, and thereby significantly improve performance, the RMS characteristics of the logging stream are set to buffer records for as long as possible and only write to disk when buffer space is exhausted (a periodic flush ensures records from times of low activity are written to disk). However when multiple server processes (either in the case of multiple instances on a single node, single instance on each of multiple clustered nodes, or a combination of the two) have the same log files open for write then this buffering and defered write-to-disk is disabled by RMS, it insisting that all records must be flushed to disk for correct serialization and coherency.

This introduces measurable latency and a potentially significant bottleneck to high-demand processing. Note that it only becomes a real issue under load. Sites with a low load should not experience any impact.

Sites that may be affected by this issue can revert to the original buffered log stream by enabling the [LogPerInstance] configuration directive. This ensures that each log stream has only one writer by creating a unique log file for each instance process executing on the node and/or cluster. It does this by appending the node and process name to the file type. This would change the log name from something like WASD_LOGS:131-185-250-202_80_ACCESS.LOG to, in the case of a two-instance single node, WASD_LOGS:131-185-250-202_80_ACCESS.LOG_KLAATU_HTTPD-80 WASD_LOGS:131-185-250-202_80_ACCESS.LOG_KLAATU_HTTPE-80

(Of course the number-of and naming-of log files is beginning to become a little itimidating at this stage!\BOLD) To assist with managing this seeming plethora of access log files is the calogs utility, which allows multiple log files to be merged whilst keeping the records in timestamp order. (....................................................................) (Log Naming\hd_log_naming)

When per-period or per-service logging is enabled the access log file has a specific name generated. Part of this name is the host's name or IP address. By default the host name is used, however if the host IP address is specified the literal address is used, hyphens being substituted for the periods. Accepted values for the [LogNaming] configuration directive are: (UNNUMBERED) ADDRESS HOST NAME (default)

Examples of generated per-service (non-per-period) log names: WASD_LOGS:131-185-250-202_80_ACCESS.LOG WASD_LOGS:WWW-EXAMPLE-COM_80_ACCESS.LOG WASD_LOGS:WASD_80_ACCESS.LOG

Examples of generated per-period (with/without per-service) log names: WASD_LOGS:131-185-250-202_80_19971013_ACCESS.LOG WASD_LOGS:WWW-EXAMPLE-COM_80_19971013_ACCESS.LOG WASD_LOGS:WWW_80_19971013_ACCESS.LOG

Examples of generated per-instance (per-service and per-period) log names: WASD_LOGS:131-185-250-202_80_ACCESS.LOG_KLAATU_HTTPD-80 WASD_LOGS:WWW-EXAMPLE-COM_80_ACCESS.LOG_KLAATU_HTTPD-80 WASD_LOGS:WASD_80_ACCESS.LOG_KLAATU_HTTPD-80 WASD_LOGS:131-185-250-202_80_19971013_ACCESS.LOG_KLAATU_HTTPD-80 WASD_LOGS:WWW-EXAMPLE-COM_80_19971013_ACCESS.LOG_KLAATU_HTTPD-80 WASD_LOGS:WWW_80_19971013_ACCESS.LOG_KLAATU_HTTPD-80 (....................................................................) (Access Tracking\hd_log_tracking)

The term (access tracking) describes the ability to follow a single user's accesses through a particular site or group of related sites. This is accomplished by setting a unique cookie in a user's browser. This cookie is then sent with all requests to that site. The site detects the cookie's unique identifier, or token, and includes it the access log, allowing the user's route through the site or sites to be reviewed. Note that a browser must have cookies enabled for this mechanism to operate.

WASD access tracking is controlled using the [Track...] directives. The tracking cookie uses an opaque, nineteen character string as the token (e.g. (ORoKJAOef8sAAAkuACc)). This token is spatially and temporally completely unique, generated the first time a user's browser accesses the site. This token is by default added to the server access log in the common format (remote-ID) location. It can also be placed into custom logs. From this identifier in the logs a session's progress may be easily tracked. (Note that the token contains nothing related to the user's actual identity!\BOLD) It is merely a unique identifier that tags a single browser's access trail through a site.

The [Track] directive enables access tracking on a per-server basis. By default all non-proxy services will then have tracking enabled. Individual services may be then be disabled (or enabled in the case of proxy services) using the per-service (;notrack) and (;track) parameters.

By default a session track token expires when the user closes the browser. To encourage the browser to keep this token between uses enable multi-session tracking using the [TrackMultiSession] directive. Note that browsers may dispose of any cookie at any time resources become scarce, and that users can also remove them.

Session tracking can be extended from the default of the local server (virtual if applicable) to a group of servers within a local domain. This means the same, initial identifier appears in the logs of all WASD servers in a related group of hosts. Of course tracking must be enabled on all servers. The host grouping is specified using the [TrackDomain] directive (this follows the general rules governing cookie domain behaviour - see RFC2109). Most host grouping require (a minimum of three dots\BOLD) in the specification. For example (note the leading dot) .site.org.domain

which would match the following servers, (curly.site.org.domain), (larry.site.org.domain), (moe.site.org.domain), etc. Sites in top-level domains (e.g. (edu), (com), (org)) need only specify a minimum of two periods. (Access Alert\hd_log_alert)

It is possible to mark a path as being of specific interest. When this is accessed by a request the server puts a message into the the server process log and perhaps of greater immediate utility the increase in alert hits is detected by HTTPDMON and this (optionally) provides an audible alert allowing immediate attention. This is enabled on a per-path basis using the SET mapping rule. Variations on the basic rule allow some control over when the alert is generated.

(SIMPLE) ALERT - at the conclusion of the request ALERT=MAP - immediately after mapping (early) ALERT=AUTH - when (any) authorization has been performed ALERT=END - at the conclusion of the request (default) ALERT=(integer) - see below NOALERT - suppress alert for this path

The special case ALERT=(integer) allows a path to be alerted if the final response HTTP status is the same as the integer specified (e.g. 501, 404) or within the category specified (599, 499). (--------------------------------------------------------------------) (Security Considerations\cr_config_securing)

This section does not pretend to be a complete guide to keeping the (bad guys) out. It does provide a short guide to making a site more-or-less liberal in the way the server supplies information about the site and itself. The reader is also strongly recommended to a number of hard copy and Web based resources on this topic.

The WASD package had its genesis in making the VMS operating system and associated resources, in a development environment, available via Web technology. For this reason configurations can be made fairly liberal, providing information of use in a technical environment, but that may be superfluous or less-than-desirable in other, possibly commercial environments. For instance, directory listings can contain VMS file system META information, error reports can be generated with similar references along with reporting source code module and line information.

The example configuration files contain a fairly restrictive set of directives. When relaxing these recommendations keep in mind that the more information available about the underlying structure of the site the more potential for subversion. Do not enable functionality that contributes nothing to the fundamental usefulness of the site, or that has the real potential to compromise any given site. This section refers to configuration directives discussed in more detail in later chapters.

It is established wisdom that the only secure computing system is one with no users and no access, that system security is inversely proportional to system usability, and that making something idiot-proof results in only idiots using it. So there are some trade-offs but (don't think it can't happen to you!) A systematic investigation of installed WASD packages by well-known IT professional Jean-loup Gailly during September 2002 revealed a couple of significant implementation flaws which compounded by notable instances of sloppy management practices on two public sites resulted in site compromise (one was mine). (....................................................................) (UNNUMBERED) (HTML=

  • WASD_ROOT:[DOC.MISC]WASD_ADVISORY_020925.TXT) (HTML/OFF) WASD_ROOT:[DOC.MISC]WASD_ADVISORY_020925.TXT (HTML/ON) (HTML=

  • http://online.securityfocus.com/archive/1/293229) (HTML/OFF) http://online.securityfocus.com/archive/1/293229 (HTML/ON) (....................................................................)

    This research has resulted in these server flaws being closed and package security considerations being extensively reviewed. As a result WASD v8.1 was much more resistent to such penetration than previous releases (and slightly less easy to use, but that's one of those trade-offs). My assessment would be that if Gailly did not find it then it wasn't there to find!

    Of course any given site's security is a function of the underlying package's security profile, with the site's implementation of that, AND other considerations such as local authorization and script implementations. Pay particular and ongoing attention to site security and integrity. (....................................................................) (Recommended Package Security\hd_securing_package)

    The following table provides recommended file protection settings for package top-level directories. Subdirectories share their parents' settings. The package tree is owned by the SYSTEM account. Directories with world READ access have no ACLs. Other directories, not accessible to the world, but sometimes having other degress of access to one or more accounts always have rights identifiers (see below) and associated ACLs to control directory access, and to propagate required access to files created beneath them. The server selectively enables SYSPRV to provide access to some of these areas (e.g. for log creation).

    Some pre-v8.1 directories are not included in this table. These are not significant in versions from 8.1 onwards and may be deleted. They can continue to exist however and the security procedures described below ensure that they comply to the general post-8.1 security model. The file access permissions indicated below are for directory contents. The directory files themselves have settings appropriate for content access.

    ((Package Access\BOLD))

  • (4\15\10\10) (Directory\AccessWorld\AccessOther\Description) ([AXP-BIN]\none\script:RE\Alpha executable script files) ([AXP]\none\none\Alpha build and utility area) ([CGI-BIN]\none\script:RE\architecture-neutral script files) ([DOC]\read\(world)\package documentation) ([EXAMPLE]\read\(world)\package examples) ([EXERCISE]\read\(world)\package test files) ([HTTP$NOBODY]\none\script:RWED\scripting account default home area) ([HTTP$SERVER]\none\server:RWED\server account default home area) ([IA64-BIN]\none\script:RE\Itanium executable script files) ([IA64]\none\none\Itanium build and utility area) ([INSTALL]\read\(world)\installation, update and secuity procedures) ([LOCAL]\none\none\site configuration files) ([LOG]\none\none\site access logs) ([LOG_SERVER]\none\server:RWED\server process (SYS$OUTPUT) logs) ([RUNTIME]\read\(world)\graphics, help files, etc.) ([SCRATCH]\none\script:RWED\working file space for scripts) ([SCRIPT]\none\none\example architecture-neutral scripts) ([SRC]\none\(world)\package source files) ([STARTUP]\none\server:RE\package startup procedures) ([VAX-BIN]\none\script:RE\VAX executable script files) ([VAX]\none\none\VAX build and utility area)

    It is recommended site-specific directories have settings applied appropriate to their function in comparison to similar package directories. See below for tools to assist in this.

    Three rights identifiers provide selective access control to the directory tree. Identifiers were used to allow maximum flexibility for a site in allowing required accounts access to either execute the server or execute scripts. Non-default account names only need to be granted one of these identifiers to be provided with that role's access. Installation, update and/or security utilities create and maintain these identifiers appropriately.

    ((Rights Identifiers\BOLD))

    (2\15) (Identifier\Description) (WASD_HTTP_SERVER\Indicates the default server account.) (WASD_HTTP_NOBODY\Indicates the default scripting account.) (WASD_IGNORE_THIS\Looked for by the SECHAN utility to avoid it changing security on site-specific files.)

    These rights identifiers are applied to directories and files to provide the required level of access. The following example shows the security setting of the top-level CGI-BIN.DIR and one of it content files. $ DIRECTORY /SECURITY CGI-BIN.DIR Directory WASD_ROOT:[000000] CGI-BIN.DIR;1 [SYSTEM] (RWED,RWED,,) (IDENTIFIER=WASD_HTTP_SERVER,ACCESS=EXECUTE) (IDENTIFIER=WASD_HTTP_NOBODY,ACCESS=EXECUTE) (IDENTIFIER=*,ACCESS=NONE) (IDENTIFIER=WASD_HTTP_NOBODY,OPTIONS=DEFAULT,ACCESS=READ+EXECUTE) (IDENTIFIER=*,OPTIONS=DEFAULT,ACCESS=NONE) (DEFAULT_PROTECTION,SYSTEM:RWED,OWNER:RWED,GROUP:,WORLD:) Total of 1 file. $ DIRECTORY /SECURITY [CGI-BIN]CGI_SYMBOLS.COM Directory WASD_ROOT:[CGI-BIN] CGI_SYMBOLS.COM;1 [SYSTEM] (RWED,RWED,,) (IDENTIFIER=WASD_HTTP_NOBODY,ACCESS=READ+EXECUTE) (IDENTIFIER=*,ACCESS=NONE) Total of 1 file. (....................................................................) (Maintaining Package Security\hd_securing_maintain)

    As noted above, WASD version 8.1 and later is much more conservative in what it makes generally available from the package tree, and a site administrator now has to take extraordinary measures to open up certain sections, making it a much more difficult and deliberate action. The package installation, update and security procedures and their associated utilities should always be used to ensure that the installed package continues to conform to the security baseline.

    Package security may be (refreshed) or reapplied at any time, and this should be done periodically to ensure that an installed package has not inadvertantly been opened to access where it shouldn't have. Of course this is not a guarantee that any given site is secure. Site security is a function of many factors; package vulnerabilities, site configuration, deployed scripts, cracker determination and expertise, etc., etc. What refreshing the security baseline does is provide a known secure (and WASD-community scrutinized) starting point. It should be used as part of a well considered site security maintenance program. (SECURE.COM\hd_config_install_secure)

    The following DCL procedure resets the package security baseline. $ @WASD_ROOT:[INSTALL]SECURE.COM

    It guides the administrator through a number of stages (UNNUMBERED) introductory notes server account scripting account package tree security settings

    of which each one may be declined. After all of these steps it searches for and executes if found the DCL procedure WASD_ROOT:[INSTALL]SECURE.COM. The intent of this file is to allow a site to automatically update any site-specific security settings (and of course modify any set by the main procedure). (SECHAN Utility\hd_config_sechan)

    The SECHAN utility (pronounced (session)) is used by SECURE.COM and the associated procedures to make file system security settings. It is also available for direct use by the site administrator.

    One of the more useful functions of SECHAN is applied using the /IGNORE qualifier. (UNNUMBERED) (/IGNORE - \BOLD) It adds an ACE containing the rights identifier WASD_IGNORE_THIS to the target file(s) which results in security settings not being applied in the future. When applying settings the SECHAN utility first checks whether a file has this ACE and if so ignores the file. This is an effective method for isolating site-specific settings from changes by this utility. $ SECHAN /IGNORE WASD_ROOT:[CGI-BIN]MY_SCRIPT.COM $ SECHAN /IGNORE WASD_ROOT:[LOCAL]*.DAT $ SECHAN /IGNORE WEB:[DATA...]*.* $ SECHAN /IGNORE WEB:[000000]DATA.DIR

    This ACE can be removed from a file (leaving other entries of any ACL intact) using the /NOIGNORE qualifier. This returns the file(s) subject again to the SECHAN utility. $ SECHAN /NOIGNORE WASD_ROOT:[CGI-BIN]MY_SCRIPT.COM $ SECHAN /NOIGNORE WASD_ROOT:[LOCAL]*.DAT (/ALL - \BOLD) This overrides the default behaviour of ignoring files that have been tagged using the /IGNORE qualifier. It causes the setting to be applied to ALL files.

    Other functionality may prove useful when applied to local parts of the package or web structure. (UNNUMBERED) (/PACKAGE - \BOLD) Used alone this qualifier results in the entire WASD_ROOT:[000000...] tree being traversed and the default package security settings applied to all package files. Top-level directories that the utility does not recognise as belonging to the package are ignored. $ SECHAN /PACKAGE $ SECHAN /PACKAGE /ALL (/ASIF=() - \BOLD) Set the supplied file specification as if it was the specified, top-level WASD directory. This allows a site-specific directory to have the same security settings applied as the specified WASD package directory. $ SECHAN /ASIF=LOCAL WEB:[DATA...]*.* $ SECHAN /ASIF=LOCAL WEB:[000000]DATA.DIR $ SECHAN /ASIF=CGI-BIN WEB:[SCRIPTS]*.* $ SECHAN /ASIF=CGI-BIN WEB:[000000]SCRIPTS.DIR $ SECHAN /ASIF=DOC WEB:[HTML...]*.* $ SECHAN /ASIF=DOC WEB:[000000]HTML.DIR (/NOSCRIPT - \BOLD) Modifies the default behaviour of the /PACKAGE qualifier. This changes the default rights identifiers applied to ACEs on files in the [CGI-BIN] and [AXP-BIN]/[VAX-BIN] directories to disallow scripting until manually changed by site administration. $ SECHAN /PACKAGE /NOSCRIPT

    This section provides only a basic description. More detail may be found in the prologue to the source code. (Independent Package and Local Resources\hd_securing_separate)

    Not only does it make it easier to manage site content but is also good security practice to keep server package and site content completely separate ((hd_site_organisation)).

    This can also be applied to scripts, both source and build areas. Keep your business logic out of the package source tree and potentially prying eyes. The script executables themselves (can) be placed into the package scripting directories but should be built independently from these and copied using locally maintained DCL procedures from build into scripting areas (the WASD_ROOT:[INSTALL]SECURE.COM procedures described above may be useful here). (Configuration\hd_securing_config)

    Various configuration and mapping directives can be used to make the site environment more or less liberal in the information it implicitly can provide. (Directory Listings\hd_securing_config_dir)

    Published guidelines for securing a Web site generally advise against automatic directory listing generation. Where a home page is not available this may leak information on other directory contents, provide parent and child directory access, etc. Compounding this is the WASD facility to (force) a listing by providing a directory URL with file wildcards (not to decry the usefulness in some environments). (UNNUMBERED) ([DirAccess] - \BOLD) Make (disabled) to completely remove the ability to generate directory listings under any circumstances. Setting to (selective) means a directory listing is (only\BOLD) available if the directory contains a file named .WWW_BROWSABLE. When made (enabled) a directory listing may be produced anytime it contains no home (welcome) page. ([DirWildcard] - \BOLD) Make (disabled) so that requests cannot (force\BOLD) a directory listing by supplying a URL containing a wildcard file part (when enabled this is provided regardless of whether a home page exists or not). ([DirMetaInfo] - \BOLD) Make (disabled) to prevent directory listing pages contain as HTML () tags information about the directory, most significantly the VMS file specification for the URL path!

    The mapping rule (SET DIR=(keyword)) can be used to change this on a per-path basis ((hd_map_rule_set)).

    (Conservative recommendation: \BOLD) Set ([DirAccess] selective) allowing listing for directories containing a file named (.WWW_BROWSABLE), disable [DirMetaInfo] and [DirWildcard]. (Server Reports\hd_securing_config_reports)

    Reports are pages generated by the server, usually to indicate an error or other non-success condition, but sometimes to indicate success (e.g. after a successful file upload). Reports provide either basic or detailed information about the situation. Sometimes the detailed information includes VMS file system details, system status codes etc. To limit this information to a minimum indication adjust the following directives. (UNNUMBERED) ([ReportBasicOnly] - \BOLD) Make (enabled) to limit the quantity of information to the minimum required to advise of the situation. Such reports give only the HTTP status code and brief explanation of the code's meaning. Note that this can also be done on a per-path basis using mapping rules. ([ReportMetaInfo] - \BOLD) Make (disabled) to exclude information on the server software, source code module and line number initiating the report. META information may also contain VMS file or system specific information. ([ServerSignature] - \BOLD) Make (disabled) to prevent the inclusion of server software, host and port information as a footer to a report.

    The mapping rule (SET REPORT=(keyword)) can be used to change some of these on a per-path basis ((hd_map_rule_set)).

    (Conservative recommendation: \BOLD) Provide minimal error information by enabling [ReportBasicOnly] and disabling [ReportMetaInfo]. Enable [ServerSignature] to provide a slightly more friendly report (server software can easily be obtained from the response header anyway). (Scripting\config_securing_config_script)

    If a static site is all that's required this source of compromise can simply be avoided. (UNNUMBERED) ([Scripting] - \BOLD) Setting this to (disabled) prevents all scripting entirely. This includes DCL CGI and CGIplus, DECnet-based OSU and CGI, and SSI DCL ((<--#dcl -->), (<--#exec -->), etc.).

    (Conservative recommendation: \BOLD) Only deploy scripts your site will actually be using. Remove all the files associated with any other scripts. Do not allow obsolete script environments to remain active. Be proactive.

    Also see (hd_securing_scripting). (Server Side Includes\hd_securing_reports_ssi)

    SSI documents are pages containing special markup directives interpreted by the server and replaced with dynamic content. This can include detail about the server, the file or files making up the document, and can even include DCL commands and procedure activation for supplying content into the page. All this by anyone who can author on the site. (UNNUMBERED) ([SSI] - \BOLD) Setting this to (disabled) prevents all Server Side Include processing completely. ([SSIexec] - \BOLD) Setting this to (disabled) disallows pages from invoking DCL to supply content for the page. WASD provides a number of levels of this and the reader is refered elsewhere in this and other documents for further information of what can and cannot be done, and by whom, in these processes.

    The mapping rule (SET SSI=(keyword)) can be used to change some of this on a per-path basis ((hd_map_rule_set)).

    (Conservative recommendation: \BOLD) Disable [SsiExec]. (Scripting\hd_securing_scripting)

    Scripting has been a notorious source of server compromise, particularly within Unix environments where script process shell command-line issues require special attention. The WASD CGI scripting interface does not pass any arguments on the command line, and is careful not to allow substitution when constructing the CGI environment. Nevertheless, script behaviours cannot be guaranteed and care should be exercised in their deployment (ask me!)

    It is strongly recommended to execute scripts in an account distinct from that executing the server. This should also mean that the accounts are not members of the same group nor should it be a member of any other group. This minimises the risk of both unintentional and malicious interference with server operation through either Inter-Process Communication (IPC) or scripts manipulating files used by the server. The PERSONA facility can be used to further differentiate script activities. See (Scripting Overview) for further detail.

    The default WASD installation creates two such accounts, with distinct UICs, usernames and home directory space. Nothing should be assumed or read into the scripting account username - it's just a username.

    ((Default Accounts\BOLD))

    (2\15) (Username\Description) (HTTP$SERVER\Server Account) (HTTP$NOBODY\Scripting Account)

    During startup the server checks for the existence of the default scripting account and automatically configures itself to use this for scripting. If it is not present it falls-back to using the server account. Other account names can be used if the startup procedures are modified accordingly. The default scripting username may be overridden using the /SCRIPT=AS=() qualifier (also see the (Scripting Overview)). (Authorization\hd_securing_auth)

    Authorization issues imply controlling access to various resources and actions and therefore require careful planning and implementation if compromise is to be avoided. WASD has a quite capable and versatile authorization and authentication environment, with a significant number of considerations.

    WASD authorization cannot be enabled without the administrator configuring at least three resources, and so therefore cannot easily be (accidentally) activated. One of these is the addition of a startup qualifier controlling where authentication information may be sourced. Another the server configuration file. The third, mapping paths against authorization configuration.

    For sites that may be particularly sensitive about inadvertant access to some resources it is possible to use the authorization configuration file as a type of (cross-check) on the mapping configuration file. The server /AUTHORIZATION=ALL startup qualifier forces all access to be authorized (even if some are marked (none)). This means that if something (escapes) via the mapping file it will very likely be (caught) by an absence in the authorization file. (Miscellaneous Issues\hd_securing_misc)

    Although it is of limited usefulness because server identity may be deduced from behaviour and other indicators the exact server and version may be obscured by using the otherwise undocumented /SOFTWARE= qualifier to change the server identification string to (basically) whatever the administrator desires. This identification is included as part of all HTTP response headers.

    Historically and by default server configuration and authorization sources are contained within the server package tree. There is no reason why they cannot be located anywhere the site prefers. Generally all that is required is a change to logical name definition and server startup. (Package Tree\hd_securing_tree)

    Version 8.1 and later is much more conservative in what it makes available of the package tree via the server. The package installation, update and security procedures and their associated utilities should always be used to ensure that the installed package continues to conform to the security baseline. See (hd_securing_maintain).

    Furthermore, with many sites there may be little need to access the full, or any of the WASD package tree. A combination of mapping and/or authorization rules can relatively simply block or control access to it. These examples can be easily tailored to suit a site's specific requirements.

    This example shows blocking all access to the /wasd_root/ tree, except for documentation, source code, examples and exercise (performance results) areas. # WASD_CONFIG_MAP pass /wasd_root/doc/* pass /wasd_root/src/* pass /wasd_root/example/* pass /wasd_root/exercise/* fail /wasd_root/*

    The next example forbids all access to the package tree unless authorized (the authorization detail would vary according to the site). It also allows modify access for the Server Administration page and to the /wasd_root/local/ area. # WASD_CONFIG_MAP pass /wasd_root/* # WASD_CONFIG_AUTH [WASD_WEB_ADMIN=id] /httpd/-/admin/* r+w /wasd_root/local/* r+w /wasd_root/* r (Be careful!) There are often multiple paths to a single resource. For instance, it is of little significance blocking access to say /wasd_root/doc/ if it's also possible to access it via /doc/.

    The following example shows how this might occur. # WASD_CONFIG_MAP fail /wasd_root/doc/* pass /* /wasd_root/*

    Authorization rules can be used to effectively block access to any VMS file specification (it cannot be done during mapping because the translation from path to file system is not performed until mapping is complete). # WASD_CONFIG_AUTH if (path-translated:WASD_ROOT:[DOC]*) * none

    or to selectively allow access # WASD_CONFIG_AUTH [[WASD_VMS_RW=id]] if (path-translated:WASD_ROOT:[DOC]*) * read (....................................................................) (Site Attacks\hd_config_attacks)

    This is not a treatise on Web security and the author is not a security specialist. This is some general advice based on observation. There is little one can do at the server itself to reduce a concerted attack against a site. Common objectives of such attacks include the following (not an exhaustive list). (Platform Vulnerabilities\hd_config_attacks_platform)

    Where a general attack is launched directed against a specific platform (a combination of operating system and Web server software). Often these can be due to wide-spread infection of systems, meaning many attacks are being launched from a large number of systems (often without the system owners' knowlege or cooperation).

    WASD, and OpenVMS in particular, are generally immune to such attacks because they are not Microsoft or Unix based. The impact of the attack becomes one of the nuisance-value traffic as the site is probed by the (sometimes very large number of) source systems. (Site Vulnerabilities\hd_config_attacks_site)

    Where a specific attack is made against a site in an attempt to exploit a known vulnerability associated with that platform or environment.

    These are perhaps the most worrying, although the (security-by-obscurity) element works in favour of WASD and OpenVMS in this case. Neither are as common as other platforms and therefore do not receive as much attention. (Denial of Service\hd_config_attacks_dos)

    (DOS) Usually comprise flooding a site with requests in an effort to consume all available network or server resources making it unavailable for legitimate use.

    These can be insidious, flooding network equipment as well as systems. Attempts at control are best undertaken at the periphery of the network (routers) although concerted attacks can succeed against the best prepared network. (Password Cracking\hd_config_attacks_crack)

    Where a systematic attempt to break into one or more accounts is undertaken. These are often repeated, dictionary-based password-guessing attacks.

    WASD's authentication functionality notes successive password validation failures and after a reasonable number disables all access via the username for a constantly extended period. Passwords stop being checked and so a dictionary-based attack cannot succeed. Password validation failures can be recorded via OPCOM. (Authorization Holes\hd_config_attacks_auth)

    Knowing of or searching for resources that should be controlled by authorization but are not.

    WASD's /AUTHORIZATION=ALL functionality may assist here ((hd_securing_auth)). (Strategies\hd_config_attacks_strategies)

    There are a few strategies for reducing the load on a server experiencing a generalized attack or probing. These can also be used to (discourage) the source from considering the site an easy target. Unfortunately most require request acceptance and at least some processing before taking action. The general idea is to identify either the source site or some characteristic of the request that indicates it could not possibly be legitimate. Most platform-specific attacks have such a signature. For instance attacks against Microsoft platforms often involve probes for backdoors into non-server executables. These can be identified by the path containing strings such as (/winnt/), (/system32/), (/cmd.exe) or variations on them. This style will be used in examples below. (UNNUMBERED) If the source IP address is known then the [Reject] (and/or [Accept]) configuration directives can be used to reject the request connection very early in the processing. The source agent receives a message about access being rejected. [Reject] 131.185.250.* the.host.name

    Mapping rules in combination with conditionals may be used to redirect the request. This redirection could be to another, non-existent site, in the hope that the source agent will use the supplied URL and thus divert some activity away from the local site. if (remote-host:the.host.name) redirect * http://the.host.name/* endif redirect **/winnt/** http://does.not.exist/

    Mapping rule redirection can also be used to just (drop) the connection without any further interaction or processing. The source agent receives no response, just a broken connection. if (remote-addr:131.185.250.*) pass * "000 just drop it!" endif pass **/system32/** "000 just drop it!"

    The (hiss) facility returns a stream of random alpha-numeric characters (a sort of (white-noise)). No response header is provided. Such a response might cause the source agent at best some distress (perhaps disabling it) or at least disuade it from continuing with more probes (as the target is obviously not a Web server ;-) if (remote-addr:131.185.250.*) map * /hiss/* script /hiss/* /hiss/* map **/cmd.exe** /hiss/*/cmd.exe* script /hiss/* /hiss/* (--------------------------------------------------------------------) (String Matching\cr_string_matching)

    Matching of strings is a pervasive and important function within the server. Two types are supported; wildcard and regular expression. Wildcard matching is generally much less expensive (in CPU cycles and time) than regular expression matching and so should always be used unless the match explicitly requires otherwise. WASD attempts to improve the efficiency of both by performing a preliminary pass to make simple matches and eliminate obvious mismatches using a very low-cost comparison. This either matches or doesn't, or encounters a pattern matching meta-character which causes it to undertake full pattern matching.

    To assist with the refinement of string matching patterns the Server Administration facility has a report item named (Match). This report allows the input of target and match strings and allows direct access to the server's wildcard and regular expression matching routines. Successful matches show the matching elements and a substitution field ((hd_string_matching_subs)) allows resultant strings to be assessed.

    To determine what string match processing is occuring during request processing in the running server use the (match) item available from the Server Administration WATCH Report. (Wildcard Patterns\hd_string_matching_wildcard)

    Wildcard patterns are simple, low-cost mechanisms for matching a string to a template. They are designed to be used in path and authorization mapping to compare a request path to the root (left-hand side) or a template expression.

    ((Wildcard Operators\BOLD))

    (2\20) (Expression\Purpose) (*\Match zero or more characters (non-greedy)) (**\Match zero or more characters (greedy)) (%\Match any one character)

    Wildcard matching uses the '*' and '%' symbols to match any zero or more, or any one character respectively. The '*' wildcard can either be greedy or non-greedy depending on the context (and for historical reasons). It can also be forced to be greedy by using two consecutive ('**'). By default it is not greedy when matching request paths for mapping or authentication, and is greedy at other times (matching strings within conditional testing, etc.) (Greedy and Non-Greedy\hd_string_matching_non_greedy)

    Non-greedy matching attempts to match an asterisk wildcard up until the first character that is not the same as the character immediately following the wildcard. It matches a minimum number of characters before failing. Greedy matching attempts to match all characters up until the first string that does not match what follows the asterisk.

    To illustrate; using the following string non-greedy character matching compared to greedy character matching the following non-greedy pattern *non-greedy character*matching does not match but the following greedy pattern *non-greedy character**matching does match. The non-greedy one failed as soon as it encountered the space following the first (matching) string, while the greedy pattern continued to match eventually encountering a string matching the string following the greedy wildcard. (Regular Expressions\hd_string_matching_regex)

    Regular expression matching is case insensitive (in line with other WASD behaviour) and uses the POSIX EGREP pattern syntax and capabilities. Regular expression matching offers significant but relatively expensive functionality. One of those expenses is expression compilation. WASD attempts to eliminate this by pre-compiling expressions during server startup whenever feasable. Regular expression matching must be enabled using the [RegEx] WASD_CONFIG_GLOBAL directive and are then differentiated from wildcard patterns by using a leading (^) character.

    A detailed tutorial on regular expression capabilities and usage is well beyond the scope of this document. Many such hard-copy and on-line documents are available. (HTML=

    http://en.wikipedia.org/wiki/Regular_expression
    ) (HTML/OFF) http://en.wikipedia.org/wiki/Regular_expression (HTML/ON)

    This summary is only to serve as a quick mnemonic. WASD regular expressions support the following set of operators.

    ((Operator Overview\BOLD))

    (2\20) (Description\Usage) (Match-self Operator\Ordinary characters.) (Match-any-character Operator\.) (Concatenation Operator\Juxtaposition.) (Repetition Operators\* + ? {}) (Alternation Operator\) (List Operators\[...] [^...]) (Grouping Operators\(...)) (Back-reference Operator\digit) (Anchoring Operators\^ $) (Backslash Operator\Escape meta-character; i.e. ^ . $ [ )

    The following operators are used to match one, or in conjunction with the repetition operators more, characters of the target string. These single and leading characters are reserved meta-characters and must be escaped using a leading backslash (()) if required as a literal character in the matching pattern. (Note\BOLD) that this does not apply to the (range) hyphen; to include a hyphen in a range ensure the character is the first or last in the range.

    ((Matching Operators\BOLD))

    (2\20) (Expression\Purpose) (^\Match the beginning of the line) (.\Match any character) ($\Match the end of the line) (\Alternation (or)) ([abc]\Match only a, b or c) ([^abc]\Match anything except a, b and c) ([a-z0-9]\Match any character in the range a to z or 0 to 9)

    Repetition operators control the extent, or number, of whatever the matching operators match. These are also reserved meta-characters and must be escaped using a leading backslash if required as a literal character.

    ((Repetition Operators\BOLD))

    (2\20) (Expression\Function) (*\Match 0 or more times) (+\Match 1 or more times) (?\Match 1 or zero times) ({n}\Match exactly n times) ({n,}\Match at least n times) ({n,m}\Match at least n but not more than m times) (Examples\hd_string_matching_examples)

    The following provides a series of examples as they might occur in use for server configuration. (NUMBERED) Equivalent functionality using wildcard and regular expression patterns. Note that (Mozilla) must be at the start of the string, with the regular expression using the start-of-string anchor resulting in two consecutive (^)s, one indicating to WASD a regular expression, the other being part of the expression itself. if (user-agent:Mozilla*Gecko*) if (user-agent:^^Mozilla.*Gecko) This shows path matching using equivalent wildcard and regular expression matching. Note the requirement to use the regular expression (grouping) parentheses to provide the substitution elements, something provided implicitly with wildcard matching. map /*/-/* /wasd_root/runtime/*/* map ^/(.+)/-/(.+) /wasd_root/runtime/*/* This rather contrived regular expression example has no equivalent capability available with wildcard matching. It forbids the use of any path that contains any character other than alpha-numerics, the hyphen, underscore, period and forward-slash. pass ^[^-_./a-z0-9]+ "403 Forbidden character in path!" (Expression Substitution\hd_string_matching_subs)

    Expression substitution is available during path mapping ((cr_mapping_rule)). Both wildcard (implicitly) and regular expressions (using (grouping) operators) note the offsets of matched portions of the strings. These are then used for wildcard and (specified) wildcard substitution where result strings provide for this (e.g. mapping 'pass' and 'redirect' rules). A maximum of nine such wildcard substitutions are supported (one other, the zeroeth, is the full match). (Wildcard Substitution\hd_string_matching_subs_wild)

    With wildcard matching each asterisk wildcard contained in the pattern ((template) string) has matching characters in the (target) string noted and stored. Note that for the percentage (single character) wildcard no such storage is provided. These characters are available for substitution using corresponding wildcards present in the (result) string. For instance, the target string this is an example target string would be matched by the pattern string * is an example target * as containing two matching wildcard strings this string which could be substituted using the result string * is an example result * producing the resultant string this is an example result string (Regular Expression Substitution\hd_string_matching_subs_regex)

    With regular expression matching the groups of matching characters must be explicitly specified using the (grouping) parenthesis operator. Hence with regular expression matching it is possible to match many characters from the target string without retaining them for later substitution. Only if that match is designated as a subsitution source do the matching characters become available for substituion via any result string. Using two possible target strings as an example this is an example target string this is a contrived target string would both be matched by the regular expression ^^([a-z]*) is [a-z ]* target ([a-z]*)$ which though it contains three regular expressions in the pattern, only two have the grouping parentheses, and so make their matching string available for substitution this string which could be substituted using the result string * is the final result * producing the resultant string this is the final result string (Specified Substitution\hd_string_matching_subs_spec)

    By default the strings matched by wildcard or grouping operators are substituted in the same order in which they are matched. This order may be changed by specifying which wildcard string should be substituted where. Not all matched (and stored) strings need to be substituted. Some may be omitted and the contents effectively ignored.

    The specified substitution syntax is a result wildcard followed by a single-apostrophe (') and a single digit from zero to nine (09). The zeroeth element is the full matching string. Element one is the first matching part of the expression, on through to the last. Specifying an element that had no matching string substitutes an empty string (i.e. nothing is added). Using the same target string as in the previous previous example this is an example target string and matched by the wildcard pattern string * is an example target * when substituted by the result string *'2 is an example result would produce the resultant string string is an example result with the string represented by the first wildcard effectively being discarded. (--------------------------------------------------------------------) (Conditional Configuration\cr_conditional)

    Request processing (WASD_CONFIG_MAP) and authorization (WASD_CONFIG_AUTH) rules may be conditionally applied depending on request, server or other charactersistics. These include (SIMPLE) server host name, port client IP address and host name browser-accepted content-types, character sets, languages, encodings browser identification string scheme ((http:) or (https:), i.e. is it a secure request?) HTTP method (GET, POST, etc.) request path, query string, cookie data, refering page virtual host:port specified in request header system information (hardware, Alpha/VAX, node name, VMS version, etc.) local time random number generation

    Conditionals may be nested up to a maximum depth of eight, are not case sensitive and generally match via string comparison, although some tests are performed as boolean operations, by converting the conditional parameter to a number before comparison, and IP address parameters will accept a network mask as well as a string pattern. (String Matching\hd_cond_pattern_matching)

    The basis of much conditional decision making is string pattern matching. Both wildcard and regular expression based pattern matching is available ((cr_string_matching)). Wildcard matching in conditional tests is (greedy). Regular expression matching, in common with usage throughout WASD, is differentiated from wildcard patterns using a leading (^) character. (Conditional Syntax\hd_conditional_syntax)

    Conditional expressions and processing flow structures may be used in the following formats. Conditional and rule text may be indented for clarifying structure. (if ((condition))\BOLD) then apply rest of line (if ((condition))\BOLD) then apply one or more rules up until the corresponding (endif\BOLD) (if ((condition))\BOLD) then apply one or more rules (else\BOLD) apply one or more other rules up until the corresponding (endif\BOLD) (if ((condition))\BOLD) then apply one or more rules (elif ((condition))\BOLD) apply one or more other rules in a sort or case statement (else\BOLD) a possible default rule or rules up until the delimiting (endif\BOLD)

    Logical operators are also supported, in conjunction with precedence ordering parentheses, allowing moderately complex compound expressions to be applied in conditionals. (SIMPLE) (!\BOLD) logical negation (\BOLD) logical AND (\BOLD) logical OR

    There are two more conditional structures that allow previous decisions to be reused. These are (unif) and the (ifif). The first unconditionally includes rules regardless of the current state of execution. The second resumes execution only if the previous (if) or (elif) expression was true. The (else) statement may also be used after an (unif) to continue only if the previous expression was false. The purpose of these constructs are to allow a single decision statement to include both conditional and unconditional rules. (if ((condition))\BOLD) then apply one or more rules (unif\BOLD) apply this block of rules unconditionally (ifif\BOLD) applied only if the original if expression was evaulated as true (unif\BOLD) apply another block of rules unconditionally (else\BOLD) and this block of rules only if the original was false (endif\BOLD) (CAUTION) Conditional syntax is checked at rule load time (either server startup or reload). Basic errors such as unknown keywords and unbalanced parentheses or structure statements will be detected and reported to the corresponding Admin Menu report and to the server process log. Unless these reports are checked after modifying rule sets syntax errors may result in unexpected mappings or access. Although the server cannot determine the correct intent of an otherwise syntactically correct conditional, if it encounters an unexpected but detectable condition during processing it aborts the request, supplying an appropriate error message. (Conditional Keywords\hd_conditional_list)

    The following keywords provide a match between the corresponding request or other value and a string immediately following the delimiting colon. White space or other reserved characters may not be included unless preceded by a backslash. The actual value being used in the conditional matching may be observed using the mapping item of the WATCH facility.

    ((Conditional Keywords\BOLD))

    (2\13) (Keyword\Description) (accept:\ Browser-accepted content types as listed in the (Accept:) request header field. Same string as provided in CGI variable HTTP_ACCEPT.) (accept-charset:\ Browser-accepted character sets as listed in the (Accept-Charset:) request header field. CGI variable HTTP_ACCEPT_CHARSET.) (accept-encoding:\ Browser-accepted content encoding as listed in the (Accept-Encoding:) request header field. CGI variable HTTP_ACCEPT_ENCODING.) (accept-language:\ Browser language preferences as listed in the (Accept-Language:) request header field. CGI variable HTTP_ACCEPT_LANGUAGE.) (authorization:\ The raw authorization string from the request header, if any supplied. This could be simply used to test whether it has been supplied or not.) (callout:\ Simple boolean value. If a script callout is in progress (see (Scripting Overview, CGI Callouts).) it is true, otherwise false.) (client_connect_gt:\ An integer representing the current network connections (those currently being processed plus those currently being (kept alive)) for the particular client represented by the current request. If greater than this value returns true, otherwise false. See (hd_client_concurrency). ) (cluster_member:\ If the supplied node name is (perhaps currently) a member of the cluster (if any) the server may be executing on.) (command_line:\ The command line qualifiers and parameters used when the server image was activated.) (cookie:\ Raw cookie data as the text string provided in (Cookie:) request header field. CGI variable HTTP_COOKIE.) (decnet:\ Whether DECnet is active on the system and which version is available. This value will be 0 if not active, 4 if PhaseIV or 5 is PhaseV.) (directory:\ Tests whether the specified directory exists or not. Parameter can be a URI available for mapping by the server or a VMS file-system specification. If no parameter is supplied the request path is mapped to a file-system specification. As this conditional accesses the file-system it can be (relatively expensive in terms of server latency).) (document_root:\ The DOCUMENT_ROOT CGI variable SET using the (map=root=()) mapping rule.) (file:\ Tests whether the specified file exists or not. Parameter can be a URI available for mapping by the server or a VMS file-system specification. If no parameter is supplied the request path is mapped to a file-system specification. The specification can be a directory. As this conditional accesses the file-system it can be (relatively expensive in terms of server latency).) (forwarded:\ Proxy/gateway host(s) request forwarded by, as specified in request header field (Forwarded:). CGI variable HTTP_FORWARDED.) (host:\ The host (and optionally port) specified in request header (Host:) field. This is used by all modern browsers to provide virtual host information to the server. CGI variable HTTP_HOST.) (instance:\ Used to check whether a particular, clustered instance of WASD is available. See (hd_conditional_robin_instance).) (jpi_username:\ The account username the server is executing as.) (mapped_path:\ The path resulting from mapping (phase 2 if script path involved) from which the path-translated is derived.) (multihome:\ Somewhat specialised conditional that becomes non-null when a client used a different IP address to connect to the service than the is bound to. Is set to the IP address the client used and may be matched using wildcard matching or as a network mask.) (note:\ Ad hoc information (string) provided by the server administrator using the /DO=NOTE= facility (and online equivalent) that can be used to quickly and easily modify rule processing on a per-system or per-cluster basis.) (notepad:\ Information (strings) stored using the SET (notepad=) mapping rule. See (hd_conditional_notepad).) (ods:\ Specified as 2 or 5 (Extended File System), or as SRI file name encoding (MultiNet NFS and others) PWK encoding (PATHWORKS 4/5), ADS encoding (Advanced Server / PATHWORKS 6), SMB encoding (Samba - same as ADS). ) (pass:\ A numeric value, 1 or 2, representing the first or second pass (if a script component was parsed) through the path mapping rules. Will be zero at other times. When the server is (reverse-mapping) a file specification will be -1.) (path-info:\ Path specified in the request line. CGI variable PATH_INFO.) (path-translated:\ VMS translation of path-info. Available after rule mapping (i.e. during authorization rule processing).) (query-string:\ Query string specified in request line. Same information as provided in CGI variable QUERY_STRING.) (rand:\ Value from a random number generator. See (hd_conditional_rand).) (redirected:\ If a request has been internally redirected ((hd_map_rule_redirect)) this conditional will be non-zero. Can be used as a boolean or with a digit specified.) (referer:\ URL of refering page as provided in (Referer:) request header field. CGI variable HTTP_REFERER.) (regex:\ Simple boolean value. If configuration directive [RegEx] is enabled (and hence regular expression string matching, (cr_string_matching)) this will be true.) (remote-addr:\ Client IP address. Same as provided as CGI variable REMOTE_ADDR. As with all IP addresses used for conditional testing this may be wildcard string match or network mask expressed as (address)/(mask-length) (see (hd_conditional_host_addresses)).) (remote-host:\ Client host name if name resolution enabled, otherwise the IP address (same as (remote-addr)). CGI variable REMOTE_HOST.) (request:\ Detect the presence of specific or unknown request fields. See (hd_conditional_request).) (request-method:\ HTTP method ((GET), (POST), etc.) specified in the request line. CGI variable REQUEST_METHOD.) (request-scheme:\ Request protocol as (http:) or (https:). CGI variable REQUEST_SCHEME.) (request-uri:\ The unescaped request path plus any query-string. CGI variable REQUEST_URI. ) (restart:\ A numeric value, zero to maximum, representing the number of times path mapping has been SET (map=restart). Can be used as a boolean or with a digit specified.) (robin:\ Used to check whether a particular, clustered instance of WASD is available and distribute requests to it using a round-robin algorithm. See (hd_conditional_robin_instance).) (script-name:\ After the first pass of rule mapping (script component resolution), or during authorization processing, any script component of the request URI.) (server-addr:\ The service IP address. CGI variable SERVER_ADDR. This may be wildcard string match or network mask expressed as (address)/(mask-length).) (server_connect_gt:\ An integer representing the current server network connections (those currently being processed plus those currently being (kept alive)). If greater than this value returns true, otherwise false. ) (server_process_gt:\ An integer representing the current server requests in-progress. If greater than this value returns true, otherwise false. ) (server-name:\ The (possibly virtual) server name. This may or may not exactly match any string provided via the (host) keyword. CGI variable SERVER_NAME.) (server-port:\ The (possibly virtual) server port number. CGI variable SERVER_PORT.) (server-protocol:\ (1.1), (1.0), (0.9) representing the HTTP protocol used by the request.) (server-software:\ The server identification string, including the version. For example (HTTPd-WASD/8.0.0 OpenVMS/AXP SSL). CGI variable SERVER_SOFTWARE.) (service:\ This is the composite server name plus port as (server-name):(port). To match an unknown service use (?).) (ssl:\ Simple boolean value. If request is via Secure Sockets Layer then this will be true.) (syi_arch_name:\ System information; CPU architecture of the server system, (Alpha), (Itanium) or (VAX). ) (syi_hw_name:\ System information; hardware identification string, for example (AlphaStation 400 4/233).) (syi_nodename:\ System information; the node name, for example (KLAATU).) (syi_version:\ System information; VMS version string, for example (V7.3).) (tcpip:\ A string derived from the UCX$IPC_SHR shareable image. It looks something like this (Compaq TCPIP$IPC_SHR V5.1-15 (11-JAN-2001 02:28:33.95)) and comprises the agent (Compaq, MultiNet, TCPware, unknown), the name of the image, the version and finally the link date.) (time:\ Compare to current system time. See (hd_conditional_time).) (trnlnm:\ Translate a logical name. See (hd_conditional_trnlnm).) (user-agent:\ Browser identification string as provided in (User-Agent:) request header field. CGI variable HTTP_USER_AGENT. ) (webdav:\ Simple boolean value. If the request has been identified as WebDAV then this is true. Takes an optional parameter, (MSagent), which is true if a Microsoft WebDAV agent has been detected.) (websocket:\ Simple boolean value. If a WebSocket protocol upgrade request will be true.) (x-forwarded-for:\ Proxied client name or address as provided in (X-Forwarded-For:) request header field. CGI variable HTTP_X_FORWARDED_FOR.) (Notepad: Keyword\hd_conditional_notepad)

    The (request notepad) is a string storage area that can be used to store and retrieve ad hoc information during path mapping and subsequent authorization processing. The notepad contents can be changed using the SET (notepad=()) or appended to using SET (notepad=+()) ((hd_map_rule_set)). These contents then can be subsequently detected using the (notepad:) conditional keyword (or the obsolescent 'NO' mapping conditional) and used to control subsequent mapping or authorization processing. Notepad information persists across internal redirection processing ((hd_map_rule_redirect)) and so may be used when the regenerated request is mapped and authorized. To prevent such information from unexpectedly interfering with internally redirected requests a (notepad=()) can be used to empty the storage area. (Rand: Keyword\hd_conditional_rand)

    At the commencement of each pass a new pseudo-random number is generated (and therefore remains constant during that pass). The (rand:) conditional is intended to allow some sort of distribution to be built into a set of rules, where each pass (request) generates a different one. The random conditional accepts two parameters, a (modulas) number, which is used to modulas the base number, and a (comparison) number, which is compared to the modulas result.

    Hence the following conditional rules if (rand:3:0) (do this) elif (rand:3:1) (do this) else (do this) endif would pseudo-randomly generate base numbers of 0, 1, 2 and perform the appropriate conditional block. Over a sufficient number of usages this should produce a relatively even distribution of numbers. If the modulas is specified as less than two (i.e. no distribution factor at all) it defaults to 2 (i.e. a distribution of 50%). Hence the following example should be the equivalent of a coin toss. if (rand:) (heads) else (tails) endif (Request: Keyword\hd_conditional_request)

    Looks through each of the lines of the request header for the specified request field and/or value. This may be used to detect the presence of specific or unknown (to the server) request fields. When detecting a specified just field the name can be provided if (request:"Keep-Alive:*") matching any value, or specific values can also be matched for if (request:"User-Agent:*Opera*")

    Note that all request fields known to the server have a specific associated conditional keyword (i.e. (user-agent:) for the above example). To determine whether any request fields unknown to the server have been supplied use the (request:) keyword as in the following example. if (request:?) map * /cgi-bin/unknown_request_notify.com* endif (Instance: and Robin: Keywords\hd_conditional_robin_instance)

    Both of these conditionals are designed to allow the redistribution of requests between clustered WASD services. They are WASD-aware and so allow a slightly more tailored distribution than perhaps an IP package round-robin implementation might. Each tests for the current operation of WASD on a particular node (using the DLM) before allowing the selection of that node as a target. This can allow some systems to be shutting down or starting up, or have WASD shutdown for any reason, without requiring any extraordinary procedures to allow for the change in processing environment. (Instance:\hd_conditional_instance)

    The instance: directive allows testing for a particular cluster member having a WASD instance currently running. This can allow requests to be redirected or reverse-proxied to a particular system with the knowlege that it should be processed (of course there is a small window of uncertainty as events such as system shutdown and startup occur asynchronously). The behaviour of the conditional block is entirely determinate based on which node names have a WASD instance and the order of evaluation. Compare this to a similar construct using the robin: directive, as described below.

    This conditional is deployed in two phases. In the first, it contains a comma-separated list of node names (that are expected to have instances of WASD instantiated). In the second, containing a single node name, allowing the selected node to be tested. For example. if (instance:NODE1,NODE2,NODE3) if (instance:NODE1) redirect /* http://node1.domain.name/*? if (instance:NODE2) redirect /* http://node2.domain.name/*? if (instance:NODE3) redirect /* http://node3.domain.name/*? pass * "500 Some sort of logic error!!" endif pass * "503 No instance currently available!"

    If none of the node names specified in the first phase is currently running a WASD instance the rule returns false, otherwise true. If true the above example has conditional block processed with each of the node names successively tested. If NODE1 has a WASD instance executing it returns true and the associated redirect is performed. The same for NODE2 and NODE3. At least one of these would be expected to test true otherwise the outer conditional established during phase one would have been expected to return false. (Robin:\hd_conditional_robin)

    The robin: conditional allows rules to be applied sequentially against specified members of a cluster that currently have instances of WASD running. This is obviously intended to allow a form of load sharing and/or with redundancy (not balancing, as no evaluation of the selected target's current workload is performed, see below). As with the instance: directive above, there is, of course, a small window of potential uncertainty as events such as system shutdown and startup occur asynchronously and may impact availability between the phase one test and ultimate request distribution.

    This conditional is again used in two phases. The first, containing a comma-separated list of node names (that are expected to have instances of WASD instantiated). The second, containing a single node name, allowing the selected node (from phase one) to have a rule applied. For example. if (robin:VAX1,ALPHA1,ALPHA2,IA64A) if (robin:VAX1) redirect /* http://vax1.domain.name/*? if (robin:ALPHA1) redirect /* http://alpha1.domain.name/*? if (robin:ALPHA2) redirect /* http://alpha2.domain.name/*? if (robin:IA64A) redirect /* http://ia64a.domain.name/*? pass * "500 Some sort of logic error!!" endif pass * "503 No round-robin node currently available!"

    In this case round-robining will be made through four node names. Of course these do not have to represent all the systems in the cluster currently available or having WASD instantiated. The first time the 'robin:' rule containing multiple names is called VAX1 will be selected. The second time ALPHA1, the third ALPHA2, and the fourth IA64A. With the fifth call VAX1 is returned to, the sixth ALPHA1, etc. In addition, the selected nodename is verified to have a instance of WASD currently running (using the DLM and WASD's instance awareness). If it does not, round-robining is applied again until one is found (if none is available the phase one conditional returns false). This is most significant as it ensures that the selected node should be able to respond to a redirected or (reverse-)proxied requested. This is the selection set-up phase.

    Then there is the selection application phase. Inside the set-up conditional other conditionals apply the selection made in the first phase (through simple nodename string comparison). The rule, in the above example a redirect, is applied if that was the node selected.

    During selection set-up unequal weighting can be applied to the round-robin algorithm by including particular node names more than once. if (robin:VAX1,ALPHA,VAX2,ALPHA)

    In the above example, the node ALPHA will be selected twice as often as either of VAX1 and VAX2 (and because of the ordering interleaved with the VAX selections). (Time: Keyword\hd_conditional_time)

    The (time:) conditional allows server behaviour to change according to the time of day, week, or even year. It compares the supplied parameter to the current system time in one of three ways. (NUMBERED) The supplied parameter is in the form (1200-1759), which should be read as (twelve noon to five fifty-nine PM) (i.e. as a time range in minutes, generalized as (hhmm-hhmm)), where the first is the start time and the second the end time. If the current time is within that range (inclusive) the conditional returns true, otherwise false. If the range doesn't look correct false is always returned. if (time:0000-0000) (it's midnight) elif (time:0001-1159) (it's AM) elif (time:1200-1200) (it's noon) else (it's PM) endif If the supplied parameter is a single digit it is compared to the VMS day of the week (1-Monday, 2-Tuesday 7-Sunday). if (time:6 time:7) (it's the weekend) else (it's the working week) endif If the supplied string is not in either of the formats described above it is treated as a string match with a VMS comparision time (i.e. (yyyy-mm-dd hh-mm-ss.hh)). if (time:%%%%-05-*) (it's the month of May) endif (Trnlnm: Keyword\hd_conditional_trnlnm)

    The (trnlnm:) conditional dynamically translates a logical name and uses the value. One mandatory and up to two optional parameters may be supplied. trnlnm:logical-name[;name-table][:string-to-match]

    The (logical-name) must be supplied; without it false is always returned. If just the (logical-name) is supplied the conditional returns true if the name exists or false if it does not. The default (name-table) is LNM$FILE_DEV. When the optional (name-table) is supplied the lookup is confined to that table. If the optional (string-to-match) is supplied it is matched against the value of the logical and the result returned. (Host Addresses\hd_conditional_host_addresses)

    Host names or addresses can be an alpha-numeric string (if DNS lookup is enabled) or dotted-decimal network address, a slash, then a dotted-decimal mask. For example (131.185.250.0/255.255.255.192). This has a 6 bit subnet. It operates by bitwise-ANDing the client host address with the mask, bitwise-ANDing the network address supplied with the mask, then comparing the two results for equality. Using the above example the host 131.185.250.250 would be accepted, but 131.185.250.50 would be rejected. Equivalent notation for this rule would be (131.185.250.0/26). (Examples\hd_conditional_examples)

    The following provides a collection of examples of conditional mapping and authorization rules illustrating the use of wildcard matching, network mask matching and the various formats in which the rules may be blocked. (NUMBERED) This first example shows an EXEC mapping rule being applied to a path if the request query string contains the string (example). if (query-string:*example*) exec /* /cgi-bin/example/* In this example a block of mapping statements is processed if the virtual service of the request matches that in the conditional, otherwise the block is skipped. Note the indentation to help clarify the structure. if (service:the.host.name:80) pass /web/* /dka0/the_host_name_web/* pass /graphics/* /dka100/graphics/* pass * "404 Resource not found." endif This example a series of tests allow a form of case processing where the first to match will be processed and terminate the matching process. In this case if a match does not occur rule processing continues after the (endif). if (service:the.host.name:80) pass /web/* /dka0/the_host_name_web/* elif (service:next.host.name:80) pass /web/* /dka0/next_host_name_web/* elif (service:another.host.name:80) pass /web/* /dka0/another_host_name_web/* endif pass /graphics/* /dka100/graphics/* pass * "404 Resource not found." In this (somewhat contrived) example a nested test is used to check (virtual) server name and that the request is being handled via Secure Sockets Layer (SSL) for security. If it is not an informative message is supplied. The (else) and the quotes are not really required but included here for illustration. if (server-name:the.host.name) if (scheme:(https)) pass /secure/* /dka0/the_host_name_web/secure/* else pass * /dka0/the_host_name_web/secure/only-via-SSL.html endif endif This would be another way to accomplish a similar objective to example 4. This uses a (negation) operator to exclude access to successive mappings if not requesting via SSL. if (server-name:the.host.name) if (!SSL:) pass * /web/secure/only-via-SSL.html endif pass /secure/* /web/secure/* pass /other/* /web/other/* pass /web/* /web/web/* pass * "404 Resource not found." endif This example shows the use of a compound conditional using the AND and OR operators. It also illustrates the use of a network mask. It will exclude all access to the specified path unless the request is originating from within a specified network (perhaps an intranet) or via SSL. if (path:/sensitive/* && !(remote-addr:131.185.250.0/24 || SSL:)) pass * 404 "Access denied (SSL only)." endif This example illustrates restricting authentication to SSL. [[*]] ["Your VMS password"=VMS] if (!request-scheme:https) * r+w,#0 endif Logical name translation may be used to dynamically alter the flow of rule interpretation. if (trnlnm:HTTPD_EXAMPLE) pass /* /example/* else pass /* /* endif Using a site administrator's /DO=NOTE= entry to modify rule processing. In this example the contingency of a broken back-end processor has been prepared for and a document advising clients of the temporary problem is redirected to once the administrator enters $ HTTPD /DO=NOTE=PROBLEM /ALL at the command-line (or via the online equivalent). Note that in this example external clients are provided with the problem advice document while internal clients may still access the back-end for troubleshooting purposes. if (note:PROBLEM && !remote-addr:131.185.0.0/16) pass /* /problem_with_backend.html else pass /* /backend/* endif

    Of course there are a multitude of possibilities based on this idea! The noted data persists across server startups but does not persist across system startups! (--------------------------------------------------------------------)