scanmail man page on Plan9

Man page or keyword search:  
man Server   549 pages
apropos Keyword Search (all sections)
Output format
Plan9 logo
[printable version]

SCANMAIL(8)							   SCANMAIL(8)

NAME
       scanmail, testscan -  spam filters

SYNOPSIS
       upas/scanmail  [	 options  ] [ qer-args ] root mail sender system rcpt-
       list

       upas/testscan [ -avd ] [ -p patfile ] [ filename ]

DESCRIPTION
       Scanmail accepts a mail message supplied on standard input,  applies  a
       file  of	 patterns to a portion of it, and dispatches the message based
       on the results.	It exactly replaces the generic queuing command qer(8)
       that is executed from the rc(1) script /mail/lib/qmail in the mail pro‐
       cessing pipeline.  Associated with each pattern is an action  in	 order
       of decreasing priority:

       dump	 the  message  is  deleted  and	 a  log	 entry	is  written to
		 /sys/log/smtpd

       hold	 the message is placed in a queue for human inspection

       log	 a line containing the matching	 portion  of  the  message  is
		 written to a log

       If no pattern matches or only patterns with an action of log match, the
       message is accepted and	scanmail  queues  the  message	for  delivery.
       Scanmail	 meshes	 with  the  blocking facilities of smtpd(6) to provide
       several layers of filtering on  gateway	systems.   In  all  cases  the
       sender  is  notified  that the message has been successfully delivered,
       leaving the sender  unaware  that  the  message	has  been  potentially
       delayed or deleted.

       Scanmail accepts the arguments of qer(8) as well as the following:

       -c     Save  a  copy of each message in a randomly-named file in direc‐
	      tory /mail/copy.

       -d     Write debugging information to standard error.

       -h     Queue held messages by sending domain name.  The -q option  must
	      specify  a root directory; messages are queued in subdirectories
	      of this directory.  If the -h option is not specified,  messages
	      are  accumulated in a subdirectory of /mail/queue.hold named for
	      the contents of /dev/user, usually none.

       -n     Messages are never held for inspection, but are delivered.  Also
	      known as vacation mode.

       -p filename
	      Read the patterns from filename rather than /mail/lib/patterns.

       -q holdroot
	      Queue  deliverable messages in subdirectories of holdroot.  This
	      option is the same as the	 -q  option  of	 qer(8)	 and  must  be
	      present if the -h option is given.

       -s     Save  deleted messages.	Messages are stored, one per randomly-
	      named file, in subdirectories of /mail/queue.dump named with the
	      date.

       -t     Test  mode.   The	 pattern matcher is applied but the message is
	      discarded and the result is not logged.

       -v     Print the highest priority match.	 This is useful	 with  the  -t
	      option  for testing the pattern matcher without actually sending
	      a message.

       Testscan is the command line version of scanmail.  If filename is miss‐
       ing,  it	 applies  the  pattern	set  to the message on standard input.
       Unlike scanmail, which  finds  the  highest  priority  match,  testscan
       prints  all  matches  in	 the portion of the message under test.	 It is
       useful for testing a pattern set	 or  implementing  a  personal	filter
       using the pipeto file in a user's mail directory.  Testscan accepts the
       following options:

       -a     Print matches in the complete input message

       -d     Enable debug mode

       -v     Print the message after conversion to canonical form (q.v.).

       -p filename
	      Read the patterns from filename rather than /mail/lib/patterns.

   Canonicalization
       Before pattern matching, both programs convert a portion of the message
       header  and  the	 beginning  of	the  message to a canonical form.  The
       amount of the header and message body processed are set by compile-time
       parameters  in the source files.	 The canonicalization process converts
       letters to lower-case and replaces consecutive spaces, tabs and newline
       characters  with	 a single space.  HTML commands are deleted except for
       the parameters following A HREF, IMG SRC, and  IMG  BORDER  directives.
       Additionally, the following MIME escape sequences are replaced by their
       ASCII equivalents:

		  Escape Seq   ASCII
		  ----------   -----
		       =2e	 .
		       =2f	 /
		       =20    <space>
		       =3d	 =
       and the sequence =<newline> is elided.  Scanmail assembles the  sender,
       destination  domain  and	 recipient  fields  of the command line into a
       string that is subjected to the same canonical  processing.   Following
       canonicalization,  the command line and the two long strings containing
       the header and the message body are passed to the matching  engine  for
       analysis.

   Pattern Syntax
       The  matching  engine  compiles	the pattern set and matches it to each
       canonicalized input string.  Patterns are specified  one	 per  line  as
       follows:

	    {*}action: pattern-spec {~~override...~~override}

       On  all lines, a # introduces a comment; there is no way to escape this
       character.

       Lines beginning with * contain a pattern-spec that is a string;	other‐
       wise, the the pattern-spec is a regular expression in the style of reg‐
       exp(6).	Regular expression matching is many times less efficient  than
       string  matching,  so  it is wiser to enumerate several similar strings
       than to combine them into a regular expression.	The action is  a  key‐
       word  terminated	 by  a	:  and	separated from the pattern by optional
       white-space.  It must be one of the following:

       dump	 if the pattern matches, the message is deleted.   If  the  -s
		 command line option is set, the message is saved.

       hold	 if  the pattern matches, the message is queued in a subdirec‐
		 tory  of  /mail/queue.hold  for  manual  inspection.	 After
		 inspection,  the  queue can be swept manually using runq (see
		 qer(8)) to deliver messages that were inadvertently matched.

       header	 this is the same as the hold action, except  the  pattern  is
		 only  applied	to  the	 message header.  This optimization is
		 useful	 for  patterns	that  match  header  fields  that  are
		 unlikely to be present in the body of the message.

       line	 the  sender and a section of the message around the match are
		 written to the file /sys/log/lines.  The  message  is	always
		 delivered.

       loff	 patterns  of  this type are applied only to the canonicalized
		 command line.	When a match occurs, all  patterns  with  line
		 actions  are  disabled.  This is useful for limiting the size
		 of the log file by excluding  repetitive  messages,  such  as
		 those from mailing lists.

       Patterns	 are  accumulated  into	 pattern sets sharing the same action.
       The matching engine applies the dump pattern set first, then the header
       and  hold pattern sets, and finally the line pattern set.  Each pattern
       set is applied three times: to the canonicalized command line,  to  the
       message	header, and finally to the message body.  The ordering of pat‐
       terns in the pattern file is insignificant.

       The pattern-spec is a string of characters terminated by a  newline,  #
       or  override  indicator,	 ~~.  Trailing white-space is deleted but pat‐
       terns containing leading or trailing white-space	 can  be  enclosed  in
       double-quote  characters.   A pattern containing a double-quote must be
       enclosed in double-quote characters and preceded by a  backslash.   For
       example, the pattern

	    "this is not \"spam\""

       matches the string this is not "spam".  The pattern-spec is followed by
       zero or more override strings.  When the specific pattern matches, each
       override	 is  applied  and if one matches, it cancels the effect of the
       pattern.	 Overrides must be strings; regular expressions are  not  sup‐
       ported.	 Each  override	 is  introduced by the string ~~ and continues
       until a subsequent ~~, # or newline, white-space included.  A ~~	 imme‐
       diately followed by a newline indicates a line continuation and further
       overrides continue on the following line.  Leading white-space  on  the
       continuation line is ignored.  For example,

	       *hold:	sex.com~~essex.com~~sussex.com~~sysex.com~~
			lasex.com~~cse.psu.edu!owner-9fans

       matches	all  input  containing	the string sex.com except for messages
       that also contain the strings in the override list.  Often it is desir‐
       able  to	 override a pattern based on the name of the sender or recipi‐
       ent.  For this reason, each override pattern is applied to  the	header
       and  the command line as well as the section of the canonicalized input
       containing the matching data.  Thus a pattern matching the command line
       or  the	header searches both the command line and the header for over‐
       rides while a match in the body searches the body, header  and  command
       line for overrides.

       The structure of the pattern file and the matching algorithm define the
       strategy for detecting and filtering  unwanted  messages.   Ideally,  a
       hold  pattern  selects a message for inspection and if it is determined
       to be undesirable, a specific dump pattern is added to  delete  further
       instances  of  the  message.  Additionally, it is often useful to block
       the sender by updating the smtpd control file.

       In this regime, patterns with a dump action,  generally	match  phrases
       that are likely to be unique.  Patterns that hold a message for inspec‐
       tion match phrases commonly found in undesirable material and occasion‐
       ally  in	 legitimate messages.  Patterns that log matches are less spe‐
       cific yet.  In all cases the ability to override a pattern by  matching
       another	string,	 allows	 repetitive messages that trigger the pattern,
       such as mailing lists, to pass the filter after the first one  is  pro‐
       cessed  manually.  The -s option allows deleted messages to be salvaged
       by either manual or semi-automatic review, supporting the specification
       of  more	 aggressive  patterns.	 Finally,  the	utility of the pattern
       matcher is not confined to filtering spam; it  is  a  generally	useful
       administrative  tool  for  deleting inadvertently harmful messages, for
       example, mail loops, stuck senders or viruses.  It is also  useful  for
       collecting or counting messages matching certain criteria.

FILES
       /mail/lib/patterns
	      default pattern file

       /sys/log/smtpd
	      log of deleted messages

       /mail/log/lines
	      file where log matches are logged

       /mail/queue/*
	      directories where legitimate messages are queued for delivery

       /mail/queue.hold
	      directory where held messages are queued for inspection

       /mail/queue.dump/*
	      directory	 where	dumped messages are stored when the -s command
	      line option is specified.

       /mail/copy/*
	      directory where copies of all incoming messages are stored.

SOURCE
       /sys/src/cmd/upas/scanmail

SEE ALSO
       mail(1), qer(8), smtpd(6)

BUGS
       Testscan does not report a match when the body of  a  message  contains
       exactly one line.

								   SCANMAIL(8)
[top]
                             _         _         _ 
                            | |       | |       | |     
                            | |       | |       | |     
                         __ | | __ __ | | __ __ | | __  
                         \ \| |/ / \ \| |/ / \ \| |/ /  
                          \ \ / /   \ \ / /   \ \ / /   
                           \   /     \   /     \   /    
                            \_/       \_/       \_/ 
More information is available in HTML format for server Plan9

List of man pages available for Plan9

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net