Mail::SpamAssassin::Plugin::TextCat man page on Mageia

Man page or keyword search:  
man Server   17783 pages
apropos Keyword Search (all sections)
Output format
Mageia logo
[printable version]

Mail::SpamAssassin::PlUser:Contributed PMail::SpamAssassin::Plugin::TextCat(3)

NAME
       Mail::SpamAssassin::Plugin::TextCat - TextCat language guesser

SYNOPSIS
	 loadplugin	Mail::SpamAssassin::Plugin::TextCat

DESCRIPTION
       This plugin will try to guess the language used in the message body
       text.

       You can use the "ok_languages" directive to set which languages are
       considered okay for incoming mail and if the guessed language is not
       okay, "UNWANTED_LANGUAGE_BODY" is triggered.

       It will always add the results to a "X-Language" name-value pair in the
       message metadata data structure. This may be useful as Bayes tokens and
       can also be used in rules for scoring. The results can also be added to
       marked-up messages using "add_header", with the _LANGUAGES_ tag. See
       Mail::SpamAssassin::Conf for details.

       Note: the language cannot always be recognized with sufficient
       confidence.  In that case, no action is taken.

USER OPTIONS
       ok_languages xx [ yy zz ... ]	  (default: all)
	   This option is used to specify which languages are considered okay
	   for incoming mail.  SpamAssassin will try to detect the language
	   used in the message body text.

	   Note that the language cannot always be recognized with sufficient
	   confidence. In that case, no action is taken.

	   The rule "UNWANTED_LANGUAGE_BODY" is triggered if none of the
	   languages detected are in the "ok" list. Note that this is the only
	   effect of the "ok" list. It does not act as a whitelist against any
	   other form of spam scanning.

	   In your configuration, you must use the two or three letter
	   language specifier in lowercase, not the English name for the
	   language.  You may also specify "all" if a desired language is not
	   listed, or if you want to allow any language.  The default setting
	   is "all".

	   Examples:

	     ok_languages all	      (allow all languages)
	     ok_languages en	      (only allow English)
	     ok_languages en ja zh    (allow English, Japanese, and Chinese)

	   Note: if there are multiple ok_languages lines, only the last one
	   is used.

	   Select the languages to allow from the list below:

	   af	- Afrikaans
	   am	- Amharic
	   ar	- Arabic
	   be	- Byelorussian
	   bg	- Bulgarian
	   bs	- Bosnian
	   ca	- Catalan
	   cs	- Czech
	   cy	- Welsh
	   da	- Danish
	   de	- German
	   el	- Greek
	   en	- English
	   eo	- Esperanto
	   es	- Spanish
	   et	- Estonian
	   eu	- Basque
	   fa	- Persian
	   fi	- Finnish
	   fr	- French
	   fy	- Frisian
	   ga	- Irish Gaelic
	   gd	- Scottish Gaelic
	   he	- Hebrew
	   hi	- Hindi
	   hr	- Croatian
	   hu	- Hungarian
	   hy	- Armenian
	   id	- Indonesian
	   is	- Icelandic
	   it	- Italian
	   ja	- Japanese
	   ka	- Georgian
	   ko	- Korean
	   la	- Latin
	   lt	- Lithuanian
	   lv	- Latvian
	   mr	- Marathi
	   ms	- Malay
	   ne	- Nepali
	   nl	- Dutch
	   no	- Norwegian
	   pl	- Polish
	   pt	- Portuguese
	   qu	- Quechua
	   rm	- Rhaeto-Romance
	   ro	- Romanian
	   ru	- Russian
	   sa	- Sanskrit
	   sco	- Scots
	   sk	- Slovak
	   sl	- Slovenian
	   sq	- Albanian
	   sr	- Serbian
	   sv	- Swedish
	   sw	- Swahili
	   ta	- Tamil
	   th	- Thai
	   tl	- Tagalog
	   tr	- Turkish
	   uk	- Ukrainian
	   vi	- Vietnamese
	   yi	- Yiddish
	   zh	- Chinese (both Traditional and Simplified)
	   zh.big5   - Chinese (Traditional only)
	   zh.gb2312 - Chinese (Simplified only)

       inactive_languages xx [ yy zz ... ]	    (default: see below)
	   This option is used to specify which languages will not be
	   considered when trying to guess the language.  For performance
	   reasons, supported languages that have fewer than about 5 million
	   speakers are disabled by default.  Note that listing a language in
	   "ok_languages" automatically enables it for that user.

	   The default setting is:

	   bs cy eo et eu fy ga gd is la lt lv rm sa sco sl yi

	   That list is Bosnian, Welsh, Esperanto, Estonian, Basque, Frisian,
	   Irish Gaelic, Scottish Gaelic, Icelandic, Latin, Lithuanian,
	   Latvian, Rhaeto-Romance, Sanskrit, Scots, Slovenian, and Yiddish.

       textcat_max_languages N (default: 3)
	   The maximum number of languages before the classification is
	   considered unknown.

       textcat_optimal_ngrams N (default: 0)
	   If the number of ngrams is lower than this number then they will be
	   removed.  This can be used to speed up the program for longer
	   inputs.  For shorter inputs, this should be set to 0.

       textcat_max_ngrams N (default: 400)
	   The maximum number of ngrams that should be compared with each of
	   the languages models (note that each of those models is used
	   completely).

       textcat_acceptable_score N (default: 1.02)
	   Include any language that scores at least
	   "textcat_acceptable_score" in the returned list of languages.

perl v5.18.1			  2011-0Mail::SpamAssassin::Plugin::TextCat(3)
[top]

List of man pages available for Mageia

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net