checklink man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

CHECKLINK(1)	      User Contributed Perl Documentation	  CHECKLINK(1)

NAME
       checklink - check the validity of links in an HTML or XHTML document

SYNOPSIS
       checklink  [ options ] uri ...

DESCRIPTION
       This manual page documents briefly the checklink command, a.k.a. the
       W3CX Link Checker.

       checklink is a program that reads an HTML or XHTML document, extracts a
       list of anchors and lists and checks that no anchor is defined twice
       and that all the links are dereferenceable, including the fragments. It
       warns about HTTP redirects, including directory redirects, and can
       check recursively a part of a web site.

       The program can be used either as a command line tool or as a CGI
       script.

OPTIONS
       This program follow the usual GNU command line syntax, with long
       options starting with two dashes (`-'). A summary of options is
       included below.

       -?, -h, --help
	    Show summary of options.

       -V, --version
	    Output version information.

       -s, --summary
	    Show result summary only.

       -b, --broken
	    Show only the broken links, not the redirects.

       -e, --directory
	    Hide directory redirects - e.g. <http://www.w3.org/TR> ->
	    <http://www.w3.org/TR/>.

       -r, --recursive
	    Check the documents linked from the first one.

       -D, --depth n
	    Check the documents linked from the first one to depth n (implies
	    --recursive).

       -l, --location uri
	    Scope of the documents checked (implies --recursive).  Can be
	    specified multiple times in order to specify multiple recursion
	    bases.  If the URI of a candidate document is downwards relative
	    to any of the bases, it is considered to be within the scope.  If
	    not specified, the default is the base URI of the initial
	    document, for example for
	    <http://www.w3.org/TR/html4/Overview.html> it would be
	    <http://www.w3.org/TR/html4/>.

       -X, --exclude regexp
	    Do not check links whose full, canonical URIs match regexp.	 Note
	    that this option limits recursion the same way as --exclude-docs
	    with the same regular expression would.

       --exclude-docs regexp
	    In recursive mode, do not check links in documents whose full,
	    canonical URIs match regexp.  This option may be specified
	    multiple times.

       --suppress-redirect URI->URI
	    Do not report a redirect from the first to the second URI.	The
	    "->" is literal text.  This option may be specified multiple
	    times.  Whitespace may be used instead of "->" to separate the
	    URIs.

       --suppress-redirect-prefix URI->URI
	    Do not report a redirect from a child of the first URI to the same
	    child of the second URI.  The \"->\" is literal text.  This option
	    may be specified multiple times.  Whitespace may be used instead
	    of "->" to separate the URIs.

       --suppress-temp-redirects
	    Do not report warnings about temporary redirects.

       --suppress-broken CODE:URI
	    Do not report a broken link with the given CODE.  CODE is the HTTP
	    response, or -1 for robots exclusion.  The ":" is literal text.
	    This option may be specified multiple times.  Whitespace may be
	    used instead of ":" to separate the CODE and the URI.

       --suppress-fragment URI
	    Do not report the given broken fragment URI.  A fragment URI
	    contains "#".  This option may be specified multiple times.

       -L, --languages accept-language
	    The "Accept-Language" HTTP header to send.	In command line mode,
	    this header is not sent by default.	 The special value "auto"
	    causes a value to be detected from the "LANG" environment
	    variable, and sent if found.  In CGI mode, the default is to send
	    the value received from the client as is.

       -c, --cookies cookie-file
	    Use cookies, load/save them in cookie-file.	 The special value
	    "tmp" causes non-persistent use of cookies, i.e. they are used but
	    only stored in memory for the duration of this link checker run.

       -R, --no-referer
	    Do not send the "Referer" HTTP header.

       -q, --quiet
	    No output if no errors are found.  Implies --summary.

       -v, --verbose
	    Verbose mode.

       -i, --indicator
	    Show progress while parsing as percentage of lines processed.  No
	    indicator is shown for documents containing no linefeeds.

       -u, --user username
	    Specify a username for authentication.

       -p, --password password
	    Specify a password for authentication.

       --hide-same-realm
	    Hide 401's that are in the same realm as the document checked.

       -S, --sleep secs
	    Sleep the specified number of seconds between requests to each
	    server.  Defaults to 1 second, which is also the minimum allowed.

       -t, --timeout secs
	    Timeout for requests, in seconds.  The default is 30.

       -C, --connection-cache number
	    Maximum number of cached connections.  Using this option overrides
	    the "Connection_Cache_Size" configuration file parameter, see its
	    documentation below for the default value and more information.

       -d, --domain domain
	    Perl regular expression describing the domain to which the
	    authentication information (if present) will be sent.  The default
	    value can be specified in the configuration file.  See the
	    "Trusted" entry in the configuration file description below for
	    more information.

       --masquerade "real-prefix surrogate-prefix"
	    Perform a simple string substitution: URIs which begin with the
	    string "real-prefix" are rewritten using the "surrogate-prefix"
	    before being dereferenced.	Useful for making a local directory
	    masquerade as a remote one. For example:

	      --masquerade "http://example.com/x/y/z/ file:///my/local/dir/"

	    If the document being checked contains a link to
	    http://example.com/x/y/z/foo.html, then the local file system will
	    be checked for file:///my/local/dir/foo.html.

	    --masquerade takes a single argument consisting of two URIs,
	    separated by whitespace.  The quote marks are not part of the
	    argument, but one usual way of providing a value with embedded
	    whitespace is to enclose it in quotes.

       -H, --html
	    HTML output.

FILES
       /etc/w3c/checklink.conf
	    The main configuration file.  You can use the W3C_CHECKLINK_CFG
	    environment variable to override the default location.

	    "Trusted" specifies a regular expression for matching trusted
	    domains (ie. domains where HTTP basic authentication, if any, will
	    be sent).  The regular expression will be matched case
	    insensitively against host names.  The default behavior (when
	    unset, that is) is to send the authentication information only to
	    the host which requests it; usually you don't want to change this.
	    For example, the following configures only the w3.org domain as
	    trusted:

		Trusted = \.w3\.org$

	    "Allow_Private_IPs" is a boolean flag indicating whether checking
	    links on non-public IP addresses is allowed.  The default is true
	    in command line mode and false when run as a CGI script.  For
	    example, to disallow checking non-public IP addresses, regardless
	    of the mode, use:

	       Allow_Private_IPs = 0

	    "Forbidden_Protocols" is a comma separated list of additional
	    protocols/URI schemes that the link checker is not allowed to use.
	    The "javascript" and "mailto" schemes are always forbidden, and so
	    is the "file" scheme when running as a CGI script.

	       Forbidden_Protocols = javascript,mailto

	    "Markup_Validator_URI" and "CSS_Validator_URI" are formatted URIs
	    to the respective validators.  The %s in these will be replaced
	    with the full "URI encoded" URI to the document being checked, and
	    shown in the link checker results view in the online/CGI version.
	    The defaults are:

	       Markup_Validator_URI =
		 http://validator.w3.org/check?uri=%s
	       CSS_Validator_URI =
		 http://jigsaw.w3.org/css-validator/validator?uri=%s

	    "Doc_URI" is a URI used for linking to the documentation, and CSS
	    and JavaScript files in the dynamically generated content of the
	    link checker.  The default is:

	       Doc_URI = http://validator.w3.org/docs/checklink.html

	    "Connection_Cache_Size" is an integer denoting the maximum number
	    of connections the link checker will keep open at any given time.
	    The default is:

	       Connection_Cache_Size = 2

ENVIRONMENT
       checklink uses the libwww-perl library which has a number of
       environment variables affecting its behaviour.  See "SEE ALSO" for some
       pointers.

       W3C_CHECKLINK_CFG
	    If set, overrides the path to the configuration file.

SEE ALSO
       The documentation for this program is available on the web at
       <http://validator.w3.org/docs/checklink.html>.

       LWP, Net::FTP, Net::NNTP, Net::IP, perlre.

AUTHOR
       This program was originally written by Hugo Haas <hugo@w3.org>, based
       on Renaud Bruyeron's checklink.pl.  It has been enhanced by Ville
       Skyttae and many other volunteers since.	 Use the
       <www-validator@w3.org> mailing list for feedback, and see
       <http://validator.w3.org/docs/checklink.html#csb> for more information.

       This manual page was originally written by Frederic Schuetz
       <schutz@mathgen.ch> for the Debian GNU/Linux system (but may be used by
       others).

COPYRIGHT
       This program is licensed under the W3CX Software License,
       <http://www.w3.org/Consortium/Legal/copyright-software>.

perl v5.20.2			  2011-03-27			  CHECKLINK(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net