extract_url man page on DragonFly

Man page or keyword search:  
man Server   44335 pages
apropos Keyword Search (all sections)
Output format
DragonFly logo
[printable version]

extract_url(1)			 User Commands			extract_url(1)

NAME
       extract_url -- extract URLs from email messages

SYNOPSIS
       extract_url [options] file

DESCRIPTION
       This is a Perl script that extracts URLs from correctly-encoded MIME
       email messages. This can be used either as a pre-parser for urlview, or
       to replace urlview entirely.

       Urlview is a great program, but has some deficiencies. In particular,
       it isn't particularly configurable, and cannot handle URLs that have
       been broken over several lines in format=flowed delsp=yes email
       messages.  Nor can it handle quoted-printable email messages. Also,
       urlview doesn't eliminate duplicate URLs. This Perl script handles all
       of that.	 It also sanitizes URLs so that they can't break out of the
       command shell.

       This is designed primarily for use with the mutt emailer. The idea is
       that if you want to access a URL in an email, you pipe the email to a
       URL extractor (like this one) which then lets you select a URL to view
       in some third program (such as Firefox). An alternative design is to
       access URLs from within mutt's pager by defining macros and tagging the
       URLs in the display to indicate which macro to use. A script you can
       use to do that is tagurl.pl.

OPTIONS
       -h, --help
	   Display this help and exit.

       -m, --man
	   Display the full man page documentation.

       -l, --list
	   Prevent use of Ncurses, and simply output a list of extracted URLs.

       -t, --text
	   Prevent MIME handling; treat the input as plain text.

       -V, --version
	   Output version information and exit.

DEPENDENCIES
       Mandatory dependencies are MIME::Parser and HTML::Parser.  These
       usually come with Perl.

       Optional dependencies are URI::Find (recognizes more exotic URL
       variations in plain text (without HTML tags)), Curses::UI (allows it to
       fully replace urlview), and Getopt::Long (if present, extract_url.pl
       recognizes long options --version and --list).

EXAMPLES
       This Perl script expects a valid email to be either piped in via STDIN
       or in a file listed as the script's only argument. Its STDOUT can be a
       pipe into urlview (it will detect this). Here's how you can use it:

	   cat message.txt | extract_url.pl
	   cat message.txt | extract_url.pl | urlview
	   extract_url.pl message.txt
	   extract_url.pl message.txt | urlview

       For use with mutt 1.4.x, here's a macro you can use:

	   macro index,pager \cb "\
	   <enter-command> \
	   unset pipe_decode<enter>\
	       <pipe-message>extract_url.pl<enter>" \
	   "get URLs"

       For use with mutt 1.5.x, here's a more complicated macro you can use:

	   macro index,pager \cb "\
	   <enter-command> set my_pdsave=\$pipe_decode<enter>\
	   <enter-command> unset pipe_decode<enter>\
	   <pipe-message>extract_url.pl<enter>\
	   <enter-command> set pipe_decode=\$my_pdsave<enter>" \
	   "get URLs"

       Here's a suggestion for how to handle encrypted email:

	   macro index,pager ,b "\
	   <enter-command> set my_pdsave=\$pipe_decode<enter>\
	   <enter-command> unset pipe_decode<enter>\
	   <pipe-message>extract_url.pl<enter>\
	   <enter-command> set pipe_decode=\$my_pdsave<enter>" \
	   "get URLs"

	   macro index,pager ,B "\
	   <enter-command> set my_pdsave=\$pipe_decode<enter>\
	   <enter-command> set pipe_decode<enter>\
	   <pipe-message>extract_url.pl<enter>\
	   <enter-command> set pipe_decode=\$my_pdsave<enter>" \
	   "decrypt message, then get URLs"

	   message-hook .  'macro index,pager \cb ,b "URL viewer"'
	   message-hook ~G 'macro index,pager \cb ,B "URL viewer"'

CONFIGURATION
       If you're using it with Curses::UI (i.e. as a standalone URL selector),
       this Perl script will try and figure out what command to use based on
       the contents of your ~/.urlview file. However, it also has its own
       configuration file (~/.extract_urlview) that will be used instead, if
       it exists. So far, there are eight kinds of lines you can have in this
       file:

       COMMAND ...
	       This line specifies the command that will be used to view URLs.
	       This command CAN contain a %s, which will be replaced by the
	       URL inside single-quotes. If it does not contain a %s, the URL
	       will simply be appended to the command. If this line is not
	       present, the command is taken from the environment variable
	       $BROWSER. If BROWSER is not set, the command is assumed to be
	       "open", which is the correct command for MacOS X systems.

       SHORTCUT
	       This line specifies that if an email contains only 1 URL, that
	       URL will be opened without prompting. The default (without this
	       line) is to always prompt.

       NOREVIEW
	       Normally, if a URL is too long to display on screen in the
	       menu, the user will be prompted with the full URL before
	       opening it, just to make sure it's correct. This line turns
	       that behavior off.

       PERSISTENT
	       By default, when a URL has been selected and viewed from the
	       menu, extract_url.pl will exit. If you would like it to be
	       ready to view another URL without re-parsing the email (i.e.
	       much like standard urlview behavior), add this line to the
	       config file.

       IGNORE_EMPTY_TAGS
	       By default, the script collects all the URLs it can find.
	       Sometimes, though, HTML messages contain links that don't
	       correspond to any text (and aren't normally rendered or
	       accessible). This tells the script to ignore these links.

       HTML_TAGS ...
	       This line specifies which HTML tags will be examined for URLs.
	       By default, the script is very generous, looking in a, applet,
	       area, blockquote, embed, form, frame, iframe, input, ins,
	       isindex, head, layer, link, object, q, script, and xmp tags for
	       links. If you would like it to examine just a subset of these
	       (e.g. you only want a tags to be examined), merely list the
	       subset you want. The list is expected to be a comma-separated
	       list. If there are multiple of these lines in the config file,
	       the script will look for the minimum set of specified tags.

       ALTSELECT ...
	       This line specifies a key for an alternate url viewing
	       behavior.  By default, extract_url.pl will quit after the URL
	       viewer has been launched for the selected URL. This key will
	       then make extract_url.pl launch the URL viewer but will not
	       quit. However, if PERSISTENT is specified in the config file,
	       the opposite is true: normal selection of a URL will launch the
	       URL viewer and will not cause extract_url.pl to exit, but this
	       key will. This setting defaults to k.

       DEFAULT_VIEW {url|context}
	       This line specifies whether to show the list of URLs at first
	       or to show the url contexts when the program is run. By
	       default, extract_url.pl shows a list of URLs.

       Here is an example config file:

	   SHORTCUT
	   COMMAND mozilla-firefox -remote "openURL(%s,new-window)"
	   HTML_TAGS a,iframe,link
	   ALTSELECT Q
	   DEFAULT_VIEW context

STANDARDS
       None.

AVAILABILITY
       http://www.memoryhole.net/~kyle/extract_url/

SEE ALSO
       mutt(1) urlview(1) urlscan(1)

CAVEATS
       All URLs have any potentially dangerous shell characters (namely a
       single quote and a dollar sign) removed (transformed into percent-
       encoding) before they are used in a shell. This should eliminate the
       possibility of a bad URL breaking the shell.

       If using Curses::UI, and a URL is too big for your terminal, when you
       select it, extract_url.pl will (by default) ask you to review it in a
       way that you can see the whole thing.

AUTHOR
       Program was written by Kyle Wheeler <kyle@memoryhole.net>

       Released under license BSD-2-Cluase (simplified) For more information
       about the license, visit <http://spdx.org/licenses/BSD-2-Clause>.

perl v5.20.3			  2016-02-19			extract_url(1)
[top]

List of man pages available for DragonFly

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net