Encode::JP man page on MirBSD

Man page or keyword search:  
man Server   6113 pages
apropos Keyword Search (all sections)
Output format
MirBSD logo
[printable version]



ext::Encode::JP::PerlpProgrammers Referencext::Encode::JP::JP(3p)

NAME
     Encode::JP - Japanese Encodings

SYNOPSIS
	 use Encode qw/encode decode/;
	 $euc_jp = encode("euc-jp", $utf8);   # loads Encode::JP implicitly
	 $utf8	 = decode("euc-jp", $euc_jp); # ditto

ABSTRACT
     This module implements Japanese charset encodings.	 Encod-
     ings supported are as follows.

       Canonical   Alias	     Description
       --------------------------------------------------------------------
       euc-jp	   /\beuc.*jp$/i     EUC (Extended Unix Character)
		   /\bjp.*euc/i
		   /\bujis$/i
       shiftjis	   /\bshift.*jis$/i  Shift JIS (aka MS Kanji)
		   /\bsjis$/i
       7bit-jis	   /\bjis$/i	     7bit JIS
       iso-2022-jp		     ISO-2022-JP		  [RFC1468]
				     = 7bit JIS with all Halfwidth Kana
				       converted to Fullwidth
       iso-2022-jp-1		     ISO-2022-JP-1		  [RFC2237]
				     = ISO-2022-JP with JIS X 0212-1990
				       support.	 See below
       MacJapanese		     Shift JIS + Apple vendor mappings
       cp932	   /\bwindows-31j$/i Code Page 932
				     = Shift JIS + MS/IBM vendor mappings
       jis0201-raw		     JIS0201, raw format
       jis0208-raw		     JIS0201, raw format
       jis0212-raw		     JIS0201, raw format
       --------------------------------------------------------------------

DESCRIPTION
     To find out how to use this module in detail, see Encode.

Note on ISO-2022-JP(-1)?
     ISO-2022-JP-1 (RFC2237) is a superset of ISO-2022-JP
     (RFC1468) which adds support for JIS X 0212-1990.	That
     means you can use the same code to decode to utf8 but not
     vice versa.

       $utf8 = decode('iso-2022-jp-1', $stream);

     and

       $utf8 = decode('iso-2022-jp',   $stream);

     yield the same result but

perl v5.8.8		   2005-02-05				1

ext::Encode::JP::PerlpProgrammers Referencext::Encode::JP::JP(3p)

       $with_0212 = encode('iso-2022-jp-1', $utf8);

     is now different from

       $without_0212 = encode('iso-2022-jp', $utf8 );

     In the latter case, characters that map to 0212 are first
     converted to U+3013 (0xA2AE in EUC-JP; a white square also
     known as 'Tofu' or 'geta mark') then fed to the decoding
     engine.  U+FFFD is not used, in order to preserve text lay-
     out as much as possible.

BUGS
     The ASCII region (0x00-0x7f) is preserved for all encodings,
     even though this conflicts with mappings by the Unicode Con-
     sortium.  See

     <http://www.debian.or.jp/~kubota/unicode-symbols.html.en>

     to find out why it is implemented that way.

SEE ALSO
     Encode

perl v5.8.8		   2005-02-05				2

[top]

List of man pages available for MirBSD

Copyright (c) for man pages and the logo by the respective OS vendor.

For those who want to learn more, the polarhome community provides shell access and support.

[legal] [privacy] [GNU] [policy] [cookies] [netiquette] [sponsors] [FAQ]
Tweet
Polarhome, production since 1999.
Member of Polarhome portal.
Based on Fawad Halim's script.
....................................................................
Vote for polarhome
Free Shell Accounts :: the biggest list on the net