eucJP(5)eucJP(5)NAMEeucJP - A character encoding system (codeset) for Japanese
The Japanese EUC (Extended UNIX Code), or eucJP, codeset consists of
the following character sets: CS0 (ASCII or JIS Roman) CS1 (JIS X0208)
CS2 (JIS Katakana) CS3 (JIS X0212)
CS0 is a primary character set. CS1, CS2, and CS3 are supplementary
character sets. The MSB (Most Significant Bit) of the byte that repre‐
sents a character in CS0 is set off, whereas the MSB of the bytes that
represent characters in CS1, CS2, and CS3 is set on.
Japanese EUC Encoding
The representation of ASCII/JIS Roman and JIS X0208 characters in the
Japanese EUC codeset is similar to how those characters are represented
in the DEC Kanji codeset (refer to deckanji(5)). The two additional
character sets, JIS Katakana and JIS X0212, are encoded in the Japanese
EUC codeset by making use of the SS2 (Single Shift 2) and SS3 (Single
Shift 3) control characters.
The Japanese EUC codeset provides the following two areas for represen‐
tation of user-defined characters (UDC):
Area Usage Row Range Number of Char‐ Code Range
JIS X0208 85-94 940 F5A1-FEFE
JIS X0212 78-94 1598 SS3 [EEA1-FEFE]
The representation of UDCs on these two code planes is identical to
that for standard characters that occupy the same planes. Code ranges
distinguish between UDCs and standard JIS X0208 and JIS X0212 charac‐
ters that occupy the same plane.
Currently, the operating system does not support JIS X0212 (JIS Supple‐
The following codeset converter pairs are available for converting Ja‐
panese characters between eucJP and other encoding formats. Refer to
iconv_intro(5) for an introduction to codeset conversion. For more
information about the other codeset for which eucJP is the input or
output, see the reference page specified in the list item. deck‐
Converting from and to the DEC Kanji codeset: deckanji(5).
Converting from and to the ISO 2022 Japanese codeset:
iso2022jp(5). ISO-2022-JPext_eucJP, eucJP_ISO-2022-JPext
Converting from and to the ISO 2022 Japanese Extended codeset:
iso2022jp(5). JIS7_eucJP, eucJP_JIS7
Converting from and to the JIS7 codeset: jiskanji(5).
Converting from and to the Shift JIS codeset: SJIS(5).
Shift JIS encoding is identical to the encoding used in the Mi‐
crosoft PC code page for Japanese. You can therefore use these
converters to convert Japanese text from and to Japanese code-
page format. See code_page(5) for more information about how the
operating system supports PC code pages. sdeckanji_eucJP,
Converting from and to the Super DEC Kanji codeset: sdeck‐
anji(5). UTF-16_eucJP, eucJP_UTF-16
Converting from and to UTF-16 format: Unicode(5). UCS-4_eucJP,
Converting from and to UCS-4 format: Unicode(5). UTF-8_eucJP,
Converting from and to UTF--8 format: Unicode(5).
Japanese EUC Fonts
For display devices, the operating system supports Japanese EUC charac‐
ters by converting Japanese EUC code to DEC Kanji code and then using
the fonts for DEC Kanji. Because the CS3 character set is not supported
by the DEC Kanji codeset, CS3 characters cannot be displayed.
The operating system does not provide PostScript fonts for Japanese
EUC. Some printers support Japanese with printer-resident fonts and
print filters perform codeset conversion, if required, for the encoding
used in the file input to the print job. For some other printers, you
can set up a print filter to convert Japanese bitmap fonts to Post‐
Script. Refer to i18n_printing(5) for introductory information about
your printing options.
Others: ascii(5), code_page(5), i18n_intro(5), i18n_printing(5),
iconv_intro(5), deckanji(5), iso2022jp(5), Japanese(5), jiskanji(5),
l10n_intro(5), sdeckanji(5), shiftjis(5), Unicode(5)eucJP(5)