manual/LDP_man-pages/draft/man7/unicode.7

   1 .\" Hey Emacs! This file is -*- nroff -*- source.
   2 .\"
   3 .\" Copyright (C) Markus Kuhn, 1995, 2001
   4 .\"
   5 .\" This is free documentation; you can redistribute it and/or
   6 .\" modify it under the terms of the GNU General Public License as
   7 .\" published by the Free Software Foundation; either version 2 of
   8 .\" the License, or (at your option) any later version.
   9 .\"
  10 .\" The GNU General Public License's references to "object code"
  11 .\" and "executables" are to be interpreted as the output of any
  12 .\" document formatting or typesetting system, including
  13 .\" intermediate and printed output.
  14 .\"
  15 .\" This manual is distributed in the hope that it will be useful,
  16 .\" but WITHOUT ANY WARRANTY; without even the implied warranty of
  17 .\" MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
  18 .\" GNU General Public License for more details.
  19 .\"
  20 .\" You should have received a copy of the GNU General Public
  21 .\" License along with this manual; if not, write to the Free
  22 .\" Software Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111,
  23 .\" USA.
  24 .\"
  25 .\" 1995-11-26  Markus Kuhn <mskuhn@cip.informatik.uni-erlangen.de>
  26 .\"      First version written
  27 .\" 2001-05-11  Markus Kuhn <mgk25@cl.cam.ac.uk>
  28 .\"      Update
  29 .\"
  30 .\" Japanese Version Copyright (c) 1997 HANATAKA Shinya
  31 .\"         all rights reserved.
  32 .\" Translated Thu Jun  3 20:36:31 JST 1997
  33 .\"         by HANATAKA Shinya <hanataka@abyss.rim.or.jp>
  34 .\" Updated & Modified Sat Jun 23 07:30:09 JST 2001
  35 .\"         by Yuichi SATO <ysato@h4.dion.ne.jp>
  36 .\"
  37 .\"WORD:
  38 .\"WORD:        diacritical mark        È¯²»¶èÊÌÉä¹æ
  39 .\"WORD:        International Phonetic Alphabet         ¹ñºÝ²»À¼»úÊì
  40 .\"WORD:
  41 .\"
  42 .TH UNICODE 7 2001-05-11 "GNU" "Linux Programmer's Manual"
  43 .\"O .SH NAME
  44 .SH Ì¾Á°
  45 .\"O Unicode \- the Universal Character Set
  46 Unicode \- ÈÆÍÑÊ¸»ú½¸¹ç
  47 .\"O .SH DESCRIPTION
  48 .SH ÀâÌÀ
  49 .\"O The international standard
  50 .\"O .B ISO 10646
  51 .\"O defines the
  52 .\"O .BR "Universal Character Set (UCS)" .
  53 .\"O UCS contains all characters of all other character set standards.
  54 .\"O It also guarantees
  55 .\"O .BR "round-trip compatibility" ,
  56 .\"O i.e., conversion tables can be built such that no information is lost
  57 .\"O when a string is converted from any other encoding to UCS and back.
  58 ¹ñºÝµ¬³Ê
  59 .B ISO 10646
  60 ¤Ï
  61 .B "ÈÆÍÑÊ¸»ú½¸¹ç (Universal Character Set (UCS))"
  62 ¤òÄêµÁ¤·¤Æ¤¤¤ë¡£
  63 UCS ¤ÏÂ¾µ¬³Ê¤ÎÊ¸»ú½¸¹ç¤ÎÊ¸»ú¤òÁ´¤Æ´Þ¤ó¤Ç¤¤¤ë¡£
  64 ¤µ¤é¤Ë¡¢
  65 .B "ÁÐÊý¸þ¤Î¸ß´¹À (round-trip compatibility)"
  66 ¤òÊÝ¾Ú¤¹¤ë¡£
  67 Îã¤¨¤ÐÂ¾¤ÎÉä¹æ¤«¤é UCS ¤ËÊÑ´¹¤·¤µ¤é¤Ë¸µ¤ÎÉä¹æ¤ËÊÑ´¹¤·¤¿¤È¤·¤Æ¤â¡¢
  68 ²¿¤Î¾ðÊó¤â¼º¤Ê¤ï¤ì¤Ê¤¤¤è¤¦¤ËÊÑ´¹¥Æ¡¼¥Ö¥ë¤òºîÀ®¤¹¤ë¤³¤È¤¬¤Ç¤¤ë¡£
  69
  70 .\"O UCS contains the characters required to represent practically all
  71 .\"O known languages.
  72 .\"O This includes not only the Latin, Greek, Cyrillic,
  73 .\"O Hebrew, Arabic, Armenian, and Georgian scripts, but also Chinese,
  74 .\"O Japanese and Korean Han ideographs as well as scripts such as
  75 .\"O Hiragana, Katakana, Hangul, Devanagari, Bengali, Gurmukhi, Gujarati,
  76 .\"O Oriya, Tamil, Telugu, Kannada, Malayalam, Thai, Lao, Khmer, Bopomofo,
  77 .\"O Tibetan, Runic, Ethiopic, Canadian Syllabics, Cherokee, Mongolian,
  78 .\"O Ogham, Myanmar, Sinhala, Thaana, Yi, and others.
  79 UCS ¤Ï¸½¼ÂÅª¤ËÃÎ¤é¤ì¤Æ¤¤¤ëÁ´¤Æ¤Î¸À¸ì¤òÉ½¸½¤¹¤ë¤Î¤ËÉ¬Í×¤ÊÊ¸»ú¤ò´Þ¤ó¤Ç¤¤¤ë¡£
  80 ¤³¤ì¤Ë¤Ï¥é¥Æ¥óÊ¸»ú¡¢¥®¥ê¥·¥ãÊ¸»ú¡¢¥¥ê¥ëÊ¸»ú¡¢¥Ø¥Ö¥é¥¤Ê¸»ú¡¢¥¢¥é¥Ó¥¢Ê¸»ú¡¢
  81 ¥¢¥ë¥á¥Ë¥¢Ê¸»ú¡¢¥°¥ë¥¸¥¢Ê¸»ú¤À¤±¤Ç¤Ê¤¯¡¢Ãæ¹ñ¡¦ÆüËÜ¡¦´Ú¹ñ¤Ç»È¤ï¤ì¤Æ¤¤¤ë´Á»ú¡¢
  82 ¤µ¤é¤Ë¤Ï¡¢Ê¿²¾Ì¾¡¢ÊÒ²¾Ì¾¡¢¥Ï¥ó¥°¥ëÊ¸»ú¡¢
  83 ¥Ç¡¼¥ô¥¡¥Ê¡¼¥¬¥ê¡¼Ê¸»ú¡¢¥Ù¥ó¥¬¥ëÊ¸»ú¡¢¥°¥ë¥à¥¡¼Ê¸»ú¡¢¥°¥¸¥ã¥é¡¼¥ÈÊ¸»ú¡¢
  84 ¥ª¥ê¥ä¡¼Ê¸»ú¡¢¥¿¥ß¡¼¥ëÊ¸»ú¡¢¥Æ¥ë¥°Ê¸»ú¡¢¥«¥Ê¥éÊ¸»ú¡¢¥Þ¥é¥ä¡¼¥é¥àÊ¸»ú¡¢
  85 ¥¿¥¤Ê¸»ú¡¢¥é¥ª¥¹Ê¸»ú¡¢¥¯¥á¡¼¥ëÊ¸»ú¡¢¥Ü¥Ý¥â¥Õ¥©Ê¸»ú (Ãí²»»úÊì)¡¢
  86 ¥Á¥Ù¥Ã¥ÈÊ¸»ú¡¢¥ë¡¼¥óÊ¸»ú¡¢¥¨¥Á¥ª¥Ô¥¢Ê¸»ú¡¢¥«¥Ê¥À²»ÀáÊ¸»ú¡¢
  87 ¥Á¥§¥í¥¡¼Ê¸»ú¡¢¥â¥ó¥´¥ëÊ¸»ú¡¢
  88 ¥ª¥¬¥àÊ¸»ú¡¢¥ß¥ã¥ó¥Þ¡¼Ê¸»ú¡¢¥·¥ó¥Ï¥éÊ¸»ú¡¢
  89 ¥¿¡¼¥ÊÊ¸»ú¡¢¥¤ (×³) Ê¸»ú¤Ê¤É¤¬´Þ¤Þ¤ì¤ë¡£
  90 .\"O For scripts not yet
  91 .\"O covered, research on how to best encode them for computer usage is
  92 .\"O still going on and they will be added eventually.
  93 .\"O This might
  94 .\"O eventually include not only Hieroglyphs and various historic
  95 .\"O Indo-European languages, but even some selected artistic scripts such
  96 .\"O as Tengwar, Cirth, and Klingon.
  97 ¤Þ¤À¥«¥Ð¡¼¤µ¤ì¤Æ¤¤¤Ê¤¤Ê¸»ú¤ËÉÕ¤¤¤Æ¤â¡¢
  98 ¥³¥ó¥Ô¥å¡¼¥¿¤Ç»ÈÍÑ¤¹¤ë¤¿¤á¤Ë
  99 ¤É¤Î¤è¤¦¤Ê¥¨¥ó¥³¡¼¥É¤¬¤â¤Ã¤È¤âÎÉ¤¤¤«¤È¤¤¤¦¸¦µæ¤¬¿Ê¤á¤é¤ì¤Æ¤ª¤ê¡¢
 100 ºÇ½ªÅª¤Ë¤ÏÄÉ²Ã¤µ¤ì¤ë¤À¤í¤¦¡£
 101 ¥Ò¥¨¥í¥°¥ê¥Õ¤äÎò»ËÅª¤Ê¤¤¤í¤¤¤í¤Ê¥¤¥ó¥É¡á¥è¡¼¥í¥Ã¥Ñ¸À¸ì¤À¤±¤Ç¤Ê¤¯¡¢
 102 ¥Æ¥ó¥°¥ï¡¼¥ëÊ¸»ú¡¢¥¥¢¥¹Ê¸»ú¡¢¥¯¥ê¥ó¥´¥óÊ¸»ú¤Ê¤É¤Î¿Í¹©Åª¤Ê¸À¸ì¤âÁª¤Ð¤ì¤Æ¤¤¤ë¡£
 103 .\"O UCS also covers a large number of
 104 .\"O graphical, typographical, mathematical and scientific symbols,
 105 .\"O including those provided by TeX, Postscript, APL, MS-DOS, MS-Windows,
 106 .\"O Macintosh, OCR fonts, as well as many word processing and publishing
 107 .\"O systems, and more are being added.
 108 UCS ¤Ï¡¢¤³¤ì¤é¤ÎÊ¸»ú¤Ë²Ã¤¨¤Æ¡¢TeX, PostScript, APL, MS-DOS, MS-Windows,
 109 Macintosh, OCR ¥Õ¥©¥ó¥È¡¢¿ôÂ¿¤¯¤Î¥ï¡¼¥É¥×¥í¥»¥Ã¥µ¡¼¤ä
 110 ½ÐÈÇ¥·¥¹¥Æ¥à¡¢¤Ê¤É¤¬Äó¶¡¤¹¤ë
 111 ¿Þ·Áµ¹æ¡¦°õ»úµ¹æ¡¦¿ô³Øµ¹æ¡¦²Ê³Øµ¹æ¤Ê¤É¤ÎÂ¿¤¯¤ò´Þ¤à¤è¤¦¤Ë¤Ê¤Ã¤¿¡£
 112
 113 .\"O The UCS standard (ISO 10646) describes a
 114 .\"O .I "31-bit character set architecture"
 115 .\"O consisting of 128 24-bit
 116 .\"O .IR groups ,
 117 .\"O each divided into 256 16-bit
 118 .\"O .I planes
 119 .\"O made up of 256 8-bit
 120 .\"O .I rows
 121 .\"O with 256
 122 .\"O .I column
 123 .\"O positions, one for each character.
 124 UCS µ¬³Ê (ISO 10646) ¤Ï
 125 .I "31¥Ó¥Ã¥È¤ÎÊ¸»ú½¸¹ç¥¢¡¼¥¥Æ¥¯¥Á¥ã¡¼"
 126 ¤òµ½Ò¤·¤Æ¤ª¤ê¡¢128 ¸Ä¤Î 24 ¥Ó¥Ã¥È
 127 .IR ·² " (" group )
 128 ¤«¤é¹½À®¤µ¤ì¤Æ¤¤¤ë¡£
 129 ³Æ·²¤Ï 256 ¸Ä¤Î 16 ¥Ó¥Ã¥È
 130 .IR ÌÌ " (" plane )
 131 ¤ËÊ¬³ä¤µ¤ì¤Æ¤ª¤ê¡¢³ÆÊ¸»ú¤Ï 256 ¸Ä¤Î 8 ¥Ó¥Ã¥È
 132 .IR ¶è " (" row )
 133 ¤Î 256
 134 .IR ÅÀ " (" column )
 135 ¤ÎÃæ¤Ë°ÌÃÖ¤¹¤ë¡£
 136 .\"O Part 1 of the standard
 137 .\"O .RB ( "ISO 10646-1" )
 138 .\"O defines the first 65534 code positions (0x0000 to 0xfffd), which form
 139 .\"O the
 140 .\"O .IR "Basic Multilingual Plane (BMP)" ,
 141 .\"O that is plane 0 in group 0.
 142 ¤³¤Îµ¬³Ê¤Î Part 1
 143 .RB ( "ISO 10646-1" )
 144 ¤Ç¤Ï¡¢ºÇ½é¤Î 65534 ¸Ä¤Î¥³¡¼¥É°ÌÃÖ (0x0000 ¡Á 0xfffd) ¤òÄêµÁ¤·¤Æ¤¤¤ë¡£
 145 ¤³¤ì¤ÏÂè 0 ·²¤ÎÂè 0 ÌÌ¤Ç¤¢¤ë
 146 .IR "´ðËÜÂ¿¸À¸ìÌÌ (Basic Multilingual Plane (BMP))"
 147 ¤ò¹½À®¤¹¤ë¡£
 148 .\"O Part 2 of the standard
 149 .\"O .RB ( "ISO 10646-2" )
 150 .\"O adds characters to group 0 outside the BMP in several
 151 .\"O .I "supplementary planes"
 152 .\"O in the range 0x10000 to 0x10ffff.
 153 ¤³¤Îµ¬³Ê¤Î Part 2
 154 .RB ( "ISO 10646-2" )
 155 ¤Ç¤Ï¡¢Âè 0 ·²¤Î BMP ¤Î³°Éô¤Ç¤¢¤ë
 156 0x10000 ¡Á 0x10ffff ¤ÎÈÏ°Ï¤Ë¤¢¤ë
 157 .I "Êä½õÌÌ"
 158 ¤ËÊ¸»ú¤òÄÉ²Ã¤·¤¿¡£
 159 .\"O There are no plans to add characters
 160 .\"O beyond 0x10ffff to the standard, therefore of the entire code space,
 161 .\"O only a small fraction of group 0 will ever be actually used in the
 162 .\"O foreseeable future.
 163 ¤³¤Îµ¬³Ê¤Ç¤Ï 0x10ffff ¤ò±Û¤¨¤¿°ÌÃÖ¤ËÊ¸»ú¤òÄÉ²Ã¤¹¤ëÍ½Äê¤Ï¤Ê¤¤¤Î¤Ç¡¢
 164 Í½ÁÛ¤Ç¤¤ë¾Íè¤Ë¤ª¤¤¤Æ¤Ï¡¢
 165 Á´¥³¡¼¥É¶õ´Ö¤Î¤¦¤Á¥°¥ë¡¼¥× 0 ¤Î°ìÉôÊ¬¤Ï¼ÂºÝ¤Ë¤Ï»È¤ï¤ì¤ë¤³¤È¤Ï¤Ê¤¤¡£
 166 .\"O The BMP contains all characters found in the
 167 .\"O commonly used other character sets.
 168 .\"O The supplemental planes added by
 169 .\"O ISO 10646-2 cover only more exotic characters for special scientific,
 170 .\"O dictionary printing, publishing industry, higher-level protocol and
 171 .\"O enthusiast needs.
 172 BMP ¤Ë¤ÏÂ¾¤ÎÊ¸»ú½¸¹ç¤Ç°ìÈÌ¤Ë»È¤ï¤ì¤ëÁ´¤Æ¤ÎÊ¸»ú¤¬´Þ¤Þ¤ì¤Æ¤¤¤ë¡£
 173 ISO 10646-2 ¤ÇÄÉ²Ã¤µ¤ì¤¿Êä½õÌÌ¤Ï¡¢
 174 ÆÃÄê¤Î²Ê³ØÊ¬Ìî¡¦¼½ñ½ÐÈÇ¡¦°õºþ»º¶È¡¦¹â¼¡¥×¥í¥È¥³¥ë¡¦
 175 ²¿¤«¤Î¥Õ¥¡¥ó¤Î´Ö¤Ê¤É¤Ç»È¤ï¤ì¤ëÆÃ¼ì¤ÊÊ¸»ú¤À¤±¤ò¥«¥Ð¡¼¤¹¤ë¡£
 176 .PP
 177 .\"O The representation of each UCS character as a 2-byte word is referred
 178 .\"O to as the
 179 .\"O .B UCS-2
 180 .\"O form (only for BMP characters), whereas
 181 .\"O .B UCS-4
 182 .\"O is the representation of each character by a 4-byte word.
 183 UCS Ê¸»ú¤ò 2 ¥Ð¥¤¥È¤Î¥ï¡¼¥É¤ÇÉ½¸½¤¹¤ë¤Î¤¬
 184 .B UCS-2
 185 ·Á¼°¤Ç¤¢¤ë (BMP Ê¸»ú¤Î¤ß)¡£
 186 ¤Þ¤¿¡¢
 187 .B UCS-4
 188 ¤Ç¤ÏÊ¸»ú¤ò 4 ¥Ð¥¤¥È¤Î¥ï¡¼¥É¤ÇÉ½¸½¤¹¤ë¡£
 189 .\"O In addition, there exist two encoding forms
 190 .\"O .B UTF-8
 191 .\"O for backwards compatibility with ASCII processing software and
 192 .\"O .B UTF-16
 193 .\"O for the backwards compatible handling of non-BMP characters up to
 194 .\"O 0x10ffff by UCS-2 software.
 195 ¤µ¤é¤Ë¡¢ASCII ¤ò½èÍý¤¹¤ë¥½¥Õ¥È¥¦¥§¥¢¤Ø¤Î²¼°Ì¸ß´¹¤Î¤¿¤á¤Ë
 196 .B UTF-8
 197 ¥¨¥ó¥³¡¼¥É·Á¼°¤¬¤¢¤ë¡£
 198 ¤Þ¤¿¡¢0x10ffff ¤Þ¤Ç¤ÎÈó BMP Ê¸»ú¤ò°·¤¦
 199 UCS-2 ÂÐ±þ¥½¥Õ¥È¥¦¥§¥¢¤È¤Î¸ß´¹¤Î¤¿¤á¤Ë
 200 .B UTF-16
 201 ¥¨¥ó¥³¡¼¥É·Á¼°¤¬¤¢¤ë¡£
 202 .PP
 203 .\"O The UCS characters 0x0000 to 0x007f are identical to those of the
 204 .\"O classic
 205 .\"O .B US-ASCII
 206 .\"O character set and the characters in the range 0x0000 to 0x00ff
 207 .\"O are identical to those in
 208 .\"O .BR "ISO 8859-1 Latin-1" .
 209 UCS Ê¸»ú½¸¹ç¤Î 0x0000 ¤«¤é 0x007f ¤Ï¡¢¸ÅÅµÅª¤Ê
 210 .B US-ASCII
 211 Ê¸»ú½¸¹ç¤ÎÊ¸»ú¤ÈÆ±¤¸¤Ç¤¢¤ë¡£
 212 ¤Þ¤¿ 0x0000 ¤«¤é 0x00ff ¤ÎÈÏ°Ï¤Ç¤Ï¡¢
 213 .B ISO 8859-1 Latin-1
 214 Ê¸»ú½¸¹ç¤ÎÊ¸»ú¤ÈÆ±¤¸¤Ç¤¢¤ë¡£
 215 .\"O .SS Combining Characters
 216 .SS "¹çÀ®Ê¸»ú (Combining Characters)"
 217 .\"O Some code points in
 218 .\"O .B UCS
 219 .\"O have been assigned to
 220 .\"O .IR "combining characters" .
 221 .\"O These are similar to the nonspacing accent keys on a typewriter.
 222 .\"O A combining character just adds an accent to the previous character.
 223 .\"O The most important accented characters have codes of their own in UCS,
 224 .\"O however, the combining character mechanism allows us to add accents
 225 .\"O and other diacritical marks to any character.
 226 .\"O The combining characters
 227 .\"O always follow the character which they modify.
 228 .\"O For example, the German
 229 .\"O character Umlaut-A ("Latin capital letter A with diaeresis") can
 230 .\"O either be represented by the precomposed UCS code 0x00c4, or
 231 .\"O alternatively as the combination of a normal "Latin capital letter A"
 232 .\"O followed by a "combining diaeresis": 0x0041 0x0308.
 233 .B UCS
 234 ¤Î¤¤¤¯¤Ä¤«¤Î¥³¡¼¥É¡¦¥Ý¥¤¥ó¥È¤Ï
 235 .I "¹çÀ®Ê¸»ú (combining characters)"
 236 ¤Ë³ä¤êÅö¤Æ¤é¤ì¤Æ¤¤¤ë¡£
 237 ¤³¤ì¤é¤Ï¥¿¥¤¥×¥é¥¤¥¿¡¼¤Î°ÜÆ°¤·¤Ê¤¤¥¢¥¯¥»¥ó¥È¡¦¥¡¼¤Ë»÷¤Æ¤¤¤ë¡£
 238 ¹çÀ®Ê¸»ú¤ÏÄ¾Á°¤ÎÊ¸»ú¤Ë¥¢¥¯¥»¥ó¥È¤Î¤ß¤ò²Ã¤¨¤ë¡£
 239 ºÇ¤â½ÅÍ×¤Ê¥¢¥¯¥»¥ó¥ÈÉÕ¤¤ÎÊ¸»ú¤Ï¤½¤ì¼«¿È¤Î¥³¡¼¥É¤ò UCS ¤Ë»ý¤Ã¤Æ¤¤¤ë¡£
 240 °ìÊý¤Ç¹çÀ®Ê¸»úµ¡¹½¤ÏÁ´¤Æ¤ÎÊ¸»ú¤Ë¥¢¥¯¥»¥ó¥È¤äÈ¯²»¶èÊÌÉä¹æ¤ò²Ã¤¨¤ë¤³¤È¤¬¤Ç¤¤ë¡£
 241 ¹çÀ®Ê¸»ú¤Ï¾ï¤Ë¤½¤ì¤¬½¤Àµ¤¹¤ëÊ¸»ú¤ËÂ³¤¯¡£
 242 Îã¤¨¤Ð¥É¥¤¥Ä¸ì¤ÎÊ¸»ú A ¥¦¥à¥é¥¦¥È ("Latin capital letter A with diaeresis") ¤Ï
 243 UCS ¤ËÁ°¤â¤Ã¤Æ½àÈ÷¤µ¤ì¤¿¥³¡¼¥É 0x00c4 ¤Ç¤â¡¢
 244 ÄÌ¾ï¤Î A "Latin capital letter A" ¤Ë
 245 "combining diaeresis (¹çÀ®Ê¬²»µ¹æ)" ¤òÂ³¤±¤¿ÁÈ¹ç¤»
 246 (0x0041 0x0308) ¤Î¤É¤Á¤é¤Ç¤âÉ½¸½¤¹¤ë¤³¤È¤¬¤Ç¤¤ë¡£
 247 .PP
 248 .\"O Combining characters are essential for instance for encoding the Thai
 249 .\"O script or for mathematical typesetting and users of the International
 250 .\"O Phonetic Alphabet.
 251 ¹çÀ®Ê¸»ú¤Ï¡¢¥¿¥¤Ê¸»ú¤ä¿ô³Ø¿¢»ú¤Î¥¨¥ó¥³¡¼¥É¡¦
 252 ¹ñºÝ²»À¼»úÊì¤ò»È¤¦¥æ¡¼¥¶¡¼¤Ê¤É¤Ë¤ÏÉ¬¿Ü¤Ç¤¢¤ë¡£
 253 .\"O .SS Implementation Levels
 254 .SS ¼ÂÁõ¥ì¥Ù¥ë
 255 .\"O As not all systems are expected to support advanced mechanisms like
 256 .\"O combining characters, ISO 10646-1 specifies the following three
 257 .\"O .I implementation levels
 258 .\"O of UCS:
 259 Á´¤Æ¤Î¥·¥¹¥Æ¥à¤Ë¹çÀ®Ê¸»ú¤Î¤è¤¦¤Ê¿Ê¤ó¤À¥µ¥Ý¡¼¥È¤ò´üÂÔ¤·¤Æ¤¤¤ë¤ï¤±¤Ç¤Ï¤Ê¤¤¡£
 260 ISO 10646-1 ¤Ï°Ê²¼¤Î»°ÃÊ³¬¤Î UCS ¤Î¼ÂÁõ¥ì¥Ù¥ë¤ò»ØÄê¤·¤Æ¤¤¤ë¡£
 261 .TP 0.9i
 262 Level 1
 263 .\"O Combining characters and
 264 .\"O .B Hangul Jamo
 265 .\"O (a variant encoding of the Korean script, where a Hangul syllable
 266 .\"O glyph is coded as a triplet or pair of vovel/consonant codes) are not
 267 .\"O supported.
 268 ¹çÀ®Ê¸»ú¤È
 269 .B ¥Ï¥ó¥°¥ë¡¦¥¸¥ã¥âÊ¸»ú
 270 (¤¤¤í¤¤¤í¤Ê´Ú¹ñ¡¦Ä«Á¯Ê¸»ú¤ÎÉä¹æ²½¡£
 271 ¤³¤ÎÉä¹æ²½¤Ç¤Ï¡¢¥Ï¥ó¥°¥ë²»Àá¤Î¥°¥ê¥Õ¤¬
 272 3 ¤Ä¤Þ¤¿¤Ï 2 ¤Ä¤ÎÊì²»¡¦»Ò²»¥³¡¼¥É¤ÎÁÈ¤ß¹ç¤ï¤»¤ÇÉä¹æ²½¤µ¤ì¤ë) ¤Ï¥µ¥Ý¡¼¥È¤·¤Ê¤¤¡£
 273 .TP
 274 Level 2
 275 .\"O In addition to level 1, combining characters are now allowed for some
 276 .\"O languages where they are essential (e.g., Thai, Lao, Hebrew,
 277 .\"O Arabic, Devanagari, Malayalam, etc.).
 278 Level 1 ¤ÈÆ±ÍÍ¤À¤¬¡¢¹çÀ®Ê¸»ú¤òÉ¬¿Ü¤È¤¹¤ë¸À¸ì¤Î¤¿¤á¤ÎÊ¸»ú
 279 (Îã¤¨¤Ð¡¢¥¿¥¤Ê¸»ú¡¦¥é¥ª¥¹Ê¸»ú¡¦¥Ø¥Ö¥é¥¤Ê¸»ú¡¦¥¢¥é¥Ó¥¢Ê¸»ú¡¦
 280 ¥Ç¡¼¥ô¥¡¥Ê¡¼¥¬¥ê¡¼Ê¸»ú¡¦¥Þ¥ì¥ä¡¼¥é¥àÊ¸»ú¤Ê¤É) ¤Ï»È¤¨¤ë¡£
 281 .TP
 282 Level 3
 283 .\"O All
 284 .\"O .B UCS
 285 .\"O characters are supported.
 286 Á´¤Æ¤Î
 287 .B UCS
 288 Ê¸»ú¤ò¥µ¥Ý¡¼¥È¤¹¤ë¡£
 289 .PP
 290 .\"O The
 291 .\"O .B Unicode 3.0 Standard
 292 .\"O published by the
 293 .\"O .B Unicode Consortium
 294 .\"O contains exactly the
 295 .\"O .B UCS Basic Multilingual Plane
 296 .\"O at implementation level 3, as described in ISO 10646-1:2000.
 297 .B ¥æ¥Ë¥³¡¼¥É¡¦¥³¥ó¥½¡¼¥·¥¢¥à (Unicode Consortium)
 298 ¤«¤éÈ¯¹Ô¤µ¤ì¤¿
 299 .B Unicode 3.0 Standard
 300 ¤Ï¡¢ISO 10646-1:2000 ¤Ëµ½Ò¤µ¤ì¤¿
 301 .B UCS Basic Multilingual Plane
 302 ¤Î level 3 ¼ÂÁõ¤ÈÁ´¤¯Æ±¤¸¤Ç¤¢¤ë¡£
 303 .\"O .B Unicode 3.1
 304 .\"O added the supplemental planes of ISO 10646-2.
 305 .\"O The Unicode standard and
 306 .\"O technical reports published by the Unicode Consortium provide much
 307 .\"O additional information on the semantics and recommended usages of
 308 .\"O various characters.
 309 .\"O They provide guidelines and algorithms for
 310 .\"O editing, sorting, comparing, normalizing, converting and displaying
 311 .\"O Unicode strings.
 312 .B Unicode 3.1
 313 ¤Ç¤Ï ISO 10646-2 ¤ÎÊä½õÌÌ¤¬ÄÉ²Ã¤µ¤ì¤Æ¤¤¤ë¡£
 314 Unicode Consortium ¤«¤éÈ¯¹Ô¤µ¤ì¤ë Unicode µ¬³Ê¤Èµ»½Ñ¥ì¥Ý¡¼¥È¤Ë¤è¤ê¡¢
 315 ¤¤¤í¤¤¤í¤ÊÊ¸»ú¤Î°ÕÌ£¤È¿ä¾©¤µ¤ì¤ë»ÈÍÑË¡¤Ë¤Ä¤¤¤Æ¤Î¹¹¤Ê¤ë¾ðÊó¤¬ÆÀ¤é¤ì¤ë¡£
 316 ¤³¤ì¤é¤Îµ¬³Ê½ñ¤äµ»½Ñ¥ì¥Ý¡¼¥È¤Ç¡¢Unicode Ê¸»úÎó¤ò
 317 ÊÔ½¸¡¦ÊÂ¤ÙÂØ¤¨¡¦Èæ³Ó¡¦Àµµ¬²½¡¦ÊÑ´¹¡¦É½¼¨¤¹¤ë¤¿¤á¤Î
 318 ¥¬¥¤¥É¥é¥¤¥ó¤È¥¢¥ë¥´¥ê¥º¥à¤¬Ê¬¤«¤ë¡£
 319 .\"O .SS Unicode Under Linux
 320 .SS "Linux ¤Ë¤ª¤±¤ë Unicode"
 321 .\"O Under GNU/Linux, the C type
 322 .\"O .B wchar_t
 323 .\"O is a signed 32-bit integer type.
 324 .\"O Its values are always interpreted
 325 .\"O by the C library as
 326 .\"O .B UCS
 327 .\"O code values (in all locales), a convention that is signaled by the GNU
 328 .\"O C library to applications by defining the constant
 329 .\"O .B __STDC_ISO_10646__
 330 .\"O as specified in the ISO C99 standard.
 331 GNU/Linux ¤Ç¤Ï¡¢C ¸À¸ì¤Î·¿
 332 .B wchar_t
 333 ¤ÏÉä¹æÉÕ¤ 32 ¥Ó¥Ã¥ÈÀ°¿ô·¿¤Ç¤¢¤ë¡£
 334 ¤½¤ÎÃÍ¤Ï C ¥é¥¤¥Ö¥é¥ê¤Ë¤è¤ê (¤¹¤Ù¤Æ¤Î¥í¥±¡¼¥ë¤Ë¤ª¤¤¤Æ) ¾ï¤Ë
 335 .B UCS
 336 ¥³¡¼¥É¤ÎÃÍ¤È¤·¤Æ²ò¼á¤µ¤ì¤ë¡£
 337 ¤³¤ì¤ò GNU C ¥é¥¤¥Ö¥é¥ê¤¬¥¢¥×¥ê¥±¡¼¥·¥ç¥ó¤ËÃÎ¤é¤»¤ë¤¿¤á¤Îµ¬Ìó¤È¤·¤Æ¡¢
 338 Äê¿ô
 339 .B __STDC_ISO_10646__
 340 ¤òÄêµÁ¤¹¤ë¡£
 341 ¤³¤ì¤Ï ISO C99 µ¬³Ê¤Ç»ØÄê¤µ¤ì¤Æ¤¤¤ë¡£
 342
 343 .\"O UCS/Unicode can be used just like ASCII in input/output streams,
 344 .\"O terminal communication, plaintext files, filenames, and environment
 345 .\"O variables in the ASCII compatible
 346 .\"O .B UTF-8
 347 .\"O multibyte encoding.
 348 .\"O To signal the use of UTF-8 as the character
 349 .\"O encoding to all applications, a suitable
 350 .\"O .I locale
 351 .\"O has to be selected via environment variables (e.g.,
 352 .\"O "LANG=en_GB.UTF-8").
 353 ASCII ¸ß´¹¤Î
 354 .B UTF-8
 355 ¥Þ¥ë¥Á¥Ð¥¤¥È¥¨¥ó¥³¡¼¥É¤Ç¤Ï¡¢Æþ½ÐÎÏ¥¹¥È¥ê¡¼¥à¡¦Ã¼ËöÄÌ¿®¡¦
 356 ¥×¥ì¡¼¥ó¥Æ¥¥¹¥È¥Õ¥¡¥¤¥ë¡¦¥Õ¥¡¥¤¥ëÌ¾¡¦´Ä¶ÊÑ¿ô¤Ë¤ª¤¤¤Æ¡¢
 357 UCS/Unicode ¤ò ASCII ¤Î¤è¤¦¤Ë»È¤¦¤³¤È¤¬¤Ç¤¤ë¡£
 358 UTF-8 ¤òÊ¸»ú¥¨¥ó¥³¡¼¥É¤È¤·¤Æ»È¤¦¤³¤È¤ò
 359 Á´¤Æ¤Î¥¢¥×¥ê¥±¡¼¥·¥ç¥ó¤ËÃÎ¤é¤»¤ë¤¿¤á¤Ë¤Ï¡¢
 360 ("LANG=en_GB.UTF-8" ¤Î¤è¤¦¤Ë) ´Ä¶ÊÑ¿ô¤ò»È¤Ã¤ÆÅ¬ÀÚ¤Ê
 361 .I ¥í¥±¡¼¥ë (locale)
 362 ¤òÁªÂò¤·¤Ê¤±¤ì¤Ð¤Ê¤é¤Ê¤¤¡£
 363 .PP
 364 .\"O The
 365 .\"O .B nl_langinfo(CODESET)
 366 .\"O function returns the name of the selected encoding.
 367 .\"O Library functions such as
 368 .\"O .BR wctomb (3)
 369 .\"O and
 370 .\"O .BR mbsrtowcs (3)
 371 .\"O can be used to transform the internal
 372 .\"O .I wchar_t
 373 .\"O characters and strings into the system character encoding and back
 374 .\"O and
 375 .\"O .BR wcwidth (3)
 376 .\"O tells, how many positions (0\(en2) the cursor is advanced by the
 377 .\"O output of a character.
 378 .B nl_langinfo(CODESET)
 379 ´Ø¿ô¤ÏÁªÂò¤µ¤ì¤¿¥¨¥ó¥³¡¼¥É¤ÎÌ¾Á°¤òÊÖ¤¹¡£
 380 ÆâÉôÅª¤Ê
 381 .I wchar_t
 382 Ê¸»ú¤äÊ¸»úÎó¤ò¥·¥¹¥Æ¥àÊ¸»úÎó¥¨¥ó¥³¡¼¥É¤ËÊÑ´¹ (µÕÊÑ´¹) ¤¹¤ë¤Î¤Ë»È¤ï¤ì¤ë
 383 .BR wctomb (3)
 384 ¤ä
 385 .BR mbsrtowcs (3)¡¢
 386 ¤µ¤é¤Ë¤Ï
 387 .BR wcwidth (3)
 388 ¤È¤¤¤Ã¤¿¥é¥¤¥Ö¥é¥ê´Ø¿ô¤Ï¡¢
 389 Ê¸»ú½ÐÎÏ¤Ç¤É¤ì¤À¤±¥«¡¼¥½¥ë¤¬¿Ê¤ó¤À¤« (0\(en2) ¤òÊÖ¤¹¡£
 390 .PP
 391 .\"O Under Linux, in general only the BMP at implementation level 1 should
 392 .\"O be used at the moment.
 393 .\"O Up to two combining characters per base
 394 .\"O character for certain scripts (in particular Thai) are also supported
 395 .\"O by some UTF-8 terminal emulators and ISO 10646 fonts (level 2), but in
 396 .\"O general precomposed characters should be preferred where available
 397 .\"O (Unicode calls this
 398 .\"O .BR "Normalization Form C" ).
 399 °ìÈÌÅª¤Ë¸À¤¦¤È¡¢Linux ¤Ç¤Ï¸½ºß¤Î¤È¤³¤í
 400 BMP ¤Î level 1 ¼ÂÁõ¤Î¤ß¤ò»È¤¦¤Ù¤¤Ç¤¢¤ë¡£
 401 ¤¢¤ë¸À¸ì¤ÎÊ¸»ú (¤È¤¯¤Ë¥¿¥¤Ê¸»ú) ¤Ç¤Ï¡¢
 402 ¥Ù¡¼¥¹Ê¸»úÅö¤¿¤ê 2 ¤Ä¤Þ¤Ç¤Î¹çÀ®Ê¸»ú¤ò»È¤¦¤³¤È¤¬
 403 UTF-8 Ã¼Ëö¥¨¥ß¥å¥ì¡¼¥¿¤È ISO 10646 ¥Õ¥©¥ó¥È (level 2) ¤Ç¥µ¥Ý¡¼¥È¤µ¤ì¤Æ¤¤¤ë¡£
 404 ¤·¤«¤·°ìÈÌÅª¤Ë¸À¤¨¤Ð¡¢¤â¤·²ÄÇ½¤Ê¤é¤Ð¤¢¤é¤«¤¸¤á¹çÀ®¤·¤¿Ê¸»ú¤ò»È¤¦¤Ù¤¤Ç¤¢¤ë
 405 (Unicode ¤Ç¤Ï¡¢¤³¤ì¤ò
 406 .B "Normalization Form C (¹çÀ®Ê¸»ú¤ÎÀµµ¬²½·Á¼°)"
 407 ¤È¤¤¤¦)¡£
 408 .\"O .SS Private Area
 409 .SS ¥×¥é¥¤¥Ù¡¼¥È¡¦¥¨¥ê¥¢
 410 .\"O In the
 411 .\"O .BR BMP ,
 412 .\"O the range 0xe000 to 0xf8ff will never be assigned to any characters by
 413 .\"O the standard and is reserved for private usage.
 414 .\"O For the Linux
 415 .\"O community, this private area has been subdivided further into the
 416 .\"O range 0xe000 to 0xefff which can be used individually by any end-user
 417 .\"O and the Linux zone in the range 0xf000 to 0xf8ff where extensions are
 418 .\"O coordinated among all Linux users.
 419 .\"O The registry of the characters
 420 .\"O assigned to the Linux zone is currently maintained by H. Peter Anvin
 421 .\"O <Peter.Anvin@linux.org>.
 422 .B BMP
 423 ¤Î 0xe000 ¡Á 0xf8ff ¤ÎÈÏ°Ï¤Ï¡¢µ¬³Ê¤Ç¤Ï¤¤¤«¤Ê¤ëÊ¸»ú¤â³ä¤êÅö¤Æ¤º¡¢
 424 »äÅª¤Ê»ÈÍÑ¤Î¤¿¤á¤ËÍ½Ìó¤µ¤ì¤Æ¤¤¤ë¡£
 425 Linux ¥³¥ß¥å¥Ë¥Æ¥£¤Ç¤Ï¡¢
 426 ¤³¤Î¥×¥é¥¤¥Ù¡¼¥È¡¦¥¨¥ê¥¢¤ò¤µ¤é¤ËºÙ¤«¤¯Ê¬³ä¤·¤Æ»ÈÍÑ¤¹¤ë¡£
 427 0xe000 ¡Á 0xefff ¤ÎÈÏ°Ï¤Ï¥¨¥ó¥É¡¦¥æ¡¼¥¶¡¼¤¬¸Ä¡¹¤Ë»ÈÍÑ¤¹¤ë¤³¤È¤¬¤Ç¤¤ë¡£
 428 0xf000 ¡Á 0xf8ff ¤ÎÈÏ°Ï¤Ï Linux Zone ¤Ç
 429 Á´¤Æ¤Î Linux ¥æ¡¼¥¶¡¼¤Ç¶¦ÄÌ¤Ë»ÈÍÑ¤¹¤ë¡£
 430 Linux Zone ¤Ø¤ÎÊ¸»ú³ä¤êÅö¤Æ¤ÎÅÐÏ¿¤Ï¡¢
 431 ¸½ºß H. Peter Anvin <Peter.Anvin@linux.org> ¤Ë¤è¤Ã¤Æ´ÉÍý¤µ¤ì¤Æ¤¤¤ë¡£
 432 .\"O .SS Literature
 433 .SS Ê¸¸¥
 434 .TP 0.2i
 435 *
 436 Information technology \(em Universal Multiple-Octet Coded Character
 437 Set (UCS) \(em Part 1: Architecture and Basic Multilingual Plane.
 438 International Standard ISO/IEC 10646-1, International Organization
 439 for Standardization, Geneva, 2000.
 440
 441 .\"O This is the official specification of
 442 .\"O .BR UCS .
 443 .\"O Available as a PDF file on CD-ROM from http://www.iso.ch/.
 444 ¤³¤ì¤Ï
 445 .B UCS
 446 ¤Î¸ø¼°¤Ê»ÅÍÍ¤Ç¤¢¤ë¡£
 447 http://www.iso.ch/ ¤«¤éÃíÊ¸¤Ç¤¤ë CD-ROM ¤Ç PDF ¥Õ¥¡¥¤¥ë¤È¤·¤ÆÆþ¼ê¤Ç¤¤ë¡£
 448 .TP
 449 *
 450 The Unicode Standard, Version 3.0.
 451 The Unicode Consortium, Addison-Wesley,
 452 Reading, MA, 2000, ISBN 0-201-61633-5.
 453 .TP
 454 *
 455 S. Harbison, G. Steele. C: A Reference Manual. Fourth edition,
 456 Prentice Hall, Englewood Cliffs, 1995, ISBN 0-13-326224-3.
 457
 458 .\"O A good reference book about the C programming language.
 459 .\"O The fourth
 460 .\"O edition covers the 1994 Amendment 1 to the ISO C90 standard, which
 461 .\"O adds a large number of new C library functions for handling wide and
 462 .\"O multibyte character encodings, but it does not yet cover ISO C99,
 463 .\"O which improved wide and multibyte character support even further.
 464 C ¥×¥í¥°¥é¥à¸À¸ì¤Ë¤Ä¤¤¤Æ¤Î¤È¤Æ¤âÎÉ¤¤»²¹Í½ñ¤Ç¤¢¤ë¡£
 465 Âè»ÍÈÇ¤Ç¤Ï¡¢¥ï¥¤¥ÉÊ¸»ú¤ä¥Þ¥ë¥Á¥Ð¥¤¥ÈÊ¸»ú¥¨¥ó¥³¡¼¥É¤ò°·¤¦¤¿¤á¤Î
 466 Â¿¤¯¤Î¿·¤·¤¤ C ¥é¥¤¥Ö¥é¥ê´Ø¿ô¤¬
 467 ²Ã¤¨¤é¤ì¤¿ ISO C90 µ¬³Ê¤Î 1994 Amendment 1 ¤ò¥«¥Ð¡¼¤·¤Æ¤¤¤ë¡£
 468 ¤·¤«¤·¡¢¥ï¥¤¥ÉÊ¸»ú¤ä¥Þ¥ë¥Á¥Ð¥¤¥ÈÊ¸»ú¤Î¥µ¥Ý¡¼¥È¤ò
 469 ¹¹¤Ë²þÁ±¤·¤¿ ISO C99 ¤Ï¡¢¤Þ¤À¥«¥Ð¡¼¤·¤Æ¤¤¤Ê¤¤¡£
 470 .TP
 471 *
 472 .\"O Unicode Technical Reports.
 473 Unicode µ»½Ñ¥ì¥Ý¡¼¥È¡£
 474 .RS
 475 http://www.unicode.org/unicode/reports/
 476 .RE
 477 .TP
 478 *
 479 .\"O Markus Kuhn: UTF-8 and Unicode FAQ for Unix/Linux.
 480 Markus Kuhn: Unix/Linux ¤Î¤¿¤á¤Î UTF-8 ¤È Unicode ¤Î FAQ¡£
 481 .RS
 482 http://www.cl.cam.ac.uk/~mgk25/unicode.html
 483
 484 .\"O Provides subscription information for the
 485 .\"O .I linux-utf8
 486 .\"O mailing list, which is the best place to look for advice on using
 487 .\"O Unicode under Linux.
 488 .I linux-utf8
 489 ¥á¡¼¥ê¥ó¥°¥ê¥¹¥È¤ò¹ØÆÉ¤¹¤ë¤¿¤á¤Î¾ðÊó¤¬¤¢¤ë¡£
 490 Linux ¤Ç Unicode ¤ò»È¤¦¾ì¹ç¤Î¥¢¥É¥Ð¥¤¥¹¤òÃµ¤¹¤Î¤Ë°ìÈÖÎÉ¤¤¾ì½ê¤Ç¤¢¤ë¡£
 491 .RE
 492 .TP
 493 *
 494 Bruno Haible: Unicode HOWTO.
 495 .RS
 496 ftp://ftp.ilog.fr/pub/Users/haible/utf8/Unicode-HOWTO.html
 497 .RE
 498 .\"O .SH BUGS
 499 .SH ¥Ð¥°
 500 .\"O When this man page was last revised, the GNU C Library support for
 501 .\"O .B UTF-8
 502 .\"O locales was mature and XFree86 support was in an advanced state, but
 503 .\"O work on making applications (most notably editors) suitable for use in
 504 .\"O .B UTF-8
 505 .\"O locales was still fully in progress.
 506 ¤³¤Î¥Þ¥Ë¥å¥¢¥ë¡¦¥Ú¡¼¥¸¤òºÇ¸å¤Ë²þÄû¤·¤¿»þÅÀ¤Ç¡¢
 507 GNU C ¥é¥¤¥Ö¥é¥ê¤Î
 508 .B UTF-8
 509 ¥µ¥Ý¡¼¥È¤Ï´°À®¤·¤Æ¤¤¤ë¡£
 510 XFree86 ¤Ë¤è¤ë¥µ¥Ý¡¼¥È¤Ï¿Ê¹ÔÃæ¤Ç¤¢¤ë¡£
 511 .B UTF-8
 512 ¥í¥±¡¼¥ë¤Ç²÷Å¬¤Ë»È¤¨¤ë¥¢¥×¥ê¥±¡¼¥·¥ç¥ó
 513 (Â¿¤¯¤ÎÍÌ¾¤Ê¥¨¥Ç¥£¥¿) ¤ÎºîÀ®¤Ï¡¢¤Þ¤À¿Ê¹ÔÃæ¤Ç¤¢¤ë¡£
 514 .\"O Current general
 515 .\"O .B UCS
 516 .\"O support under Linux usually provides for CJK double-width characters
 517 .\"O and sometimes even simple overstriking combining characters, but
 518 .\"O usually does not include support for scripts with right-to-left
 519 .\"O writing direction or ligature substitution requirements such as
 520 .\"O Hebrew, Arabic, or the Indic scripts.
 521 Linux ¤Ç¤Î
 522 .B UCS
 523 ¥µ¥Ý¡¼¥È¤Ç¤ÏÄÌ¾ï CJK ¤Î 2 ¥ï¥¤¥ÉÊ¸»ú¤¬Äó¶¡¤µ¤ì¤ë¡£
 524 Ã±½ã¤Ê½Å¤ÍÂÇ¤Á¤Ë¤è¤ë¹çÀ®Ê¸»ú¤¬Äó¶¡¤µ¤ì¤ë¾ì¹ç¤â¤¢¤ë¡£
 525 ¤·¤«¤·¡¢±¦¤«¤éº¸¤Ø½ñ¤¯Ê¸»ú¤ä¥Ø¥Ö¥é¥¤Ê¸»ú¡¦¥¢¥é¥Ó¥¢Ê¸»ú¡¦¥¤¥ó¥É¸ì·ÏÊ¸»ú¤Ê¤É¤Î
 526 ¹ç»ú¤ÎÃÖ¤´¹¤¨¤òÉ¬Í×¤È¤¹¤ëÊ¸»ú¤Ï¥µ¥Ý¡¼¥È¤µ¤ì¤Æ¤¤¤Ê¤¤¡£
 527 .\"O These scripts are currently only
 528 .\"O supported in certain GUI applications (HTML viewers, word processors)
 529 .\"O with sophisticated text rendering engines.
 530 ¸½ºß¡¢¤³¤ì¤é¤ÎÊ¸»ú¤ÏÀöÎý¤µ¤ì¤¿¥Æ¥¥¹¥ÈÉÁ²è¥¨¥ó¥¸¥ó¤òÈ÷¤¨¤¿
 531 GUI ¥¢¥×¥ê¥±¡¼¥·¥ç¥ó (HTML ¥Ó¥å¡¼¥¢¡¦¥ï¡¼¥É¥×¥í¥»¥Ã¥µ) ¤Ç¤Î¤ß
 532 ¥µ¥Ý¡¼¥È¤µ¤ì¤Æ¤¤¤ë¡£
 533 .\"O .\" .SH AUTHOR
 534 .\" .SH Ãø¼Ô
 535 .\" Markus Kuhn <mgk25@cl.cam.ac.uk>
 536 .\"O .SH SEE ALSO
 537 .SH ´ØÏ¢¹àÌÜ
 538 .BR setlocale (3),
 539 .BR charsets (7),
 540 .BR utf-8 (7)