Está en la página 1de 23

Ken Lunde

Beijing

Cambridge

Ko
ln

India

Mexico

Paris

Sebastopol

Taipei

Tokyo

Copyright
Chinese Edition 2002 Copyright by O'Reilly & Associates, Inc.
Taiwan Branch
Published by O'Reilly & Associates, Inc.
All rights reserved
Including the right of reproduction in whole or in part in any form.

Trademarks
All brand names and product names used in this book are trademarks,
registered trademarks, or trade name of their respective holders.







1.

Template

Writing System

ASCII
Unicode

ASCII Unicode

QWERTY

* Multiple-byte Text3719

1 :

ASCII
ASCII ASCII
ASCII
Notation
Byte Order

X Syllabary
108 1,600

E U C

EUC
EUC

EUC Extended Unix Code


4

E U C

* Orthography

1-1

1-1:

encoding

a
enkdingu

text

tekisuto

support

sapto

E U C

1-2
1-2:

no

waa

to

ga

shite-iru
shimasu

ha wa

EUC

1 :

ideographs pictographs logographs*





1-3

1-3:

nado

hh

nihongo

eigo

konk

1- 4

1-4:

Quc ng ch Nm ch Hn

1-5 2

1-5:

30% 60%
10%

200

90% 1,000

11

Character set
52

1 :

1,945 *

ASCII

ASCII ISO 8859-1:1998


ASCII 94
42 1,945
JIS X 0208:1997
6,879 197819831990 1997

JIS X 0212-1990 6,067


encoding

implementation

encoding method ISO-2022-KR
EUC-KR Johab UHC

data Bit
on off 1 0
Byte 128

256

63 1978

16 65,536
256256 1-1
0

255

255

1-1: 256256

65,536 Text Stream



ISO-2022-JP
8,836 9494 *

ASCII

row cell

* (Code space)2562564D
256256

1 :

1-2

03

0E

0E

03

1-2:

ISO-2022-JPShift-JISEUC-JP ISO-2022-JP

shifting character escape sequence
Shift-JIS EUC-JP

Input Method

1.

2.


k a

70%

key-value

1 :

10

kan ji
1-6
1-6:


K A N

J I

1-7
1-7:


K A N J I

11


90 M em-square* 7

X
GB GB/T
GB GB 2312-80
GB Guo Biao

* M em-square M
design space

1 :

12

GB/T GB Traditional Chinese


T TraditionGB/TTTui

GBK GB 2312-80 K Kuozhan
JIS JISCJSA
JISC JIS JSA JIS
JISC
JIS Japanese Industrial Standard* JIS
JISC JIS
JIS
JISCJapanese Industrial Standards CommitteeJISC
JIS JSAJISJIS
JISJIS

JSA Japanese Standards Association JISC


JIS
JIS JISJISCJSA

1987 3 1 JIS C X 1-8


JIS C X
1-8: JIS
JIS C
JIS C 6220

JIS X
JIS X 0201

JIS C 6228

JIS X 0202

JIS C 6225

JIS X 0207

JIS C 6226

JIS X 0208

JIS C 6233

JIS X 6002

JIS C 6235

JIS X 6003

JIS C 6236

JIS X 6004

JIS C 6232

JIS X 9051

JIS C 6234

JIS X 9052

* JIS

13

KS
KS KS

KS XX
*
1997 8 20 KS C X 1-9
C X KS
1-9: KS
KS C
KS C 5601

KS X
KS X 1001

KS C 5657

KS X 1002

KS C 5636

KS X 1003

KS C 5620

KS X 1004

KS C 5700

KS X 1005-1

KS C 5861

KS X 2901

KS C 5715

KS X 5002

VISCII VSCII TCVN


VISCII VSCIIVietnamese Standard Code for Information
InterchangeVISCII RFC 1456
VSCII TCVN 57121993 VN2VSCII
ISO IR 180 3 65 VISCII VSCII
S VISCII VSCII

TCVN Tiu Chun Vit Nam GBJISKS TCVN

Internationalization I18N I
N 18
Localization L10N

* KS B D A
CJKV6N, C10N, G11N, K11N, M17N, S32S, V12N

1 :

14

Japanization J10N L10N

I18N L10N

locale model multilingual


model

Row-Cell*

1 94 1 94
01-01

Character
Glyph
f i fi fi
ligature

S G

S F
S H

S I

Typeface

* /

15

Unicode
DTP Center Biblos

h klijmno
Enfour Media 18


1,945 JSA
JIS X 9051-1984* JIS X 9052-1983 JIS X 0208-1983
JSA JSA

FDPC
FDPCMITI JSA
Heisei
Heisei Mincho W3 W3Heisei Kaku
Gothic W5W5 W3
3W39W948Heisei
Maru Gothic

3232GB 2312-80
1-10
1-10:

GB 6345.1-86

127

3232

GB 6345.2-86

2831

3232

GB 12034-89

3255

3232

GB 12035-89

5679

3232

GB 12036-89

80103

3232

* JIS C 6232-1984
JIS C 6234-1983

1 :

16

GB 6345.1-86
pixel
bitmap pattern
GB 16794.1-1997 48
1-10GB GBK 4848
GB 5007.1-85 2424 GB 5007.2-85
2424 GB 16794.1-1997
2424
1-11 ISO
Unicode glyph image

1-11: ISO Unicode

ISO

a The Unicode Standard,


b ISO 10646-1:1993
c ISO 9541-1:1991

Unicode a
1


1
2

Version 2.0 (Addison-Wesley, 1996)

Typeface Style Font


point size outline font

outline font instance
serif sans serif script
1-12
1-12: CJKV
a

Serif

Song ( )

Mincho ()

Myeongjo (/)

Sans Serif

Hei ( )

Gothic ()

Gothic (/ )

17

1-12: CJKV ( )

Script

Kai ( )

Kaisho ( )
Gyosho ()
Sosho ()

Fangsong ( )

Kyokasho ( )

Haeseo (/)
Haengseo ( /
)
Choseo (/ )

1-12

Half-WidthFull-Width*

1-13

1-13:

12345

1-14

1-14:
ASCII

ISO-2022-JP

Shift-JIS

EUC-JP

ISO 10646-1:1993

* / /

1 :

18

26 Roman
LatinISO

Roman italic

notation
1-15

1-15: 100

0 1

01100100

07

144

10

09

100

16

09 AF

64

Octet

0110010001011111

01100100
01011111
100 0x64 95 0x5F
25695 0x645F
25695 256

19

100 95 1-16

1-16:

01100100

01011111

0110010001011111

144

137

62137

100

95

25695

64

5F

645F

256

Little-Endian BigEndian
* char

Vax IntelMS-DOS
Windows

Motorola Sun MacOS


Unix

1-17
1-17:

01100100

01011111

0101111101100100

0110010001011111

64

5F

5F64

645F

0x640x5F0x7E0xA1 0xA17E5F64
0x645F7EA1
* Gulliver's Travels

1 :

20

endian
Unicode 0x0020
0x2000

ANSI C multiple-byte
character wide character

1-18

1-18:

ASCII

ISO-2022

EUC
GBK

Big5

Big5+

Shift-JIS

Johab

UHC

UTF-8

1-19
1-19:

UCS-2
UCS-4
UTF-16
Unicode 2.0

UTF-16

También podría gustarte