Character encoding is the essential for the machine-machine and man-machine communication. During the years the encoding has been improved and today there are several standards in use.
Character encoding is the essential for the machine-machine and man-machine communication. During the years the encoding has been improved and today there are several standards in use.
Parameter | ASCII | ISO Latin 1 | ANSI | UTF-8 |
---|---|---|---|---|
Bits per character | 7 | 8 | 8 | 8-32 |
Number of characters | 95 | 190 | xx | xx |
Range | 0–127 | 0–255 | 0–255 | 0-4294967295 |
The ASCII (American Standard Code for Information Interchange) encoding
Xxxxx
Xxxxx
The UTF-8 (8-bit UnicodeTransformation Format) endoing is commonly used in web pages and in XML data. The format is backwards compatible with the old ASCII and ISO standards, at the same time as it enables the use of characters for, in principle, all languages.
The encoding uses a a variable number of 8-bit blocks or octets to represent a character. From one to to four octets can be used and the old ASCII and ISO encodings are preserved with the use of a single octet.
There is also a variant named UTF-16 using 16-bit blocks, but this standard is not commonly used.