UTF-8  
 
An 8-bit encoding scheme for representing any Unicode character using only 8-bit codes.

The encoding scheme, while inefficient, includes a variety of features to improve compatibility with older schemes. ASCII, for example, is represented by the first 128 UTF-8 code-points making ASCII and UTF-8 exactly the same for ASCII encoded text. UTF-8 also has features to promote error trapping and resynchronization. The resynchronization is needed because single Unicode code points higher than the first 128 used for ASCII, will span multiple bytes in UTF-8 encoded text.

UTF-8 is used on the web and is described in RFC 3629.

 
  ASCII     VT-220    
 


Content: © Copyright 2000-2007 Creativyst, Inc. (all rights reserved)

 

index
Record date: 2006.10.06-1634