- About Us
Sure, if you're completely unconcerned with backwards compatibility. Unfortunately, the real world doesn't work out that nicely. For instance, compose a mail message in a 32-bit encoding scheme and watch it almost invariably get mangled by all the routers and mail processing hubs between you and the recipient; now encode the same message in UTF8…
Incidentally, do you even know how UTF8 works? The number doesn't indicate the potential encoding range at all; UTF8 is every bit as capable of representing the full Unicode space as any of the others. It does this by being a variable encoding, using from one to four bytes to encode a single character.
Each scheme is designed to address different requirements. UTF8 is intended for when English is a dominant language, in which case it is more space efficient, or when full compatibility with the ASCII7 standard is a must.
Incidentally, can you provide some specific examples of how UTF32 can't represent Asian languages completely? I haven't come across anything yet that isn't a result of the various standards groups arguing over the best way to encode them…the technical implementation is perfectly capable, even using UTF8.
Incidentally, don't you mean UCS-4, which is also a Unicode standard?
Return to Introduction to Unicode