|
UTF-32 is a method of encoding Unicode characters, using a fixed
amount of 32 bits for each character. It can be regarded as the simplest possible way, as all other Unicode Transformation
Formats have variable-length encodings for various characters. However, a notable drawback of UTF-32 is that it requires four
times the storage space of traditional encodings. This is why it is rarely used for external storage, but only internally when
character handling is required to be as efficient as possible.
UTF-32 was originally a subset of the UCS-4 standard, but the Principles and
Procedures document of JTC1/SC2/WG2 states that all future assignments of characters will be constrained to the BMP for the first 14
supplementary planes and has removed former provisions for private-use code positions in groups 60 to 7F and in planes E0 to
FF.
Accordingly UCS-4 and UTF-32 can be now taken to be identical save that UTF-32 standard has additional Unicode semantics that
must be observed.
|