Unicode

Unicode is the worldwide standard for encoding and representing text in most of the world's writing languages, maintained by the Unicode Consortium. Currently Unicode has over 137000 different characters, covering both modern and historical languages as well as symbols and emojis. The characters of the Unicode Standard is synchronized with ISO/IEC 10646, and both are code-for-code identical.

Features

  • Unicode is implemented using several different character encodings. The standard encodings include UTF-8, UTF-16, UTF-32 (also known as UCS-4), UTF-7 and UCS-2 (obsolete).
  • The Unicode Consortium is responsible for maintaining and publishing the Unicode standard.
  • The first 256 characters of Unicode are equivalent to the ISO-8859-1 standard. Also the first 128 characters are equivalent to the standard ASCII alphabet.
  • Wikipedia has further info about Unicode and the various Unicode encodings.

Visual tricks can be played with unicode, such as upside down text effects.

Sample

f0 9f 99 88 f0 9f 99 89 f0 9f 99 89

The codes above represents three monkeys 🙈🙉🙉 encoded using Unicode UTF-8 encoding.