UTF-8 Encoder | Boxentriq

UTF-8 encoding online tool. UTF-8 (8-bit Unicode Transformation Format) is a variable length character encoding that can encode any of the valid Unicode characters. Each Unicode character is encoded using 1-4 bytes. Standard 7-bit ASCII characters are always encoded as a single byte in UTF-8, making the UTF-8 encoding backwards compatible with ASCII. UTF-8 is the most common Unicode encoding and used by a majority of applications and websites.

Need to translate in the other direction? Use the UTF-8 Decoder instead.

UTF-8 Encoding Tool

This tool converts between Unicode and hexadecimal format using UTF-8 encoding. UTF-8 is the most common Unicode encoding and used by a majority of applications and websites.

Plaintext

Hex data

Features

UTF-8 can start with the Byte Order Mark (BOM) EF BB BF, but it is not required or even recommended by the Unicode standard.
Prefix code: the first byte in each character encoding always indicate how many bytes in total are used for representing the character. This helps reduce decoding errors.
Self-synchronization: since the bytes are divided in leading bytes and continuation bytes, which have different value ranges, it is always possible to detect the beginning of a character. This helps reduce decoding errors.

UTF-8 encoding is regularly used in CTFs and logic puzzles. It can sometimes be recognized by BOM (byte order marks) in the beginning. UTF-8 can start with code EF BB BF, but it is not required or even recommended by the Unicode standard. UTF-16 can start with FE FF or FF FE, to indicate which form of UTF-16 is used. UTF-32 can start with 00 00 FE FF or FF FE 00 00.

Visual tricks can be played with unicode, such as upside down text effects.

Sample

f0 9f 99 88 f0 9f 99 89 f0 9f 99 89

The codes above represents three monkeys 🙈🙉🙉 encoded using UTF-8.