Compare UTF-8 and UTF-16 encoding side by side
0xxxxxxx
1 byte (ASCII)
110xxxxx 10xxxxxx
2 bytes
1110xxxx 10xxxxxx 10xxxxxx
3 bytes
11110xxx 10xxxxxx 10xxxxxx 10xxxxxx
4 bytes
Direct code unit
2 bytes (BMP?)
High + Low surrogate
4 bytes (Surrogate?)