This line is written using unicode combinable characters . And Unicode is such a system for coding in general all possible elements of any writing that exists now, existed once (dead languages) and even for some made-up (for example, Klingon language). In Unicode, there are more than one hundred thousand characters 1 and each has its own number, which is written like this: U + <hexadecimal number>. For example, the Latin letter A 2 is U + 0041 .
In writing of many languages there are diacritical symbols - these are icons that are added to a letter and as a result either a new letter (e → ё) is obtained, or some additional meaning of the same letter (for example, stress).
There are a lot of such icons: the imagination of humanity has invented dozens of circles, hooks, dots, dashes and other things. It would be very difficult to fit into the standard a combination of each of these icons with each possible letter. Therefore, combinable characters were invented and developed.
It works like this. There is a Latin letter A We need to write this letter with a circle at the top to make Å . To do this, we use the base letter A , and after it we put the combined symbol ˚ - U + 030A "Combined riser circle" 3 . In fact, these characters go separately one by one, but the browser (or text editor) can combine them, so it shows them together .
Combinable characters can be added not only from above, but also from below: A͢, and even directly on top of the character: ̸A. You can even immediately everywhere: ̸Å͢.
What will happen if after one “basic” symbol to put a multiply combined? Get a column of characters, like this: Å̊̊̊̊̊̊̊̊. That is exactly what Zalgo-text is.
The text above is written as simple as possible and contains a number of factual inaccuracies. Below are notes for professionals.
- In fact, not characters, but code points (code points), which are assigned (assigned) some abstract character (abstract character) or other role. The total possible code positions are 1114112 10 or 17 × 2 16 (from
0 to U+10FFFF ) and for the most part nothing has been assigned yet. - In fact,
A should be called an abstract character, which is assigned the code position U + 0041. An abstract symbol contains only the idea of the letter A, but does not define any specific mapping (glyph). Specific images are contained in fonts. - This is not the only way to get an abstract symbol
Å . Two Unicode code positions are assigned directly to this symbol: Latin capital letter A with a circle on top of U + 00C5 and Angstrom Sign U + 212B . This is far from an exception; many abstract symbols have assigned code points (1 or more) and are “assembled” using combinable symbols. All abstract characters that are assigned their own code points are called assigned characters .
A similar question in English: How does Zalgo text work?