Replacing characters at the edges of each word

Question

New I do not understand why the stars do not change to em? I said to find the word if it starts with * and ends with *

echo preg_replace('#\b(?<=\*)(.+)(?=\*)\b#', '<em>$1</em>', 'This text has *two* *italic* bits');

Why are only the last letters of the word displayed, not the entire word?

 echo preg_replace('#\*([az])+\*#', '<em>$1</em>', 'This *text* is *italic*');

because the greedy algorithm you have and eats everything from the first to the last star.
Use preg_replace('#\*(\p{L}+)\*#u', '<em>$1</em>', 'This *text* is *italic*') - you have an exciting subtitle captures only one letter at a time, the quantifier must be inside the traces.
@DivMan \p{L} \p{Letter} – символы, считающиеся буквами \p{M} \p{Mark} – различные символы, существующие не самостоятельно, а лишь в сочетании с другими базовыми символами (диакритические знаки, рамки и т. д.) \p{Z} \p{Separator} – символы, выполняющие функции разделителей, но не имеющие собственного визуального представления(разнообразные пробелы и т.д.) \p{S} \p{Symbol} – различные декоративные элементы и знаки \p{N} \p{Number} – цифры \p{P} \p{Punctuation} – знаки препинания \p{C} \p{Other} – прочие символы (редко используется при работе с обычным текстом)

Accepted Answer · 2017-11-01T22:06:27

In the case of \b(?<=\*)(.+)(?=\*)\b asterisks are part of the preview blocks, i.e. they do not become part of the replaced string. Here you can add the fact that (.+) Is a "greedy pattern", i.e. will find everything from the first asterisk to the last, (.+?) looks preferable, as it will find the text from the first asterisk to the next to the right.

In #\*([az])+\*# you have an exciting submask that captures only one letter, the quantifier must be inside the submasks.

Use

 preg_replace('#\*([az]+)\*#', '<em>$1</em>', 'This *text* is *italic*');

See the online demo .

If you need to add support for all Unicode letters:

 preg_replace('#\*(\p{L}+)\*#u', '<em>$1</em>', 'This *text* is *italic*');

where \p{L} finds any Unicode letter.

If the task is to find an asterisk, after which there is a letter, and then any text that ends with a letter, after which there is an asterisk, you can use

 preg_replace('#\*(\p{L}(?:.*?\p{L})?)\*#u', '<em>$1</em>', 'This *text* is *italic*');

or

 preg_replace('#\*(\p{L}(?:[^*]*\p{L})?)\*#u', '<em>$1</em>', 'This *text* is *italic*');

Is there a \p{L}(?:[^*]*\p{L})? will find a letter, after which optionally 0 or more of any characters other than * follow, and again any letter. Those. there are *Да, нет* , and * \\ * will not be found.

I have not finished .. Your \*([^*]+)\* will find * \\ *
I did that the text would change, if there are single asterisks, in the first line, it works as it should, and in the second it works incorrectly, the volume should be the whole line in italics, why did it not quit?
@Wiktor Stribiżew did not understand what you wanted to convey?
My example && Your example The result is the same for them, but my speed remains higher :)

Replacing characters at the edges of each word

1 answer 1

More articles: