PHP + regex typo fix

Question

$a='я пошел гулять на улицу АЗина';

It is necessary in this line to correct the error of Azin on Azin in PHP with one line of a regular expression. If there are beautiful solutions without regulars, then in general, they are also oceanic.

offtopic:> with a regular expression, yeah, it's not enough to handle a regular session here ...

Answer 1 · 2012-01-20T17:44:24

Work in utf8:

 $a=iconv('utf-8', 'windows-1251', 'я пошел гулять на улицу АЗина'); $b=$a{0}; for($i=1;$i<strlen($a);$i++) $b .= preg_match('/[A-ZА-Я]{2}/',$a{$i-1}.$a{$i})?strtolower($a{$i}):$a{$i}; echo iconv('windows-1251', 'utf-8', $b);

If the document is in cp1251,

 $a='я пошел гулять на улицу АЗина'; for($i=0;$i<strlen($a);$i++)$b.=($i==0?$a{0}:(preg_match('/[A-ZА-Я]{2}/',$a{$i-1}.$a{$i})?strtolower($a{$i}):$a{$i})); echo $b;

Conversion - one line)

I think it will be easier mb_ and / u use for UTF, but thanks anyway.
Well, yes, with mb_func_overload=On in any encoding there will be option 2.

Answer 2 · 2012-01-20T17:44:41

IMHO:

Auto-replacement of such a typo is not gud, for! - how are you going to track abbreviations ??

Therefore, I suggest something like this (JS):

 submit = function (){ if ( str.test( /[А-ЯA-Z]{2}/ ) && confirm( 'Возможно опечатка в строке Х [OK - исправлю, Отмена - не опечатка]' ) ) return; //отправка }

UPD:

It is possible so:

 <? $str_in = 'Я пошел гулять на улицу АЗина'; $str_out = preg_replace_callback( '/([А-Я])([А-Я])/', create_function('$match', 'return $match[ 1 ].strtolower( $match[ 2 ] );'), $str_in ); echo $str_out; ?>

The only thing - the problem with the encodings ... How to overcome - I will not tell)

with the detection of problems there is no: [AZ-ZA-Z] [AZ-YaA-Z] [a-Yaa-z]
There are many names like "IEiU", etc., so there are problems)
Згтещ Switcher same nabil such a base, and the TC will not break off, if you strongly need) to check the first and the last letter in the extreme, if both are upcase, then abbreviation.

AseN AseN 11.6k ten 51 116 · Answer 3 · 2012-01-20T15:30:05

In general, there is a solution here without regulars - read the ascii code of each character, then compare the previous one with the one just read, if the symbol just read has a code greater than or equal to the previous one, then add 64 to the code. In my opinion so ...

This is briefly ... If this solution is of interest, I will tell you about the details =)

Thank you, this is hard work in many lines, I want a one-liner

Answer 4 · 2012-01-20T17:40:53

Here is what first came to mind:

 $a='я пошел гулять на улицу АЗина'; $words = explode(" ", $a); for($i=0; $i<count($words); $i++){ if(preg_match("/[А-ЯA-Z]/", $words[$i][0])){ $words[$i] = ucfirst(strtolower($words[$i])); } } $a = implode(" ", $words);

One line does not))

There will be a problem if the sentence contains commas or periods.

PHP + regex typo fix

4 answers 4

More articles: