Convert text to lower case, in case there are too many capital letters

Question

It is necessary to take from the variable some arbitrary text written in both small and large letters in an unknown percentage, and if the number of upper case letters in the text is some X percent (a manually specified parameter), then the entire text is translated into lower case.
It is necessary for catching lovers BUILD ADS KAPSLOCK. Verification of the percentage is done for those bona fide users who sometimes allow themselves to single out a few words in caps.

I would like to issue in the form of a function.

The problem is curious ... If the algorithm: 1. remove everything from the text except letters;
4. punish those responsible by changing registar .... Regular expressions are irreplaceable here.
We translate the text into lower case, compare it with the original through the Levinstein algorithm, look at the number of differing characters, divide by the length of the line — this is the percentage.
The task really turned out to be quite interesting, in my opinion, the very place for it is on ruSO.

Answer 1 · 2016-01-15T16:18:01

Code that supports UTF-8 encoding and Russian characters. Counts the length of only letters , spaces, punctuation and other characters are not taken into account. You can take the full length of the string mb_strlen($text) subtract $txtL and $txtU from it and get these non-alphanumeric characters and also calculate their ratio, maybe this can also be useful ...

 <?php $text="Необходимо это для отлова любителей СТРОЧИТЬ ОБЪЯВЛЕНИЯ КАПСЛОКОМ. Проверка процентного соотношения делается "; $text=magicLower($text); print $text; function magicLower($text) { $txtL=mb_strlen(preg_replace("/[^а-яёa-z]+/u","",$text)); $txtU=mb_strlen(preg_replace("/[^А-ЯЁA-Z]+/u","",$text)); if(!$txtL) $txtL=0.01; print "LowerCase: $txtL, Upper case: $txtU rate:".($txtU/$txtL)."\n"; if($txtU/$txtL<0.1) return $text; return preg_replace_callback("/(?<=[A-ZА-ЯЁ])([A-ZА-ЯЁ\s]+)/u", function($match) { return mb_convert_case($match[0], MB_CASE_LOWER, "UTF-8"); },$text); } ?>

The magicLower function only converts non-single capital letters to lowercase, possibly separated by a space. Thus, the beauty of normal sentences is preserved; they remain with large letters. The result of this code:

 LowerCase: 134, Upper case: 58 rate:0.43283582089552 Необходимо это для отлова любителей Строчить объявления капслоком. Проверка процентного соотношения делается

Your code does not work for the АБВ line and does not work correctly for the Ёж под Ёлкой line Ёж под Ёлкой .

Convert text to lower case, in case there are too many capital letters

1 answer 1

More articles: