I am interested in such a moment. I for example Parsya 2 site, they have the same category but written in different ways.
For example:

1 site: tourist knives
2 site: Tourist knives

These are the same categories but written differently. Categories are all written to an array, but before writing, you need to check the array for the presence of such occurrences.

How to be in this situation?

  • Divide into tokens, trim register, compare. - Akina
  • What do you mean by tokens? - RAPOS
  • In this particular case, the word probably needs to be considered a token. - Akina
  • Use morphology. You can find open libraries on the Internet - Alexander Muksimov
  • 3
    If the categories from the example are brought to lower case, divided into words into an array, sorted and merged back - both will give Π½ΠΎΠΆΠΈ туристичСскиС . That does not match with Π½ΠΎΠΆΠΈ ΠΎΡ…ΠΎΡ‚Π½ΠΈΡ‡ΡŒΠΈ . - Akina

1 answer 1

You can even without sorting the array - turn its elements:

 $str_1 = 'НоТи туристичСскиС'; $str_2 = 'ВуристичСскиС Π½ΠΎΠΆΠΈ'; var_dump( similar_str($str_1, $str_2) ); function similar_str(... $data) { [$a, $b] = array_map('mb_strtolower', $data); $b = join(' ', array_reverse(explode(' ', $b))); return $a == $b; } 

UPD: As it turned out, the option with array_reverse () is only suitable for a couple of words, so I added a sort, as suggested in the comments:

 $str_1 = 'НоТи ΠΊΠ°ΠΊΠΈΠ΅-Ρ‚ΠΎ туристичСскиС'; $str_2 = 'ВуристичСскиС Π½ΠΎΠΆΠΈ ΠΊΠ°ΠΊΠΈΠ΅-Ρ‚ΠΎ'; var_dump( similar_str($str_1, $str_2) ); function similar_str(... $data) { [$a, $b] = array_map(function($i){ $words = explode(' ', mb_strtolower($i)); sort($words); return $words; }, $data); return $a == $b; }