Hello. We have to deal with Russian texts, where there is an undesirable blotch of characters of the English alphabet.

Interspersing Latin characters in Sublime Text

Since the location of blotches is always different and is not subject to any rules, I cannot search for words in the text.

The solution is to replace the Latin letters ABEKMHOPCTXaeopcyx with their corresponding Cyrillic АВЕКМНОРСТХаеорсух : AА , BВ , etc. How in Sublime Text is it most effective to make all 18 substitutions?

This question is a special case of a more general: how to make a lot of substitutions in one regular expression? Thank.

  • one
    Does sublime not support ?{} , which means that the code for the substitution will not work. If you are using linux, then you can use the tr command, which just performs your task. - KoVadim pm
  • Well, because the same thing ... What am I now, the answer copy-paste with minimal changes? - Qwertiy

3 answers 3

Plugin for ST2 that performs the desired:

 # -*- coding: utf-8 -*- import sublime, sublime_plugin class MultipleReplace(sublime_plugin.TextCommand): def run(self, edit): target = "ABEKMHOPCTXaeopcyx" # en replacer = u"абекмнорстхАЕОРСУХ" # ru region = sublime.Region(0, self.view.size()) fullText = unicode( self.view.substr( region ) ) count = 0 for ch in range( len(target) ): count += fullText.count( target[ch] ) fullText = fullText.replace( target[ch], replacer[ch] ) self.view.replace( edit, region, fullText ) sublime.status_message( "Hidden chars replaced: " + str( count ) ) 

Plug-in for ST3 that performs the desired:

 # -*- coding: utf-8 -*- import sublime, sublime_plugin class MultipleReplace(sublime_plugin.TextCommand): def run(self, edit): target = "ABEKMHOPCTXaeopcyx" # en replacer = u"абекмнорстхАЕОРСУХ" # ru region = sublime.Region(0, self.view.size()) fullText = self.view.substr( region ) count = 0 for ch in range( len(target) ): count += fullText.count( target[ch] ) fullText = fullText.replace( target[ch], replacer[ch] ) self.view.replace( edit, region, fullText ) sublime.status_message( "Hidden chars replaced: " + str( count ) ) 

In the evening I will sign in more detail how to install plugins in ST.


Here you can find how to install the plugin:
https://habrahabr.ru/post/136529/

  • ToolsNew Plugin... → Skopipastil Your code → latin_to_cyrillic.py as latin_to_cyrillic.py . PreferencesKey Bindings - User , added the line { "keys": ["ctrl+j"], "command": "latin_to_cyrillic" }, I tried to substitute various hotkeys, latin_to_cyrillic is written in Latin everywhere. However, it does not work, the console does not show anything. Where am I wrong? Thank. - Sasha Chernykh
  • Everything started to work for me only when I hung up on ctrl+6 tried many combinations - ReinRaus
  • Unfortunately, none of the combinations earned (. I inserted your updated version of the plug-in — it still doesn’t work out. Thank you. - Sasha Chernykh
  • Apparently, not in the hot rules of the problem. I put a shortcut to your plugin on ctrl + l , which I had on another plugin running. Still nothing happens when you try to start. Where else could I be wrong? Thank. - Sasha Chernykh
  • Probably you should then rename the class to LatinToCyrillic , I renamed and earned. Before that, I saved it in UTF-8 without a BOM. Xs-need it. Try to just rename first. - ReinRaus

Just for fun

All this can be done with a regular expression in one pass, but I recommend the plugin, though. A plugin is the right way to solve a problem.

so

  • Add a "magic" line to the end of the text with a new line (the Latin character and the corresponding Cyrillic alternate in it)

     AАBВEЕKКMМHНOОPРCСTТXХaаeеoоpрcсyуxх 
  • We are looking for a regular expression

     (.)(?=[\s\S]*\n[^\n]*\1(.)(?:[^\n]{2})*\n?(?![\s\S])) 
  • case sensitive
  • do a replacement for

     $2 
  • remove the "magic" line from the text

You can touch the regular season here:
https://regex101.com/r/iW2yE3/1

enter image description here

  • one
    Cool! But terribly inefficient. I did not use multi-line mode precisely because of efficiency. - Qwertiy
  • one
    @Qwertiy is a very inefficient and inefficient link in your version :) - ReinRaus
  • one
    And if the file is several megabytes?))) - Qwertiy
  • one
    Плагин - это правильный путь решения задачи :) - ReinRaus
  • @Qwertiy, on 89 ... - Sasha Chernykh

Note

  1. Native search Ctrl + F Sublime Text 3 allows you to conduct multiple searches. Search query syntax: ($первая фраза|$вторая фраза) .

Slim hoodie

However, it is not possible to make multiple replacements with a single query in the native search.

  1. The solution described in this answer allows you to make multiple replacements in only one file, but not several. I wouldn’t say that I was looking very carefully, but I couldn’t find any programs at all (I was looking for Windows for my OS) with which I could do multiple replacements in many files.

  2. Tested in November 2016 on Sublime Text Build 3126. I anticipate, because, for example, the settings described in the June 2016 article about RegReplace have lost their relevance.



RegReplace

A plugin for performing multiple replacements in Sublime Text 3, documentation . Allows you to replace not only in simple cases, as discussed in this question, but also using regular expressions, for operations with which the Python re module is used.



Writing replacement rules in RegReplace

PreferencesPackage SettingsRegReplaceRules - User → we insert the code into the opened file.

 { "replacements": { "sasha_felicity_A": { "find": "A", "replace": "А", }, "sasha_felicity_B": { "find": "B", "replace": "В", }, "sasha_felicity_E": { "find": "E", "replace": "Е", }, "sasha_felicity_K": { "find": "K", "replace": "К", }, "sasha_felicity_M": { "find": "M", "replace": "М", }, "sasha_felicity_H": { "find": "H", "replace": "Н", }, "sasha_felicity_O": { "find": "O", "replace": "О", }, "sasha_felicity_P": { "find": "P", "replace": "Р", }, "sasha_felicity_C": { "find": "C", "replace": "С", }, "sasha_felicity_T": { "find": "T", "replace": "Т", }, "sasha_felicity_a": { "find": "a", "replace": "а", }, "sasha_felicity_e": { "find": "e", "replace": "е", }, "sasha_felicity_o": { "find": "o", "replace": "о", }, "sasha_felicity_p": { "find": "p", "replace": "р", }, "sasha_felicity_c": { "find": "c", "replace": "с", }, "sasha_felicity_y": { "find": "y", "replace": "у", }, "sasha_felicity_x": { "find": "x", "replace": "х", }, } } 

On the contrary, find Latin characters, opposite replace - Cyrillic. Instead of sasha_felicity_$буква you can call the rules with any other names, as long as they do not coincide with the names of other rules.

If you do not specify the IGNORECASE flag, then the search and replace operations will be case sensitive. The Latin H in the text will be replaced by Cyrillic Н , but the Latin h will not be replaced by Cyrillic н , which is necessary according to the conditions of the question. If you are case sensitive, the syntax of the rule is:

 "sasha_felicity_H": { "find": "(?i)H", "replace": "Н", }, 


Running a command with RegReplace rules

To work with commands in Sublime Text, I use the Suricate framework. It allows you not to create 100,500 sublimе-commands , sublime-keymap , Context.sublime-menu and Main.sublime-menu , but to specify settings for opening commands from the Command Palette, the context menu, Menu Bar and hot keys in one file for all commands of all plugins and default settings.

Add the following lines to the User/Default.suricate-profile :

 "sublime_latin_to_cyrillic": { "call": "sublime.reg_replace", "caption": "RegReplace: Latin To Cyrillic", "args": { "replacements": [ "sasha_felicity_A", "sasha_felicity_B", "sasha_felicity_E", "sasha_felicity_K", "sasha_felicity_M", "sasha_felicity_H", "sasha_felicity_O", "sasha_felicity_P", "sasha_felicity_C", "sasha_felicity_T", "sasha_felicity_a", "sasha_felicity_e", "sasha_felicity_o", "sasha_felicity_p", "sasha_felicity_c", "sasha_felicity_y", "sasha_felicity_x", ] } }, 

Options:

  • sublime_latin_to_cyrillic - the name of the array. You can set an arbitrary, if only it was clear what it means.
  • call - Sublime Text command name in the sublime.$имя команды format sublime.$имя команды .
  • caption - what will be the name of the item in the Command Palette, clicking on which will start the command specified in the call . Instead of RegReplace: Latin To Cyrillic You can set an arbitrary, if only it was clear that it means.
  • args are reg_replace command reg_replace . They are the rules referred to in the previous section.

I recommend using Suricate, but you can make settings without installing additional plug-ins. PreferencesPackage SettingsRegReplaceCommands - UserRegReplace code into the opened file:

 [ { "caption": "Reg Replace: Test RegReplace", "command": "reg_replace", "args": { "replacements": [ "sasha_felicity_A", "sasha_felicity_B", "sasha_felicity_E", "sasha_felicity_K", "sasha_felicity_M", "sasha_felicity_H", "sasha_felicity_O", "sasha_felicity_P", "sasha_felicity_C", "sasha_felicity_T", "sasha_felicity_a", "sasha_felicity_e", "sasha_felicity_o", "sasha_felicity_p", "sasha_felicity_c", "sasha_felicity_y", "sasha_felicity_x", ] } } ] 

What the parameters mean is clear from the previous section. Do not get confused in the JSON-syntax of the configuration files, put brackets, quotes and commas correctly.



Using

After you have made all the settings, open the file in which you want to replace → Ctrl + Shift + PSuricate: RegReplace: Latin To CyrillicEnter → should turn out like this:

Ricta morra