How to fill a string with random small Russian letters?

    8 answers 8

    Get an array of characters - the Russian alphabet. Next, using a random number generator, generate an index from this array and use it to display a symbol.

    Night in the yard and this is what immediately came to mind)

    • can an example please? - skies
    • Well, you can fill the array with Russian small letters?))) Then read this: ci-plus-plus-snachala.ru/?p=15 - Stas0n
    • one
      recently began to learn the language - skies
    • one
      congratulations) what's the problem then? 1) Create an array of char characters alphabet for 33 characters. 2) Next handles fill this case. 3) Then you generate a random number rand from zero to 32 (this is one of the indices of the array) 4) alphabet [rand] - your random symbol - Stas0n
    • 2
      An array of charms? Oh, and I have linux and utf-8. What am I supposed to do? - alexlz
    #include <iostream> #include <sstream> #include <stdlib.h> using namespace std; const char * ar[] = {"а", "б", "в", "г", "д", "е", "ё", "ж", "з", "и", "й", "к", "л", "м", "н", "о", "п", "р", "с", "т", "у", "ф", "х", "ц", "ч", "ш", "щ", "ъ", "ы", "ь", "э", "ю", "я"}; int main(int argc, char* argv[]) { int n; string s = ""; stringstream ss (argv[1]); ss >> n; for(int i=0; i < n; i++) s += ar[rand() % (sizeof ar/sizeof (char *))]; cout << s << endl; return 0; } 

    UPD Corrected by the recommendation of @GLmonster

    • This is a clear code) - Stas0n
    • And what is the g-code on the 5th line? - gammaker
    • Where exactly? - Stas0n
    • 2
      I am saying that creating 33 lines of one character in dynamic memory is nonsense. It’s better to replace it with: char ar [] = "abvgdeoziyklmnoprstufkhtschshchyyyuya"; - gammaker
    • one
      About two bytes - where did the firewood come from? I WCHAR_MAX = 2147483647l, but it does not climb in two bytes. - alexlz

    There is such a table of ANSI characters, which can be seen here as an example of a table .

    Each character has its own code from 0 to 255. And the Russian letters (large and small) fall from 192 to 255, respectively, only large ones from 192 to 223, and only small ones from 224 to 255.

    We create an array of char type, fill it with Russian characters and, as a result, print the characters on the screen using a random index of our array :)

    You can also just make an array of type int and fill it with character codes, and when output, put (char), the result will be the same.

    For random output, you need to connect time.h, you can implement randomness in a certain range using the following rand ()% algorithm (MAXIMUM - MINIMUM + 1) + MINIMUM.

    In the end, here:

     #include <iostream> #include <time.h> // макрос, который выводит случайные числа в определенном диапазоне // rand() % (МАКСИМАЛЬНОЕ - МИНИМАЛЬНОЕ + 1) + МИНИМАЛЬНОЕ #define _rand(min, max) ( rand() % ((max) - (min) + 1) + (min) ) int main(int argc, char** argv) { setlocale(LC_ALL, "russian"); srand(time(NULL)); char chars[32]; // наш алфавит // заполняем массив for(int i=224, n=0; i<=255; ++i, ++n) { chars[n] = (char) i; // (char) приводит код к символу } // выводим собранный массив std::cout << "Собранный массив:\n"; for(int i=0; i<32; ++i) { std::cout << chars[i] << "\t"; } // выводим собранный массив в случайном порядке std::cout << "\nСлучайным образом:\n"; for(int i=0; i<100; ++i) { std::cout << chars[_rand(0, 31)] << "\t"; } std::cout << std::endl; return 0; } 
    • I have also registered the namespace sdt) - Stas0n
    • 3
      After adding #include <stdlib.h> and running the assembled array: Random: ным By the way, what is the ANSI character table and how does the American Standards Institute know Russian letters? Did our scouts catch on? - alexlz
    • in the console, the ASCII code is used; it consists of 1 byte characters (0-255), the first part (128) is the American standard, and the second part is the national ... school497.ru/download/u/02/img/asc1.gif for the Russian output operation Letters we trim setlocale (LC_ALL, "rus"); - ProkletyiPirat
    • one
      setlocale to chop under a root or it is possible above (without paying attention to hemp height)? In the @belka answer, character codes are generally specified in one of the encodings (cp1251?). And I have utf-8 - alexlz
    • chop up to use (immediately after main () { ) also requires the locale.h || library The locale only works for the output ... setlocale itself cuts the national part to askey ... but the best option in Windows is SetConsoleCP (1251); + SetConsoleOutputCP (1251); from Windows.h library - ProkletyiPirat

    @skies , you wrote a bunch of tips in the answers and comments (there are better and worse).

    I think that if you need a simple program, then it will be platform-specific (that is, you know the Russian encoding (at least one-byte, UTF-8 or Unicode) in your system and write for it. You can clarify in the question for which OS You want the program and get specific advice.

    In any case, you need an array of small Russian letters. For a simple program, this is most likely an array of pointers to single-letter strings (strings in C terms, not C ++) (for utf-8 (as in the @alexlz answer)) the string will have 3 bytes (including the terminating '\ 0 '). With this solution, you will use the same algorithm for outputting / inserting Russian letters, regardless of the encoding.

    In a program that is as independent as possible of the system (Windows, Mac, * nix, (localization is configured / not configured) and the compiler), inside the program it is probably most convenient to use an array of type int with Unicode. Before output, you will have to (dynamically) determine the current encoding of the Russian language and recode Unicode into it (this is necessary for working with non-military localization).

    The approach is not generally accepted, because it is quite complicated. Unfortunately, it is not always possible to do this correctly. For example, a Ubuntu user works in utf-8, but in a particular window gnome-terminal can be set, say, the cp-1251 encoding (this leaves LANG = ru_RU.utf8).

    • What does "(dynamically) determine the current encoding" mean? - alexlz
    • read the variable LANG , for example. - gecube
    • Is it like this? #include <stdio.h> #include <iconv.h> #include <string.h> #include <stdlib.h> int main () {int c; char s [3]; char * loc = getenv ("lang"); char * dot = strchr (loc, '.'); iconv_t cd = iconv_open (dot + 1, "ucs-2"); for (c = L'a '; c <= L'ya'; c ++) {size_t ins = 2, outs = 2; char * in = (char *) & c, * out = s; int i = iconv (cd, & in, & ins, & out, & outs); s [2-outs] = 0; printf ("% s", s); } iconv_close (cd); return 0; } By the way, little / big endian can come up here. - alexlz
    • Dynamically determine the encoding - well, in a simple case by LANG. Depends on the OS. In Windows, for example, use LANG is not accepted. In general, we look at (priorities depend on the developer) setlocale (LC_ALL, ""), LANG, analyze swearing popen ("mkdir /", "r") type of OS ... and make a decision (sometimes unexpected iso 8859-5 if we are in hp -ux) - avp

    Here is a cross-platform code

     #include <iostream> #include <cstdlib> #include <ctime> #include <locale> #include <string> using namespace std; #ifdef WIN32 #define RUSLOCAL "rus" #elif __unix #define RUSLOCAL "ru_RU.utf-8" #endif wchar_t randomletter() { static bool seedset = false; if (!seedset) { srand(static_cast<unsigned> (time(NULL))); seedset = true; } static wchar_t alph[] = L"абвгдеёжзиклмнопрстуфхцчшщьыъэюя"; return alph [rand() % (sizeof(alph) / sizeof (alph[0]) - 1)]; } int main(int argc, char* argv[]) { locale::global (locale( RUSLOCAL )); for (int i = 0; i<10; i++) { wcout << randomletter() << L" "; } wcout << endl; #ifdef WIN32 system ("PAUSE"); #endif return 0; } 

    Tested in Linux g ++ and in Windows VC ++ 2010. If precompiled headers are used in Windows, you must also include stdafx.h.

    • @mikillskegg, unfortunately in MinGW g ++ (g ++. exe (GCC) 3.4.5 (mingw-vista special r3) in Win 7 is not even translated rusl.c: 21: 29: converting to execution character set: Illegal byte sequence rusl. c: In function wchar_t randomletter()': rusl.c:22: warning: division by zero in rand ()% 0' rusl.c: In function int main(int, char**)': rusl.c:29: error: rusl.c: 31: error: `wcout 'is not in this scope, but in Ubuntu it’s like OK. av avp
    • Maybe because the gcc version is old. I tried with cygwin (gcc 4.3). If the original text of the program is saved in utf-8, then it is compiled. But when you start the program crashes with the wrong locale. In general, some darkness. - skegg
    • Yes, the really cross-platform code with Russian letters is total darkness. For good in Windows, it is still necessary to look at the output in the console or in a file (pipe), and for the console that there is cp866 or cp1251. You can do it, but the sheet with \ # ifdefs turns out to be healthy. - avp
    • Put the latest version of MinGW. Gathered without problems, but when you start it also throws out. Rummaged: he has no Russian locale in principle. Here is such a fun. Conclusion: Well, this Windows nafig. - skegg
    • But in VS everything worked fine. - skegg

    Arrays are not needed for random letters: 'a' + rand ()% 33

    • Ischo adyn ... Firstly 8-bit / not 8-bit. Secondly, for example, I do not remember in DKOI-8 (8-bit encoding, based on EBCDIC), Russian small go in a row or not. - alexlz
    • If not 1 byte, but 2 bytes per character, then L'a '+ rand ()% 33 In general, why support all encodings in a row? Yes, and you did not seem to ask a question, but there are no such requirements in it. - gammaker
    • So, do not like. All codings why, yes? And also (if 2 bytes) all the architecture why (with little-endian or big-endian)? The question, of course, was asked not by me, but by @skies, but I cannot leave a comment already? And the conditions in the question did not specify any - alexlz
    • one
      @GLmonster "Any Unicode" is good. Which one Small ones in the past millennium chose one, then they had to smoothly shovel it into another ... And about big / low - after all, with arithmetic operations you can save it backwards. (By the way, you can remind about inetovskie htonl, ntohl, htons, ntohs) - alexlz
    • 2
      @GLmonster, in any encoding known to me because of the letter ё it will not work correctly (even if you replace% 33 with% 34). This ё - letter has code everywhere outside the range of 'a' - 'me'. - avp

    In pseudocode, the algorithm for filling an array with random small Russian letters will be the following:

    1 . We generate random number of RANDcode in the range of MINcode to MAXcode. Here MINcode is the minimum code for a Russian small letter, and MAXcode is the maximum code for a small Russian letter, respectively.

    2 Fill the array with the number of RANDcode.

    3 If the array is not yet full, then go to step 1.

    To generate a random number in the range from MINcode to MAXcode, you can use the iRand function (see the article “ Generating a Random Number in the Required Range ”). It remains to determine only the values ​​of MINcode and MAXcode - they depend on the encoding. For example, if KOI-8R encoding is used, then MINcode = 0xC0, MAXcode = 0xDF; and if the encoding is CP1251 , then MINcode = 0xE0, MAXcode = 0xFF.

    In general, if you do not need to bother with the encoding, then you first need to prepare the array ALLRUSSMALLLETTERS, which will list all the small Russian letters, and generate a random number for the index of this array (from 0 to sizeof (ALLRUSSMALLLETTERS) -1). For a random index from the ALLRUSSMALLLETTERS array, you need to take a character and insert it into the array to be filled.

      1. set manually the required character set (array)
      2. get its size
      3. create a function returning a random character (random from a range of characters)

      version with locales, etc. not worth it, is in itself a task. Cross-platform in this case is maintained. If suddenly, the console will give out krakozabli - we kick the console or user. For example, before calling in Windows from cmd, you can set a locale. If there is no locale in it, let the user think about it. Something like this, the code options have already been given a lot.

      • 3
        Most interested in the proposal to kick the user. And if it is unavailable or available, but too healthy? - alexlz
      • Kicking a user, you lose money. (S) - skegg
      • And if you yourself have this user? - avp
      • Losing time, and therefore money, nerves and generally the joy of life. - skegg
      • guys, the expression "kicking the user" should not be understood literally .. this means requiring the user to fulfill the conditions, for example, if he wants to see the Russian text, then take care of the locale .. In this discussion, as I understand it, the task is at the learning level, so, the user - the programmer - the student, can easily translate the console into the desired encoding. This I mean, to take care of the universality of the program - can lead into such jungle that the very purpose of the program will be forgotten. - stells2