There are 400,000 words in Russian. The average word length in the Russian language is 5.28 characters. The semantic content of the text does not matter, only the maximum number of combinations is of interest.

How to estimate the number of possible texts with a length of 1000 characters?

  • one
    From the available data, the answer cannot be obtained Because The answer 34 ^ 1000 does not contradict anything. If you want a number of REAL texts, then the task is much more complicated and requires a formal grammar of the language. - pavel

1 answer 1

A simple solution to C: filling the text with words and multiplying by the number of variants, that is, words in the alphabet. The result is a huge number, it does not fit anywhere, so I only keep the degree: 400000 ^ 190. If you add spaces between words, but the degree is 30 less.

#include <stdio.h> const int wordsCount = 400000; const double wordLength = 5.28; const int textLength = 1000; int main () { // количество символов в тексте double symbolCount = 0; // степень количества вариантов unsigned int power = 1; // заполнение текста словами while (symbolCount < textLength) { symbolCount += wordLength;// + 1 для пробелов power++; } // цикл заполняет текст больше лимита, можно вычесть одно словечко power--; // собственно, вывод результатов printf("%d ^ %u\n", wordsCount, power); return 0; } 

And Python counted and so:

 wordsCount = 400000 wordLength = 5.28 textLength = 1000 count = 1 symbols = 0 while (symbols < textLength): count *= wordsCount symbols += wordLength + 1 count 

But the result does not inspire confidence, he lacks zeros at all:

  • Isn't it 190 ^ 400000? - user208916
  • Not! Basics of combinatorics. A simple example: two friends randomly share candy images. How many result options are there? 8, 2 ^ 3, the number of options in the degree of the number of cases. - AivanF.
  • Clear. Thank. - user208916 pm
  • If the problem is just to get a number, then you can take a calculator :) - 40000 ^ 190 = 2.462625E874 ... Anyway, this is a very rough estimate, because words cannot be considered equiprobable in any case, regardless of the meaning - if they get more genuine, they will be less, and shorter - more than 190. So here even the order itself - 874 - can be believed only as an estimated value ... - Harry