Guys such a problem: Assignment with online courses ulearn.

"This time we will process the text. Our final goal is the text continuation algorithm, which, according to several words, guesses the most likely following based on the knowledge gained from analyzing a large text array. The whole task is divided into 3 stages: text preparation, analysis of the frequency of digrams and the actual continuation of the text, based on information about the frequency. In this task, you need to break the text into sentences, and sentences into words. Perform the task in the SentencesParserTask class. "

Actually the file itself with the task below. I duplicate the code here.
I tried to do the task using LINQ, left comments on what each method does.
Error gives this:

Error on: ab, c 0th element (subitems delimited by pipe symbol) should be [b|c], but was [b,|c] 

Thanks in advance for your help!)

 using System.Collections.Generic; using System.Linq; using System.Text; namespace TextAnalysis { static class SentencesParserTask { public static readonly string[] StopWords = { "the", "and", "to", "a", "of", "in", "on", "at", "that", "as", "but", "with", "out", "for", "up", "one", "from", "into" }; /* Π Π°Π·Π±Π΅ΠΉΡ‚Π΅ Ρ„Π°ΠΉΠ» с тСкстом Π½Π° прСдлоТСния ΠΈ слова. Π‘Ρ‡ΠΈΡ‚Π°ΠΉΡ‚Π΅, Ρ‡Ρ‚ΠΎ слова ΠΌΠΎΠ³ΡƒΡ‚ ΡΠΎΡΡ‚ΠΎΡΡ‚ΡŒ Ρ‚ΠΎΠ»ΡŒΠΊΠΎ ΠΈΠ· Π±ΡƒΠΊΠ² (ΠΈΡΠΏΠΎΠ»ΡŒΠ·ΡƒΠΉΡ‚Π΅ ΠΌΠ΅Ρ‚ΠΎΠ΄ char.IsLetter) ΠΈΠ»ΠΈ символа апострофа ', Π° прСдлоТСния Ρ€Π°Π·Π΄Π΅Π»Π΅Π½Ρ‹ ΠΎΠ΄Π½ΠΈΠΌ ΠΈΠ· ΡΠ»Π΅Π΄ΡƒΡŽΡ‰ΠΈΡ… символов .!?;:() Π£Π΄Π°Π»ΠΈΡ‚Π΅ ΠΈΠ· Ρ„Π°ΠΉΠ»Π° слова, содСрТащиСся Π² массивС StopWords (частыС Π½Π΅Π·Π½Π°Ρ‡Π°Ρ‰ΠΈΠ΅ слова ΠΏΡ€ΠΈ Π°Π½Π°Π»ΠΈΠ·Π΅ тСкстов Π½Π°Π·Ρ‹Π²Π°ΡŽΡ‚ стоп-словами) ΠœΠ΅Ρ‚ΠΎΠ΄ Π΄ΠΎΠ»ΠΆΠ΅Π½ Π²ΠΎΠ·Π²Ρ€Π°Ρ‰Π°Ρ‚ΡŒ список ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½ΠΈΠΉ, Π³Π΄Π΅ ΠΊΠ°ΠΆΠ΄ΠΎΠ΅ ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½ΠΈΠ΅ β€” это список ΠΎΡΡ‚Π°Π²ΡˆΠΈΡ…ΡΡ слов Π² Π½ΠΈΠΆΠ½Π΅ΠΌ рСгистрС. */ //StopWords.Select(b=>b) public static List<List<string>> ParseSentences(string text) { var sb = new StringBuilder(text); var lstStr = sb.ToString() .Split('.', '!', '?', ';', ':', '(', ')')//раздСляСм строку символами Ρ‚Π΅ΠΌ самым прСобразуСтся Π΄ΠΎ ΠΏΡ€Π΅Π΄Π»ΠΎΠΆΠ΅Π½ΠΈΠΉ .Select(a => a.Split(' ').Except(StopWords)//Ρ€Π°Π·Π±ΠΈΠ²Π°Π΅ΠΌ прСдлоТСния ΠΏΠΎ ΠΏΡ€ΠΎΠ±Π΅Π»Ρƒ ΠΈ удаляСм ΠΈΠ· прСдлоТСния мноТСство StopWords //ΠΈ провСряСм являСтся Π»ΠΈ символ Π±ΡƒΠΊΠ²ΠΎΠΉ, Π² Π½ΠΈΠΆΠ½Π΅ΠΌ Π»ΠΈ рСгистрС Π±ΡƒΠΊΠ²Π° ΠΈΠ»ΠΈ символ апострофа .Where(b => b.Any(c=>char.IsLetter(c) && char.IsLower(c) || c=='`')) .Select(b => b).ToList()).ToList(); return new List<List<string>>(lstStr); } } } 
  • Guys and explain how to properly format the code to leave questions correctly? And then I tried to follow the instructions of the site and something went awry - Philip Barinov
  • Do you want to do this with LINQ ? - Bulson
  • Well, not that, just the fact is that I am just starting to learn LINQ and I would like to make out exactly how this example can be done using LINQ) Of course, if you tell me how to organize the solution of this problem more correctly both in terms of refatcoring and from the point from the point of view of productivity - I will be very grateful to you) - Philip Barinov
  • Eco, you are wrapped; with performance and refactoring! As I understand it, you can write, but you want to write not simply, but with a little perversion in the style of brainfuck. Then I will wait too, maybe something will happen :) - Bulson
  • And this part: "... analysis of the frequency of bigrams ..." you have already written? - Bulson

0