There is a DB from one table in which 3 fields: ID, English word, transfer.

For one of the tasks, I read only verbs with the help of such an expression (in my case all verbs start with a part of "To")

words = wordsfromDB.Where(w => w.EnglishWord.StartsWith("To")).ToList(); 

Now I want to implement the ability to read only phrasal verbs, that is, the value "To give" is not considered, and the value "To give up" is considered.

That is, it must read all entries that begin with "To" and which have at least 3 words.

Help make the right condition.

  • Instead of w.EnglishWord.StartsWith("To") specify a regular expression. for example, Regex.Match("to\s+\w+\s+up").Success - Stack
  • @Stack, regular expressions cannot be translated to SQL, this is the limitation of the provider linq2sql - Grundy
  • It does not matter to me now, I download all the words in the List, and then add the words that suit me to another List. Regularity is good, but I would like to do LINQ. - Vladimir
  • @Vladimir, who did you answer? it was recommended to use the regulars in the Where method, for example, if you call it after ToList - Grundy
  • one
    or check that three words, for example, w.EnglishWord.Split(' ', StringSplitOptions.RemoveEmptyEntries).Length > 2 and this is also in Where after ToList - Grundy

2 answers 2

Do not use regular expressions unless absolutely necessary. Both in pure SQL, and in "LINQ to SQL", for such queries there is a LIKE operator, which also allows selection by text patterns:

 words = wordsfromDB.Where(w => SqlMethods.Like(w.EnglishWord, "To % up")).ToList(); 

It is executed an order of magnitude faster than crutches that are not typical for SQL with regular expressions.

In general, according to the task - look at the full-text search. There are both extended text queries and grammar-dictionary kernels, and not only for English.

  • if I am not mistaken, then up is just for an example and in the expression instead of it, probably, it’s also worth % to put SqlMethods.Like(w.EnglishWord, "To % %") since it doesn’t matter which three words. - Grundy
  • @Grundy, maybe. But only the author knows the exact formulation of the problem. I just took a regular expression from the first comment to the question, but transformed it into a text pattern for use in LIKE / PATINDEX. More for example, as an illustration of the fact that here you can do with standard SQL tools. - Sergey Rufanov
  • Just the post itself states: all entries that begin with "To" and that have at least 3 words. - Grundy
  • I then corrected it myself and wrote above via LINQ: all the entries that begin with "To", that is, with a space, and at least 3 words. And only phrasal verbs fall into my sample. - Vladimir
  • I have this option: words = wordsfromDB.Where (w => w.EnglishWord.Split (). Length> 2) .ToList (). Where (w => w.EnglishWord.StartsWith ("To")).. ToList (); But this option does not work: (using I added) words = wordsfromDB.Where (w => SqlMethods.Like (w.EnglishWord, "To%%")). ToList (); The error is given such: It is not possible to use the Like method on the client. It is intended only for conversion to SQL - Vladimir

If the database grows, instead of loading data into the application for further processing, you can perform processing on the server.
SQL Server has a CLR, and you can define a UDF (user defined function) in C #.
UDF is running on the server. Description and examples - here .


UPDATE:

If separate sentences are stored in the database, and LIKE is not suitable for selecting sentences, then you probably need to revise the data structure and store individual words with the sentence identifier.

  • That is, the sample will be carried out in the database itself, and not all words will be immediately loaded into my application, but only those that fit the condition. Right ? - Vladimir
  • Yes, all processing (and based on regex) will be in sql server, and only filtered result will be loaded into the application. - Stack
  • There is only savings in traffic between the application and the server. The rest - just shifting responsibility "from a sore head to a healthy one." If the client application and the DBMS server are located close to the network, or generally on the same server / computer, then the benefit will be zero. And if the DBMS server uses more than one client application - the landing in performance will be distributed to all. - Sergey Rufanov
  • @SergeyRufanov if the database is small and is needed only for data sampling, then it can be copied entirely (earlier, if I am not mistaken, it was called the client-server architecture), but such a solution has many drawbacks, including the violation of srp (single responsibility). - Stack
  • one
    @Stack, your solution actually comes down to iterating through all the rows in a table and applying a CLR function to each of them - this is slow. Analyze the code of your CLR-functions, and optimize their logic for the use of indexes and the natural order of the elements, unfortunately, SQL Server is not able to, and hardly ever learn. If you want to use your solution on the client, and "do not violate the SRP" (in brackets, because in fact we can not judge now the responsibility of a small piece of code given by the author) - stupidly copy CLR-UDF into the project, removing the "SqlFunction" annotation above the method. - Sergey Rufanov