There is a list of words (number from 1000 - up to 3000) and there are 30 categories. Is it possible to automatically somehow determine the appropriate category for each word by meaning (by semantics)?

For example: words = mom, dad, brother, bread, pie, meat, milk .... categories = family, food, ....

Result: family = mom, dad, brother food = bread, pie, meat, milk

  • Possible, ontology keyword - Alexander Muksimov
  • In what form is your list of words, are there any semantic semantic hints, or is it just an array separated by commas? - Kromster
  • Rather, just an array. The expectation is that there is a ready-made library that gives the expected categories for a word, and if any of the issued categories are in the list, then this category is assigned to this word. I hope that is available explained - Anton
  • habrahabr.ru/post/277413 for a start - Alexander Muksimov

4 answers 4

You can parse any site or several, where words are already divided into categories.

For example: http://rus.lang-study.com/category/slovar/

If you wish, you can also lemmatize words to learn how to categorize nouns in any form.

    For a computer, words are just words; by itself, it cannot define the meaning of a word, or a lexical or logical group.

    Of course, this can be done manually. Create a list of groups, add the words of this group to each group, and already on the basis of these groups, if the word is input, and the word belongs to one of the groups, output this group to the output.

      You can write such an algorithm.

      Take each group and the word denoting the group. We are looking for this word in Yandex and take, say, the first 40 texts. We go through each text and see how many times what word from the list of 3000 is included in this text. Thus we make the table

      [group name]
      word1 is included 0 times
      word2 enters 3 times
      word3 comes in 15 times
      word4 comes in 0 times

      Next, we go through all the words out of 3000 and look into which group it belongs with the highest value, into this group and determine the word.
      The algorithm is of course simple, if you wish, you can improve it.

      • The idea is working, but costly in time and resources (at first glance). As already commented earlier, the calculation on the fact that there is a ready-made library, which gives the expected categories for the word. This would simplify the task many times. Perhaps there is already a ready-made solution from Google or Yandex, in the search for which I am - Anton
      • @Anton by the way, you can quickly make such an algorithm, but it’s very simple. On the other hand, if you find a ready-made solution, then of course it can be better. But on the other hand, it’s not a fact that this ready-made solution will suit you, and if it does, how much time it will take to adapt. - Dmitry Polyanin

      such a primitive option for your specific example

      #include <iostream> #include <map> enum Category {family, food }; void print_result(Category N, const std::multimap<Category, std::string>& g) { auto P = g.equal_range(N); while(P.first != P.second) { std::cout << P.first->second << ", "; ++P.first; } std::cout << std::endl; } int main() { std::multimap<Category, std::string> group{ {family, "mama"}, {family, "dad"}, {family, "brother"}, {food, "bread"}, {food, "tart"}, {food, "meat"}, {food, "milk"} }; std::cout << "family" << " = "; print_result(family, group); std::cout << "food" << " = "; print_result(food, group); return 0; } 
      • In your code, categories are already predefined for each of the words. The task is such that, knowing only the words, to determine the appropriate category from another list - Anton
      • Well, enter in multimap from another list, in any case you yourself should indicate what is which. Then I noted that this is a primitive example for clarity - of course, you can do everything with classes and get data from the list and stream, gradually adding new terms to the list of categories .... you can follow the advice of MaxU - I just wrote a good example - AR Hovsepyan
      • almost every problem can be solved in many ways, just like this one. But you yourself have to choose a solution, because you know exactly why you are doing this, and according to your knowledge, choose a solution and a solution. Examples are just food - AR Hovsepyan