Recently I came across the fact that it is necessary to select some sequence of characters from the text on the site, however, the Internet describes ways to extract HTML code from a page, but I know that the data I need will be in the visible part of the site, so I need only text, without JavaScript , CSS, HTML tags - for me this is just an extra thread. From here, gentlemen, the question arises: how can you parse (read) only the text from the site and highlight some character? If there is no answer, I will read all the HTML text.

  • four
    As it is stupid to parse a pure text, you can get a lot of spaces, incomprehensible characters, and other Labudas not interesting. Usually the text (whether it is an article, quote or other material) is located in one tag (say <div class="content">孝械泻褋褌</div> ), in this case, it is enough just to get the data in this tag ... So maybe you should work with html? Either I do not understand something, then it would be very good to see what is required to work with (page code) ... - EvgeniyZ
  • The page (even though its visible part, though invisible) consists of HTML markup, and already in this markup may be the text you need. Therefore, to get this text, you must first parse the HTML. If, of course, you know in advance what kind of text it should be, then you can disassemble all the markup as text, but I would not recommend it. - tym32167
  • I need the text of the post in VKontakte (PS and not only if I find the parsing on the tag with the class, it will be great, for example, search in the code for <div class="current">袣袥袗小小</div> ). I鈥檒l say right away that everything would be very simple if I could use third-party libraries (for example, vk api), but my task is to make less code in both volume and weight. The layout of the post in any group / page looks like this: <div class="wall_post_text">孝械泻褋褌, 褌械泻褋褌, 褌械泻褋褌, <a href="/away.php?to=小小蝎袥袣袗&amp;post=-褔懈褋谢芯&amp;cc_key=" target="_blank">小小蝎袥袣袗</a> <br> 袧芯胁邪褟 褋褌褉芯泻邪 褌械泻褋褌邪 </div> - Maxim Maximov
  • five
    And a colleague with VK, you need to work exclusively through the API, since he has a very good one. All the text can be easily received, say via get or a separate message via getById . You will have less code than pars HTML! It is enough to send the correct request (not necessarily third-party libraries!). - EvgeniyZ

0