Hello! I have already read many similar topics, but mostly some of them were asked in them. Very much agree with these words. Well, to be honest, nothing at all is unclear. I understood that the information from html is parsed either through LINQ or through CSS selectors. I am not familiar with the first one, CSS is superficial. But still this option is intuitive to me or something closer, so I would like to receive answers in the form of CSS selectors.
Immediately the question: can the whole info be parsed in both ways? Or are there only cases when only one of the methods works? Or there are cases where it is generally impossible?)
Now directly to the task. I want to parse contact data from the site of the intercom. For example, take this page. Parsyu whole page to start
var parser = new HtmlParser(); var doc = parser.Parse("ссыль"); How, for example, parse the name? I look at the source, I see that the name is in the blockdiv class="df_panel" . It seems to be this unit with a unique name, so you can narrow down the search
var div = doc.QuerySelector("div.df_panel"); This is where the questions immediately begin. I figured out myself that if a class name is specified in a div block, then it is written as shown. If, for example, div id="test" , then the request is already written in a different way (it took a long time to get based on a bunch of examples from different forums)
var div = doc.QuerySelector("div[id="test""); So where is something written about this? I understand that some regular expressions are used here. Maybe they are similar to some other parsers, as, for example, it is written here that AngleSharp is very similar to Fizzler. But what if this is a locally arising task for me, and I didn’t deal with any other parsers? How should I understand what to write to me?
Ok, distracted. Dives closest to narrow the range of the search received. (Distract again - by the way, but what if there was no it at all? Is it possible to somehow obtain certain data if there are no unique identifiers, by means of which the search zone of the desired value is gradually narrowed?). Total see that the name is written in the header tag <h6 itemprop="name">НУЖНОЕ ИМЯ</h6> . How to get this value? Would it be possible to pull out the name if it were written without a title tag at all?
While on this questions I will stop. I would be grateful for any explanation. It is advisable to get answers to more general questions (for example, about, as I suppose, these regular expressions with help or good examples), then maybe I can figure out the rest.