There is such a table

<table cellspacing="0" cellpadding="0" cols="4" alt="Таблица с разным большая и красивая N1" title="Таблица с разным большая и красивая N1" border="0" style="border-collapse:collapse;WIDTH:135.73mm;min-width: 135.73mm;" class="a76"> <tbody> <tr height="0"> <td style="WIDTH:36.78mm;min-width: 36.78mm"></td> <td style="WIDTH:36.78mm;min-width: 36.78mm"></td> <td style="WIDTH:36.78mm;min-width: 36.78mm"></td> <td style="WIDTH:25.40mm;min-width: 25.40mm"></td> </tr> <tr valign="top"> <td style="HEIGHT:11.24mm;" class="a40c"> <div class="a40">n</div> </td> <td class="a44c"> <div class="a44">DCRF</div> </td> <td class="a48c"> <div class="a48">DCRI</div> </td> <td class="a52c"> <div class="a52">DCRO</div> </td> </tr> <tr valign="top"> <td style="HEIGHT:11.24mm;" class="a57cr"> <div class="a57">5122</div> </td> <td class="a61cl"> <div class="a61">Алла</div> </td> <td class="a65cl"> <div class="a65">должна</div> </td> <td class="a69cl"> <div class="a69">ехать</div> </td> </tr> </tbody> </table> 

And I need to remove the text inside the columns from here, that is, the first occurrence of tr .

Here is what I wrote:

 var query = (from table in webGet.DocumentNode.SelectNodes("//table").Cast<HtmlNode>() from row in table.SelectNodes("tr").Cast<HtmlNode>() from cell in row.SelectNodes("th|td").Cast<HtmlNode>() where table.Attributes["alt"] != null select new { Table = table.Attributes["alt"].Value,Row = row.InnerText, CellText = cell.InnerText }); foreach (var cell in query) { Console.WriteLine("{0}.{1}: {2}", cell.Table, cell.Row, cell.CellText); } 

The result is:

 Таблица с разным большая и красивая N1.: Таблица с разным большая и красивая N1.: Таблица с разным большая и красивая N1.: Таблица с разным большая и красивая N1.: Таблица с разным большая и красивая N1.nDCRFDCRIDCRO: n Таблица с разным большая и красивая N1.nDCRFDCRIDCRO: DCRF Таблица с разным большая и красивая N1.nDCRFDCRIDCRO: DCRI Таблица с разным большая и красивая N1.nDCRFDCRIDCRO: DCRO Таблица с разным большая и красивая N1.5122АллаДолжнаехать: 5122 Таблица с разным большая и красивая N1.5122АллаДолжнаехать: Алла Таблица с разным большая и красивая N1.5122АллаДолжнаехать: Должна Таблица с разным большая и красивая N1.5122АллаДолжнаехать: ехать 

And it is necessary so:

 Таблица с разным большая и красивая N1.n: 5122 Таблица с разным большая и красивая N1.DCRF: Алла Таблица с разным большая и красивая N1.DCRI: Должна Таблица с разным большая и красивая N1.DCRO: ехать 

    1 answer 1

    You can try this option:

     var tableNode = webGet.DocumentNode.SelectSingleNode("//table"); var result = tableNode .SelectNodes(".//tr").Where(x => x.GetAttributeValue("valign", null) == "top") .Select(x => x.SelectNodes(".//td").Select(s => s.InnerText.Trim())) .Cast<IEnumerable<dynamic>>() .Aggregate((first, second) => first.Zip(second, (f, s) => new { Table = tableNode.GetAttributeValue("alt", null), First = f, Second = s })); foreach (var item in result) Console.WriteLine($"{item.Table}.{item.First}: {item.Second}"); 

    I will explain:

    • webGet.DocumentNode.SelectSingleNode("//table") - here we take the first table from HTML. So, as you have provided only its code, I will assume that you have it one full page, if not, look for the one you need.

    • .SelectNodes(".//tr").Where(x => x.GetAttributeValue("valign", null) == "top") - Take all tr elements whose attribute valign is equal to top .

    • .Select(x => x.SelectNodes(".//td").Select(s => s.InnerText.Trim())) - take all the td elements and make a string of them (taking only the internal text).
    • .Cast<IEnumerable<dynamic>>() - We translate everything into a collection of dynamic types so that we can further use anonymous types.
    • .Aggregate((first, second) => first.Zip(second, (f, s) => new { Table = tableNode.GetAttributeValue("alt", null), First = f, Second = s })); - we take from the obtained result (and at this stage we have 2 collections), the first and subsequent collections and "sew" them into one. On the way out we give anonymous types with the necessary data.

    The result will be something like this:

    Collection result

    It remains only to withdraw with the desired type ( foreach ):

     Таблица с разным большая и красивая N1.n: 5122 Таблица с разным большая и красивая N1.DCRF: Алла Таблица с разным большая и красивая N1.DCRI: должна Таблица с разным большая и красивая N1.DCRO: ехать 

    PS I'm not sure about the result if we have more than 2 tr (most likely there will be an anonymous type attachment in an anonymous type), the code is checked only on the provided HTML!

    • In general, there the page itself is completely given to the table, in which there are more tables and it turns out as you said that there would be an anonymous attachment in an anonymous one - Dobrotiu
    • @Dobrotiu You need to find exactly how to determine the necessary data (it can be a unique ID, a class or it can be strictly in a certain sequence). When found, rewriting this code will be quite simple. Well, in general, it seems to me that you have a generated page, that is, you can work with it not through HTML, but through requests to the site, I advise you to look in this direction. - EvgeniyZ pm