Here is the Html code snippet:
<div> <div> <a></a> <a></a> <div><a><span></span>Text1</a></div> </div> <div>Text2</div> </div> With the help of:
var htmlNodes = htmlDoc.DocumentNode.SelectNodes("*"); foreach (var node in htmlNodes) { text += node.InnerText; } I get this line:
"\r\n \r\n \r\n \r\n \r\n Text1\r\n Text2" Can I just pull out the text?
"Text1 Text2"
SelectNodes("//text()[normalize-space(.) != '']")- Alexander PetrovDescendants().OfType<HtmlTextNode>().Where(n => !string.IsNullOrWhiteSpace(n.InnerText))- Alexander PetrovhtmlDoc.DocumentNode.SelectSingleNode("//text()[normalize-space(.)]").InnerText;TheDocumentNodeexactly thatHtmlsnippet from above. - Vipz