Can I save all the tags of the selected site? For selection there is only $ elem-> nodeValue = only the text of all tags. Is it possible to get all the contents with markup? If not, how to do it in a different way? thank
3 answers
If you have a DOMNode (you use DOMDOcument , right?), The saveHTML method will help you:
$html = $doc->saveHTML($myNode); However, note that support for the $myNode argument appeared with php5.3.6.
An analogue for XML is saveXML .
- You heard me) Thank you for it. Aaaaaaa .... I knew about him, and did not think = (stuck on the fact that it should only be used at the end of the whole process) drowned. Thank you, there is no limit to nonsense) - dima buhayov
|
There are two excellent tools that I use:
- Goutte / Guzzle Client
To begin with, create a new instance of the GoutteClient or GuzzleClient class and perform the actions in order:
Goutte
$client = new Goutte\Client(); // Отправка запроса и получение контента в виде HTML $content = $client->request('GET', 'http://stackoverflow.com/')->html(); // Достаем документ и создаем экземпляр синглтона дя работы с phpQuery $dom = phpQuery::newDocument('<!DOCTYPE html>' . $content); // Крутим "барабан" со всеми элементами документа foreach ($dom->find('*') as $element){ // Инициализируем элемент как новый объект $element = pq($element); /** * И тут уже можно получать аттрибуты через * $element->attr('src') и т.д * $element->text() **/ } Guzzle
$client = new Guzzle\Client(); $content = $client->get('http://stackoverflow.com/')->getBody() ->getContents(); $dom = phpQuery::newDocument($content); , and the rest as in the previous example with Goutte.
Selectors in phpQuery work in the same way as in native jQuery.
- please give me a simple answer. With this thing you can not get the text of the node, and all the markup of the selected node? And then you wrote how to use it, and all. - dima buhayov
- This is very good, but that the first that the second boil down to the choice of text. First to get the attribute value, and the text element. - And I need the element itself with its attributes and text - I chose for example the parent div (the contents of which I should return "<img src = 'image' />") - so I want. The second option you simply upload the document in another way. in the DOM, this can be made easier without the libraries being converted. $ html = new DOMDocument (); $ html-> loadHTML (file_get_contents (' stackoverflow.com/' )); But thanks. - dima buhayov
- @dimabuhayov, yes - you can - Roman Kozin
|
http://simplehtmldom.sourceforge.net/ Try using this library. It is built on selectors of the jquery type, where you can choose html.
- the answer should not consist of only one link, please provide at least one elementary example of parsing (code) using simplehtmldom. - Alex
- I don't need libraries, I already know them all. You give advice on how to make parsing more simple by offering "phpQuery", "Simple ...". You do not see the question - dima buhayov
|