PHP Simple HTML DOM Parser how to get to go to the full news script?

Question

Help please, I beg you .. The problem is that I get an array of links, if I follow the full news, but I don’t understand how to make the parser go, please explain a clever man ...

function parser_simple_html($url, $i){ if($i < 1) { $html = str_get_html(get_result($url)); $blog = $html->find("#dle-content", 0); foreach ($blog->find(".post") as $root) { $film = $root->find(".post-title", 0)->find('a', 0)->href; // $film - получили массив ссылок print $film; // как пробежаться по эти ссылкам ?? } // находим следующую страницу $page1 = $blog->find('.navigation', 0)->find('a', 10)->next_sibling ()->href; $page2 = $blog->find('.navigation', 0)->find('a', 9)->next_sibling ()->href; if ($page1 == true){ $page = $blog->find('.navigation', 0)->find('a', 10)->next_sibling ()->href; } else { $page = $blog->find('.navigation', 0)->find('a', 9)->next_sibling ()->href; } // конец поиска следующей страницы // переходим на следующую страницу if ((isset($page)) && !empty ($page)){ $i++; parser_simple_html('' . $page . '', $i); } } } $i = 0; parser_simple_html('http://дле-сайт.ру/page/1/', $i);

New view of the program:

  // ----------------- Вспомогательные функции. // Выносим в функцию, так как может поменяться способ получения файла. function getHtmlDocument($url) { return file_get_html($url); } function getLinksFromDocument($htmlDoc) { // код который вернет все ссылки в массиве. // Поменяйте ваш код что бы он только ссылки возвращал. $ssil = ''; $html = getHtmlDocument('http://dle-site.ru/page/1/'); $blog = $html->find("#dle-content", 0); foreach ($blog->find(".post") as $root) { $ssil .= $root->find(".post-title", 0)->find('a', 0)->href . ' '; } $s = $ssil; $ssilka = explode(" ", $s); return [$ssilka]; } function getArticleInfo($htmlDoc) { $tittle = $articlesInfo->find("#dle-content", 0)->find(".post", 0)->find(".post-title", 0)->find('a', 0)->plaintext; return [ "title" => $tittle, // тут допишите сами "content" => $tittle // тут допешите тоже сами ]; } // ----------------- Вспомогательные функции. // ----------------- Сама программа. // Получаем документ в котором есть список ссылок $htmlDocument = getHtmlDocument('http://dle-site.ru/page/1/'); // Парсим документ что бы получить список ссылок только $linkList = getLinksFromDocument($htmlDocument); // Пустой массив с информацией о статьях $articlesInfo = []; // Для каждой ссылки получаем документ. foreach ($linkList as $link) { $articleDocument = getHtmlDocument($link); // Парсим эти полученные документы. $articlesInfo[$link] = getArticleInfo($link); } // Здесь переменная $articlesInfo содержит всю информацию о всех статьях. // ----------------- Сама программа.

Start here php.net/manual/ru/control-structures.foreach.php .
@DmitryZasypkin I want the parser to switch to each link from the array and take the article description or at least the title of the article from the post-title class
@E_p thanks, read, and not only this article ... but alas, there is no mentor who will tell you what I am doing wrong and why ... I read a book on php, with those examples that were given in a book like everything is clear and clear, but how I do something not by examples, so everything is rolling upside down ...) why I ask for help to explain what they are
@ KostyaKorostelev make the function of processing these links and set it on the resulting array

E_p E_p 2,941 one 7 18 · Accepted Answer · 2017-02-28T22:27:28

If there is little programming experience, then you should not begin with copying the code, but with a description of what is happening.

The algorithm of your program is very simple:

We receive the document in which there is a list of links
Parsing a document to get a list of links only
For each link we get a document.
Parsim these received documents.

You can use this as a comment in the code. Meta program that does nothing. Then look and think. The 1st and 3rd paragraph you have the same. So you can write a general function.

Now you can start writing code.

 <?php // ----------------- Вспомогательные функции. // Выносим в функцию, так как может поменяться способ получения файла. function getHtmlDocument($url) { return file_get_html($url); } function getLinksFromDocument($htmlDoc) { // код который вернет все ссылки в массиве. // Поменяйте ваш код что бы он только ссылки возвращал. return [] } function getArticleInfo($htmlDoc) { return [ "title" => "", // тут допишите сами "content" => ""// тут допешите тоже сами ]; } // ----------------- Вспомогательные функции. // ----------------- Сама программа. // Получаем документ в котором есть список ссылок $htmlDocument = getHtmlDocument('http://дле-сайт.ру/page/1/'); // Парсим документ что бы получить список ссылок только $linkList = getLinksFromDocument($htmlDocument); // Пустой массив с информацией о статьях $articlesInfo = []; // Для каждой ссылки получаем документ. foreach ($linkList as $link) { $articleDocument = getHtmlDocument($link); // Парсим эти полученные документы. $articlesInfo[$link] = getArticleInfo($link); } // Здесь переменная $articlesInfo содержит всю информацию о всех статьях. // ----------------- Сама программа.

Thank you very much for your help)) they painted everything very well)
Good afternoon, an hour has passed since I racked my brains and didn’t understand the function function getArticleInfo($htmlDoc) how does it work?) What to add) how will it work in foreach ?
sorry for illiteracy, but I probably didn’t quite understand how to work with documents, how to deduce from the document the array that is obtained as a result of foreach

PHP Simple HTML DOM Parser how to get to go to the full news script?

1 answer 1

More articles: