There are such concepts as Web Mining and Data Mining, based on them we ask a question - we have parked the entire site, got a list of all pages, we see that there are sites like / news / .. or / catalog / .. and then there are pages, so analyze (output) the nodes separately, and everything else is separate, the algorithm of definition and output is interesting, thanks.
1 answer
Most likely, we invent the bicycle, but it is interesting :)
Any element has a parent, and sometimes it is a parent itself. Nodes, in this case, have children, but pages do not. It is necessary to split the URL of each page by slashes, and calculate the single parent:
$url1 = '/news/2012-08-31.html'; // родитель 'news' $url2 = '/archive/news/2012/08/31/glavnaya_novost.html'; // родитель '31', но именно тот, у которого, в свою очередь, родитель '08', и не любой, а .. ну вы поняли) // Результат разбивки: array('archive','news','2012','08','31','glavnaya_novost.html');
Create an array of $ parents where there will be a record about each unique parent, for example.
$parents[99] = array( 'url'=>'31', parentId=>19, childs=>0);
Here is the name and indication of the parent. Have a root parent null. Now you need to go through the elements from left to right: from the root to the end, creating parents, if they are not already there, and raising the child count for the next parent.
Output nodes - all elements whose childs>0
, <br> output pages - everything where childs=0
.