SEO analysis

Question

There are such concepts as Web Mining and Data Mining, based on them we ask a question - we have parked the entire site, got a list of all pages, we see that there are sites like / news / .. or / catalog / .. and then there are pages, so analyze (output) the nodes separately, and everything else is separate, the algorithm of definition and output is interesting, thanks.

It is curiously written: Web Mining: basic concepts . - stanislav

Answer 1 · 2012-08-31T11:55:39

Most likely, we invent the bicycle, but it is interesting :)

Any element has a parent, and sometimes it is a parent itself. Nodes, in this case, have children, but pages do not. It is necessary to split the URL of each page by slashes, and calculate the single parent:

$url1 = '/news/2012-08-31.html'; // родитель 'news' $url2 = '/archive/news/2012/08/31/glavnaya_novost.html'; // родитель '31', но именно тот, у которого, в свою очередь, родитель '08', и не любой, а .. ну вы поняли) // Результат разбивки: array('archive','news','2012','08','31','glavnaya_novost.html');

Create an array of $ parents where there will be a record about each unique parent, for example.

 $parents[99] = array( 'url'=>'31', parentId=>19, childs=>0);

Here is the name and indication of the parent. Have a root parent null. Now you need to go through the elements from left to right: from the root to the end, creating parents, if they are not already there, and raising the child count for the next parent.

Output nodes - all elements whose childs>0 , <br> output pages - everything where childs=0 .

SEO analysis

1 answer 1

More articles: