Greetings
I need to build an adjacency matrix, based on data obtained by a crawler bot. The data has a list of visited pages, inlinks (links pointing to the page itself), outlinks (links pointing to another page) and extra pages (for example, there are no http).
We need to create such an algorithm so that the program "reads" the information, "bypasses" extra pages, and if there are inlinks / outlinks, would write them as 1, if not, as 0, thereby building a 500x500 binary matrix.
Here is a possible pseudocode:
for each visited page vp for each outlink of vp if link relative revolve link if ink to visited page write 1 else if link dangling ignore it else write 0 Links to resources and another pseudocode: https://moodle.bbk.ac.uk/pluginfile.php/666888/mod_resource/content/2/sewn_2016_labsheet_3_pseudocode.pdf http://www.dcs.bbk.ac.uk/~ martin / sewn / ls3 / sewn_2016_labsheet_3_full_crawl.xlsx
Is it possible to implement this task in Python? or is it better to use other tools like r and matlab?