Please help me figure it out, trying to handle the upload of information about a Cisco phone in HTML format. Required data is in table[2]

 <HTML> <HEAD> <META http-equiv="Content-Type" content="text/html; charset=utf-8"><TITLE>Cisco Systems, Inc.</TITLE> </HEAD> <BODY bgcolor="#FFFFFF" link="#FFFFFF" vlink="#FFFFFF" alink="#FFFFFF" text="#003031" > <TABLE BORDER="1" WIDTH="100%" HEIGHT="100%" CELLSPACING="0" CELLPADDING="0" bordercolor="#003031"> <TR> <td WIDTH="200" HEIGHT="100" ALIGN=center><A HREF="http://www.cisco.com"><IMG SRC="/FS/Logo.png"></A></TD> <td HEIGHT="50" bgcolor="#003031"><p ALIGN=center style="margin-top: 0px;"><B><font color="#FFFFFF" size="6">Device information</FONT></B><p ALIGN=center><B><font color="#FFFFFF" size="4">Cisco IP Phone</FONT></FONT></B></TD> </TR> <TR> <td WIDTH="200" ALIGN=center VALIGN=top bgcolor="#003031"> <TABLE BORDER="0" CELLSPACING="10" CELLPADDING="0"> <TR><TD><a href="/CGI/Java/Serviceability?adapter=device.statistics.device">Device information</A></TD></TR> <TR><TD><B><font color='#FFFFFF'>Streaming statistics</FONT></B></TD></TR> <TR><TD>&nbsp;&nbsp;&nbsp;<a href="/CGI/Java/Serviceability?adapter=device.statistics.streaming.0">Stream 1 </A></TD></TR> </TABLE> </TD> <td VALIGN=top> <DIV ALIGN=center> <TABLE BORDER="0" CELLSPACING="10" CELLPADDING="0"> <TR><TD><B> Service mode</B></TD><td width=20></TD><TD><B>Enterprise</B></TD></TR> ... <TR><TD><B> Service domain</B></TD><td width=20></TD><TD><B></B></TD></TR> <TR><TD><B> App load ID</B></TD><td width=20></TD><TD><B>rootfs8845_65.12&#x2D;1&#x2D;1&#x2D;12</B></TD></TR> </TABLE> </DIV> </TD> </TR> </TABLE> </BODY></HTML> 

When you try to pull out the parameters of the device from the table, the rows

 param = soup.findChildren("table"[2]) print(param) 

The output is a list without any binding to the lines, all parameters in a row:

 ....<b> Service mode</b>, <b>Enterprise</b>, <b> Service domain</b>, <b></b>, <b> Service state</b>, <b>Idle</b>, .... 

However, if you do a search on the "DIV" tag

 param = soup.findChildren("div") print(param) 

The output will contain the string of necessary data, but it cannot be run through a cycle to create a dictionary, where the parameters from one string will be the key - the value (example "Service mode": "Enterprise")

Tell me how to properly handle the table to create a dictionary of parameters in the table[2]

  • "table"[2] is not what you expect :) - gil9red
  • Yes, it turns out that an unordered data set of 2 columns of this table, and in addition, in some rows, data can only be in one column. - Luarvick
  • Because, soup.findChildren("table"[2]) will actually become soup.findChildren("b") - gil9red

1 answer 1

Try this:

 from bs4 import BeautifulSoup # ... table = root.select_one('table:nth-of-type(3)') # Или так: table = soup.select('table')[2] name_by_value = dict() for tr in table.select('tr'): tds = tr.select('td') name, value = tds[0].text.strip(), tds[2].text.strip() name_by_value[name] = value print(name_by_value) # {'Service mode': 'Enterprise', 'Service domain': '', 'App load ID': 'rootfs8845_65.12-1-1-12'} 

Ps.

I noticed that the required cell values ​​are found exclusively in <b> , which means you can specify the search a little:

 ... for tr in table.select('tr'): tds = [x.text.strip() for x in tr.select('td > b')] name, value = tds[0], tds[1] name_by_value[name] = value ... 

And if there are always two elements, then use unpacking:

 ... for tr in table.select('tr'): name, value = [x.text.strip() for x in tr.select('td > b')] name_by_value[name] = value ... 
  • Yes, this is exactly what was needed, thank you very much for the decision! - Luarvick