There is an HTML table. How to import HTML into Excel using a COM object? (Powershell script)
- and Excel and Microsoft.Office.Interop.Excel installed? - Stack
- Excel is installed. and what is Microsoft.Office.Interop.Excel and why? - user195474
- Microsoft.Office.Interop. * - .NET wrappers for working with COM objects Excel, Word, etc. - Stack
- PowerShell is based on .NET - Stack
- Everything set - user195474
|
1 answer
This is possible either through Regex (which is more convenient and faster) or through the Com object. Through COM, you need to read the entire table and each row, parse it into an array, and unload it into csv. If your table looks like this:
<H3>Заголовок</H3> <TABLE ID="table6" BORDER=1 CELLPADDING=2> <TR><TH><B>Condition</B></TH><TH><B>\LogicalDisk(*)\Disk Transfers/sec</B></TH><TH><B>Min</B></TH><TH><B>Avg</B></TH><TH><B>Max</B></TH><TH><B>Hourly Trend</B></TH><TH><B>Std Deviation</B></TH><TH><B>10% of Outliers Removed</B></TH><TH><B>20% of Outliers Removed</B></TH><TH><B>30% of Outliers Removed</B></TH></TR> <TR><TD>No Thresholds</TD><TD>MACHINENAME/C:</TD><TD>1</TD><TD>7</TD><TD>310</TD><TD>0</TD><TD>11</TD><TD>5</TD><TD>5</TD><TD>5</TD></TR> <TR><TD>No Thresholds</TD><TD>MACHINENAME/D:</TD><TD>0</TD><TD>0</TD><TD>0</TD><TD>0</TD><TD>0</TD><TD>0</TD><TD>0</TD><TD>0</TD></TR> <TR><TD>No Thresholds</TD><TD>MACHINENAME/E:</TD><TD>0</TD><TD>24</TD><TD>164</TD><TD>-1</TD><TD>11</TD><TD>22</TD><TD>21</TD><TD>20</TD></TR> <TR><TD>No Thresholds</TD><TD>MACHINENAME/HarddiskVolume5</TD><TD>0</TD><TD>0</TD><TD>2</TD><TD>0</TD><TD>0</TD><TD>0</TD><TD>0</TD><TD>0</TD></TR> </TABLE> That code for COM will look like this:
# С помощью IE создаем HTML файл. $oIE = New-Object -ComObject InternetExplorer.Application $oIE.Navigate("file.html") $oHtmlDoc = $oIE.Document # Читаем таблицу по её ID. $oTable = $oHtmlDoc.getElementByID("table6") # Выгружаем строки в массив. $oTbody = $oTable.childNodes | Where-Object { $_.tagName -eq "tbody" } $cTrs = $oTbody.childNodes | Where-Object { $_.tagName -eq "tr" } # Создаем массив из заголовков таблицы $cThs = $cTrs[0].childNodes | Where-Object { $_.tagName -eq "th" } $cHeaders = @() foreach ($oTh in $cThs) { $cHeaders += ` ($oTh.childNodes | Where-Object { $_.tagName -eq "b" }).innerHTML } # Конвертим строки в массив PS объектов и экспортируем в CSV $cCsv = @() foreach ($oTr in $cTrs) { $cTds = $oTr.childNodes | Where-Object { $_.tagName -eq "td" } # Пропускаем первую строку (заголовки). if ([String]::IsNullOrEmpty($cTds)) { continue } $oRow = New-Object PSObject for ($i = 0; $i -lt $cHeaders.Count; $i++) { $oRow | Add-Member -MemberType NoteProperty -Name $cHeaders[$i] ` -Value $cTds[$i].innerHTML } $cCsv += $oRow } # Закрываем IE $oIE.Quit() # Экспорт в CSV. $cCsv | Export-Csv -Path "file.csv" -NoTypeInformation |