There is an html page

... <tr> <td class="basic-text" width="250">Type Allocation Holder</td> <td class="basic-text">Siemens</td> </tr> <tr> <td class="basic-text" width="250">Mobile Equipment Type</td> <td class="basic-text">Siemens S40</td> </tr> ... 

I need to get Siemens and S40 (or at least Siemens S40) from it using regular expressions.

Thank you in advance.

    4 answers 4

    As far as I remember, the parsil page itself on Delphi - Delphi can work with the DOM. Perhaps it will be better than regulars.

    • namely, TXMLDocument can - Nofate
    • About TXMLDocument - judging by the name, it will work only with an XML document. But here the problem may arise in the fact that the processed HTML will not be the correct XML - although there may be wrong. I once loaded the pages through TWebBrowser and then parsied through the interface - Ale_x

    Look at this resource , suddenly, it will help.

    And in general, I already answered HERE for a similar (albeit remotely) question.

      Search for text of the last <td> elements within <tr>:

       <td[^>]*>([^<]*)</td>\s*</tr> 

      But this design is flimsy. It is better and true to use an HTML parser, because there is nothing to be attached to (There would be a specific class for td elements with phone names, then another thing).

         var FindMaster:TRegExpr; ................. begin FindMaster:=TRegExpr.Create; FindMaster.Expression('Siemens S40'); ............................... end; 
        • one
          And if we know that we are looking for a "Siemens S40", then why look? - Nofate
        • @Nofate But everything is done as written in the question. - Alex78191