I want to highlight the shortest coincidence, but only so that it takes not the first div, but the one after it.

Here, for example, a regular schedule for selecting the very first match. And I need +1 more

(?<=item_description)[\w\W]*?(?=</div>) <div class="item_description"> Описание объявы <script type="text/javascript"> </script> </div> 

1 answer 1

Another character trying to parse HTML with regular expressions.

Just do not need to do this. HTML grammar is irregular (level 2) and you just need to take any DOM or XML parser. Everything.

Stand!

PS TC don't think anything personal, it's just a really bad idea and a common misconception, threads that constantly pop up (there were two today) and which I was already fed up with. Let's say js one-liner that easily pulls out the "Description declared" from the first diva in this HTML.

 var text = document.getElementsByClassName("item_description")[0].childNodes[0].nodeValue; 

It's easier than trying this:

 <\/?([A-Za-z][^\s>\/]*)(?:=\s*(?:"[^"]*"|'[^']*'|[^\s>]+)|[^>])*(?:>|$) 

Pull tags, for example, from this construction:

 <script> a<b; if(div>0) alert("</div>"); </script> 
  • Dude, didn’t you think that it might not have JS, but some C ++? Yes, it's terrible that the tag of the desired language is not put (or in the question), but still you have to reckon with it ... - user31688
  • Yes, I thought of course, but there is no label, I decided to take the most popular one. Although now probably in any language there is something for pars HTML. - igumnov
  • He definitely doesn’t have js because js don’t have that: (?<= - Qwertiy
  • What does it matter which language. - etki
  • @Qwertiy, this [ (?<= ] Refers to regular cuttings, but they are in JS. - user31688