Here is the task:

Output to the console all the tags that correspond to the specified tag. Each tag on a new line, the order must correspond to the sequence in the file. The number of spaces, \ n, \ r do not affect the result. The file does not contain a CDATA tag, there is a separate closing tag for all opening tags. There are no single tags. The tag may contain nested tags.

Here are the tag templates from the job:

<tag>text1</tag> <tag text2>text1</tag> <tag text2>text1</tag> 

text1, text2 may be empty

Enter this:

 <span>string1 <span>string2</span> string11</span> 

The output should be:

 <span>string1 <span>string2</span> string11</span> <span>string2</span> 

What regexp is needed for this? Here is my test code:

 public class Solution { public static void main(String[] args) { String testStr = "<span>string1 <span>string2</span> string11</span>"; Pattern p = Pattern.compile("(\\<(/?[^\\>]+)\\>)"); Matcher m = p.matcher(testStr); while(m.find()) { System.out.println(testStr.substring(m.start(), m.end())); } } } 

And the answer is to him.

 (?=((?:(?1)|<(?!/span>)|[^<]+)+</span>)) 

But I can not "screw"

 Pattern p = Pattern.compile("(?=((?:(?1)|<(?!/span>)|[^<]+)+</span>))"); 

writes this:

Exception in thread "main" java.util.regex.PatternSyntaxException : Dangling meta character ' ? 'near index 0 ?=(<(span)>(?:(?1)|<(?!/\2>)|[^<]+)+</\2>)

Tell me how to adapt the regular season for my task?

  • five
    Show the code that throws this exception - LEQADA
  • Pattern p = Pattern.compile ("(? = ((? :(? 1) | <(?! / Span>) | [^ <] +) + </ span>))"); - Vetos
  • one
    @LEQADA: there is a code. - Nick Volynkin
  • supplemented the question. - Vetos
  • one
    It uses regulars from php. Java expressions of this kind (pay attention to (?1) ) do not yet support. - Temka also

1 answer 1

In your regular expression:

 (?=((?:(?1)|<(?!/span>)|[^<]+)+</span>)) 

The definition of the capture group (?1) not correct from the point of view of the regular expressions syntax in JAVA, since PHP PCRE recursive patters are not supported in it - the error is issued because of this


Your task should be solved using an HTML parser, for example jsoup :

 String html = "<span>string1 <span>string2</span> string11</span>"; String tag = "span"; Document document = Jsoup.parse(html); document.select(tag).forEach((element) -> { System.out.println(element.outerHtml()); }); 

This code prints:

<span> string1 <span> string2 </ span> string11 </ span>
<span> string2 </ span>

That quite corresponds to the required from the correct decision.

  • Thanks I got it. I did not find another answer. - Vetos