there is a line like:

№;Задача;T;O;P ;2016-01-18 1. ;task1;03:00;.; ;2016-01-18 2. ;task2;03:00;.; ;2016-01-18 3. ;task3;03:00;.; ;2016-01-19 7. ;33333;03:00;.; ;2016-01-19 8. ;d;03:00;.; ;2016-01-19 9. ;00;03:00;.; ;2016-01-20 21. ;task1;03:00;.; ;2016-01-20 22. ;task2;03:00;.; ;2016-01-21 25. ;testtime;03:00;.; ;2016-01-21 26. ;fgghgfh;23:45;.;, new t. 

how to remove duplicate dates from it and leave only the very first one, so that this kind of result will eventually turn out

 №;Задача;T;O;P ;2016-01-18 1. ;task1;03:00;.; 2. ;task2;03:00;.; 3. ;task3;03:00;.; ;2016-01-19 7. ;33333;03:00;.; 8. ;d;03:00;.; 9. ;00;03:00;.; ;2016-01-20 21. ;task1;03:00;.; 22. ;task2;03:00;.; ;2016-01-21 25. ;testtime;03:00;.; 26. ;fgghgfh;23:45;.;, new t. 

your second answer came up to the upper example, please tell me for this example how to do the same so that you delete the same dates

  <tr><th></th><th>2016-01-18</th><th></th><th></th><th></th></tr><tr><td>1. </td><td>task1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-18</th><th></th><th></th><th></th></tr><tr><td>2. </td><td>task2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-18</th><th></th><th></th><th></th></tr><tr><td>6. </td><td>task4&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>23:33</td><td>.</td><td>, text</td></tr> <tr><th></th><th>2016-01-19</th><th></th><th></th><th></th></tr><tr><td>18. </td><td>trtt&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>23:08</td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-19</th><th></th><th></th><th></th></tr><tr><td>19. </td><td>klkkl&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>23:44</td><td>.</td><td>, new t.</td></tr> <tr><th></th><th>2016-01-19</th><th></th><th></th><th></th></tr><tr><td>20. </td><td>hhh 565&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>23:59</td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-20</th><th></th><th></th><th></th></tr><tr><td>21. </td><td>task1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-20</th><th></th><th></th><th></th></tr><tr><td>22. </td><td>task2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-20</th><th></th><th></th><th></th></tr><tr><td>23. </td><td>task3 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td>, extra t.</td></tr> <tr><th></th><th>2016-01-21</th><th></th><th></th><th></th></tr><tr><td>25. </td><td>testtime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-21</th><th></th><th></th><th></th></tr><tr><td>26. </td><td>sdas&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td>, qqwweerrtt</td></tr> <tr><th></th><th>2016-01-21</th><th></th><th></th><th></th></tr><tr><td>27. </td><td>in 3:00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td> <td>, 12345</td></tr> 

bring to mind

 <tr><th></th><th>2016-01-18</th><th></th><th></th><th></th></tr><tr><td>1. </td><td>task1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><td>2. </td><td>task2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><td>6. </td><td>task4&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>23:33</td><td>.</td><td>, text</td></tr> <tr><td>18. </td><td>trtt&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>23:08</td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-19</th><th></th><th></th><th></th></tr><tr><td>19. </td><td>klkkl&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>23:44</td><td>.</td><td>, new t.</td></tr> <tr><td>20. </td><td>hhh 565&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td>23:59</td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-20</th><th></th><th></th><th></th></tr><tr><td>21. </td><td>task1&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><td>22. </td><td>task2&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><td>23. </td><td>task3 &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td>, extra t.</td></tr> <tr><td>25. </td><td>testtime&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td> </td></tr> <tr><th></th><th>2016-01-21</th><th></th><th></th><th></th></tr><tr><td>26. </td><td>sdas&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td>, qqwweerrtt</td></tr> <tr><td>27. </td><td>in 3:00&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;</td><td></td><td>.</td><td>, 12345</td></tr> 

if it's not difficult to explain how your code works, you didn’t quite understand it, but it perfectly approached my first example, as I understood the problem in the first line, if I change it, an error occurs.

    1 answer 1

    The solution arises from a bug in a related issue . Because of the bug in java of old versions, it is necessary to get rid of the retrospective positional check, because the regular expression

     (?<=(;\d{4}-\d\d-\d\d\n).{0,30})\1 

    causes an error.

    Regular expression solves the problem:

     /(?<=(;\d{4}-\d\d-\d\d\n))((?:[^;]++|;)*?)\1/g 

    It uses a retrospective positional check of constant length to prevent the appearance of a bug.
    Live example on regex101

    It is subject to easy customization, its simpler form:

     (?<=(;\d{4}-\d\d-\d\d\n))(.*?)\1/gs 

    The same, but in Java:

     String regex = "(?<=(;\\d{4}-\\d\\d-\\d\\d\\n))((?:[^;]++|;)*?)\\1"; text = Pattern.compile( regex ).matcher( text ).replaceAll( "$2" ); 

    Live example on IDEone