The task is to save to the file all links in the specified URL. While reading through the BufferedReader, a NullPointerException crashes when using the do while in the if (buffer.indexOf ("<a ")> 0) } line if (buffer.indexOf ("<a ")> 0) } , and when using for everything works, although in theory both cycles do the same thing. Tell me why. Previously, without problems, used both one and the other cycle.

Crashes:

  public static void getLinks(String link, String file) { StringBuilder sb = new StringBuilder(); try { URL url = new URL(link); HttpURLConnection connect = (HttpURLConnection) url.openConnection(); BufferedReader input = new BufferedReader(new InputStreamReader(connect.getInputStream())); String buffer = ""; String text = ""; do { buffer = input.readLine(); if (buffer.indexOf("<a ") > 0) { // NullPointerException if (buffer.indexOf("</a") > 0) { sb.append(buffer.substring(buffer.indexOf("<a "), buffer.indexOf("</a") + 4)) .append(System.lineSeparator()); text = ""; } else { text = buffer.substring(buffer.indexOf("<a ")); } } else if (text.length() > 0) { if (buffer.indexOf("</a") > 0) { sb.append(text).append(text + buffer.substring(1, buffer.indexOf("</a") + 4)) .append(System.lineSeparator()); text = ""; } else { text += buffer; } } } while (buffer != null); } catch (IOException e) { System.out.println(e); } try (ObjectOutputStream links = new ObjectOutputStream(new FileOutputStream(file))) { links.writeObject(sb.toString()); System.out.println("File " + file + " was saved!"); } catch (IOException e) { System.out.println("Error save file!"); } } 

Works:

 public static void getLinks(String link, String file) { StringBuilder sb = new StringBuilder(); try { URL url = new URL(link); HttpURLConnection connect = (HttpURLConnection) url.openConnection(); BufferedReader input = new BufferedReader(new InputStreamReader(connect.getInputStream())); String buffer = ""; String text = ""; for (; (buffer = input.readLine()) != null;) { // работает if (buffer.indexOf("<a ") > 0) { if (buffer.indexOf("</a") > 0) { sb.append(buffer.substring(buffer.indexOf("<a "), buffer.indexOf("</a") + 4)) .append(System.lineSeparator()); text = ""; } else { text = buffer.substring(buffer.indexOf("<a ")); } } else if (text.length() > 0) { if (buffer.indexOf("</a") > 0) { sb.append(text).append(text + buffer.substring(1, buffer.indexOf("</a") + 4)) .append(System.lineSeparator()); text = ""; } else { text += buffer; } } } } catch (IOException e) { System.out.println(e); } try (ObjectOutputStream links = new ObjectOutputStream(new FileOutputStream(file))) { links.writeObject(sb.toString()); System.out.println("File " + file + " was saved!"); } catch (IOException e) { System.out.println("Error save file!"); } } 
  • Does it enter the cycle in the second case? - temq
  • Same as in the first case: buffer = input.readLine() , only it is built into the condition of the for (; (buffer = input.readLine()) != null;) - Sergey
  • I understand that the check is built in, but does it enter the loop or immediately go out without a single iteration? - temq
  • @temq and what does it matter in the context of the question? In any case, the do-while here looks inappropriate and requires additional. checks for null in the body of the loop. - Regent
  • @Regent is all clear, I just wanted to push a person to the idea that his code works, but not as it should, because Obviously, the expected execution of a heifer loop at least once, and not an instant exit from it. - temq

2 answers 2

Your do-while does not work the same way as for : in for before each iteration is checked that the read buffer not null . In the do-while , this check occurs after buffer has already been used, and, in fact, therefore the check is meaningless, because if buffer is null , then this will be known already in the if (buffer.indexOf("<a ") > 0) line if (buffer.indexOf("<a ") > 0) about the forwarded NullPointerException , and it will not get to while .

It makes sense to replace the do-while with a while :

 while ((buffer = input.readLine()) != null) 

When using Java 8, you can go through all the lines using for-each.

For example:

 Stream<String> stream = input.lines(); for (String buffer : (Iterable<String>)stream::iterator) { ... } 

Or at least:

 for (String buffer : input.lines().collect(Collectors.toList())) { ... } 
  • Thank. I understand while (buffer != null) { buffer = input.readLine(); also not suitable. Is it possible then to somehow take out the operation buffer = input.readLine() beyond the limits of while () , i.e. transfer to the loop body? - Sergey
  • @Sergey yes, also not suitable. It is possible to bring into the cycle body, but why do it? - Regent
  • I want to understand the principle of how this can be done and draw analogies with the way it was done earlier, because for some reason there was no such problem with the do-while earlier. It may happen that before all the time I used it incorrectly. - Sergey
  • one
    @Sergey can be done this way . But, in my opinion, this code looks much worse than the case in which the exit condition of the loop is in the while . - Regent
  • Thanks for the detailed answer - they helped a lot! Yes, the code looks better with the condition inside while . It was necessary to make a puzzle in my head. - Sergey

Line by line reading a file involves reading the line and checking at each iteration:

 while (true) { buffer = input.readLine(); if (buffer == null) break; // ... } 

Or:

 do { buffer = input.readLine(); // buffer может быть null здесь, поэтому обязательна проверка: if (buffer == null) break; // ... } while (buffer != null); // Из-за проверки выше, проверка здесь не имеет смысла. Выражение всегда true 

As @Regent correctly noted, this code can also be written as:

 while ((buffer = input.readLine()) != null) {} 

Although in reality, your for loop is a complete counterpart to it - for(;cond;); equivalent to while(cond);

However, many styleguides expressly prohibit the use of an assignment operator under conditions.

If the reading of the first line is considered to be the initialization of the loop, then we can write this code as:

 for(buffer = input.readLine(); buffer != null; buffer = input.readLine()){ // ... } 

However, there is obviously a duplication of code.


In Java, when we need to iterate through elements of a certain collection (for example, lines in a file), we describe an iterator.
Let's try:

 public class BufferedReaderIterator implements Iterable<String> { private BufferedReader input; public BufferedReaderIterator(BufferedReader input) { this.input = input; } @Override public Iterator<String> iterator() { return new Iterator<String>() { @Override public boolean hasNext() { try { input.mark(1); if (input.read() < 0) return false; input.reset(); return true; } catch (IOException e) {return false; } } @Override public String next() { try { return input.readLine(); } catch (IOException e) {return null; } } @Override public void remove() {throw new UnsupportedOperationException(); } }; } 

And this is how the initial cycle will look now:

 for(String buffer: new BufferedReaderIterator(input)){ // ... } 

We got rid of both duplication and assignment.


Alternatively, you can use a ready-made solution, for example: org.apache.commons.io.LineIterator .
Truth with him will have to handle exceptions.

  • It is a pity that for a compact for will have to pay a whole extra. class. But if you don’t use Java 8 and solutions like the one that was inspired by your answer to me, then you probably won’t be shorter - Regent
  • @Regent, it is important to understand that the solution with .lines().collect although it looks similar, is fundamentally different. It reads the entire file before the first iteration, which is quite acceptable for small files. - vp_arth
  • Of course. By analogy, I meant using for-each. Correct the description in the answer to avoid misunderstandings. - Regent
  • Thanks for participating. iterator , of course, is unlikely to be used for such a task. But you have here provided for what is called, all possible options. - Sergey