public static void main(String[] args) throws FileNotFoundException, IOException { String file, FileR,line,result1; Scanner in = new Scanner(System.in); System.out.println("Введите имя файла, из которого считывать строки:"); file = in.nextLine(); Scanner out = new Scanner(System.in); System.out.println("Куда сохранить результат?"); FileR = out.nextLine(); //запрашиваем кодировку Scanner inEncoding = new Scanner(System.in, "UTF-8"); System.out.println("Введите название кодировки входного файла:"); String encodingStr = inEncoding.nextLine(); System.out.println("Введите название кодировки выходного файла:"); String encodingStrout = inEncoding.nextLine(); //cчитывание из файла InputStream inFile = new FileInputStream(file); byte[] str = new byte[inFile.available()]; inFile.read(str); if (encodingStr.compareToIgnoreCase("windows-1251") == 0) { line = new String(str, "windows-1251"); } else if (encodingStr.compareToIgnoreCase("Unicode") == 0) { line = new String(str, "UTF-16"); } else if (encodingStr.compareToIgnoreCase("UTF-8") == 0) { line = new String(str, "UTF-8"); } else { return; } //условия задания Pattern p = Pattern.compile("[\\p{L}]*",Pattern.UNICODE_CHARACTER_CLASS); boolean matches=p.matcher(line).matches(); if(matches) { result1="Текст не содержит символы, отличные от букв и пробела"; } else{ result1="Текст содержит символы, отличные от букв и пробела"; } OutputStream outFile = new FileOutputStream(FileR); if (encodingStrout.compareToIgnoreCase("windows-1251") == 0) { byte[] result = result1.getBytes("windows-1251"); outFile.write(result); } else if (encodingStrout.compareToIgnoreCase("Unicode") == 0) { byte[] result = result1.getBytes("UTF-16"); outFile.write(result); } else if (encodingStrout.compareToIgnoreCase("UTF-8") == 0) { byte[] result = result1.getBytes("UTF-8"); outFile.write(result); }}} - The lines should be read from the specified file. The result is also output to the specified file (other). File names get from the command line. Provide error handling, as well as the ability to set the encoding of the read and output text. - Madina
- Displays the result normally, saves in the entered encoding. But it always says that there are characters that are different from letters and spaces, which encoding wouldn’t set, although there are only letters in the file. I can not understand where the error is: in transcoding, reading or in regular expressions? - Madina
- Do not ask questions consisting of one code. Describe the essence of your question in such a way that it can be answered clearly and unequivocally. Try to solve the problem yourself, and if you have problems, ask a question about these problems, and do not give your homework in the comments. - m. vokhm
- First of all, I didn’t just bring my homework in the comments, but added all the conditions to make it clearer for what this problem with recoding and why I just didn’t enter text from the console, but read from the file and output to the file. I tried to do it myself and in the second comment I said that the problem is that it displays the result normally, but does not correctly determine that the text does not contain characters other than letters and spaces. I can not understand what contributes to this: incorrect reading of lines from a file, encoding or incorrect operation of regular expressions. - Madina
1 answer
No need to create a new copy of the scanner for each question. One copy is enough for all questions.
Used resources (streams, scanner) must be closed.
To compare the entered string with one of the possible options, you can use not the
switch, but theswitch(starting with Java 7).Error handling is not even the most primitive.
There is no space in your regular expression. Your pattern checks if the file contains only letters; if there are spaces or line breaks in the file, the pattern does not match.
Even if the pattern is corrected to take into account the spaces, there will be another problem: there may be a BOM in the file, which is also not a letter. This must also be taken into account.
So in the first approximation, something like this is obtained:
import java.io.FileInputStream; import java.io.FileOutputStream; import java.io.IOException; import java.io.InputStream; import java.io.OutputStream; import java.util.Scanner; import java.util.regex.Pattern; public class So_600623_FileEcodings { static String fixEncodingName(String encName) { switch (encName.toLowerCase()) { case "windows-1251": return "windows-1251"; case "utf-16": return "UTF-16"; case "utf-8": return "UTF-8"; default: System.out.println("Недопустимая кодировка: " + encName); System.exit(1); return null; // Compiler beleives that one day // System.exit() may return control :) } } public static void main(String[] args) throws IOException { InputStream inpStream = null; OutputStream outStream = null; Scanner in = null; try { in = new Scanner(System.in); String inpFileName, outFileName, dataStr, resultStr; System.out.println("Введите имя файла, из которого считывать строки:"); inpFileName = in.nextLine(); System.out.println("Куда сохранить результат?"); outFileName = in.nextLine(); //запрашиваем кодировку System.out.println("Введите название кодировки входного файла:"); String inpEncName = fixEncodingName(in.nextLine()); System.out.println("Введите название кодировки выходного файла:"); String outEncName = fixEncodingName(in.nextLine()); //cчитывание из файла inpStream = new FileInputStream(inpFileName); byte[] bytes = new byte[inpStream.available()]; inpStream.read(bytes); dataStr = new String( bytes, inpEncName); // It may contain BOM. Q&D workaround suitable for UTF-8 and UTF-16 if ((bytes[0] & 0xEE) == 0xEE) dataStr = dataStr.substring(1); //условия задания // Pattern p = Pattern.compile("[\\p{L}]*",Pattern.UNICODE_CHARACTER_CLASS); // А пробелы где? Pattern p = Pattern.compile("[\\p{L}\\s]*",Pattern.UNICODE_CHARACTER_CLASS); resultStr = p.matcher(dataStr).matches()? "Текст не содержит символов, отличных от букв и пробела": "Текст содержит символы, отличные от букв и пробела"; outStream = new FileOutputStream(outFileName); byte[] result = resultStr.getBytes(outEncName); outStream.write(result); } catch (IOException x) { System.out.println("Ошибка ввода-вывода:"); x.printStackTrace(); } finally { // Открытые файлы надо закрывать if (outStream != null) try { outStream.close(); } catch (Exception x) { System.out.println("Ошибка закрытия выходного файла:") ; x.printStackTrace(); } if (inpStream != null) try { inpStream.close(); } catch (Exception x) {} if (in != null) try { in.close(); } catch (Exception x) {} // Сканер тоже } } } - Thank you, there are a couple of questions, but you have helped so much, so I will dig in the literature, a special thanks for the link to BOM, I will now take it into account :) - Madina