I get the file here's a piece of the function

while ((bytesRead = is.read(data)) > 0) { try { sb.append(new String(data, 0, bytesRead, charset)); } catch (UnsupportedEncodingException e) { Log.e(MY_LOG, "Invalid charset: " + e); //Append without charset (uses system's default charset) sb.append(new String(data, 0, bytesRead)); } } 

And I always get into the catch (UnsupportedEncodingException e) . But the file is recorded and everything is fine ...

What does this charset mean? I understand this is the encoding ...

But how scary are I recording without her?

This is the way to get this encoding.

 String[] values = conn.getContentType().split(";"); String charset = ""; for (String value : values) { value = value.trim(); if (value.toLowerCase().startsWith("charset=")) { charset = value.substring("charset=".length()); break; } } 

I understand that if the server sent me a file with an encoding, then I read it and write the file with the same encoding ...

  • what is in the charset before calling sb.append(new String(data, 0, bytesRead, charset)); ? - Vladyslav Matviienko
  • @metalurgus Nothing, there is a default value String charset = ""; I understand it because the server does not transmit charset ... Could this be? - Aleksey Timoshchenko
  • well then look what is in conn.getContentType() . You know better if your server can not give the charset. - Vladyslav Matviienko
  • @metalurgus In general, when getting String[] values contain only the first cell with application/text value ... - Aleksey Timoshchenko
  • @metalurgus so is it normal that the charset is not contained? Does it affect something? - Aleksey Timoshchenko

2 answers 2

What does this charset mean? I understand this is the encoding ...

You correctly understood that the charset is responsible for decoding the byte sequence into characters. In general, from the same byte of the array using different encodings, you will get different strings. Here, by the way, is an example of a problem when using the wrong encoding.

But how scary are I recording without her?

For as much, how scary is it that as a result your text will consist for example of one ? - a symbol in which in some encodings all symbols missing in the encoding are replaced.

But the file is recorded and everything is fine ...

Most likely, your system uses windows-1251, which knows Russian and Aigly characters, so your file is decoded correctly. But it is rather a coincidence. Your code is unlikely to work for a file with Chinese characters.

Here's another: you can get the default charset for your JVM if

 Charset.defaultCharset() 

    Set the default value to charset = "UTF-8"; , if the server passes this parameter, the variable charset set to