Text is written to the variable. Text in utf-8 / koi-8 / windows-1251 encoding
I need functions that can translate them into each other.
Text is written to the variable. Text in utf-8 / koi-8 / windows-1251 encoding
I need functions that can translate them into each other.
A long time ago, skipipastil / wrote and use this function for transcoding:
public static void convert( String infile, //input file name, if null reads from console/stdin String outfile, //output file name, if null writes to console/stdout String from, //encoding of input file (eg UTF-8/windows-1251, etc) String to) //encoding of output file (eg UTF-8/windows-1251, etc) throws IOException, UnsupportedEncodingException { // set up byte streams InputStream in; if(infile != null) in=new FileInputStream(infile); else in= System.in ; OutputStream out; if(outfile != null) out=new FileOutputStream(outfile); else out=System.out; // Use default encoding if no encoding is specified. if(from == null) from=System.getProperty("file.encoding"); if(to == null) to=System.getProperty("file.encoding"); // Set up character stream Reader r=new BufferedReader(new InputStreamReader(in, from)); Writer w=new BufferedWriter(new OutputStreamWriter(out, to)); // Copy characters from input to output. The InputStreamReader // converts from the input encoding to Unicode,, and the OutputStreamWriter // converts from Unicode to the output encoding. Characters that cannot be // represented in the output encoding are output as '?' char[] buffer=new char[4096]; int len; while((len=r.read(buffer)) != -1) w.write(buffer, 0, len); r.close(); w.flush(); w.close(); }
It works like a clock - it is successfully encoded even from Chinese-Pisis to UTF-8. I think it will be easy to adapt for decoding strings.
Charset cset = Charset.forName("UTF-8"); ByteBuffer buf = cset.encode(strOld); byte[] b = buf.array(); String str = new String(b);
You probably need to look at the EncodingUtils
class - I think there are all the necessary methods and functions.
Here is an example of a transfer from 1251 to 1252:
var dt = new Java.Lang.String("some_string_in_1251"); Java.Lang.String win1252String = new Java.Lang.String(dt.GetBytes("windows-1252"), "windows-1251");
Source: https://ru.stackoverflow.com/questions/130834/
All Articles