Faced a VERY strange problem, the answer to which I did not find in Google.

There is a task - using the Android application to download the html page and then work with it. It would seem that everything is fine, but the link contains Russian characters, so an empty html page is downloaded.

I take this Russian part of the page from EditText.
Link is being formed.
Download a blank page.

I read about this problem, everyone writes to use the transcoding of the Russian part in utf-8.
Recoded.
A blank page has been downloaded.

To check the link yourself, put the link text in this EditText.
I copy.
I pass on it through the browser on the smartphone - the link opens is correct!

But from the computer through the Chrome browser, it opens again empty. I tried to recode not only the Russian part, but the entire link - the problems remained.

I re-read all the forums, everyone is writing about utf-8, but it does not work, what should I do?

Method of extraction of the Russian part

private String getGroup () { EditText textEdit = (EditText) getView().findViewById(R.id.groupEditText); String group = textEdit.getText().toString(); return group; } 

Link forming method

 private String getUrlString () { String firstURL = getString(R.string.url_first); String group = getGroup(); try { URLEncoder.encode(group, "utf-8"); } catch (UnsupportedEncodingException e) { AlertDialog.Builder builder = new AlertDialog.Builder(getActivity()); builder.setTitle(R.string.dialog_error); builder.setMessage(R.string.dialog_group_message); builder.setCancelable(true); builder.setPositiveButton(android.R.string.ok, new DialogInterface.OnClickListener() { @Override public void onClick(DialogInterface dialog, int which) { dialog.dismiss(); } }); AlertDialog dialog = builder.create(); dialog.show(); } String secondURL = getString(R.string.url_second); String finaURL = firstURL + group + secondURL; return finaURL; } 

Just in case the connection method

 URL url = new URL(timeTableURL); URLConnection connection = url.openConnection(); BufferedReader bufReader = new BufferedReader(new InputStreamReader(connection.getInputStream())); String inputLine; createDir(getString(R.string.directory)); String fileName = getString(R.string.directory) + "/TimeTable" + timeTableGroup + ".html"; File file = new File(fileName); if (!file.exists()) { file.createNewFile(); } FileWriter fileWriter = new FileWriter(file.getAbsoluteFile()); BufferedWriter bufWriter = new BufferedWriter(fileWriter); while ((inputLine = bufReader.readLine()) != null) { bufWriter.write(inputLine); } bufWriter.close(); bufReader.close(); 

    3 answers 3

    This code does nothing

     URLEncoder.encode(group, "utf-8"); 

    must be

     group = URLEncoder.encode(group, "utf-8"); 

    If group is a GET parameter, then this should be enough.

    • Thank you very much! Once again, I am convinced that it is better to first read the documentation :) - Dima Gurov
    • Here the rule is simple. In Java, all lines are immutable, i.e. unchangeable. The string object is always re-created when trying to change its content. Therefore, all functions that perform row manipulations return a new object as a result. - Eugene Krivenja

    You can try to convert the url to [IDN] [1]

    Here are a couple of examples.

      val s = "test string с ΠΊΠΈΡ€ΠΈΠ»ΠΈΡ†Π΅ΠΉ" Log.d("log", "s Π² IDN ${IDN.toASCII(s)}") 

    in the logs:

     s Π² IDN xn--test string -83k6baatow7hsa6j 

    and with random url in rf zone

      val ss = "ΠΌΠΈΠ½ΠΎΠ±Ρ€Π½Π°ΡƒΠΊΠΈ.Ρ€Ρ„" Log.d("log", "ss Π² IDN ${IDN.toASCII(ss)}") 

    in the logs:

     ss Π² IDN xn--80abucjiibhv9a.xn--p1ai 

    where http: //xn--80abucjiibhv9a.xn--p1ai url respectively.

    As it turned out, if you use Jsoup, then everything is much simpler. In a separate stream, you can easily get the sources of starnitsa

     Jsoup.connect("https://eisgateway.mephi.ru/TimeTable/timetableshow.aspx?gr=%D0%B003-23&prep=&typ=gr").get() 
    • I just tried to convert the Russian part to IDN separately, the entire link separately, but when converting the entire link, the URL connection refuses to work with it (the application crashes), and when converting the Russian part, the link is again not correct. How to be that? - Dima Gurov
    • actually here is an interesting link, just in case eisgateway.mephi.ru/TimeTable/… where the devil knows what begins after "gr = ...." and the part "% D0% B003-23" in theory is "a03-23" - Dima Gurov
    • Added how this can be done in response. - Kota1921

    It is necessary to encode the call string, like this:

     url = UrlEncoder.encode(url, "utf-8"); 
    • UrlEncoder is designed to encode request parameters only. They cannot encode the entire request link. - Eugene Krivenja
    • I agree, I wrote garbage) - Android Android