On the stock page Silpo there are products that do not have the old price. Accordingly, my code does not work correctly and assigns the Kiwi product the old price of the next curd.

private ArrayList<Double> findOldPrice(Document doc) { ArrayList<Double> list = new ArrayList<Double>(); Elements div = doc.getElementsByClass("pr"); Document document = Jsoup.parse(div.toString()); Elements elements = document.select("span"); for (int i = 0; i < elements.size(); i += 2) { double hrn = Double.parseDouble(elements.get(i).text()); double kop = Double.parseDouble(elements.get(i + 1).text()) * 0.01; double newPrice = hrn + kop; list.add(newPrice); } return list; } 

How can I get around this problem?

  • Do you propose to write for you an algorithm for parsing the site you cited? Have you tried to check for your condition and process it? - Yuriy SPb
  • I do not ask you to do it for me, you misunderstand most likely. I provided only part of the code for parsing a single component. If you look closely at the source code of the page, we will see that the product name can be obtained from <div class = "p10"> or <h3> new and old prices from <div class = "price_2014_new"> and <div class = "price_2014_old "> I get it all. The problem is that there are products that do not have the old price and I haven’t figured out how to add these data to the list in pairs, for example List <productName, newPrice, oldPrice>, String values - driversti

1 answer 1

Solved this question as follows:

 import org.jsoup.Jsoup; import org.jsoup.nodes.Document; import org.jsoup.nodes.Element; import org.jsoup.select.Elements; import java.io.IOException; import java.util.ArrayList; import java.util.Arrays; import java.util.List; public class Test { private static List<String> silpoPages = Arrays.asList("http://silpo.ua/ua/actions/priceoftheweek/?PAGEN_1=1&", "http://silpo.ua/ua/actions/priceoftheweek/?PAGEN_1=2&", "http://silpo.ua/ua/actions/priceoftheweek/?PAGEN_1=3&"); public static void main(String[] args) { for (String silpoPage : silpoPages) { Document doc; Elements el; ArrayList<Element> list = new ArrayList<Element>(); try { doc = Jsoup.connect(silpoPage) .userAgent("Mozilla/5.0 (Windows NT 5.1) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/27.0.1453.110 Safari/537.36") .timeout(0) .get(); el = doc.getElementsByClass("photo"); String name; double newPrice, oldPrice; for (Element element : el) { name = element.select("h3").text(); newPrice = Double.parseDouble(element.select("div.price_2014_new").text()) / 100; if (!element.select("div.price_2014_old").text().equals("")){ oldPrice = Double.parseDouble(element.select("div.price_2014_old").text()) / 100; }else { oldPrice = 0; } System.out.println(name + " " + newPrice + "грн, " + oldPrice + "грн"); } } catch (IOException e) { e.printStackTrace(); System.out.println("Cannot open the site!"); } } } } 

Perhaps not elegant and the formatting of prices should be corrected, but maybe someone will fit;)