I'm trying to get the current dollar rate:

# -*- coding: utf-8 -*- import urllib3 import xml import sys u = urllib3.urlopen("http://www.cbr.ru/scripts/XML_daily.asp", timeout=10) 

The XML_daily.asp file contains:

 ... <Valute ID="R01215"> <NumCode>208</NumCode> <CharCode>DKK</CharCode> <Nominal>10</Nominal> <Name>Датских крон</Name> <Value>63,8151</Value> </Valute> <Valute ID="R01235"> <NumCode>840</NumCode> <CharCode>USD</CharCode> <Nominal>1</Nominal> <Name>Доллар США</Name> <Value>35,0115</Value> </Valute> ... 

How do I get the Dollar Value out of here (Valute ID = "R01235")?

3 answers 3

You can use xml.etree.ElementTree .

 import xml.etree.ElementTree as ET tree = ET.parse('/tmp/XML_daily.asp') 

Then pull out the desired currency using XPath:

 tree.findall('./Valute[ @ID ="R01235"]/Value')[0].text 
  • It outputs to me: xml.etree.ElementTree.ParseError: not well-formed (invalid token): - Adam
  • one
    @derkode, I do not know, everything works for me: $ curl " cbr.ru/scripts/XML_daily.asp "> /tmp/XML_daily.asp $ python3 Python 3.2.3 (default, Feb 27 2014, 21:31:18) [GCC 4.6.3] on linux2 Type "help", "copyright", "credits" or "license" for more information. >>> import xml.etree.ElementTree as ET >>> >>> tree = ET.parse ('/ tmp / XML_daily.asp') >>> tree.findall ('./ Valute [@ ID = "R01235" ] / Value ') [0] .text '34, 9043' - dzhioev

Here is the full Python script that prints the current dollar rate (a slightly simplified version of the jawev answer ):

 #!/usr/bin/env python3 from urllib.request import urlopen from xml.etree import ElementTree as etree with urlopen("https://www.cbr.ru/scripts/XML_daily.asp", timeout=10) as r: print(etree.parse(r).findtext('.//Valute[@ID="R01235"]/Value')) 

    You can use the re module from the standard python library: link , or use any parser, it depends on the taste and requirements. For example, parser or grab.

    • 7
      For advice to parse * ML regexpas, it's time to shoot on the spot :) - user6550