Good day.

There is a variable with the text: "Discos in Moscow: announcement for November 1." The text can change, only the phrase "announcement on" remains unchanged and then follows <day> <month>

Do not tell me, how can I, after finding the key phrase, parse the following four words?

  • Announcement on - is this the key phrase? then November 1 - these are not 4 words - approximatenumber
  • Yes, "the announcement of the" is the key phrase. 4 words, including the announcement on. That is, the "Announcement for November 1," I need to isolate the variable. - Sergey
  • Related question: Parse French date in python - jfs

2 answers 2

To get the date object after the specified phrase, you can use regular expressions and the dateparser module to recognize the date itself in the date object :

 #!/usr/bin/env python # -*- coding: utf-8 -*- from __future__ import unicode_literals import re import dateparser # $ pip install dateparser text = "Дискотеки Москвы: анонс на 1 ноября" m = re.search(r"{phrase}\s+(\w+\s+\w+)".format(phrase=re.escape("анонс на")), text, flags=re.UNICODE) date_string = m.group(1) print(dateparser.parse(date_string).date()) # -> 2016-11-01 
  • Thanks for the answer. Did not help. I use python 2 `text = answer ['response'] [i] ['description'] text = text.lower () m = re.search (r" {phrase} \ s + (\ w + \ s + \ w +) ". format (phrase = re.escape ("Announcement on")), text) date_string = m.group (1) print dateparser.parse (date_string) .date () `Although the text is in the variable text: the анонс на 28 октября (пятница 20.30-23.55) - Sergey
  • @jsf thanks, helped. PS I'm sort of adapting code for 2 python. Apparently, somewhere missed something. - Sergey

Use regular expressions for this. For example:

 text = """ Дискотеки Москвы: анонс на 1 ноября Дискотеки Москвы: анонс на 1 декабря Дискотеки СПб: анонс на 10 декабря Дискотеки СПб: анонс на 26 декабря """ import re for date, month in re.findall('анонс на (\d+) (\w+)', text): print(date, month) 

Console:

 1 ноября 1 декабря 10 декабря 26 декабря 

For a single expression, you can use the search method:

 text = "Дискотеки Москвы: анонс на 1 ноября" import re match = re.search('анонс на (\d+) (\w+)', text) if match: date = match.group(1) month = match.group(2) print(date, month) # 1 ноября 

Ps.

Examples in Python 3. For Python 2, you need to import the function print: from __future__ import print_function , or use the print operator of the same name.

  • Thank you, but he does not want to find matches / Code text = answer['response'][0]['description']#в answer ответ json, текст существует text = text.lower() match = re.findall(u' анонс на (\d+) (\w+)',text) print match Returns an empty array. Although the text contains the required text: 🔴 анонс на 28 октября (пятница 20.30-23.55) - Sergey
  • I have your text parsed, try adding u'анонс на (\d+) (\w+)' . And better to your question, at the end indicate the version of the python and the request for which data is returned - gil9red