This question has already been answered:
How to break the text into separate sentences? The splitlines () variant is not appropriate, since the text can be written in one line.
This question has already been answered:
How to break the text into separate sentences? The splitlines () variant is not appropriate, since the text can be written in one line.
Expression ignores
1980
100rub.
100r.
100kop
100k
etc.
etc.
as well as combined punctuation marks.
Code here
http://ideone.com/pNpffv
parts = all_text.split('.')
...
is called ellipsis (at least it was called when I was in school). Secondly, there is such a sign ?!?
. You can approach the task "creatively" (delete empty lines). filter (lambda x: not re.match ("^ \ s * $", x), re.split ("[!.? \ n]", all_text) However, there is a suspicion that the punctuation marks should be present in the resulting list. And then - just search by pattern [^!.?\n]+[!.?\n]+
. That also does not give 100% correct result "In 1998 there was a default." - alexlz s = "Properties are a little different. They need a special declaration since they're handled in a very different way. (Hmmmm... I may have figured out an obvious way around that, but I want to get this out the door first.) Here's how you'd mock out calls to a property. Note that unlike other calls, all the calls to an overridden property must be played back in order." def srtip_sent(str_): separators = ['.', '?', '!'] start = 0 s_split = [] for i in range(len(str_)): if s[i] in separators: s_split.append(str_[start:i+1]) start = i + 1 return map(lambda s: s.strip(), s_split) srtip_sent(s) ['Properties are a little different.', "They need a special declaration since they're handled in a very different way.", '(Hmmmm.', '.', '.', 'I may have figured out an obvious way around that, but I want to get this out the door first.', ") Here's how you'd mock out calls to a property.", 'Note that unlike other calls, all the calls to an overridden property must be played back in order.']
Does not work correctly with compound characters, for example, with a triple-point.
А. С. Пушкин
. And with internal punctuation marks like "Что за хрень?" -- поинтересовалась Алиса.
"Что за хрень?" -- поинтересовалась Алиса.
- VladDSource: https://ru.stackoverflow.com/questions/197142/
All Articles