Good day! Learning to write in python. Please help. The fact is that there is one csv file there are stored texts and there are words more precisely 5 words you need to check for the presence of one of these 5 words in the texts of this file.

For example: in csv such texts: Good afternoon! My name is John! Hello! My name is Emma! My name is Emily!

Words: Good afternoon, Hello, Good evening.

my code is:

#! /usr/bin/env python # -*- coding: utf-8 -*- import csv import unicodecsv as csv import sys reload(sys) sys.setdefaultencoding('utf-8') result_read = ['Добрый день' ,'Здравствуйте','Доброе утро','Уважаемый','Уважаемая'] if __name__ == '__main__': print 'Starting...' with open('data.csv', 'r') as f: data = f.read().splitlines() # i = raw_input('Enter response: ') # data = [i] result = [] for lines in data: #sentences = filter(lambda x: len(x) > 0, entry.split('.')) if result_read in lines: #and len(sentences) > 5 and all([len(x.split()) > 4 for x in sentences]): result.append((lines, 'соответствует')) else: result.append((lines, 'не соответствует')) with open('result.csv', 'w') as f: for r in result: f.write(u'{}, {}\n'.format(r[0], r[1])) print 'Finished!' 
  • one
    "Good day" is not one word as a rule¶ Do you want to grep -Ff phrases input.csv command? What does it mean that this is a csv file, and not any other text file? - jfs
  • one
    So what's the problem then? Is the code not working? It works, but not as you want? Gives an error message? - Enikeyschik
  • Good day! The code does not work correctly, that is, if it sees at least one match it writes there, and then all the others do not match even if the greeting words are present - Alikhan Orynbassaruly

1 answer 1

To find the lines that contain at least one of the specified phrases, you can use regex (not tested):

 #!/usr/bin/env python3 import fileinput import regex # $ pip install regex phrases = ['Добрый день' ,'Здравствуйте','Доброе утро','Уважаемый','Уважаемая'] found = re.compile(r'\L<phrases>', phrases=phrases).search for line in fileinput.input(): if found(line): print(line, end='') 

Example:

 $ ./find-phrases input.csv 

This is the [slow] analog: grep -Ff phrases input.csv commands.