I ask for help with the following problem: when using pandas.read_csv data lines are lost. My actions are step by step:
# читаем данные в датафрейм import pandas as pan d = pan.read_csv('card.txt', sep = '|', encoding = 'cp1251') print len(d) Check the result:
#читаем содержимое файла import codecs fh = codecs.open('card.txt','r', encoding = 'cp1251') text = list() #делим на строки for line in fh: line = line.rstrip() text.append(line) print len(text) We get len (text) - len (d) = 214000
What could be the problem? I would be very grateful for the help.