Hello. The following problem arose: where to store and how to provide quick access to a sufficiently large data table in Python?
Baseline data - with 4 million. rows and 10 columns.
Initially, plan to stuff this whole thing in .pkl like this:
import pickle infile=open('big.txt','r') flag=1 count=0 data=[] while flag==1: row=infile.readline() if row=='': flag=0 print('export done') break data.append(row.split('\t')) outfile=open('big_dump.pkl','wb') pickle.dump(data,outfile) print('dump done') However, such code overloads the RAM, probably because it stores the data list in memory. Which way to look, which library to store to choose?
I thought about pytables, but there is not enough Russian documentation. The problem is that after saving the data, you need the ability to call them from a file by key.