How to create a DataFrame with N columns from a 1D array (if the length of a 1D array is not a multiple of N, fill the missing elements with zeros)

Question

I have a column 'ID', the values of which need to be filled in 10 other columns. Filling to produce line by line. If empty cells remain, fill them with zeros.

Make the most of pandas

ID = [3, 2, 2, 10, 2, 1, 7, 5, 8, 9, 3, 1 ...]

Here is what I should get:

def prepare_train_set(path_to_csv_files, session_length=10): df = pd.read_csv(path_to_csv_files) df['ID'] = pd.factorize(df.site)[0] df['frequency'] = df.groupby('ID', as_index=False)['site'].transform(lambda s: s.count()) dictionary = df[['site', 'ID', 'frequency']].loc[pd.unique(df['ID'])] dct = dictionary.set_index('site').T.to_dict('list') df2 = pd.DataFrame(columns=['site1','site2' ,'site3 ','site4 ', 'site5','site6',\ 'site7 ', 'site8', 'site9', 'site10' , 'user_id']) print(df) print("======") print(dct) print("======") return dct import glob for path in glob.glob(os.path.join(PATH_TO_DATA,'3users/user*[0-9].csv')): #print(os.path.isfile(path)) prepare_train_set(path, 10)

Please give an example of the input DF (as text / CSV or as Python code) and what you want to get at the output.
@MaxU drive.google.com/open?id=1EcmlTDqZXyqzulTVVAyWo7v2qm6IzEu8 Initially, you need to write a function that returns a dictionary and a new data frame.
Here is the task: github.com/Yorko/mlcourse_open/blob/master/jupyter_notebooks/…
It is best to give a small artificial example (3-5 lines) of input data and what you want to get at the output.
@MaxU Since I process many data frames, each of which is responsible for an individual user, user_id is the user's sequence number, respectively

MaxU MaxU 52.2k 6 18 51 · Answer 1 · 2017-12-14T23:02:20

Example:

 In [73]: ID = np.random.randint(100, size=(24)) In [74]: ID Out[74]: array([37, 33, 3, 7, 9, 30, 60, 20, 55, 97, 94, 4, 3, 87, 28, 22, 62, 28, 97, 70, 3, 57, 21, 18]) In [75]: %paste user_id = 1 N = 10 # число столбцов data = np.pad(ID, (0, int(np.ceil(len(ID) / N)) * N - len(ID)), mode='constant') df = pd.DataFrame(data.reshape(-1, N), columns=np.arange(1, N+1)) \ .add_prefix('site') \ .assign(user_id=user_id) ## -- End pasted text -- In [76]: df Out[76]: site1 site2 site3 site4 site5 site6 site7 site8 site9 site10 user_id 0 37 33 3 7 9 30 60 20 55 97 1 1 94 4 3 87 28 22 62 28 97 70 1 2 3 57 21 18 0 0 0 0 0 0 1

How to create a DataFrame with N columns from a 1D array (if the length of a 1D array is not a multiple of N, fill the missing elements with zeros)

1 answer 1

More articles: