How to display information about duplicate data 2D array (DataFrame)?

Question

For example, there is an array:

1 1 1 0 2 0 1 1 2 1 0 0 2 1 1

To conclusion

 0 - 4 1 - 8 2 - 3

Code:

 import numpy as np import pandas as pd # Шаг 1. Load data file data_file = pd.read_excel('Arrayt.xlsx') # Шага 2. Выводит список элементов с частотами data_file.stack().value_counts().reset_index().rename(columns={'index':'val', 0:'count'})' # Сейчас просматриваю уроки по сохранению и выводу полученных данных на Шаге 2.

At the very beginning, the task was formalized as follows:

Array is specified

 A11 A12 A13 A14 A15 A16 ... A1n

Need to withdraw

 A12-A11 None A13-A11 A13-A12 None A14-A11 A14-A12 A14-A13 A15-A11 A15-A12 A15-A13 A16-A11 A16-A12 A16-A13 ... ... ... A1n-A11 A1n-A12 A1n-A12

I performed this part in an axial way in an ax way. Already imported array in Python (thanks forumchaninu MaxU). Again, thanks to the MaxU hint, each matrix element with frequencies has been output.

The task costs all implementation to transfer to Python

I want to do this with an array of 100 by 100. To do this, I import the panda package import pandas as pd mydata = pd.io.excel.read_excel (open ("C: \\ Users \\ User \\ Downloads \\ Arr‌ aytest.xlsx" )) print mydata But is it enough to import the above mentioned pandas?
On StackOverflow (SO), it’s customary to ask one question and not to change its essence, since
If you want to ask another question - ask (open) a new question.

Answer 1 · 2017-02-11T21:12:07

How to read data from Excel in Pandas.DataFrame :

 import pandas as pd df = pd.read_excel('C:\\Users\\User\\Downloads\\Ar‌r‌aytest.xlsx')

Answer to the question about duplicate data:

 In [131]: df Out[131]: 0 1 2 3 4 0 1 1 1 0 2 1 0 1 1 2 1 2 0 0 2 1 1

In the form of Pandas.Series:

 In [132]: df.stack().value_counts() Out[132]: 1 8 0 4 2 3 dtype: int64 In [143]: df.stack().value_counts(sort=False) Out[143]: 0 4 1 8 2 3 dtype: int64

In the form of Pandas.DataFrame:

 In [134]: df.stack().value_counts().reset_index().rename(columns={'index':'val', 0:'count'}) Out[134]: val count 0 1 8 1 0 4 2 2 3 In [144]: df.stack().value_counts(sort=False) \ .reset_index().rename(columns={'index':'val', 0:'count'}) Out[144]: val count 0 0 4 1 1 8 2 2 3

Explanations:

 In [135]: df.stack() Out[135]: 0 0 1 1 1 2 1 3 0 4 2 1 0 0 1 1 2 1 3 2 4 1 2 0 0 1 0 2 2 3 1 4 1 dtype: int64

Kirill Malyshev Kirill Malyshev 5.470 one 6 22 · Answer 2 · 2017-02-11T20:47:10

The set (a) function creates a set without repeating elements, which we loop through.

 a = [1, 1, 1, 0, 2, 0, 1, 1, 2, 1, 0, 0, 2, 1, 1] for i in set(a): print(str(i) + ' - ' + str(a.count(i)))

I want to do this with an array of 100 by 100. For this, I import the panda package import pandas as pd mydata = pd.io.excel.read_excel (open ("C: \\ Users \\ User \\ Downloads \\ Arraytest.xlsx")) print mydata But is it enough to import the above mentioned pandas?

How to display information about duplicate data 2D array (DataFrame)?

2 answers 2

More articles: