There is data that needs to be prepared for combining with another block of data. For this I would like to chronologically organize them. Sample source data:
df1 pid syear pgsbil pgfamstd \ 0 101 1984 [3] Fachhochschulreife [1] verheiratet zus. 1 101 1985 [3] Fachhochschulreife [1] verheiratet zus. 2 101 1986 [3] Fachhochschulreife [1] verheiratet zus. ... ... ... ... 6 102 1984 [1] Hauptschulabschluss [1] verheiratet zus. 7 102 1985 [1] Hauptschulabschluss [1] verheiratet zus. ... ... ... ... 484168 31433802 2012 [2] Realschulabschluss [1] verheiratet zus. 484169 31433901 2012 [4] Abitur [2] verheiratet getr. I tried to sort using code:
DF1 = df1.sort_values(by='syear', ascending=1) But instead of a year, I get, in my opinion, it is in a different encoding (like everything else!):
Df1 Out[53]: pid syear pgsbil pgfamstd \ 248899 320797655 -32656 81 -95 248825 891723238 -32419 43 43 250014 345587954 -32377 NaN -119 ... ... ... ... 250163 957561202 31108 -91 27 250166 449665857 31554 -1 -1 Why do you get numbers in a different format when sorting data? How do i fix this?
df1.syear.min(),df.syear.max()anddf1.dtypes- MaxUAttributeError: 'Series' object has no attribute 'agg'error for thedf1.syear.agg(['min','max'])commandAttributeError: 'Series' object has no attribute 'agg'And for the second command: pid int32 / syear int16 / pgsbil category / pgfamstd category / pglabgro int32 / pgemplst category / dtype: object - user21df1.syear.min()I get-32656, and on the second normal datadf.syear.max()result is2012Although I give the commandprint(max(df1['syear'])), I get 31554. - user21