Changing a many column dataframe to a single column without losing information in python
I want to see changes in cell values when comparing multiple dataframes. These dataframes are formed from JSON data, which forms manycolumn tables, and I cannot easily change this data source. Let's say there are 10 dataframes with 10 rows and 10 columns (equally labelled). I'd like to compare the information, by turning each dataframe into 100 row and 1 column.
for 3x3 example:
import pandas as pd
data = [{'a':1,'b':2,'c':3},{'a':10,'b':20,'c':30},{'a':100,'b':200,'c':300}]
df = pd.DataFrame(data)
df.index = ['x','y','z']
gives this table
a b c
x 1 2 3
y 10 20 30
z 100 200 300
but I would like to have:
col
xa 1
xb 2
xc 3
ya 10
yb 20
yc 30
za 100
zb 200
zc 300
so that I may then add many columns and compare values changes.
Can somebody advise me on how to do this using pandas? It is okay if a third colum is required, i.e.:
1 2 3
x a 1
x b 2
x c 3
y a 10
y b 20
y c 30
z a 100
z b 200
z c 300
2 answers

Try something like this
import pandas as pd data = [{'a':1,'b':2,'c':3},{'a':10,'b':20,'c':30},{'a':100,'b':200,'c':300}] df = pd.DataFrame(data) df.index = ['x','y','z'] df_us = df.unstack().reset_index() df_us.columns = [i for i in range(df_us.shape[1])] df_us = df_us.sort_values(by=2)

Use
DataFrame.stack
withSeries.to_frame
and then flattenMultiIndex
toindex
withmap
:df_us = df.stack().to_frame('col') df_us.index = df_us.index.map(lambda x: f'{x[0]}{x[1]}') print (df_us) col xa 1 xb 2 xc 3 ya 10 yb 20 yc 30 za 100 zb 200 zc 300
For 3 columns:
df_us = df.stack().reset_index() df_us.columns = [0,1,2] print (df_us) 0 1 2 0 x a 1 1 x b 2 2 x c 3 3 y a 10 4 y b 20 5 y c 30 6 z a 100 7 z b 200 8 z c 300