Converting columns with date in names to separate rows in Python

I already got answer to this question in R, wondering how this can be implemented in Python.

Let's say we have a pandas DataFrame like this:

import pandas as pd
d = pd.DataFrame({'2019Q1':[1], '2019Q2':[2], '2019Q3':[3]})

which displays like this:

   2019Q1  2019Q2  2019Q3
0       1       2       3

How can I transform it to looks like this:

Year    Quarter    Value
2019    1          1
2019    2          2
2019    3          3

2 answers

  • answered 2019-11-08 13:47 jezrael

    Use Series.str.split for MultiIndex with expand=True and then reshape by DataFrame.unstack, last data cleaning with with Series.reset_index and Series.rename_axis:

    d = pd.DataFrame({'2019Q1':[1], '2019Q2':[2], '2019Q3':[3]})
    
    d.columns = d.columns.str.split('Q', expand=True)
    df = (d.unstack(0)
           .reset_index(level=2, drop=True)
           .rename_axis(('Year','Quarter'))
           .reset_index(name='Value'))
    print (df)
       Year Quarter  Value
    0  2019       1      1
    1  2019       2      2
    2  2019       3      3
    

    Thank you @Jon Clements for another solution:

    df = (d.melt()
           .variable
           .str.extract('(?P<Year>\d{4})Q(?P<Quarter>\d)')
           .assign(Value=d.T.values.flatten()))
    print (df)
       Year Quarter  Value
    0  2019       1      1
    1  2019       2      2
    2  2019       3      3
    

    Alternative with split:

    df = (d.melt()
           .variable
           .str.split('Q', expand=True)
           .rename(columns={0:'Year',1:'Quarter'})
           .assign(Value=d.T.values.flatten()))
    print (df)
       Year Quarter  Value
    0  2019       1      1
    1  2019       2      2
    2  2019       3      3
    

  • answered 2019-11-08 14:03 Erfan

    Using DataFrame.stack with DataFrame.pop and Series.str.split:

    df = d.stack().reset_index(level=1).rename(columns={0:'Value'})
    df[['Year', 'Quarter']] = df.pop('level_1').str.split('Q', expand=True)
    
       Value  Year Quarter
    0      1  2019       1
    0      2  2019       2
    0      3  2019       3
    

    If you care about the order of columns, use reindex:

    df = df.reindex(['Year', 'Quarter', 'Value'], axis=1)
    
       Year Quarter  Value
    0  2019       1      1
    0  2019       2      2
    0  2019       3      3