create a list of values in a column in pandas based on values of another column

I have a dataframe containing 3 columns. I want to get the value in first column corresponding to last entry in second column and the the value in first column whose their associated values in second column have at least 8 difference with the last entry of second column and put them in a list. Since 18 is the reference, I want to have its associated value from col1 in the list and have a data frame in the output. I am trying to figure out how I can do this in pandas.

col1  col2   col3
 a      0      1
 b      2      1
 c      13     1
 d      18     1

the output that i want is:

    col1   col3
 [d, b, a]  1

Thanks in advance.

1 answer

  • answered 2018-11-07 22:45 AmourK

    From the way I interpreted your question d should not be included. This is since 18 - 18 = 0 < 8

    Regardless, I took a three step approach to this problem.

    # Get the desired reference value
    last_entry = df.iloc[-1][col2]
    
    # Select only rows whose difference is at least 8 
    # Or the case where it is the last entry 
    qry = "{ref}-col2 >= 8 or index=={idx}".format(ref=last_entry, idx=len(df)-1)
    diff_gt_8 = df.query(qry) 
    
    # For each value of col3 get a list of values of col1 and convert to DataFrame
    pd.DataFrame( diff_gt_8.groupby(col3)[col1].apply(list) )