Copy values of a column into an array in python

I want to pass the values of column in an array and then use it in a loop but the issue is that the loop replaces all the values of original column to the first value of array.

E.g here is the original dataset

Score   Col1    Col2    Col3
1         2       6      1
2         5       0      1
3         1       13     1
4         1        0     0

The result I want is

Score   Col1    Col2    Col3
1         2       6      1
1         5       0      1
1         1       13     1
1         1        0     0

Score   Col1    Col2    Col3
2         2       6      1
2         5       0      1
2         1       13     1
2         1        0     0

Score   Col1    Col2    Col3
3         2       6      1
3         5       0      1
3         1       13     1
3         1        0     0

Score   Col1    Col2    Col3
4         2       6      1
4         5       0      1
4         1       13     1
4         1        0     0

But using my code I'm getting the results like

Score   Col1    Col2    Col3
1         2       6      1
1         5       0      1
1         1       13     1
1         1        0     0

Score   Col1    Col2    Col3
1         2       6      1
1         5       0      1
1         1       13     1
1         1        0     0

Score   Col1    Col2    Col3
1         2       6      1
1         5       0      1
1         1       13     1
1         1        0     0

Score   Col1    Col2    Col3
1         2       6      1
1         5       0      1
1         1       13     1
1         1        0     0

This is the code I'm using it's quite simple

df_arr = df1['Score'].values
for i in df_arr:
    df1['Score'] = i
    print(df1)

However if I add duplicate column of 'Score' e.g 'Score1' and use it in making array and in loop I get the right results.

df_arr = df1['Score1'].values
    for i in df_arr:
        df1['Score'] = i
        print(df1)

Edit: What I want is for each value in my array i get the dataset in which the first whole column replaced by that array value. I have provided sample as well.

2 answers

  • answered 2018-07-15 15:42 Neil

    So, I'm pretty sure this is what you want. It creates a deep copy of the dataframe so it doesn't end up copying the changes to the original dataframe object.

    df = df[["Score", "Col1", "Col2", "Col3"]]
    
    df_list = []
    for score in df["Score"]:
        new_df = df.copy()
        new_df["Score"] = score
        df_list.append(new_df)
    
    for temp_df in df_list:
        print(temp_df.to_string(index=False), "\n")
    

    Output (Note: I changed the order per OP's request):

    Score Col1 Col2 Col3
       1    2    6    1
       1    5    0    1
       1    1   13    1
       1    1    0    0 
    
    Score Col1 Col2 Col3
       2    2    6    1
       2    5    0    1
       2    1   13    1
       2    1    0    0 
    
    Score Col1 Col2 Col3
       3    2    6    1
       3    5    0    1
       3    1   13    1
       3    1    0    0 
    
    Score Col1 Col2 Col3
       4    2    6    1
       4    5    0    1
       4    1   13    1
       4    1    0    0 
    

  • answered 2018-07-15 15:55 sacul

    You can create a dictionary containing all your dataframes using a dictionary comprehension. The advantage is that you have all your dataframes nicely organized in a dictionary.

    df_dict = {'Score_'+str(i): df[['Col1', 'Col2', 'Col3']].assign(Score=i) for i in df.Score.unique()}
    

    Then, you can access each dataframe as you would any dictionary. For instance, to get the dataframe where Score is 2, you can use:

    df_dict['Score_2']
    
       Col1  Col2  Col3  Score
    0     2     6     1      2
    1     5     0     1      2
    2     1    13     1      2
    3     1     0     0      2
    

    You can also see all the dataframes that have been created by looking at the keys of your dictionary:

    >>> df_dict.keys()
    dict_keys(['Score_1', 'Score_2', 'Score_3', 'Score_4'])