How to replace values in pandas dataframe

My goal is to design a program that will take create a program that will replace unique values in a pandas dataframe.

The following code performs the operation

    # replace values
    print(f" {s1['A1'].value_counts().index}")
    for i in s1['A1'].value_counts().index:
        s1['A1'].replace(i,1)

    print(f" {s2['A1'].value_counts().index}")
    for i in s2['A1'].value_counts().index:
        s2['A1'].replace(i,2)

    print("s1 after replacing values")
    print(s1)
    print("******************")
    print("s2 after replacing values")
    print(s2)
    print("******************")

Expected: The values in the first dataframe s1 should be replaced with 1s. The values in the second dataframe s2 should be replaced with 2s.

Actual:

 Int64Index([8, 5, 2, 7, 6], dtype='int64')
 Int64Index([2, 8, 5, 6, 7, 4, 3], dtype='int64')
s1 after replacing values
    A1        A2   A3  Class
3    5  0.440671  2.3      1
9    8  0.070035  2.9      1
14   2  0.868410  1.5      1
29   6  0.587487  2.6      1
34   8  0.652936  3.0      1
38   8  0.181508  3.0      1
45   8  0.953230  3.0      1
54   7  0.737604  2.7      1
68   5  0.187475  2.2      1
70   5  0.511385  2.3      1
71   8  0.688134  3.0      1
73   2  0.054908  1.5      1
87   8  0.461797  3.0      1
90   2  0.756518  1.5      1
91   2  0.761448  1.5      1
93   5  0.858036  2.3      1
94   5  0.306459  2.2      1
98   5  0.692804  2.2      1
******************
s2 after replacing values
    A1        A2   A3  Class
0    2  0.463134  1.5      3
1    8  0.746065  3.0      3
2    6  0.264391  2.5      2
4    2  0.410438  1.5      3
5    2  0.302902  1.5      2
..  ..       ...  ...    ...
92   5  0.775842  2.3      2
95   5  0.844920  2.2      2
96   5  0.428071  2.2      2
97   5  0.356044  2.2      3
99   5  0.815400  2.2      3

Any help understanding how to replace the values in these dataframes would be greatly appreciated. Thank you.

1 answer

  • answered 2021-10-24 19:20 Evan Gertis

    This could be confusing given the documentation on the replace method. You need to reassign the dataframe.

    # replace values
        print(f" {s1['A1'].value_counts().index}")
        for i in s1['A1'].value_counts().index:
            print(f"s1['A1'].replace({i},1)")
            s1['A1'] = s1['A1'].replace(i,1)
    
        print(f" {s2['A1'].value_counts().index}")
        for i in s2['A1'].value_counts().index:
            print(f"s2['A1'].replace({i},2)")
            s2['A1'] = s2['A1'].replace(i,2)
    

    The docs do not say that: https://pandas.pydata.org/docs/reference/api/pandas.DataFrame.replace.html.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum