# I want to calculate the percentage but all i am getting is the sum in pandas data frame

I want to calculate the percentage but all i am getting is the sum . Please help me get the percentage value in the cells rather than the count in python in pandas data frame .

Code :

``````ds_data = data[(data.JobTitle == 'Data Analyst') | (data.JobTitle == 'Data Engineer')  | (data.JobTitle == 'Data Scientist')]
agg_func = {'Education':{'Masters': lambda x: \
sum(i == 'Masters' for i in x),
'Bachelor': lambda x : sum(i == 'Bachelors (4 years)' for i in x),
'None': lambda x : sum(i == 'None (no degree completed)' for i in x),
'Doctorates': lambda x : sum(i == 'Doctorate/PhD' for i in x),
'Associates': lambda x : sum(i == 'Associates (2 years)' for i in x)}}
function = ds_data.groupby(['JobTitle']).agg(agg_func).reset_index()
function.columns = function.columns.droplevel(0)
function
``````

I've taken the liberty to define a function to contain the math, since it is cleaner than copy/pasting the code.

In order to get the percentage, you need to divide by the total number, or the length of the list.

``````def calc_percentage(data, degree):
return (sum(i == degree for i in x) / len(x)) * 100

agg_func = {
'Education': {
'Masters': lambda x : calc_percentage(x, 'Masters'),
'Bachelor': lambda x : calc_percentage(x, 'Bachelors (4 years)'),
'None': lambda x : calc_percentage(x, 'None (no degree completed)'),
'Doctorates': lambda x : calc_percentage(x, 'Doctorate/PhD'),
'Associates': lambda x : calc_percentage(x, 'Associates (2 years)')
}
}
``````

If we use the dict renaming (which is deprecated), one can compute the total amount of rows, and then using it in the lambda functions to get the percentage:

``````ds_data = data[(data.JobTitle == 'Data Analyst') | (data.JobTitle == 'Data Engineer')
| (data.JobTitle == 'Data Scientist')]
ds_data_nrows = ds_data.shape
agg_func = {'Education':{'Masters': lambda x: \
(sum(i == 'Masters' for i in x) / ds_data_nrows) * 100,
'Bachelor': lambda x : (sum(i == 'Bachelors (4 years)' for i in x) / ds_data_nrows) * 100,
'None': lambda x : (sum(i == 'None (no degree completed)' for i in x) / ds_data_nrows) * 100,
'Doctorates': lambda x : (sum(i == 'Doctorate/PhD' for i in x) / ds_data_nrows) * 100,
'Associates': lambda x : (sum(i == 'Associates (2 years)' for i in x) / ds_data_nrows) * 100}}
function = ds_data.groupby(['JobTitle']).agg(agg_func).reset_index()
function.columns = function.columns.droplevel(0)
function
``````