df=df.groupby('Dates')['OrderQuantity'].sum() error

In my data set, I have a list of dates in one column and quantities in another. Some of the dates appear more than once representing different orders made on the same day. I want to find the sum of the quantities ordered on each day, so that each date shows up in the dates column once, with the total number of items purchased that day in the quantity column. I am currently using the df=df.groupby('Dates')['OrderQuantity'].sum() function, but it is copying the first sum it finds into any of the following rows with quantities >0. Here is my code:

import pandas as pd

import numpy as np

df=pd.read_excel('stackoverflowexample.xlsx')

df=df.groupby('Dates')['OrderQuantity'].sum()

df.to_csv("materialrows.csv")

df=pd.read_csv("materialrows.csv")

array = np.zeros((11,2))

j=0
for i in df['Dates']:
     array[i][0] = i
    array[i][1] = df['OrderQuantity'][j]
    j+1

for i in range(1,15):
    if array[i][0] == 0:
        array[i][0] = array[i-1][0] + 1
    
x=pd.DataFrame(data = array, columns = ["Dates","OrderQuantity"])   

x=x.iloc[1:, :]
x=x['OrderQuantity']
print(x)

df=df.groupby('Dates')['OrderQuantity'].sum()
df.to_csv("materialrows.csv")

df=pd.read_csv("materialrows.csv")

array = np.zeros((11,2))

j=0
for i in df['Dates']:
    array[i][0] = i
    array[i][1] = df['OrderQuantity'][j]
    j+1

    for i in range(1,15):
    if array[i][0] == 0:
        array[i][0] = array[i-1][0] + 1
        
y=pd.DataFrame(data = array, columns = ["Dates","OrderQuantity"])   
  
y=y.iloc[1:, :]

y=y['OrderQuantity']

print(y)

Here is what the 'stackoverflowexample' excel file looks like.

Dates OrderQuantity
1     3
1     4
2     3 
3     8
4     1
5     2
6     6 
7     1
7     2
7     5
8     1
9     2
10    2

Here is the current result of my code:

1    7
2    7
3    7
4    7
5    7
6    7
7    7
8    7
9    7
10   7

Here is the result I want:

    1    7
    2    3
    3    8
    4    1
    5    2
    6    6
    7    8
    8    1
    9    2
    10   2

Any help would be greatly appreciated!

1 answer

  • answered 2021-07-27 17:18 KalaJatt

    This df=df.groupby('Dates')['OrderQuantity'].sum() returns a series. Adding the as_index=False field will return a DF -->

    df = df.groupby('Dates',as_index=False)['OrderQuantity'].sum().
    

    Try re-running the cells in your notebook.

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum