calculate weighted average over several columns with NA

I have a data frame like this one:

ID  duration1   duration2   total_duration  quantity1   quantity2
 1     5            2             7             3         1
 2     NA           4             4             3         4
 3     5            NA            5             2         NA

I would like to do a weighted mean for each subject like this:

df$weighted_mean<-  ((df$duration1*df$quantity1) + (df$duration2*df$quantity2) ) / (df$total_duration)

But as I have NA, this command does not work and it is not very nice....

The result would be this:

ID  duration1   duration2   total_duration  quantity1   quantity2   weighted_mean
 1     5            2             7             3         1          2.43
 2     NA           4             4             3         4          4
 3     5            NA            5             2         NA         2

Thanks in advance for the help

1 answer

  • answered 2022-05-07 07:59 jay.sf

    You could exploit sum and prod's na.rm= arguments.

    transform(df, z=sum(prod(duration1, quantity1, na.rm=T), na.rm=T) + 
           sum(prod(duration2, quantity2, na.rm=T), na.rm=T)/
           na.omit(total_duration))
    #   ID duration1 duration2 total_duration quantity1 quantity2        z
    # 1  1         5         2              7         3         1 454.5714
    # 2  2        NA         4              4         3         4 458.0000
    # 3  3         5        NA              5         2        NA 456.4000
    

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum