How to aggregate over multiple variables
I have some data which has the columns company, amount of donation and party.
I want to have a new list where for each company the sum of donated money, for each party, is listed.
So far I know how to aggregate the "amount" column, if I only have one company
df <- df %>% filter(company == "c1" ) %>% select(amount, party) test<-aggregate(df$amount, by=list(party=a$party), FUN=sum)
Is there a way not to use a loop function for every company?
Thanks for helping
To extend what Jonny commented - here's an example that may go into right direction:
library(tidyverse) df <- tibble(company = rep(letters[1:3],4), donation = runif(12, min = 1, max = 100), party = rep(LETTERS[25:26], 6)) df %>% group_by(company, party) %>% summarize(donation = sum(donation))
Use the package sqldf, dplyr
If I understood your question correctly, this is one way you could do it if there are not many companies.
I am calling the original table TB1 First assign each company a unique ID.
TB1$ID <- group_indices(TB1$Company)Columns in TB1 are Company, Donation, Party,ID
sqldf(" SELECT a.Party, CompanyOne.Donation, CompanyTwo.Donation FROM TB1 a LEFT OUTER JOIN (Select sum(Donation) as CompanyOneDonation, Party FROM TB1 a WHERE Company = (CompanyOne) GROUP BY PARTY, COMPANY) as CompanyOne ON a.ID = CompanyOne.ID LEFT OUTER JOIN (Select sum(Donation) as CompanyTwoDonation, Party FROM TB1 WHERE Company = (CompanyTwo) GROUP BY PARTY, COMPANY) as CompanyTwo ON CompanyTwo.ID = CompanyOne.ID GROUP BY a.Party ")
Please let me know if this works for you.