ggplot2, one barplot, multiple variables

I'm working on coding qualitative data from a survey and I'm trying to make one ggplot2 barplot.

The items are opened-ended questions. For example, one question/item is 'what mental health services does your community provide?'.

Each questions/items such as that example, are columns in my data table. For each item I created additional columns to code the original open-ended responses as dichotomous response variables.

For example for the question/item 'what mental health services does your community provide?', I created three additional columns

'services provided by emergency departments', 'services provided by clinics', and 'services provided by schools'.

If a responder endorsed any of these three sub-categories in their open-ended response, I would code a 'yes, if not, a 'no'.

So I have five columns, one id, one containing the original open-ended question, and three sub-category columns coded as yes or no for each person.

df<-structure(list(id = 1:20, other_mh_services = c("school services and emergency room", 
"mental health clinic", "mental health clinic and schools services", 
"none", "mental health clinic", "school services and emergency room", 
"mental health clinic", "mental health clinic and schools services", 
"none", "mental health clinic", "school services and emergency room", 
"mental health clinic", "mental health clinic and schools services", 
"none", "mental health clinic", "school services and emergency room", 
"mental health clinic", "mental health clinic and schools services", 
"none", "mental health clinic"), school = c("yes", "no", "yes", 
"no", "no", "yes", "no", "yes", "no", "no", "yes", "no", "yes", 
"no", "no", "yes", "no", "yes", "no", "no"), er = c("yes", "no", 
"no", "no", "no", "yes", "no", "no", "no", "no", "yes", "no", 
"no", "no", "no", "yes", "no", "no", "no", "no"), clinic = c("no", 
"yes", "yes", "no", "yes", "no", "yes", "yes", "no", "yes", "no", 
"yes", "yes", "no", "yes", "no", "yes", "yes", "no", "yes")), class = "data.frame", row.names = c(NA, 
-20L))

e.g.

ID Item1. Other mental health services? Item1. school Item1.ER Item1.clinic
1 school services and emergency room yes yes no
2 mental health clinic no no yes
3 mental health clinic and schools services yes no yes
4 none no no no

I'd like to create one barplot, or histogram, which has each item subcategory (columns 3-5) on the x axis, and on the y axis, the number of people who responded 'yes'. Example Plot

Any suggestions on how to do that in ggplot2?

1 answer

  • answered 2021-05-03 19:54 Leo Ohyama

    Without a proper reproducible dataset I can only make an educated guess on what you want.

    I made this data up where we two possible answers to answer the question (ER or School)

    df<-data.frame(ID = seq(1:100), question = seq(1:100), School = sample(c(0,1), replace=TRUE, size=100),
                   ER = sample(c(0,1), replace=TRUE, size=100))
    

    I also created a small data frame with all possible combinations of these answers (4 answer types). I then label these as A, B,C, or D. I also add a column where I designate each answer based on 1 or 0 (1 being yes, 0 being no)

    possible_combinations<-unique(df[, c("School", "ER")])
    possible_combinations$combo_type<-c("A", "B", "C", "D")
    possible_combinations$combos<-paste(possible_combinations$School,possible_combinations$ER)
    

    I want to fill in the original dataset with markers indicating which questions had which combinations of answers based on the possible combinations above so I make an empty column in the original dataset:

    df$combo<-NA
    

    Now I simply run a for loop that checks the combinations of actual answers to the previously established combinations and fill in this empty column with either A, B,C, or D.

    for(i in 1:nrow(df)){
      combo<-paste(df$School[i], df$ER[i])
      df$combo[i]<-possible_combinations$combo_type[which(possible_combinations[,4] %in% combo)]
    }
    

    Now it's just a matter of plotting with bar plots using ggplot after summarising the total counts of each answer based on unique combinations:

    df %>% group_by(combo) %>%
      summarise(total = n()) %>% ggplot(.) +
      geom_bar(stat = "identity", aes(x = combo, y = total), width =0.5) +
      theme_bw()
    

    Final plot