# Subsetting for proportional representation in R

I can't wrap my tiny brain around this one. One dataframe contains observations, each with a gender and an age bracket. I'm trying to write a function that returns a subset of the rows of this dataframe where each age-gender combination appears in a proportion roughly equal to the value in the "props" dataframe. Ideally, the function will trim as few observations as possible. The results can be approximate (By approximate/roughly equal, I mean that each group's representation in the output should be at least within 5% of the desired proportion, and generally as low as possible).

``````ages <- c("18-29", "30-39", "40-49", "50-59","60+")
genders <- c("M","F")

set.seed(101)
df <- data.frame("id" = paste0("p",c(1:500)),
"gender" = sample(genders, replace=TRUE, size=500),
"age" = sample(ages, replace=T, size=500))

props <- data.frame("age" = c(ages, ages),
"gender" = genders,
"pcts" = c(.0835, .1145, .1145, .1145, .073, .0835, .1145,
.1145, .1145, .073))

select_max <- function(df, props) {

....

return(subset)
}
``````

I experimented with solutions using least common multiples and greatest common divisors, but these fell apart when the proportions didn't work nicely together. I'm considering a solution which adds and subtracts rows one at a time until it gets close enough to the desired proportions, but I feel there must be some more elegant solution. All help is appreciated. This is a fun one, for sure.