Create a column that identifies if all conditions are met

I have a data frame with numeric values. I want to check, for each row if they meet a certain criteria, and create a new column which gives TRUE if all criteria are met. Example criteria are Current.eGFR is greater than or equal to 15, or less than 60 and Decline.12month is less than or equal to -4.

This is head() of data frame

     ID Current.eGFR Decline.12month Decline.24.month
1   13         18.0            -1.3             -8.9
2   19         17.6             1.5             -2.3
3 1063         20.1            -5.3            -10.4
4  700         28.0            -0.2             -2.7
5 1518         14.6           -14.7            -45.2
6  197         19.0           -13.0             -5.1

3 answers

  • answered 2019-12-07 23:05 akrun

    One option is to to use the > or < along with | and &

    df1$newcol <- with(df1, (Current.eGFR >= 15 & Current.eGFR < 60) &
                   Decline.12month <= -4)
    df1$newcol
    #[1] FALSE FALSE  TRUE FALSE  FALSE  TRUE
    

    data

    df1 <- structure(list(ID = c(13L, 19L, 1063L, 700L, 1518L, 197L),
    Current.eGFR = c(18, 
    17.6, 20.1, 28, 14.6, 19), Decline.12month = c(-1.3, 1.5, -5.3, 
    -0.2, -14.7, -13), Decline.24.month = c(-8.9, -2.3, -10.4, -2.7, 
    -45.2, -5.1)), class = "data.frame", row.names = c("1", "2", 
    "3", "4", "5", "6"))
    

  • answered 2019-12-07 23:30 G. Grothendieck

    First note that we need Current.eGFR >= 15 and Current.eGFR < 60 since all numbers would satisfy the condition if it were really or. Compare:

    1:70 >=15 | 1:70 < 60  # bad - result is *always* TRUE
    ##  [1] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
    ## [16] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
    ## [31] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
    ## [46] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
    ## [61] TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE TRUE
    
    1:70 >=15 & 1:70 < 60  # good
    ##  [1] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    ## [13] FALSE FALSE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
    ## [25]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
    ## [37]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE
    ## [49]  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE  TRUE FALSE
    ## [61] FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE FALSE
    

    Making that correction, use transform to create the new column.

    transform(mydf, ok = Current.eGFR >= 15 & Current.eGFR < 60 & Decline.12month < -4)
    

    giving:

        ID Current.eGFR Decline.12month Decline.24.month    ok
    1   13         18.0            -1.3             -8.9 FALSE
    2   19         17.6             1.5             -2.3 FALSE
    3 1063         20.1            -5.3            -10.4  TRUE
    4  700         28.0            -0.2             -2.7 FALSE
    5 1518         14.6           -14.7            -45.2 FALSE
    6  197         19.0           -13.0             -5.1  TRUE
    

    Note

    The input mydf in reproducible form is assumed to be as follows.

    Lines <- "     ID Current.eGFR Decline.12month Decline.24.month
    1   13         18.0            -1.3             -8.9
    2   19         17.6             1.5             -2.3
    3 1063         20.1            -5.3            -10.4
    4  700         28.0            -0.2             -2.7
    5 1518         14.6           -14.7            -45.2
    6  197         19.0           -13.0             -5.1"
    mydf <- read.table(text = Lines)
    

  • answered 2019-12-08 07:35 MalditoBarbudo

    Tidy way, just for completeness:

    library(dplyr)
    #> 
    #> Attaching package: 'dplyr'
    #> The following objects are masked from 'package:stats':
    #> 
    #>     filter, lag
    #> The following objects are masked from 'package:base':
    #> 
    #>     intersect, setdiff, setequal, union
    
    df1 <- structure(list(ID = c(13L, 19L, 1063L, 700L, 1518L, 197L),
    Current.eGFR = c(18, 
    17.6, 20.1, 28, 14.6, 19), Decline.12month = c(-1.3, 1.5, -5.3, 
    -0.2, -14.7, -13), Decline.24.month = c(-8.9, -2.3, -10.4, -2.7, 
    -45.2, -5.1)), class = "data.frame", row.names = c("1", "2", 
    "3", "4", "5", "6"))
    
    df1 %>%
      mutate(
        conditions_met = if_else(
          Current.eGFR >= 15 & Current.eGFR < 60 & Decline.12month <= -4,
          TRUE, FALSE
        )
      )
    #>     ID Current.eGFR Decline.12month Decline.24.month conditions_met
    #> 1   13         18.0            -1.3             -8.9          FALSE
    #> 2   19         17.6             1.5             -2.3          FALSE
    #> 3 1063         20.1            -5.3            -10.4           TRUE
    #> 4  700         28.0            -0.2             -2.7          FALSE
    #> 5 1518         14.6           -14.7            -45.2          FALSE
    #> 6  197         19.0           -13.0             -5.1           TRUE
    

    Created on 2019-12-08 by the reprex package (v0.3.0)