Search and mass convert character columns to date in R with dplyr without explicite specification

I have a messy dataframe with thousand variables and want to automate conversion of specific columns to dates without having to specify which columns explicitely. All columns to convert have "Date" in their name. Most are mdy but they also can be dmy. Some contain errors, or malformatted dates but in a very very minor proportion <0.1%.

I tried:

df %>% select(contains("Date")) %>% as_Date() #Does not work
df %>%  select(contains("Date"))  %>% mdy() #selecting only the columns with dates, does not work
df %>% select(contains("Date")) %>% parse_date_time( c("mdy", "dmy")) #also does not work

I think I dont get something fundamental.

2 answers

  • answered 2021-10-23 11:35 Chris Ruehlemann

    Here's a solution based on lubridate:

    Toy data:

    df <- data.frame(Date1 = c("01-Mar-2015", "31-01-2012", "15/01/1999"), 
                     Var_Date = c("01-02-2018", "01/08/2016", "17-09-2007"), 
                     More_Dates = c("27/11/2009", "22-Jan-2013", "20-Nov-1987"))
    
    # define formats:
    formats <- c("%d-%m-%Y", "%d/%m/%Y", "%d-%b-%Y")
    

    A dyplrsolution:

    library(dplyr)
    library(lubridate)
    df %>% 
      mutate(across(contains("Date"), 
                    ~ parse_date_time(., orders = formats))) %>%
      mutate(across(contains("Date"),
                    ~ format(., "%d-%m-%Y")))
           Date1   Var_Date More_Dates
    1 01-03-2015 01-02-2018 27-11-2009
    2 31-01-2012 01-08-2016 22-01-2013
    3 15-01-1999 17-09-2007 20-11-1987
    

    A base Rsolution:

    library(lubridate)
    df[,grepl("Date", names(df))] <- apply(df[,grepl("Date", names(df))], 2, 
                      function(x) format(parse_date_time(x, orders = my_formats), "%d-%m-%Y"))
    

  • answered 2021-10-23 13:22 akrun

    We could use parse_date from parsedate

    library(parsedate)
    library(dplyr)
    df %>%
        mutate(across(everything(), parse_date))
           Date1   Var_Date More_Dates
    1 2015-03-01 2018-01-02 2009-11-27
    2 2012-01-31 2016-01-08 2013-01-22
    3 1999-01-15 2007-09-17 1987-11-20
    

    data

    df <- structure(list(Date1 = c("01-Mar-2015", "31-01-2012", "15/01/1999"
    ), Var_Date = c("01-02-2018", "01/08/2016", "17-09-2007"), More_Dates = c("27/11/2009", 
    "22-Jan-2013", "20-Nov-1987")),
     class = "data.frame", row.names = c(NA, 
    -3L))
    

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum