Dealing with time format beyond 24 hours and do some maths

In my data.table, I have a character column representing time, and I have to do some tests, some maths on this column.

I choosed to use lubridate::hms() to convert character because of time greather than 24 hours, i.e. "26:35:20".

Let see reprex and firsts steps :

library(data.table)
library(lubridate)

foo <- data.table(id = seq(1:5),
                  mytime = c("03:45:12", "17:56:00", "26:32:63", "09:00:00", "35:05:55"))

foo[, hmstime := hms(mytime)]

This :

testingtime <- hms("12:00:00")
addingtime <- hms("24:00:00")

testingtime + hours(24) 
testingtime + addingtime

And this :

foo[, addtime := hmstime + addingtime]

foo[hmstime < testingtime, "test1" := "ok"]

run fine.

But this doesn't run :

foo[hmstime < testingtime, "test2" := hmstime + addingtime]

And I can't seem to find out why... Maybe I was on the wrong track from the start ? But I can't find another way to achieve this.

Many thanks !

1 answer

  • answered 2021-10-23 13:14 r2evans

    Before that call, test2 does not exist. The conditional new assignment typically sets NA for all other rows, but the data returned by that call is not so easily worked into that flow:

    ### non-assignment, just evaluating
    dput(foo[hmstime < testingtime, hmstime + addingtime])
    # new("Period", .Data = c(12, 0, 12, 0, 12), year = c(0, 0, 0, 
    # 0, 0), month = c(0, 0, 0, 0, 0), day = c(0, 0, 0, 0, 0), hour = c(27, 
    # 41, 50, 33, 59), minute = c(45, 56, 32, 0, 5))
    

    My guess (without diving into data.table-internals) is that it doesn't know how to work a new(...) object interspersed with NAs into the new column. Admittedly, it's not something I do frequently, either.

    Workaround: assign all the new value, then NA-out the remainder.

    foo[, "test2" := hmstime + addingtime][hmstime >= testingtime, test2 := NA]
    foo
    #       id   mytime     hmstime     addtime  test1       test2
    #    <int>   <char>    <Period>    <Period> <char>    <Period>
    # 1:     1 03:45:12  3H 45M 12S 27H 45M 12S     ok 27H 45M 12S
    # 2:     2 17:56:00  17H 56M 0S  41H 56M 0S   <NA>        <NA>
    # 3:     3 26:32:63 26H 32M 63S 50H 32M 63S   <NA>        <NA>
    # 4:     4 09:00:00    9H 0M 0S   33H 0M 0S     ok   33H 0M 0S
    # 5:     5 35:05:55  35H 5M 55S  59H 5M 55S   <NA>        <NA>
    

    I want to suggest prefilling with another Period throw-away value (such as NA), but that fails too:

    foo[, test2 := hmstime[NA]][hmstime < testingtime, test2 := hmstime + addingtime]
    # Error in `[.data.table`(foo[, `:=`(test2, hmstime[NA])], hmstime < testingtime,  : 
    #   Supplied 5 items to be assigned to 2 items of column 'test2'. If you wish to 'recycle' the RHS please use rep() to make this intent clear to readers of your code.
    

    suggesting that the issue really is in subsetting the new(...) to fit into the column. Note that the first suggestion (that works) in fact does work because it is assigning a simple NA value into an existing column, it is not trying to fit a new(...) into a column.

    This might be a bug or feature-request for data.table, over to you.


    Another workaround that is much closer to the canonical data.table way of doing things, thanks @Frank:

    foo[hmstime < testingtime, test3 := hmstime[.I] + addingtime]
    

How many English words
do you know?
Test your English vocabulary size, and measure
how many words do you know
Online Test
Powered by Examplum