Dates - Times


Dates and Times deserve their own page, as they are unique and a bit different than what we’ve seen so far.

Packages

library(tidyverse)
library(lubridate)
library(stringr) # for mutate

In R, there are three types of data that refer to an instant in time:

  • A date (“2016-08-16”)
  • A time within a day (“20:11:59 UTC”)
  • And a date-time. This is a date plus a time (“2018-03-31 18:15:48 UTC”)

The time is given in UTC, which stands for Universal Time Coordinated, more commonly called Universal Coordinated Time. This is the primary standard by which the world regulates clocks and time.

Dates are stored and represented as an object of the Date class

  • Dates are stored internally as the number of days since 1970-01-01
  • Times are represented by the POSIXct or the POSIXlt class
  • Times are stored internally as the number of seconds since 1970-01-01 for POSIXct
  • Times are stored as a list of seconds, minutes, hours …. for POSIXlt

Lubridate


ymd

When you run the function, R returns the date in yyyy-mm-dd format. It works the same way for any order. For example, month, day, and year.R still returns the date in yyyy-mm-dd format.

These functions also take unquoted numbers and convert them into the yyyy-mm-dd format.

ymd("20210120")   #you can run it without ""
[1] "2021-01-20"
ymd("2021-01-20") 
[1] "2021-01-20"

mdy

mdy("January 20th, 2021")
[1] "2021-01-20"

dmy

Or, day, month, and year. R still returns the date in yyyy-mm-dd format.

 dmy("20-Jan-2021") 
[1] "2021-01-20"

Dates


today

For example, to get the current date you can run the today() function. The date appears as year, month, and day.

today()
[1] "2024-11-01"

now

To get the current date-time you can run the now() function. Note that the time appears to the nearest second.

now()
[1] "2024-11-01 15:35:43 CDT"

When working with R, there are three ways you are likely to create date-time formats:

  • From a string
  • From an individual date
  • From an existing date/time object

R creates dates in the standard yyyy-mm-dd format by default.

Convert


Integer to Date

date_integer <- c(20120101, 20120104, 20120107, 20120110, 20120113, 20120116, 
                  20120119, 20120122)
date_integer
[1] 20120101 20120104 20120107 20120110 20120113 20120116 20120119 20120122

as.date

  • The first option is to use as.date
as.Date(as.character(date_integer), "%Y%m%d")
[1] "2012-01-01" "2012-01-04" "2012-01-07" "2012-01-10" "2012-01-13"
[6] "2012-01-16" "2012-01-19" "2012-01-22"

ymd

  • or can use ymd
ymd(date_integer)
[1] "2012-01-01" "2012-01-04" "2012-01-07" "2012-01-10" "2012-01-13"
[6] "2012-01-16" "2012-01-19" "2012-01-22"

String to Date

as.date

Dates can be coerced from character string to date using as.Date(). It prints out as character string but it is not a string.

 # Coerce a 'Date' object from character
x <- as.Date("1970-01-01")
x
[1] "1970-01-01"
class(x)
[1] "Date"

unclass

You can see the internal representation of a Date object by using the unclass() function. - Remember that date is stored as the # of days since 1970-01-01

unclass(x)                          
[1] 0
unclass(as.Date("1970-01-02")) 
[1] 1
unclass(as.Date("1960-12-25")) 
[1] -3294

Date/time data often comes as strings. You can convert strings into dates and date-times using the tools provided by lubridate. These tools automatically work out the date/time format. - First, identify the order in which the year, month, and day appear in your dates. - Then, arrange the letters y, m, and d in the same order. - That gives you the name of the lubridate function that will parse your date. For example, for the date 2021-01-20, you use the order ymd:

format(as.Date)

  • We already saw how to use as.date. Now we can use format(as.Date)() which does what it sounds like.
  • If we want to convert a string to date and we want to format it a specific way.
  • I’ll start the code here and the rest will be used in Extract further down in the Convert section.
  • So all we did is format the started_at value as.Date with the format = “option”, and we assigned the value as.Date to date and created a new column with the mutate()
test_trip <- trips19 |> 
        mutate( date = format(as.Date(started_at), format = "%m%d%Y")

Date to String

tostring

y = toString("12/26/2024")
y
[1] "12/26/2024"

format

This one is very similar to format(as.Date())

beDate = format(latest_weight$ActivityDate, "%m-%d-%y")

Date to Datetime

The ymd() function and its variations create dates.

ymd_hms

To create a date-time from a date, add an underscore and one or more of the letters h, m, and s (hours, minutes, seconds) to the name of the function:

ymd_hms("2021-01-20 20:11:59")
[1] "2021-01-20 20:11:59 UTC"
mdy_hm("01/20/2021 08:01")
[1] "2021-01-20 08:01:00 UTC"

Datetime to Date

Ok so how about when we want to switch back to date? Do you remember earlier we used now() and we got this value

now()
[1] "2024-11-01 15:35:43 CDT"

What if we want the date only?

as.date

as_date(now())
[1] "2024-11-01"

Split & Extract


Well if we want to extract specific days, months, year. Well we already used mdy() earlier so now we can use

day

wday

This signifies the start of a week, we can set the first day of the week to be Monday = 1, Sunday = 7

wday(x, label=TRUE, week_start = 1)  # for monday

month

year

To extract what we need. Here is an example where we use mdy(ActivityDate) and mutate it into a new column DateofActivity, this way we don’t edit the original data. Then we can extract day, month, and year and mutate each into a new column (day, month, year)

dailyactivity_df_3_4 %>%
        mutate(DateOfActivity = mdy(ActivityDate)) %>%
        mutate(day = day(DateOfActivity)) %>%  
        mutate (month = month(DateOfActivity)) %>%
        mutate (year = year(DateOfActivity)) %>% 
        glimpse()

timeframes

Here is a complete example I pulled from a project I worked on, where I took a certain string (started_at) converted to date with the as.date(), then I formatted the value mdY and saved it all in date.

I also extracted the month as a string, weekdays as string, quarter as string…. more information can be found here.

# LET'S BREAK started_at INTO >DATE,YEAR,QUARTER,MONTH(NUM),DAY(NUM),WEEKDAY(STRING)
test_trip <- trips19 %>% 
        mutate(
                date = format(as.Date(started_at), format = "%m%d%Y"),      #monthdayYYYY
                week_day = weekdays(started_at),                            #text for the day
                month_wor = months(started_at),                             #month in text  
                quarter = quarters(started_at),                             #quarter
                num_day = day(started_at),                                  #gives the day of the month in number
                blah = wday(started_at),                                    #number for the day of week with sunday=1
                blah_blah = wday(started_at, label = TRUE),                #only 3 letters text for the day 
                blue = format(as.Date(started_at), format = "%A"),         #same as week_day  
                month = format(as.Date(started_at), format = "%m"),        #months in number 
                day = format(as.Date(started_at), format = "%d"),          #same as num_day  
                year = format(as.Date(started_at), format = "%Y")          #Y > 1111 and y > 11 
                )

Time


You can always load the lubridate package if you plan on working with time

  • POSIXct is just a very large integer under the hood, it uses a useful class when you want to store times in something like a data frame. You can coerce a number using as.POSIXct()

  • POSIXlt is a list underneath and it stores a bunch of other useful information like the day of the week, day of the year, month, day of the month. You can use as.POSIXlt() Some generic functions that work on dates and times: . weekdays: gives the day of week . months: gives the month name . quarters: gives the quarter number

POSIXlt

x <- Sys.time()
x
[1] "2024-11-01 15:35:44 CDT"
p <-  as.POSIXlt(x)
names(unclass(p))
 [1] "sec"    "min"    "hour"   "mday"   "mon"    "year"   "wday"   "yday"  
 [9] "isdst"  "zone"   "gmtoff"
p$sec   #extract seconds
[1] 44.03039
p$mday  #day of the month 
[1] 1

POSIXct

So you see here that POSIXCt() doesn’t have a list. We can coerce it from POSIXct to Xlt by using as.POSIXlt()

x <- Sys.time()
x
[1] "2024-11-01 15:35:44 CDT"
unclass(x)
[1] 1730493344
p <- as.POSIXct(x)
p
[1] "2024-11-01 15:35:44 CDT"
unclass(p)      #so you see it's already in POSIXct format it doesn't have any list elements
[1] 1730493344
names(unclass(p))
NULL
x$sec          #so I get an error message if I try it, so I need to convert it or coerce it

coerce

Since Sys.time() is POSIXct by default, I can coerce it to being POSIXlt by using: as.POSIXlt()

p <- as.POSIXlt(x)
p$sec
[1] 44.21886
#or for a new one here:
t1 <- Sys.time()
t1
[1] "2024-11-01 15:35:44 CDT"
class(t1)
[1] "POSIXct" "POSIXt" 
unclass(t1)
[1] 1730493345
t2 <- as.POSIXlt(Sys.time())
class(t2) 
[1] "POSIXlt" "POSIXt" 
t2 
[1] "2024-11-01 15:35:44 CDT"
unclass(t2)
$sec
[1] 44.6634

$min
[1] 35

$hour
[1] 15

$mday
[1] 1

$mon
[1] 10

$year
[1] 124

$wday
[1] 5

$yday
[1] 305

$isdst
[1] 1

$zone
[1] "CDT"

$gmtoff
[1] -18000

attr(,"tzone")
[1] ""    "CST" "CDT"
attr(,"balanced")
[1] TRUE

str(unclass)

To have a more compact view of unclass()

#to have a more compact view use
str(unclass(t2))
List of 11
 $ sec   : num 44.7
 $ min   : int 35
 $ hour  : int 15
 $ mday  : int 1
 $ mon   : int 10
 $ year  : int 124
 $ wday  : int 5
 $ yday  : int 305
 $ isdst : int 1
 $ zone  : chr "CDT"
 $ gmtoff: int -18000
 - attr(*, "tzone")= chr [1:3] "" "CST" "CDT"
 - attr(*, "balanced")= logi TRUE

extract elements

If we want to just use the minutes from t2 above we use

t2$min
[1] 35

weekday

This function returns the day of the week. Remember d1 contains todays date so let’s extract the day of the week from it

d1 <- today()
weekdays(d1)
[1] "Friday"

months

Works for month of the year

months(d1)
[1] "November"
months(t1)
[1] "November"

strptime

strptime() converts your dates if they are written in a different format strings. Look at the examples below. I have to pass it a format strings in the arguments. Check ?strptime for details

datestring <- c("January 10, 2012 10:40", "December 9, 2011 9:10")
x <- strptime(datestring, "%B %d, %Y %H:%M") 
x  
[1] "2012-01-10 10:40:00 CST" "2011-12-09 09:10:00 CST"
class(x) 
[1] "POSIXlt" "POSIXt" 

strptime 2

Store this in t3: “October 17, 1986 08:24”

t3 <- "October 17, 1986 08:24"
t4 <- strptime(t3, "%B %d, %Y %H:%M")
t4 
[1] "1986-10-17 08:24:00 CDT"

difftime

We can substract one time from another for example we assigned Sys.time() to t1 earlier, so we can in a sense check to see if the time now is different than t1 and then we can subtract one from the other. Or we can use difftime() which allows us to set the format/units we are needing to retrieve. Of course if the two vars are within minutes of each other like the first example, it’s useless to be looking for a day difference because it’ll be zero. But let’s use it anyways:

Sys.time() > t1        #this will tell us if time has elapsed
[1] TRUE
Sys.time() - t1          #gives is the difference in this case minutes  
Time difference of 0.655916 secs
difftime(Sys.time(), t1, units ='days') 
Time difference of 8.132116e-06 days

difftime() calculates the difference between two timevalues. Using it in many cases will give the right answer but might not be useful for calculations you are doing. In the example below, it gave the correct value in seconds but classified the answer as drtn which when placed in the dataset places a value followed by secs like this: 360 secs, 445 secs… and on and on.

So I had to coerce it into a number by using the as.number function  which converted it to a num right in the code, which then gave us the value without the secs part.

#____________________CALCULATE ride_length AND DROP tripduration
all_trips19_20 <- all_trips19_20 %>%
        mutate(ride_length = as.numeric(difftime(ended_at, started_at, units = 'secs')))

as.numeric

Aside from using difftime() you can always convert the date to numeric and subtract

date1 <- as.Date("1970-01-01")
date1
[1] "1970-01-01"
date2 <- as.Date("2012-06-21")
date2
[1] "2012-06-21"
as.numeric(date1 - date2)
[1] -15512