r - Subset rows according to a range of time (incl. minutes) -
my question follow-up on question raised here user wet feet:
this modified dataset:
date_time loc_id node energy kgco2 1 2009-02-27 00:11:08 87 103 0.00000 0.00000 2 2009-02-27 01:05:05 87 103 7.00000 3.75900 3 2009-02-27 02:05:05 87 103 6.40039 3.43701 4 2009-02-28 02:10:05 87 103 4.79883 2.57697 5 2009-02-28 04:05:05 87 103 4.10156 2.20254 6 2009-02-28 05:05:05 87 103 2.59961 1.39599 7 2009-03-01 03:20:05 87 103 2.59961 1.39599
i trying rows fall within specific time interval, e.g. 02:05:00 03:30:00.
3 2009-02-27 02:05:05 87 103 6.40039 3.43701 4 2009-02-28 02:10:05 87 103 4.79883 2.57697 7 2009-03-01 03:20:05 87 103 2.59961 1.39599
applying solution in linked question (hour
lubridate
package), however, doesn't suffice since have consider minutes of interval. use interval
function lubridate
package include minutes, since dataframe covers different dates, wouldn't help.
i particularly curious whether there solution allowing make use of dplyr
's filter
verb. or using xts
package way forward?
in code below, each row calculate number of seconds since midnight , check whether value within time range in question, converted seconds since midnight. i've included code set data datetime format (and utc time zone) since data sample wasn't provided in reproducible form.
1. set data frame
library(lubridate) library(tidyverse) dat = read.table(text="date_time time loc_id node energy kgco2 1 2009-02-27 00:11:08 87 103 0.00000 0.00000 2 2009-02-27 01:05:05 87 103 7.00000 3.75900 3 2009-02-27 02:05:05 87 103 6.40039 3.43701 4 2009-02-28 02:10:05 87 103 4.79883 2.57697 5 2009-02-28 04:05:05 87 103 4.10156 2.20254 6 2009-02-28 05:05:05 87 103 2.59961 1.39599 7 2009-03-01 03:20:05 87 103 2.59961 1.39599", header=true, stringsasfactors=false) dat$date_time = as.posixct(paste0(dat$date_time, dat$time), tz="utc") dat = dat %>% select(-time)
2. helper function convert hms time strings seconds since midnight
hms_to_numeric = function(x) { x = as.posixct(paste("2010-01-01", x)) 3600 * hour(x) + 60 * minute(x) + second(x) }
3. filter data include rows within time range
dat %>% filter(between(as.numeric(date_time) - as.numeric(as.posixct(substr(date_time,1,10), tz="utc")), hms_to_numeric("02:05:00"), hms_to_numeric("03:30:00")))
date_time loc_id node energy kgco2 1 2009-02-27 02:05:05 87 103 6.40039 3.43701 2 2009-02-28 02:10:05 87 103 4.79883 2.57697 3 2009-03-01 03:20:05 87 103 2.59961 1.39599
wiki
Comments
Post a Comment