I got an email from a gentlemen boasting labor data that even the Bureau of Labor Statistics hasn’t published yet!
Retale, a tool that helps you scour advertisements for deals in your area, has produced a cool infograph on what Americans are doing right now. It tells you what activities Americans are doing at different times of the day: at 9PM, 34% of Americans are watching TV, 7% are still working, etc…
The data is pretty reliable, as it is from the Bureau of Labor Statistics. You can even break down the graphs into various demographics: men, women, 45-54 year olds, retirees….
Even though I am not American, the model predicts correctly that I am watching TV right now. The TV is on in the background as I type this, so close enough.
Check out the infograph here.
Say you have a dataset, where each row has a date or time, and something is recorded for that date and time. If each row is a unique date – great! If not, you may have rows with the same date, and you have to combine records for the same date to get a daily tally.
Here is how you can make a daily tally (or a monthly or yearly one; the frequency of tallies is not important):
- convert the dates to numbers. R will say 01/01/1970 is day 1, 02/01/1970 is day 2, …, 07/03/2010 is day 14675; 31/12/1960 is day -1.
- use a “for loop” to lump entries from the same date together
- calculate the daily by calculating the number of rows in the daily lump (I do this below), or by adding all entries in a particular column in a daily lump
To get the daily total,
summary(rott[,2])<-as.numeric(as.Date(rott[,2], format=”%m/%d/%Y”, origin = “3/7/2010″))
for(i in 1:184) #my data spans 184 days from 7th March to 6th Sept 2010
rott.i<-rott[rott[,2]==14674+i,] daily[i,1]<-nrow(rott.i) #7th March 2010 is the 14675th day from 01/01/1970, the day the R calendar starts
acf(daily,main=”Autocorrelation of Timeseries”) #ACF!