Time is free, but it's priceless. You can't own it, but you can use it. You can't keep it, but you can spend it. Once you've lost it you can never get it back.

01 Apr 2009

An algorithm to convert a date from RFC 1123 (or with slight modification any date string) into Unix epoch time.
Today at work I had an interesting problem that needed a solution. At times, the program that i'm working on needs to read the http date header response (in RFC 1123 format) and convert it into unix epoch time. If that sounded Greek to you, basically it means that once in a while when a webpage is requested, our program needs to turn the date that the webpage says is "right now" and turn it into a number that the program can use to do time synchronization. For example, Wed, 25 Feb 2009 17:20:17 GMT becomes 1235582417000.

Normally this is a trivial task that one doesn't need to think about because almost all languages provide date utilities to do this sort of thing for you. The language in use on this particular project is called Lua. We chose this language for various reasons that aren't important to this discussion. However, if you're looking for a light-weight, powerful, and fast language that straddles the barrier between a scripting language and a full blown language like C or Java, you should give Lua a try.

Like most other languages, Lua has date methods that can easily turn an RFC 1123 date into epoch time. So what's the point of this article? It's not to show you how to use the built-in Lua function. It's to show you how to manually do the computation yourself. And why would I want to do such a thing? In this case, the reason happens to be because the implementation of os.date (which is the Lua date utility function) returns different values depending on which operating system is in use. If it's run on Windows, it returns a different value than on a unix, linux, or bsd system. But before you go cursing Windows ... the windows implementation is actually the one that gave the expected response! Since our program is going to run on a variety of platforms, this important method must work identically across them all. Hence, my task for the day: write my own conversion function.

The first order of business was to write a function that would determine if a date was a leap year. A quick Google search yielded the following facts: A year is a leap year IF it is divible by 4 (such as 2008), unless it's a century (such as 1900). But every 4th century is in fact a leap year (such as 1600, 2000, 2400). Now some people might sputter and go "but wait, if the date is before such and such, or after such and such, the algorithm breaks down. However, i don't really need to worry about dates prior to NOW and if this is still in use in 30 years, well ... we'll have a whole new Y2K thing to worry about anyway. :) To code the algorithm up in lua I wrote the following function. This could easily be adapted to any other programming language (and before you deride me for not using a tertiary operator ... there is no such thing in Lua):
function isLeapYear(year)
   if (year % 4 == 0 and year % 100 ~= 0) or (year % 400 == 0) then
        return true
    else
        return false
    end
end
Next I needed some tables (as they're called in Lua) to contain mapping functions for the conversion.
-- map an abbreviated month name (Jan, Feb, etc..) to a 1-based month index value
local monthEnum = {
    Jan=1, Feb=2, Mar=3, Apr=4, May=5, Jun=6, Jul=7, Aug=8, Sep=9, Oct=10, Nov=11, Dec=12
}

-- based on a numeric month val, determine how many days occurred (in a year) prior to that month - (does not count leap years)
local daysBeforeMonthEnum = {
    ['1']=0, ['2']=31, ['3']=31+28, ['4']=31+28+31, ['5']=31+28+31+30, ['6']=31+28+31+30+31, ['7']=31+28+31+30+31+30,
    ['8']=31+28+31+30+31+30+31, ['9']=31+28+31+30+31+30+31+31, ['10']=31+28+31+30+31+30+31+31+30,
    ['11']=31+28+31+30+31+30+31+31+30+31, ['12']=31+28+31+30+31+30+31+31+30+31+30
}
Obviously the daysBeforeMonthEnum table can be optimized so that the values are pre-calculated rather than having to be calculated each time a lookup occurs. I did it this way during development to make sure i got the values correct. It also helps anyone reading this to understand the intention of the mapping.
And finally, the method that does all the work. Simply call this with an RFC 1123 date string, and it will return a long with the correct epoch time.
function convertHeaderTimeToEpochTime(date)
    -- break up the date string into its constituant parts
    local year = tonumber(date:sub(13,16))
    local month = monthEnum[date:sub(9,11)] -- using the lookup table
    local day = tonumber(date:sub(6,7))
    local hour = tonumber(date:sub(18,19))
    local min = tonumber(date:sub(21,22))
    local sec = tonumber(date:sub(24,25))

    local i -- loop counter.  declare everything local or else it automatically goes to global scope in lua
    local time = 0 -- will hold the calculated time value
    local leap = isLeapYear(year) -- whether or not the current year is a leap year

    -- calculate seconds in each year from the epoch start (1970) up to (but not including) the current year
    for i = year-1,1970,-1 do
        if isLeapYear(i) then
           time = time + (366 * 24 * 3600) -- leap years have 366 days
        else
           time = time + (365 * 24 * 3600) -- normal years have 365 days
        end
    end

    -- calculate how many seconds have elapsed so far in this year prior to the current month
    local monthDays = daysBeforeMonthEnum[tostring(month)] -- using the lookup table
    if month > 2 and leap then monthDays = monthDays + 1 end -- account for leap years
    time = time + (monthDays * 24 * 3600)

    -- calculate the rest
    time = time + ((day-1) * 24 * 3600) -- how many seconds in the days of the current month prior to today
    time = time + hour * 3600 -- how many seconds have elapsed so far today up to the current hour
    time = time + min * 60 -- how many seconds have elapsed so far today up to the current minute
    time = time + sec -- add in the remaining seconds of the current minute

    -- since the RFC 1123 doesn't give millisecond precision, assume 0 and convert seconds to milliseconds
    time = time * 1000

    return time
end
Whew, that was a mouthful huh? Took me an hour or two to work out all the kinks. I even wrote some unit tests to ensure that everything is working (I'll write up an article on the benefits [read: necessity] of good unit tests another time). I could optimize this method slightly by storing all the calculations as constants instead of recalculating them all the time, but just as i mentioned before, it helped me out during development to make sure i had the values correct. Plus this method is only called when a page is loaded, which is not part of a time-critical loop or calculation, so if it takes an extra 2-3 milliseconds it's not a big deal.

All in all, a fun diversion. It's good to break up the big hairy tasks with something fun and different now and again.