Monday, August 18, 2008

Subtract Two Dates [Unix]

In this post, I will describe how you can subtract two dates (which are in the form yyyyMMdd) to get the time between them. My approach involves first parsing both dates in order to extract the year, month and day, converting them to seconds using floating point math and finally subtracting one from the other.

Bash doesn't support floating point arithmetic, but I won't let that stop me. I will use awk instead. Another possible candidate is bc.

1. Parsing the date
We need to extract the year, month and day from our date which is in yyyyMMdd e.g. 20080818. The year is composed of the first four characters, the month is the next two and the day, the final two. We can do this using awk's substr function as shown below. substr(s,m,n) returns the n-character substring of s that begins at position m.

today=20080818
echo $today | awk '{
 year=substr($1,1,4);
 month=substr($1,5,2);
 day=substr($1,7,2);
 }'
If you don't like awk, you can also get substrings using the shell's expr command:
today=20080818
year=`/usr/ucb/expr substr $today 1 4`
month=`/usr/ucb/expr substr $today 5 2`
day=`/usr/ucb/expr substr $today 7 2`
2. Converting to seconds
We will use the formula below to give us the number of days since 1/1/1970:
(year-1970)*365.25 + month*30.5 + day

Now convert the days to seconds by multiplying the days by 24 * 60 * 60:
((year-1970)*365.25 + month*30.5 + day) * 24 * 60 * 60

3. Subtracting the times
Once we have both dates in seconds, we can subtract them. Since we're still dealing with floating point we should use awk or bc for precision.
echo $seconds1 $seconds2 | awk '{print $1 - $2}'
or
echo $seconds1 - $seconds2 | bc
Putting it all together
Here is the complete shell script which takes two dates and returns the number of seconds between them:
#!/usr/bin/bash

date1=$1
date2=$2

#give the dates to awk
echo $date1 $date2 | awk '{

#parse
year1=substr($1,1,4);
month1=substr($1,5,2);
day1=substr($1,7,2);

year2=substr($2,1,4);
month2=substr($2,5,2);
day2=substr($2,7,2);

#get seconds
secs1=((year1 - 1970)*365.25+(month1*30.5)+day1)*24*60*60;
secs2=((year2 - 1970)*365.25+(month2*30.5)+day2)*24*60*60;

#subtract
print secs1 - secs2;
}'
As you can see, all of the computation is in awk! So we could put all of our awk code into a separate file called subtractDates.awk, for instance, and run it on the command line like this: echo 20080830 20080818 | awk -f subtractDates.awk

Now that we have the number of seconds between two dates, we can convert that to days, months and years using a similar approach if we have to.

4 comments:

  1. Anonymous8:17 PM

    thanks, it helped me a lot!

    ReplyDelete
  2. Anonymous2:21 PM

    Another really helpful titbit.
    Good work Fahd!!

    ReplyDelete
  3. Anonymous10:14 PM

    I'm a little bit un-happy with that 365.25 days / year approximation. What if the two dates substracted are in the same (non-leap) year?

    I'd go with almost the same idea, just letting bigger brains (awk developers) do the guestimation for me, by using something along the lines of:

    # asume I have the time as hh:mm

    sub(/:/, " ", time);
    seconds=mktime("2009 01 14 " time " 00");

    ReplyDelete
  4. Try this function:

    ########################################################################################
    # date2num : calculate daynumber based on year, month and day #
    # #
    # Argumemts : gregdate - date in "CCYYMMDD" format #
    # #
    # Returns : number of days since 0000-01-01 #
    # #
    ########################################################################################
    function date2num( gregdate, _year, _month, _day, _jdays, _epoch, _olympiad, _jaar, _eeuw, _absolute)
    {
    _year = substr(gregdate, 1, 4) + 0
    _month = substr(gregdate, 5, 2) + 0
    _day = substr(gregdate, 7, 2) + 0
    if ((_year % 4 == 0) && ((_year % 100 != 0) || (_year % 400 == 0))) {
    split ( "0/31/60/91/121/152/182/213/244/274/305/335/366", _jdays, "/")
    }
    else {
    split ( "0/31/59/90/120/151/181/212/243/273/304/334/365", _jdays, "/")
    }
    _jaar = _year - 1
    _epoch = int( _jaar / 400 )
    _jaar -= _epoch * 400
    _eeuw = int( _jaar / 100 )
    _jaar -= _eeuw * 100
    _olympiad = int ( _jaar / 4 )
    _jaar -= _olympiad * 4
    _absolute = ( _epoch * 146097) + ( _eeuw * 36524 ) + ( _olympiad * 1461 ) + ( _jaar * 365 ) - 1

    return _absolute + _jdays[_month] + _day
    }

    ReplyDelete

Note: Only a member of this blog may post a comment.