fbpx

CALENDAR TABLES IN T-SQL

วิธีการรับมือกับการเปลี่ยนแปลงในการงาน และ ชีวิต

CALENDAR TABLES IN T-SQL

In an earlier blog, we covered a type of auxiliary table (the Tally Table) that can provide a lot of querying flexibility if you have one in your database or construct one in-line to your query.  Today we’re going to talk about another: the Calendar table.

The basic concept behind a Calendar table is that each row is a date and the columns in the table represent complex date calculations that otherwise you’d need to perform manually in your query.  If the calculations are done beforehand, they can tremendously speed up your query because you can avoid them when you need them.

We’re also going to show you a really neat and fast FUNCTION for generating a Calendar table that you could use instead of having a pre-defined, auxiliary table in your database.

THE GENERATECALENDAR FUNCTION

We’ll start by proposing a utility FUNCTION for performing various complex date calculations all in one shot.  Originally I had the idea that this might be nearly as fast as having a separate Calendar table.  A friend of mine, SQL MVP Jeff Moden, got wind of what I was doing and he rewrote my version into one that was significantly faster.  So without further ado, here is that FUNCTION.

CREATE FUNCTION [dbo].[GenerateCalendar] 
        (
        @FromDate   DATETIME, 
        @NoDays     INT			
        )
-- Generates a calendar table with sequential day numbering (@FromDate = SeqNo 1).
-- See RETURNS table (comments) for meaning of each column.
-- Notes: 1) Max for NoDays is 65536, which runs in just over 2 seconds.
--
-- Example calls to generate the calendar:
-- 1) Forward for 365 days starting today:
-- DECLARE @Date DATETIME
-- SELECT @Date = GETDATE()
-- SELECT * 
-- FROM dbo.GenerateCalendar(@Date, 365) 
-- ORDER BY SeqNo;
-- 2) Backwards for 365 days back starting today:
-- DECLARE @Date DATETIME
-- SELECT @Date = GETDATE()
-- SELECT * 
-- FROM dbo.GenerateCalendar(@Date, -365) 
-- ORDER BY SeqNo;
-- 3) For only the FromDate:
-- DECLARE @Date DATETIME
-- SELECT @Date = GETDATE()
-- SELECT * 
-- FROM dbo.GenerateCalendar(@Date, 1);
-- 4) Including only the last week days of each month:
-- Note: Seq no in this case are as if all dates were generated
-- DECLARE @Date DATETIME
-- SELECT @Date = GETDATE()
-- SELECT * 
-- FROM dbo.GenerateCalendar(@Date, 365) 
-- WHERE Last = 1 ORDER BY SeqNo;
RETURNS TABLE WITH SCHEMABINDING AS
 RETURN
--===== High speed code provided courtesy of SQL MVP Jeff Moden (idea by Dwain Camps)
--===== Generate sequence numbers from 1 to 65536 (credit to SQL MVP Itzik Ben-Gen)
   WITH  E1(N) AS (SELECT 1 UNION ALL SELECT 1), --2 rows
         E2(N) AS (SELECT 1 FROM E1 a, E1 b),    --4 rows
         E4(N) AS (SELECT 1 FROM E2 a, E2 b),    --16 rows
         E8(N) AS (SELECT 1 FROM E4 a, E4 b),    --256 rows
        E16(N) AS (SELECT 1 FROM E8 a, E8 b),    --65536 rows
   cteTally(N) AS (
SELECT TOP (ABS(@NoDays)) ROW_NUMBER() OVER (ORDER BY (SELECT NULL)) FROM E16)
        -- [SeqNo]=Sequential day number (@FromDate always=1) forward or backwards
 SELECT [SeqNo]     = t.N,
        -- [Date]=Date (with 00:00:00.000 for the time component) 
        [Date]      = dt.DT,  
        -- [Year]=Four digit year 
        [Year]      = dp.YY,
        -- [YrNN]=Two digit year 
        [YrNN]      = dp.YY % 100,
        -- [YYYYMM]=Integer YYYYMM (year * 100 + month) 
        [YYYYMM]    = dp.YY * 100 + dp.MM,
        -- [BuddhaYr]=Year in Buddhist calendar 
        [BuddhaYr]  = dp.YY + 543, 
        -- [Month]=Month (as an INT) 
        [Month]     = dp.MM, 
        -- [Day]=Day (as an INT) 
        [Day]       = dp.DD,
        -- [WkDNo]=Week day number (based on @@DATEFIRST) 
        [WkDNo]     = DATEPART(dw,dt.DT),
        -- Next 3 columns dependent on the language setting so may not work for non-English 
        -- [WkDName]=Full name of the week day, e.g., Monday, Tuesday, etc. 
        [WkDName]   = CONVERT(NCHAR(9),dp.DW), 
        -- [WkDName2]=Two characters for the week day, e.g., Mo, Tu, etc. 
        [WkDName2]  = CONVERT(NCHAR(2),dp.DW),  
        -- [WkDName3]=Three characters for the week day, e.g., Mon, Tue, etc. 
        [WkDName3]  = CONVERT(NCHAR(3),dp.DW),  
        -- [JulDay]=Julian day (day number of the year) 
        [JulDay]    = dp.DY,
        -- [JulWk]=Week number of the year 
        [JulWk]     = dp.DY/7+1,
        -- [WkNo]=Week number 
        [WkNo]      = dp.DD/7+1,
        -- [Qtr]=Quarter number (of the year) 
        [Qtr]       = DATEPART(qq,dt.Dt),                       
        -- [Last]=Number the weeks for the month in reverse 
        [Last]      = (DATEPART(dd,dp.LDtOfMo)-dp.DD)/7+1,
        -- [LdOfMo]=Last day of the month 
        [LdOfMo]    = DATEPART(dd,dp.LDtOfMo),
        -- [LDtOfMo]=Last day of the month as a DATETIME
        [LDtOfMo]   = dp.LDtOfMo                                
   FROM cteTally t
  CROSS APPLY 
  ( --=== Create the date
        SELECT DT = DATEADD(dd,(t.N-1)*SIGN(@NoDays),@FromDate)
  ) dt
  CROSS APPLY 
  ( --=== Create the other parts from the date above using a "cCA"
    -- (Cascading CROSS APPLY (cCA), courtesy of Chris Morris)
        SELECT YY        = DATEPART(yy,dt.DT), 
                MM        = DATEPART(mm,dt.DT), 
                DD        = DATEPART(dd,dt.DT), 
                DW        = DATENAME(dw,dt.DT),
                Dy        = DATEPART(dy,dt.DT),
                LDtOfMo   = DATEADD(mm,DATEDIFF(mm,-1,dt.DT),-1)
 
  ) dp;

If you’ve read my blog on Tally tables, you’ll recognize the Ben-Gan style in-line Tally table that this FUNCTION uses (which will work in SQL 2005), and that it is limited to generating 65,536 rows (days) for your calendar table.  You may also note that it is a schema-bound, in-line Table Valued FUNCTION (iTVF), done that way to get the best possible performance out of it.  Let’s see how we would use it to populate a Calendar table starting on 01 Jan 2010.

SELECT [Date], [Year], [YrNN], [YYYYMM], [BuddhaYr], [Month], [Day]
    ,[WkDNo], [WkDName], [WkDName2], [WkDName3], [JulDay], [JulWk]
    ,[WkNo], [Qtr], [Last], [LdOfMo], [LDtOfMo]
INTO dbo.Calendar 
FROM dbo.GenerateCalendar('2010-01-01', 65536);
 
ALTER TABLE dbo.Calendar ALTER COLUMN [Date] DATETIME NOT NULL;
ALTER TABLE dbo.Calendar ALTER COLUMN [Year] INT NOT NULL;
ALTER TABLE dbo.Calendar ALTER COLUMN [Month] INT NOT NULL;
ALTER TABLE dbo.Calendar ALTER COLUMN [YYYYMM] INT NOT NULL;
ALTER TABLE dbo.Calendar ALTER COLUMN [Day] INT NOT NULL;
GO
ALTER TABLE dbo.Calendar ADD CONSTRAINT Cal_pk PRIMARY KEY([Date]);
ALTER TABLE dbo.Calendar ADD CONSTRAINT Cal_ix1 UNIQUE NONCLUSTERED([Year], [Month], [Day]);
ALTER TABLE dbo.Calendar ADD CONSTRAINT Cal_ix2 UNIQUE NONCLUSTERED([YYYYMM], [Day]);
 
SELECT TOP 10 [Date], [YYYYMM], [WkDName2], [WkNo], [Last]
FROM dbo.Calendar;

The final SELECT returns selected columns that we’ll use in some examples that follow for just the first few rows of the table.  Note that the CLUSTERED INDEX on [Date] ensures these rows are returned in date sequence.  We’ve also included a couple of UNIQUE NONCLUSTERED INDEXes to support the ways we may want to retrieve data from this table.

Date                     YYYYMM   WkDName2  WkNo   Last
2010-01-01 00:00:00.000  201001   Fr        1      5
2010-01-02 00:00:00.000  201001   Sa        1      5
2010-01-03 00:00:00.000  201001   Su        1      5
2010-01-04 00:00:00.000  201001   Mo        1      4
2010-01-05 00:00:00.000  201001   Tu        1      4
2010-01-06 00:00:00.000  201001   We        1      4
2010-01-07 00:00:00.000  201001   Th        2      4
2010-01-08 00:00:00.000  201001   Fr        2      4
2010-01-09 00:00:00.000  201001   Sa        2      4
2010-01-10 00:00:00.000  201001   Su        2      4

The last entry in this calendar table is 2189-06-06, which is probably far enough into the future for any reasonable use.  Note also that the columns [Date] and [LDtOfMo] could as easily have been CAST to a DATE data type, assuming we’re working in SQL 2008 or later because we’ll never be interested in the time component.

SIMPLE CASES OF USING OUR CALENDAR TABLE

While these are “simple” cases for using our Calendar table, try doing them without one and you might come to the conclusion that they may not be so quite so simple without the Calendar table.
Suppose we are interested in finding the first Monday of each month in the year 2014.  The WkNo makes this really simple.

SELECT [Date]
FROM dbo.Calendar
WHERE [Date] BETWEEN ‘2014-01-01’ AND ‘2014-12-31’ AND
    [WkDName2] = ‘MO’ AND [WkNo] = 1;

Using the [Date] field as we have done ensures this query uses a Clustered Index Seek that returns these results.

Date
2014-01-06 00:00:00.000
2014-02-03 00:00:00.000
2014-03-03 00:00:00.000
2014-05-05 00:00:00.000
2014-06-02 00:00:00.000
2014-08-04 00:00:00.000
2014-09-01 00:00:00.000
2014-10-06 00:00:00.000
2014-11-03 00:00:00.000
2014-12-01 00:00:00.000

The interested reader may confirm that these are all Mondays.
Likewise the [Last] column, which numbers the days backwards as weeks within the month, is designed specifically to address a requirement like “list all of the dates that are the last Friday in a month.”
SELECT [Date]
FROM dbo.Calendar
WHERE [Date] BETWEEN ‘2014-01-01’ AND ‘2014-12-31’ AND
[WkDName2] = ‘FR’ AND [Last] = 1;

Once again, because of the way we are using the [Date] in the WHERE filter, the result is a Clustered Index Seek, returning these results.

Date
2014-01-31 00:00:00.000
2014-02-28 00:00:00.000
2014-03-28 00:00:00.000
2014-04-25 00:00:00.000
2014-05-30 00:00:00.000
2014-06-27 00:00:00.000
2014-07-25 00:00:00.000
2014-08-29 00:00:00.000
2014-09-26 00:00:00.000
2014-10-31 00:00:00.000
2014-11-28 00:00:00.000
2014-12-26 00:00:00.000

Another requirement that can be a bit challenging is to calculate the last date of a particular month.  Consider the query and results below listing the first and last day of each month, and specifically how it returns the leap day for February.

SELECT [Date], [LDtOfMo]
FROM dbo.Calendar
WHERE [Date] BETWEEN '2012-01-01' AND '2012-12-31' AND
    [Day] = 1;
Date                     LDtOfMo
2012-01-01 00:00:00.000  2012-01-31 00:00:00.000
2012-02-01 00:00:00.000  2012-02-29 00:00:00.000
2012-03-01 00:00:00.000  2012-03-31 00:00:00.000
2012-04-01 00:00:00.000  2012-04-30 00:00:00.000
2012-05-01 00:00:00.000  2012-05-31 00:00:00.000
2012-06-01 00:00:00.000  2012-06-30 00:00:00.000
2012-07-01 00:00:00.000  2012-07-31 00:00:00.000
2012-08-01 00:00:00.000  2012-08-31 00:00:00.000
2012-09-01 00:00:00.000  2012-09-30 00:00:00.000
2012-10-01 00:00:00.000  2012-10-31 00:00:00.000
2012-11-01 00:00:00.000  2012-11-30 00:00:00.000
2012-12-01 00:00:00.000  2012-12-31 00:00:00.000

The real utility of having a Calendar table will be clearer from the example described in the next section.

USING THE CALENDAR TABLE TO FILL IN A MISSING MONTH

Let’s suppose that you have a table that contains sales orders.  We’ll simplify the situation slightly by ignoring the header/detail complexity of a true sales order, and just assume that the total for the order appears on the Sales order’s header.

CREATE TABLE #SalesOrders
(
    SONumber        INT IDENTITY PRIMARY KEY
    ,SODate         DATE
    ,Customer       VARCHAR(20)
    ,SalesAmount    MONEY
);
 
INSERT INTO #SalesOrders
VALUES('2013-01-15', 'AIS', 25000)
    ,('2013-01-31', 'AIS', 15000)
    ,('2013-03-01', 'AIS', 20000)
    ,('2013-03-02', 'AIS', 12000)
    ,('2013-03-05', 'AIS', 10000)
    ,('2013-01-15', 'DTAC', 25000)
    ,('2013-03-01', 'DTAC', 5000)
    ,('2013-03-15', 'DTAC', 11000);
 
SELECT *
FROM #SalesOrders;

We can see by inspecting the results that follow, that no sales were made in February.

SONumber  SODate       Customer   SalesAmount
1         2013-01-15   AIS        25000.00
2         2013-01-31   AIS        15000.00
3         2013-03-01   AIS        20000.00
4         2013-03-02   AIS        12000.00
5         2013-03-05   AIS        10000.00
6         2013-01-15   DTAC       25000.00
7         2013-03-01   DTAC       5000.00
8         2013-03-15   DTAC       11000.00

Suppose that our requirement was to list total sales by month for the first quarter of 2013.  We can use our Calendar table to fill in the missing month.

SELECT [YYYYMM], Sales=ISNULL(SUM(SalesAmount), 0)
FROM dbo.Calendar a
LEFT JOIN #SalesOrders b ON a.[Date] = DATEADD(month, DATEDIFF(month, 0, SODate), 0)
WHERE [Date] BETWEEN '2013-01-01' AND '2013-03-31' AND
    [Day] = 1
GROUP BY [YYYYMM];

In our Tally tables blog entry, we showed you how you could truncate a DATETIME value to either the day or hour.  DATEADD(month, DATEDIFF(month, 0, SODate), 0) simply truncates the DATE to the month (setting the day to the first of the month).  It should come as no surprise to our readers that the results from this query exactly matches the requirement that we specified.

YYYYMM  Sales
201301  65000.00
201302  0.00
201303  58000.00

We’ll leave it as an exercise for our interested readers to figure out how to list sales by company by month (so that there should be two rows for February each with a value of zero).
Note that just like a Tally table; a Calendar table can also be used to expand a row set if that happens to be what you need to do.

USING THE CALENDAR TABLE IN CONJUNCTION WITH A HOLIDAYS TABLE

Suppose we have a table of holidays, coincidentally containing all of the Aware holidays for 2014:

CREATE TABLE dbo.Holidays
(
    FromDate    DATETIME PRIMARY KEY
    ,ToDate     DATETIME
    ,Holiday    VARCHAR(100)
);
 
INSERT INTO dbo.Holidays
SELECT '2014-01-01', NULL, 'New Year''s Day'
UNION ALL SELECT '2014-02-14', NULL, 'Makha Bucha Day'
UNION ALL SELECT '2014-04-07', NULL, 'Substitution for Chakri Day'
UNION ALL SELECT '2014-04-14', '2014-04-15', 'Songkran Festival'
UNION ALL SELECT '2014-05-01', NULL, 'National Labor Day'
UNION ALL SELECT '2014-05-05', NULL, 'Coronation Day'
UNION ALL SELECT '2014-05-13', NULL, 'Wisakha Bucha Day'
UNION ALL SELECT '2014-07-11', NULL, 'Asarnha Bucha Day'
UNION ALL SELECT '2014-08-12', NULL, 'HM the Queen''s Birthday'
UNION ALL SELECT '2014-10-23', NULL, 'Chulalongkorn Day'
UNION ALL SELECT '2014-12-05', NULL, 'HM the King''s Birthday'
UNION ALL SELECT '2014-12-10', NULL, 'Constitution Day'
UNION ALL SELECT '2014-12-31', NULL, 'New Year''s Eve';
 
SELECT *
FROM dbo.Holidays;
FromDate     ToDate       Holiday
2014-01-01   NULL         New Year's Day 2014-02-14 NULL Makha Bucha Day 2014-04-07 NULL Substitution for Chakri Day 2014-04-14 2014-04-15 Songkran Festival 2014-05-01 NULL National Labor Day 2014-05-05 NULL Coronation Day 2014-05-13 NULL Wisakha Bucha Day 2014-07-11 NULL Asarnha Bucha Day 2014-08-12 NULL HM the Queen's Birthday
2014-10-23   NULL         Chulalongkorn Day
2014-12-05   NULL         HM the King's Birthday 2014-12-10 NULL Constitution Day 2014-12-31 NULL New Year's Eve

Let’s suppose that our assignment is to calculate pay day for each month in 2014.  Aware pays its employees on the last day of the month, the Friday before that if the last day of the month occurs on a weekend or the previous work day if the normal pay day happens to be a holiday.  Let’s do this in two parts.
Firstly, we know that none of the holidays plus weekend days can span more than seven days, so we just need to find (using the [Last] column of our Calendar table) all of the work days in the last week of the month.

SELECT [Date], [WkDName2], FromDate
FROM dbo.Calendar a
LEFT JOIN dbo.Holidays b ON a.[DATE] BETWEEN FromDate AND ISNULL(ToDate, FromDate) 
WHERE [Date] BETWEEN '2014-01-01' AND '2014-12-31' AND
    [Last] = 1 AND
    -- Remove Saturdays and Sundays
    [WkDName2] NOT IN ('SA', 'SU') AND
    -- Remove holidays
    FromDate IS NULL;

The above query actually generates 72 rows of results so let’s just look at a few of them (for May and December).

Date                     WkDName2  FromDate
...
2014-05-26 00:00:00.000  Mo        NULL
2014-05-27 00:00:00.000  Tu        NULL
2014-05-28 00:00:00.000  We        NULL
2014-05-29 00:00:00.000  Th        NULL
2014-05-30 00:00:00.000  Fr        NULL
...
2014-12-25 00:00:00.000  Th        NULL
2014-12-26 00:00:00.000  Fr        NULL
2014-12-29 00:00:00.000  Mo        NULL
2014-12-30 00:00:00.000  Tu        NULL

It should be no surprise that FromDate is always NULL, because of the WHERE filter.  What’s important to note is that the last date shown for May is the Friday (31 May is a Saturday so it was eliminated) and that the last date for December is the 30th (because the New Year’s Eve holiday was eliminated).

It now becomes a relatively simple matter to use ROW_NUMBER() to identify the last working day of the month (note the DESC sort), like this:

SELECT [Date]
FROM
(
    SELECT [Date], [WkDName2], FromDate
        ,rn=ROW_NUMBER() OVER (PARTITION BY [YYYYMM] ORDER BY [Date] DESC) 
    FROM dbo.Calendar a
    LEFT JOIN dbo.Holidays b ON a.[DATE] BETWEEN FromDate AND ISNULL(ToDate, FromDate) 
    WHERE [Date] BETWEEN '2014-01-01' AND '2014-12-31' AND
        [Last] = 1 AND
        -- Remove Saturdays and Sundays
        [WkDName2] NOT IN ('SA', 'SU') AND
        -- Remove holidays
        FromDate IS NULL
) a
WHERE rn=1;

So here is a list of Aware’s pay days in 2014:

Date
2014-01-31 00:00:00.000
2014-02-28 00:00:00.000
2014-03-31 00:00:00.000
2014-04-30 00:00:00.000
2014-05-30 00:00:00.000
2014-06-30 00:00:00.000
2014-07-31 00:00:00.000
2014-08-29 00:00:00.000
2014-09-30 00:00:00.000
2014-10-31 00:00:00.000
2014-11-28 00:00:00.000
2014-12-30 00:00:00.000

Now that’s a query we can all be interested in!
But what is really important here is that by combining our two tables, we’ve got a really nice way of identifying working days and numbering them within a month if need be.

CONCLUSION

We have described how to create an auxiliary Calendar table in your SQL database and provided a high-performance, alternative FUNCTION (GenerateCalendar) that you can use instead if you want to.  Since the FUNCTION returns a table, it can be substituted pretty much wherever the Calendar table is used (after FROM), but you do need to take care to specify the correct number of days you’ll need to drive the results.

Note that the GenerateCalendar FUNCTION is also somewhat unique in that it will generate days backwards from a starting date (by supplying a negative number of days).  Some sample queries are embedded in its comments to guide you.

We’ve described three cases where a Calendar table can be useful:

  • To simplify what would otherwise be complex date calculations.
  • To fill in dates (e.g., a month) where those dates are missing from your data.
  • To combine with a Holidays table to calculate working days.

Note that there are some people that advocate marking days in your main Calendar table as holidays.  I personally do not like that approach, mainly because Thailand has the tendency to create new holidays on a whim.  I’d rather insert a new row into my Holidays table than update an existing row in my Calendar table, but there are other reasons why you may want to keep them separate.

Just like Tally tables, it is possible to create an in-line Calendar table that contains only the dates you need for your specific query.  While you may still need to perform any specific date calculations you need within that query (like obtaining the month or day as an integer), it may still offer some usability in cases where you need to fill in dates like in our Sales order example.  We’ll also leave this as an exercise for our interested readers.

We hope you found this blog entry on Calendar tables to be interesting, informative and at least a little bit of fun.

For additional reading on Calendar tables, you can look at these three articles by Todd Fifield: