Date Dimension with Global Holiday Calendar

March 21, 2022

Overview

A date dimension is an essential table in a data model that allows to analyze performance more effectively across different time periods. It should be included in every dimensional model that contains a date or requires date intelligence as part of the analysis. The date dimension plays an important role in the data warehouse design. It provides the ability to study behavior and trend the data over a period of time. The Date Dimension contains a continuous range of dates that cover the entire date period required for the analysis. It also includes columns that will filter the data by almost any date logic. It includes the day of the week, workdays, weekends, quarters, months, years.

A key problem when working with dates is the impact of holidays and weekends. To manage this complexity, a holiday calendar is used. A holiday calendar implementation keeps track of which dates are holidays and which are weekends. Different countries have different holiday dates and thus different calendars. It is not unusual for individual exchanges or other financial entities to have their own calendar. Each holiday calendar represents a consistent set of weekends and holidays. In all cases they refer to a specific location such as Subdivisions. The holiday calendar consists of various countries and subdivisions.

Another key problem when working with dates is calculating one date relative to another taking into account holidays, weekends and other conventions. To handle this, a set of date manipulations are provided in the table below (DIM_DATE table). Hence the holiday calendar and date_dim table are linked together.

Data Model

The data model below illustrates that tables are related to each other and we have derived the data model using the SQL DBM (SQL Database Modeler is an online browser-based database design tool that also imports existing DB, generates SQL, and allows collaboration).

Data Tables

There are five data tables as described below. These tables contain the date dimension and holiday calendars for countries and first level subdivisions.

  • DIM_DATE
  • COUNTRY_CODE
  • SUBDIVISION_CODE
  • HOLIDAY_CALENDAR_COUNTRY
  • HOLIDAY_CALENDAR_SUBDIVISION

Date Dimension Table

The date dim table contains the date range from January 01, 1900 to December 31, 2100. The fields with its description are defined below.

Colum NameData TypeDescription
DATE_PKEYNUMBER(9,0)Primary Key for the table, which contains the unique date values.
DATEDATEIt is the date field of the date dimension
FULL_DATE_DESCVARCHAR(64)it contains the day name along with its actual long date format e.g
Monday, January 01, 1900
DAY_NUM_IN_WEEKNUMBER(1,0)It is the day number in a week which starts from Sunday as the first day in the week.
DAY_NUM_IN_WEEK_ISO_8601NUMBER(1,0)It is the day number in a week which starts from Monday as the first day in the week as in ISO 8601
DAY_NUM_IN_MONTHNUMBER(2,0)It is the day number in a Month such as 30 days.
DAY_NUM_IN_YEARNUMBER(3,0)It is the day number in a Year such as 365 days.
DAY_NAMEVARCHAR(10)Full name of the day name e.g Sunday
DAY_ABBREVVARCHAR(3)Abbreviation of the day name e.g Sun
WEEKDAY_INDVARCHAR(64)It indicates, if it is weekday or weekend
MONTH_END_INDVARCHAR(64)It indicates the month end or not a month end
WEEK_BEGIN_DATE_NKEYNUMBER(9,0)start date of the week in the format of 19000101
WEEK_BEGIN_DATEDATEstart date of the week in the format of 1900-01-01
WEEK_END_DATE_NKEYNUMBER(9,0)End date of the week in the format of 19000107
WEEK_END_DATEDATEEnd date of the week in the format of 1900-01-07
WEEK_NUM_IN_YEARNUMBER(9,0)It is the week number in the year e.g 1st week of the year
MONTH_NAMEVARCHAR(10)Full name of the month e.g January
MONTH_ABBREVVARCHAR(3)Abbreviation of the month e.g Jan
MONTH_NUM_IN_YEARNUMBER(2,0)Month Number of the year e.g 1 is for January
YEARMONTHVARCHAR(10)It is the year and month number e.g 1900-01 for January 1900
QUARTERNUMBER(1,0)Quarter number is the year
YEARQUARTERVARCHAR(10)It is the year and quarter number e.g 1900-01 first quarter of the year 1900
YEARNUMBER(5,0)Year field

Country Code

Country code has all the available countries and its ISO 3166 country code. The code is a two letter code which defines the country name. The other field has the actual country name.

Colum NameData TypeDescription
COUNTRY_CODEVARCHAR(5)It is the country code of the country e.g CA is for Canada
COUNTRYVARCHAR(100)Full name of the country

 

Subdivision Code

Subdivisions belong to countries. Hence each subdivision has a mapping country as shown in the above data model.

Colum NameData TypeDescription
SUBDIVISIONVARCHAR(50)Subdivision code primary key, it is the combination of the subdivision and country code.
SUBDIVISION_CODEVARCHAR(25)It is the subdivision code e.g AB is for Alberta in Canada
COUNTRY_CODEVARCHAR(5)It is the country code
SUBDIVISION_NAMEVARCHAR(200)Full name of the subdivision

 

Holiday Calendar of Country

This calendar table contains all the holidays of the particular country. This table has 40 years (20 years previous data and 20 years future data) holiday data for the countries.

Colum NameData TypeDescription
HOL_DATE_PKEYVARCHAR(15)It is the holiday calendar primary key consists of the date and the country code
DATE_FKEYNUMBER(9,0)It is linked to the dim_date primary key to get the details of the date.
HDATEDATEIt is the actual date of the holiday
HOLIDAY_NAMEVARCHAR(500)It is the full name of the holiday.
OBSERVED_INDVARCHAR(10)It shows if it is an observed holiday.
COUNTRY_FKEYVARCHAR(10)Country code of the country

Holiday Calendar of Subdivision

This calendar table contains all the holidays of the particular subdivision. This table has 40 years (20 years previous data and 20 years future data) holiday data for the subdivisions.

Colum NameData TypeDescription
HOL_DATE_PKEYVARCHAR(50)It is the holiday calendar primary key consists of the date and the subdivision code
DATE_FKEYNUMBER(9,0)It is linked to the dim_date primary key to get the details of the date.
HDATEDATEIt is the actual date of the holiday
HOLIDAY_NAMEVARCHAR(500)It is the full name of the holiday.
OBSERVED_INDVARCHAR(10)It shows if it is an observed holiday.
SUBDIVISIONVARCHAR(50)Subdivision code of the actual subdivision

Usage

The most obvious advantage to using a calendar table is that we don’t have to create calculated fields for every component of our date. While doing data analysis, there might be a requirement to exclude weekends and holidays from an analysis, hence we have the weekend indicator and holiday calendars for a better analysis. There are few examples provided below. Since our calendar table contains every date, we can first scaffold our data to ensure all dates are counted. We can then use the weekend and holiday indicators to properly exclude or count days. Calendar table scaffolds also allow us to easily count business days between two dates.

Lookup

  1. Find out the holidays for USA

2. Find out the holidays for Alberta subdivision under Canada

3. Find the observed holidays in Alberta for 2021

4. Show the total number of weeks in 2022

How to subscribe

To subscribe to the dataset go to this link and click on GET button.

Summary

It is important to include a robust date dimension in the model, as it will allow the report to take advantage of date intelligence functionality.

  • Using an existing date dimension in the source layer is a great option if it already exists and is reliably updated and maintained.
  • The SQL method takes advantage of the speed in processing in the data source server itself. It can potentially improve the performance of generating the date dimension.
  • The date dimension table may lack the flexibility of customization if it is brought into the data model directly from the source layer. Any necessary modifications may be dependent on an IT or DBA team, resulting in a loss of control of development time.
  • If a custom date dimension is required, SQL is characterized by limited date intelligence functions. It can be more difficult to create all the columns needed when compared to other languages.

Download Sample Data

The samples below are the date dimension and holiday calendar for the year 2021 of Germany and it’s Subdivisions. It can be downloaded from the links below. The files below are downloadable in TSV format.

List of countries that support Date Dimension

COUNTRY_CODECOUNTRY
AOAngola
ARArgentina
AWAruba
AUAustralia
ATAustria
AZAzerbaijan
BDBangladesh
BYBelarus
BEBelgium
BWBotswana
BRBrazil
BGBulgaria
BIBurundi
CACanada
CLChile
COColombia
CWCuraçao
CZCzechia
DKDenmark
DJDjibouti
DODominican Republic
ETEthiopia
HKHong Kong
ISIceland
ILIsrael
ITItaly
KEKenya
LSLesotho
LTLithuania
LULuxembourg
NGNigeria
NONorway
PTPortugal
RURussia
SASaudi Arabia
SKSlovakia
TWTaiwan
VNVietnam
EGEgypt
EEEstonia
HNHonduras
HUHungary
KRKorea
NINicaragua
RORomania
TNTunisia
UZUzbekistan
ZMZambia
JMJamaica
ZASouth Africa
CNChina
HRCroatia
GEGeorgia
DEGermany
INIndia
JPJapan
KZKazakhstan
MWMalawi
MXMexico
NZNew Zealand
SZSwaziland
FIFinland
FRFrance
GRGreece
IEIreland
IMIsle of Man
MYMalaysia
MZMozambique
NLNetherlands
PEPeru
RSSerbia
SGSingapore
CHSwitzerland
TRTurkey
UAUkraine
AEUnited Arab Emirates
USUnited States
NANamibia
SESweden
ZWZimbabwe
LVLatvia
MKNorth Macedonia
PYParaguay
PLPoland
ESSpain
VEVenezuela
MAMorocco
SISlovenia
GBUnited Kingdom
UYUruguay