Date Dimension with Global Holiday Calendar
Overview
A date dimension is an essential table in a data model that allows to analyze performance more effectively across different time periods. It should be included in every dimensional model that contains a date or requires date intelligence as part of the analysis. The date dimension plays an important role in the data warehouse design. It provides the ability to study behavior and trend the data over a period of time. The Date Dimension contains a continuous range of dates that cover the entire date period required for the analysis. It also includes columns that will filter the data by almost any date logic. It includes the day of the week, workdays, weekends, quarters, months, years.
A key problem when working with dates is the impact of holidays and weekends. To manage this complexity, a holiday calendar is used. A holiday calendar implementation keeps track of which dates are holidays and which are weekends. Different countries have different holiday dates and thus different calendars. It is not unusual for individual exchanges or other financial entities to have their own calendar. Each holiday calendar represents a consistent set of weekends and holidays. In all cases they refer to a specific location such as Subdivisions. The holiday calendar consists of various countries and subdivisions.
Another key problem when working with dates is calculating one date relative to another taking into account holidays, weekends and other conventions. To handle this, a set of date manipulations are provided in the table below (DIM_DATE table). Hence the holiday calendar and date_dim table are linked together.
FlowForward.
All Things Data Engineering
Straight to Your Inbox!
Data Model
The data model below illustrates that tables are related to each other and we have derived the data model using the SQL DBM (SQL Database Modeler is an online browser-based database design tool that also imports existing DB, generates SQL, and allows collaboration).
Data Tables
There are five data tables as described below. These tables contain the date dimension and holiday calendars for countries and first level subdivisions.
- DIM_DATE
- COUNTRY_CODE
- SUBDIVISION_CODE
- HOLIDAY_CALENDAR_COUNTRY
- HOLIDAY_CALENDAR_SUBDIVISION
Date Dimension Table
The date dim table contains the date range from January 01, 1900 to December 31, 2100. The fields with its description are defined below.
Colum Name | Data Type | Description |
---|---|---|
DATE_PKEY | NUMBER(9,0) | Primary Key for the table, which contains the unique date values. |
DATE | DATE | It is the date field of the date dimension |
FULL_DATE_DESC | VARCHAR(64) | it contains the day name along with its actual long date format e.g Monday, January 01, 1900 |
DAY_NUM_IN_WEEK | NUMBER(1,0) | It is the day number in a week which starts from Sunday as the first day in the week. |
DAY_NUM_IN_WEEK_ISO_8601 | NUMBER(1,0) | It is the day number in a week which starts from Monday as the first day in the week as in ISO 8601 |
DAY_NUM_IN_MONTH | NUMBER(2,0) | It is the day number in a Month such as 30 days. |
DAY_NUM_IN_YEAR | NUMBER(3,0) | It is the day number in a Year such as 365 days. |
DAY_NAME | VARCHAR(10) | Full name of the day name e.g Sunday |
DAY_ABBREV | VARCHAR(3) | Abbreviation of the day name e.g Sun |
WEEKDAY_IND | VARCHAR(64) | It indicates, if it is weekday or weekend |
MONTH_END_IND | VARCHAR(64) | It indicates the month end or not a month end |
WEEK_BEGIN_DATE_NKEY | NUMBER(9,0) | start date of the week in the format of 19000101 |
WEEK_BEGIN_DATE | DATE | start date of the week in the format of 1900-01-01 |
WEEK_END_DATE_NKEY | NUMBER(9,0) | End date of the week in the format of 19000107 |
WEEK_END_DATE | DATE | End date of the week in the format of 1900-01-07 |
WEEK_NUM_IN_YEAR | NUMBER(9,0) | It is the week number in the year e.g 1st week of the year |
MONTH_NAME | VARCHAR(10) | Full name of the month e.g January |
MONTH_ABBREV | VARCHAR(3) | Abbreviation of the month e.g Jan |
MONTH_NUM_IN_YEAR | NUMBER(2,0) | Month Number of the year e.g 1 is for January |
YEARMONTH | VARCHAR(10) | It is the year and month number e.g 1900-01 for January 1900 |
QUARTER | NUMBER(1,0) | Quarter number is the year |
YEARQUARTER | VARCHAR(10) | It is the year and quarter number e.g 1900-01 first quarter of the year 1900 |
YEAR | NUMBER(5,0) | Year field |
Country Code
Country code has all the available countries and its ISO 3166 country code. The code is a two letter code which defines the country name. The other field has the actual country name.
Colum Name | Data Type | Description |
---|---|---|
COUNTRY_CODE | VARCHAR(5) | It is the country code of the country e.g CA is for Canada |
COUNTRY | VARCHAR(100) | Full name of the country |
Subdivision Code
Subdivisions belong to countries. Hence each subdivision has a mapping country as shown in the above data model.
Colum Name | Data Type | Description |
---|---|---|
SUBDIVISION | VARCHAR(50) | Subdivision code primary key, it is the combination of the subdivision and country code. |
SUBDIVISION_CODE | VARCHAR(25) | It is the subdivision code e.g AB is for Alberta in Canada |
COUNTRY_CODE | VARCHAR(5) | It is the country code |
SUBDIVISION_NAME | VARCHAR(200) | Full name of the subdivision |
Holiday Calendar of Country
This calendar table contains all the holidays of the particular country. This table has 40 years (20 years previous data and 20 years future data) holiday data for the countries.
Colum Name | Data Type | Description |
---|---|---|
HOL_DATE_PKEY | VARCHAR(15) | It is the holiday calendar primary key consists of the date and the country code |
DATE_FKEY | NUMBER(9,0) | It is linked to the dim_date primary key to get the details of the date. |
HDATE | DATE | It is the actual date of the holiday |
HOLIDAY_NAME | VARCHAR(500) | It is the full name of the holiday. |
OBSERVED_IND | VARCHAR(10) | It shows if it is an observed holiday. |
COUNTRY_FKEY | VARCHAR(10) | Country code of the country |
Holiday Calendar of Subdivision
This calendar table contains all the holidays of the particular subdivision. This table has 40 years (20 years previous data and 20 years future data) holiday data for the subdivisions.
Colum Name | Data Type | Description |
---|---|---|
HOL_DATE_PKEY | VARCHAR(50) | It is the holiday calendar primary key consists of the date and the subdivision code |
DATE_FKEY | NUMBER(9,0) | It is linked to the dim_date primary key to get the details of the date. |
HDATE | DATE | It is the actual date of the holiday |
HOLIDAY_NAME | VARCHAR(500) | It is the full name of the holiday. |
OBSERVED_IND | VARCHAR(10) | It shows if it is an observed holiday. |
SUBDIVISION | VARCHAR(50) | Subdivision code of the actual subdivision |
Usage
The most obvious advantage to using a calendar table is that we don’t have to create calculated fields for every component of our date. While doing data analysis, there might be a requirement to exclude weekends and holidays from an analysis, hence we have the weekend indicator and holiday calendars for a better analysis. There are few examples provided below. Since our calendar table contains every date, we can first scaffold our data to ensure all dates are counted. We can then use the weekend and holiday indicators to properly exclude or count days. Calendar table scaffolds also allow us to easily count business days between two dates.
Lookup
- Find out the holidays for USA
1 2 3 4 5 6 7 8 9 10 11 |
/ ********* Find the holidays for year 2022 of USA ***********/ SELECT H.HDATE AS DATE, H.HOLIDAY_NAME, C.COUNTRY FROM COUNTRY_CODE C INNER JOIN HOLIDAY_CALENDAR_COUNTRY H ON C.COUNTRY_CODE = H.COUNTRY_FKEY INNER JOIN DIM_DATE D ON D.DATE_PKEY = H.DATE_FKEY WHERE C.COUNTRY = 'United States' AND D.YEAR = '2022' |
2. Find out the holidays for Alberta subdivision under Canada
1 2 3 4 5 6 7 8 9 10 11 12 13 |
/ ********* Find the holidays for year 2023 of CANADA and its Subdivision British Columbia ***********/ SELECT HS.HDATE AS DATE, HS.HOLIDAY_NAME, C.COUNTRY, S.SUBDIVISION_NAME FROM HOLIDAY_CALENDAR_SUBDIVISION HS INNER JOIN SUBDIVISION_CODE S ON HS.SUBDIVISION = S.SUBDIVISION INNER JOIN COUNTRY_CODE C ON S.COUNTRY_CODE = C.COUNTRY_CODE INNER JOIN DIM_DATE D ON D.DATE_PKEY = HS.DATE_FKEY WHERE C.COUNTRY = 'Canada' AND D.YEAR = '2023' AND S.SUBDIVISION_NAME = 'British Columbia' |
3. Find the observed holidays in Alberta for 2021
1 2 3 4 5 6 7 8 9 10 11 12 13 |
/************ Find the observed holidays in Alberta for 2021 ***********/ SELECT HS.HDATE AS DATE, HS.HOLIDAY_NAME, C.COUNTRY, S.SUBDIVISION_NAME, HS.OBSERVED_IND FROM HOLIDAY_CALENDAR_SUBDIVISION HS INNER JOIN SUBDIVISION_CODE S ON HS.SUBDIVISION = S.SUBDIVISION INNER JOIN COUNTRY_CODE C ON S.COUNTRY_CODE = C.COUNTRY_CODE INNER JOIN DIM_DATE D ON D.DATE_PKEY = HS.DATE_FKEY WHERE C.COUNTRY = 'Canada' AND D.YEAR = '2021' AND S.SUBDIVISION_NAME = 'Alberta' AND HS.OBSERVED_IND = 'true' |
4. Show the total number of weeks in 2022
1 2 3 4 5 6 7 |
/************ Show the total number of weeks in second quarter of 2022 ***********/ SELECT COUNT(DISTINCT WEEK_NUM_IN_YEAR) AS WEEK_COUNT FROM DIM_DATE WHERE YEAR = '2022' and QUARTER = '2' |
How to subscribe
To subscribe to the dataset go to this link and click on GET button.
Summary
It is important to include a robust date dimension in the model, as it will allow the report to take advantage of date intelligence functionality.
- Using an existing date dimension in the source layer is a great option if it already exists and is reliably updated and maintained.
- The SQL method takes advantage of the speed in processing in the data source server itself. It can potentially improve the performance of generating the date dimension.
- The date dimension table may lack the flexibility of customization if it is brought into the data model directly from the source layer. Any necessary modifications may be dependent on an IT or DBA team, resulting in a loss of control of development time.
- If a custom date dimension is required, SQL is characterized by limited date intelligence functions. It can be more difficult to create all the columns needed when compared to other languages.
Download Sample Data
The samples below are the date dimension and holiday calendar for the year 2021 of Germany and it’s Subdivisions. It can be downloaded from the links below. The files below are downloadable in TSV format.
- Country Code (This file contains the ISO country code of germany)
- Date Dimension (This file contains the date dimension for 2021)
- Holidays of Germany and Subdivisions (This file contain the holidays of Germany and its subdivisions)
- Germany Holidays (This file contains the holidays of Germany for 2021)
- Holidays of Subdivision of Germany (This file contains the holidays of Subdivision Germany for 2021)
- Subdivisions of Germany (This file contains the name of subdivision and it’s ISO code)
List of countries that support Date Dimension
COUNTRY_CODE | COUNTRY |
---|---|
AO | Angola |
AR | Argentina |
AW | Aruba |
AU | Australia |
AT | Austria |
AZ | Azerbaijan |
BD | Bangladesh |
BY | Belarus |
BE | Belgium |
BW | Botswana |
BR | Brazil |
BG | Bulgaria |
BI | Burundi |
CA | Canada |
CL | Chile |
CO | Colombia |
CW | Curaçao |
CZ | Czechia |
DK | Denmark |
DJ | Djibouti |
DO | Dominican Republic |
ET | Ethiopia |
HK | Hong Kong |
IS | Iceland |
IL | Israel |
IT | Italy |
KE | Kenya |
LS | Lesotho |
LT | Lithuania |
LU | Luxembourg |
NG | Nigeria |
NO | Norway |
PT | Portugal |
RU | Russia |
SA | Saudi Arabia |
SK | Slovakia |
TW | Taiwan |
VN | Vietnam |
EG | Egypt |
EE | Estonia |
HN | Honduras |
HU | Hungary |
KR | Korea |
NI | Nicaragua |
RO | Romania |
TN | Tunisia |
UZ | Uzbekistan |
ZM | Zambia |
JM | Jamaica |
ZA | South Africa |
CN | China |
HR | Croatia |
GE | Georgia |
DE | Germany |
IN | India |
JP | Japan |
KZ | Kazakhstan |
MW | Malawi |
MX | Mexico |
NZ | New Zealand |
SZ | Swaziland |
FI | Finland |
FR | France |
GR | Greece |
IE | Ireland |
IM | Isle of Man |
MY | Malaysia |
MZ | Mozambique |
NL | Netherlands |
PE | Peru |
RS | Serbia |
SG | Singapore |
CH | Switzerland |
TR | Turkey |
UA | Ukraine |
AE | United Arab Emirates |
US | United States |
NA | Namibia |
SE | Sweden |
ZW | Zimbabwe |
LV | Latvia |
MK | North Macedonia |
PY | Paraguay |
PL | Poland |
ES | Spain |
VE | Venezuela |
MA | Morocco |
SI | Slovenia |
GB | United Kingdom |
UY | Uruguay |