This post is a part of Bit Series blog.
Every data-warehouse project needs to have calendar table as a dimension, which hosts all possible combinations of dates. This table can be used as a dimension or in many cases to derive the dates.
With this post, we will be seeing on how we can generate a calendar table directly using Vertica SQL. The Vertica SQL will be using a special Time series function which is inbuilt inside vertica.
The time series function of vertica enables us to perform analytics based on time-series and gap filling. You can get more information about it here.
For now we will stick to this user of generating the calendar using this function as below
CREATE TABLE calendar AS SELECT slice_time AS actual_timestamp , CAST(slice_time AS DATE) AS DATE , date_part('WEEK', slice_time) AS week, date_part('YEAR', slice_time) AS YEAR, date_part('MONTH', slice_time) AS MONTH, date_part('DAY', slice_time) AS DAY, date_part('QUARTER', slice_time) AS quarter FROM ( SELECT '2019-01-01 00:00:00' AS sample_date UNION SELECT '2019-12-31 00:00:00' AS sample_date ) a TIMESERIES slice_time AS '1 DAY' OVER(ORDER BY sample_date::VARCHAR::TIMESTAMP)
So, what are we doing here
1. We take 2 sample dates, between which we need to fill the gap, suppose you want the calendar to be for year 2019, take the date as shown above in subquery
2. We are using 1 vertica function here, TIMESERIES , slice_time is just an alias here. TIMESERIES is going to generate the dates based on interval specified, in this case ‘1 DAY’
3. Lastly the TIMESERIES function will always require a ORDER by operation on timestamp column. Here we are just sorting the data based on timestamp generated.
4. Once we get a daily timestamp, we are just using standard date functions to derive various date time properties of that timestamp, like day, week, quarter, year, month, etc.
5. Finally, we are just storing the data from the select statement in the CALENDAR table
You can check the results of this table, and it can be used easily as dimension table for Calendar use case.
Go ahead and try generating this data at your end, there are many possibilities, where you can use this data.