Vertica

How to build a calendar using Vertica SQL (bit 5)

This post is a part of Bit Series blog.

Every data-warehouse project needs to have calendar table as a dimension, which hosts all possible combinations of dates. This table can be used as a dimension or in many cases to derive the dates.

With this post, we will be seeing on how we can generate a calendar table directly using Vertica SQL. The Vertica SQL will be using a special Time series function which is inbuilt inside vertica.

The time series function of vertica enables us to perform analytics based on time-series and gap filling. You can get more information about it here.

For now we will stick to this user of generating the calendar using this function as below

CREATE TABLE
    calendar AS
SELECT
    slice_time                       AS actual_timestamp ,
    CAST(slice_time AS DATE)         AS DATE ,
    date_part('WEEK', slice_time)    AS week,
    date_part('YEAR', slice_time)    AS YEAR,
    date_part('MONTH', slice_time)   AS MONTH,
    date_part('DAY', slice_time)     AS DAY,
    date_part('QUARTER', slice_time) AS quarter
FROM
    (
        SELECT
            '2019-01-01 00:00:00' AS sample_date
        UNION
        SELECT
            '2019-12-31 00:00:00' AS sample_date ) a TIMESERIES slice_time AS '1 DAY' OVER(ORDER BY
    sample_date::VARCHAR::TIMESTAMP)

So, what are we doing here
1. We take 2 sample dates, between which we need to fill the gap, suppose you want the calendar to be for year 2019, take the date as shown above in subquery
2. We are using 1 vertica function here, TIMESERIES , slice_time is just an alias here. TIMESERIES is going to generate the dates based on interval specified, in this case ‘1 DAY’
3. Lastly the TIMESERIES function will always require a ORDER by operation on timestamp column. Here we are just sorting the data based on timestamp generated.
4. Once we get a daily timestamp, we are just using standard date functions to derive various date time properties of that timestamp, like day, week, quarter, year, month, etc.
5. Finally, we are just storing the data from the select statement in the CALENDAR table

You can check the results of this table, and it can be used easily as dimension table for Calendar use case.

Go ahead and try generating this data at your end, there are many possibilities, where you can use this data.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s