woensdag 30 augustus 2023

Building a Date dimension in Snowflake

Introduction

In the past, I was used to use a Date dimension (or I call it sometimes a Calendar dimension) in SQL Server and I've built it many times in SQL Server, but now for an exercise I decided to build it in Snowflake. Disclaimer is that it is not very performant at the moment because it's a row by row insertion and that is one thing where Snowflake doesn't perform not very well. But so be it. I don't expect that this stored procedure is executed very often. There, it could be an option to use it in a project. But on the other hand, I will look for faster solutions and I've already found some interesting options. But that is something for a future blogpost.


DISCLAIMER : You can use this procedure, but there are better ways to do this in Snowflake. I'll come back with a better solution. It was an exercise for me to rebuild the SQL Server stored procedure in Snowflake. 

Look for a better way here

The stored procedure


Here is the stored procedure : 

CREATE OR REPLACE PROCEDURE SP_CreateCalenderDimension(t_startdate TIMESTAMP_NTZ(9), t_enddate TIMESTAMP_NTZ(9))
RETURNS VARCHAR NOT NULL
LANGUAGE SQL
AS
-- DECLARE
--     t_startdate TIMESTAMP_NTZ(9) DEFAULT '2023-01-01';
--     t_enddate TIMESTAMP_NTZ(9) DEFAULT '2023-12-31';
BEGIN
    LET t_loopdate TIMESTAMP_NTZ(9) := t_startdate;
    LET i_calendarid INT;
    LET d_calendershortdate DATE;
    LET i_dayweek INT;                -- Monday is the first day of the week
    LET i_daymonth INT;
    LET i_dayyear INT;
    LET s_shortdayname STRING;
    LET s_longdayname STRING;
    LET i_isoweekyear INT;
    LET i_monthnumber INT;
    LET i_year INT;
    LET i_isoyear INT := TO_NUMBER(DATE_PART(year, :t_startdate));

    LET s_yearname STRING;
    LET i_quarter INT;
    LET s_yyyymm STRING(6);
    LET s_mmyyyy STRING(6);
    LET i_weekday INT;
    LET i_firstdayofmonth DATE;
    LET i_lastdayofmonth DATE;
    LET i_firstdayofquarter DATE;
    LET i_lastdayofquarter DATE;
    LET i_firstdayofyear DATE;
    LET i_lastdayofyear DATE;
    LET i_isomonthnumber INT;
    LET i_isoquarter INT;
    LET i_isoweekofyear INT;
    LET s_wwyyyy STRING(6);
    LET s_yyyyww STRING(6);
    LET b_changeisoweekyear CHAR(1) := 'N';
    
    CREATE OR REPLACE TABLE SP_DB.PUBLIC.CALENDAR (
    	CalendarID INT NOT NULL,
    	CalendarDate TIMESTAMP_NTZ(9) NOT NULL,
    	CalendarShortDate DATE NOT NULL,
    	CalendarDayOfWeek INT  NOT NULL,
    	CalendarDayOfMonth INT NOT NULL,
--        CalendarDayOfQuarter INT  NULL,	   
    	CalendarDayOfYear INT NOT NULL,
    	CalendarShortDayName STRING NOT  NULL,
    	CalendarLongDayName STRING NOT NULL,
        --  CalendarIsWorkDayOfMonth
        --  DaySuffix
        --  DayOfQuarter
        CalendarFirstDayOfMonth DATE NOT NULL,
        CalendarLastDayOfMonth DATE NOT NULL,
        CalendarFirstDayOfQuarter DATE NULL,
        CalendarLastDayOfQuarter DATE NULL,
        CalendarFirstDayOfYear DATE NULL,
        CalendarLastDayOfYear DATE NULL,
        CalendarIsWeekDay INT NULL,
        --  IsHoliday
        --  HolidayDescription

    -- Week
        CalendarISOWeekOfYear NUMBER(38,0) NULL, 
        CalendarWWYYYY STRING(6) NULL, 
        CalendarYYYYWW STRING(6) NULL, 
    
    -- Month
        CalendarMonthNumber NUMBER(38,0) NULL,
        CalendarISOMonthNumber NUMBER(38,0) NULL,    -- 445 pattern
        --  MonthName
        --  MonthOfQuarter
        CalendarMMYYYY STRING(6) NULL,         
        CalendarYYYYMM STRING(6) NULL,
        --  MonthYear
        --  YearMonth
        --  ISOMonth

    -- Quarter
        CalendarQuarter NUMBER(38,0)  NULL,
        CalendarISOQuarter NUMBER(38,0) NULL, 
        --  QuarterName

    -- Year
        CalendarYear NUMBER(38,0)  NULL,
        CalendarISOYear NUMBER NULL,
        CalendarYearName STRING NULL
    );
  
    WHILE (t_loopdate <= t_enddate) DO    
        -- day
        SET i_calendarid := TO_NUMBER(TO_CHAR(:t_loopdate,'yyyymmdd'));
        SET d_calendershortdate := TO_DATE(:t_loopdate);
        SET i_dayweek := DATE_PART(dayofweekiso, :t_loopdate);      
        SET i_daymonth := DATE_PART(day, :t_loopdate);
        SET i_dayyear := DATE_PART(dayofyear, :t_loopdate);
        SET s_shortdayname := DAYNAME(:t_loopdate);
        SET s_longdayname := CASE 
            WHEN :s_shortdayname = 'Sun' THEN 'Sunday' 
            WHEN :s_shortdayname = 'Mon' THEN 'Monday' 
            WHEN :s_shortdayname = 'Tue' THEN 'Tuesday' 
            WHEN :s_shortdayname = 'Wed' THEN 'Wednesday' 
            WHEN :s_shortdayname = 'Thu' THEN 'Thursday' 
            WHEN :s_shortdayname = 'Fri' THEN 'Friday' 
            WHEN :s_shortdayname = 'Sat' THEN 'Saturday' 
        END;
        SET i_weekday := CASE WHEN DATE_PART(dayofweekiso, :t_loopdate) NOT IN (6,7) THEN 1 ELSE 0 END;
        SET i_firstdayofmonth := TO_DATE(DATEADD(day, - (DATE_PART(day, :t_loopdate) - 1), :t_loopdate));
        SET i_lastdayofmonth := LAST_DAY(:t_loopdate, 'month');
        SET i_firstdayofquarter := TO_DATE(DATEADD(quarter, DATEDIFF(quarter, '2000-01-01'::TIMESTAMP, :t_loopdate), '2000-01-01'::TIMESTAMP));
        SET i_lastdayofquarter :=  LAST_DAY(:t_loopdate, 'quarter');
        SET i_firstdayofyear := TO_DATE('01/01/' || TO_VARCHAR(DATE_PART(year, :t_loopdate)));
        SET i_lastdayofyear := LAST_DAY(:t_loopdate, 'year');

        -- week
        SET i_isoweekyear := DATE_PART(weekiso, :t_loopdate); 
 
    	-- Determine the month based on the weeknumber (based on a 4-4-5 period)
    	SET i_isomonthnumber  := CASE	WHEN :i_isoweekyear >= 1 AND :i_isoweekyear <= 4 Then 1		--4
    									WHEN :i_isoweekyear >= 5 AND :i_isoweekyear <= 8 Then 2		--4
    									WHEN :i_isoweekyear >= 9 AND :i_isoweekyear <= 13 Then 3	--5
    									WHEN :i_isoweekyear >= 14 AND :i_isoweekyear <= 17 Then 4	--4	
    									WHEN :i_isoweekyear >= 18 AND :i_isoweekyear <= 21 Then 5	--4
    									WHEN :i_isoweekyear >= 22 AND :i_isoweekyear <= 27 Then 6	--5
    									WHEN :i_isoweekyear >= 28 AND :i_isoweekyear <= 31 Then 7	--4
    									WHEN :i_isoweekyear >= 32 AND :i_isoweekyear <= 35 Then 8	--4
    									WHEN :i_isoweekyear >= 36 AND :i_isoweekyear <= 40 Then 9	--5
    									WHEN :i_isoweekyear >= 41 AND :i_isoweekyear <= 44 Then 10	--4
    									WHEN :i_isoweekyear >= 45 AND :i_isoweekyear <= 48 Then 11	--4	
    									WHEN :i_isoweekyear >= 49 AND :i_isoweekyear <= 53 Then 12	--5
    							END;
    
        -- Month
        SET i_monthnumber := DATE_PART(month, :t_loopdate);

        SET i_quarter := DATE_PART(quarter, :t_loopdate);
        SET s_yyyymm := TO_VARCHAR(DATE_PART(year, :t_LoopDate)) || RIGHT('0' || TO_VARCHAR(DATE_PART(month, :t_LoopDate)),2);
        SET s_mmyyyy := RIGHT('0' || TO_VARCHAR(DATE_PART(month, :t_LoopDate)),2) || TO_VARCHAR(DATE_PART(year, :t_LoopDate));

        -- Quarter
    	SET i_isoquarter  := CASE WHEN :i_isoweekyear >= 1 AND  :i_isoweekyear <= 13 Then 1	
    	 						  WHEN :i_isoweekyear >= 14 AND :i_isoweekyear <= 27 Then 2	
    	 						  WHEN :i_isoweekyear >= 28 AND :i_isoweekyear <= 40 Then 3	
    	 						  WHEN :i_isoweekyear >= 41 AND :i_isoweekyear <= 53 Then 4		
                            END;

        -- Year
        SET i_year := DATE_PART(year, :t_loopdate);
        SET s_yearname := 'CY ' || TO_VARCHAR(:i_year);


        -- Post processing (dependent on isoyear)
        SET s_wwyyyy := RIGHT('0' || TO_VARCHAR(:i_isoweekyear),2) || TO_VARCHAR (:i_isoweekyear);
    	SET s_yyyyww := TO_VARCHAR (:i_isoweekyear) || RIGHT('0' || TO_VARCHAR(:i_isoweekyear),2);

        IF (:i_isoweekyear = 1 AND :b_changeisoweekyear = 'Y') THEN
            SET i_isoyear := :i_isoyear + 1; 
		    SET b_changeisoweekyear := 'N';
        END IF;
        IF (:i_isoweekyear = 2) THEN 
            SET b_changeisoweekyear := 'Y';
        END IF;
        
        INSERT INTO Calendar (
            CalendarID,                 
            CalendarDate, 
            CalendarShortDate, 
            CalendarDayOfWeek,
            CalendarDayOfMonth, 
            CalendarDayOfYear,
            CalendarShortDayName,
            CalendarLongDayName,
            CalendarFirstDayOfMonth,
            CalendarLastDayOfMonth,
            CalendarFirstDayOfQuarter,
            CalendarLastDayOfQuarter,
            CalendarFirstDayOfYear,
            CalendarLastDayOfYear,

            
            CalendarISOWeekOfYear,
            CalendarYYYYWW,
            CalendarWWYYYY,
            
            CalendarMonthNumber,
            CalendarISOMonthNumber,
            CalendarYear,
            CalendarISOYear,
            CalendarYearName,
            CalendarQuarter,
            CalendarISOQuarter,
            CalendarYYYYMM,
            CalendarMMYYYY,
            CalendarIsWeekDay
            ) 
        SELECT     
            :i_calendarid, 
            :t_loopdate, 
            :d_calendershortdate, 
            :i_dayweek,               
            :i_daymonth, 
            :i_dayyear,
            :s_shortdayname, 
            :s_longdayname,
            :i_firstdayofmonth,
            :i_lastdayofmonth, 
            :i_firstdayofquarter,
            :i_lastdayofquarter, 
            :i_firstdayofyear, 
            :i_lastdayofyear, 
            
            :i_isoweekyear,
            :s_yyyyww,
            :s_wwyyyy,
            :i_monthnumber,
            :i_isomonthnumber,
            :i_year,
            :i_isoyear,
            :s_yearname,
            :i_quarter,
            :i_isoquarter,
            :s_yyyymm,
            :s_mmyyyy,
            :i_weekday;

        SET t_loopdate := DATEADD(DAY, 1, t_loopdate);
    END WHILE;
    
   RETURN t_loopdate;
END;

I haven't found a way to define a default value for the parameters, so I will fill some dates in the callee parameters:

CALL SP_CreateCalenderDimension('2023-01-01', '2023-12-31');

Inspection of the calendar table :

SELECT * FROM Calendar;




Final Thoughts 

In SQL Server it's performing well, but in Snowflake it's performing less. On the one side it's a stored procedure I do expect that it's not necessary to execute often. 

But future improvements could be another smarter way of generating the calendar dimension. I will look into that. I already found some examples with GENERATOR. Next improvement could be including (dutch) holidays.

Hennie

Geen opmerkingen:

Een reactie posten